Happy New Peek

By Jaromir D.B. Nemec

It is 6 A.M. of the New Years day, some DBA recapitulate the resolutions while others are already sleeping. In neither of those situations an alerting call is welcome. Though, exactly that happens. “Our nightly Oracle job is hanging, but nothing was changed!”, is the plain message. “Nothing was changed, nothing was changed”, flashes through the DBA’s head before going to resolve the problem, “why do we ever need a sophisticated database if our business never changes!”.

The Problem

Let’s demonstrate the troubleshooting step by step[1]. If you want to execute the example in your own environment, skip to the Example section at the end where you find the scripts to perform the setup.

The first step is to understand the logic of the running batch. The algorithm is rather simple; the status of some entities (selected based on defined business rules) in the fact table should be updated.

The tricky part is that the fact table is versioned, i.e. each entity consists of a number of versions. The business policy says that only the status of the last version should be updated.

As our business is rather dynamic, each entity contains up to a thousand of versions. In order to handle such a large number of records, the table is partitioned by a business year, defined as the year of the transaction date.

The next step in the troubleshooting is to get the SQL statement that is currently running in the session of the hanging batch. This is not a big task, accessing the views v$session and v$sqltext you get the "hanging" statement. See Figure 1.

UPDATE MY_FACT A

SET STATUS = 'O'

WHERE ENTITY_ID = :B2 AND BUSINESS_YEAR = :B1 AND

VERSION =

(SELECT MAX(VERSION) FROM MY_FACT B

WHERE A.ENTITY_ID = B.ENTITY_ID AND

A.BUSINESS_YEAR = B.BUSINESS_YEAR)

;

Figure 1 The “hanging” Statement

The statement is a relatively simple update corresponding well to the job description above. The entity and the business year are fixed with bind variables (the fact table is partitioned by a business_year – see the table DDL in the Example section). The last version is determined by using a subquery. Apparently the update statement we see is performed in a loop over all selected business entities.

The next thing you should do is to check the execution plan of this statement. With the explain plan statement and dbms_xplan utility this task is quickly done. See Figure 2.

EXPLAIN PLAN set statement_id = 'N3' into plan_table

FOR

UPDATE MY_FACT A

SET STATUS = 'O'

WHERE ENTITY_ID = :B2 AND BUSINESS_YEAR = :B1 AND

VERSION =

(SELECT MAX(VERSION) FROM MY_FACT B

WHERE A.ENTITY_ID = B.ENTITY_ID AND

A.BUSINESS_YEAR = B.BUSINESS_YEAR)

;

---

SELECT * FROM table(DBMS_XPLAN.DISPLAY('plan_table', 'N3','ALL')) ;

-----------------------------------------------------------------------------------

-----------------------------------------------------------------------------------

| 0| UPDATE STATEMENT | | 1| 16| | |

|* 2| TABLE ACCESS BY LOC.IND.ROWID | MY_FACT | 1| 3| | |

| 3| NESTED LOOPS | | 1| 16| | |

| 4| VIEW | VW_SQ_1 | 2| 10| | |

| 5| SORT GROUP BY | | 2| 10| | |

| 6| PARTITION RANGE SINGLE | | 604| 10| KEY | KEY |

|* 7| TABLE ACCESS BY LOC.IND.ROWID| MY_FACT | 604| 10| KEY | KEY |

|* 8| INDEX RANGE SCAN | MY_FACT_IX1| 1208| 6| KEY | KEY |

| 9| PARTITION RANGE SINGLE | | 1| 2| KEY | KEY |

|* 10| INDEX RANGE SCAN | MY_FACT_IX1| 1| 2| KEY | KEY |

-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - filter("BUSINESS_YEAR"=:B1)

7 - filter("B"."BUSINESS_YEAR"=:B1)

8 - access("B"."ENTITY_ID"=TO_NUMBER(:B2))

10 - access("ENTITY_ID"=TO_NUMBER(:B2) AND "VERSION"="VW_COL_1")

Figure 2 The Execution Plan (output modified to optimise presentation)

What is wrong with this execution plan? First an index range scan on one index partition of our fact table is performed to access all versions of the processed entity (see access and filter predicates in lines 8 and 7 of the execution plan). Then using sort group by operation the maximal version is found. The nested loops operation (line 3) is performed exactly once. The reason is that the version number is unique within each entity. Therefore there exists only one record with the maximal version.

Where should be a problem to access thousand of records via index, sort them and get the record with the maximal version? This execution plan should not cause a hanging job, even if applied in a loop to process a number of entities.

Well, the explain plan apparently doesn’t provide the clue, let's try to effectively execute the statement and observe the execution plan with the autotrace option. Of course, it is not possible to execute the statement without initialising the bind variables. We simply choose some random entity_id and use the current business year. We will see the importance of the choice of the bind variables later on.

SQL> set autotrace traceonly;

SQL> --

SQL> variable ent_id number;

SQL> variable bus_year varchar2(10);

SQL> --

SQL> execute :ent_id := 4; :bus_year := '2006-01-01';

PL/SQL procedure successfully completed.

SQL> --

SQL> UPDATE MY_FACT A

2 SET STATUS = 'O' WHERE ENTITY_ID = :ent_id AND BUSINESS_YEAR = TO_DATE(:bus_year,'YYYY-MM-DD') AND

3 VERSION = (SELECT MAX(VERSION) FROM MY_FACT B

4 WHERE A.ENTITY_ID = B.ENTITY_ID AND A.BUSINESS_YEAR = B.BUSINESS_YEAR)

5 ;

1 row updated.

Execution Plan

----------------------------------------------------------

0 UPDATE STATEMENT Optimizer=ALL_ROWS (Cost=16 Card=1 Bytes=52

)

1 0 UPDATE OF 'MY_FACT'

2 1 TABLE ACCESS (BY LOCAL INDEX ROWID) OF 'MY_FACT' (TABLE)

(Cost=3 Card=1 Bytes=17)

3 2 NESTED LOOPS (Cost=16 Card=1 Bytes=52)

4 3 VIEW OF 'VW_SQ_1' (VIEW) (Cost=10 Card=2 Bytes=70)

5 4 SORT (GROUP BY) (Cost=10 Card=2 Bytes=32)

6 5 PARTITION RANGE (SINGLE) (Cost=10 Card=604 Bytes

=9664)

7 6 TABLE ACCESS (BY LOCAL INDEX ROWID) OF 'MY_FAC

T' (TABLE) (Cost=10 Card=604 Bytes=9664)

8 7 INDEX (RANGE SCAN) OF 'MY_FACT_IX1' (INDEX)

(Cost=6 Card=1208)

9 3 PARTITION RANGE (SINGLE) (Cost=2 Card=1)

10 9 INDEX (RANGE SCAN) OF 'MY_FACT_IX1' (INDEX) (Cost=

2 Card=1)

Statistics

----------------------------------------------------------

220 recursive calls

4 db block gets

125 consistent gets

0 physical reads

64 redo size

637 bytes sent via SQL*Net to client

830 bytes received via SQL*Net from client

3 SQL*Net roundtrips to/from client

11 sorts (memory)

0 sorts (disk)

1 rows processed

SQL> ----

SQL> quit;

Figure 3 The Autotrace Run

Surprisingly, what we see is exactly the same execution plan like the one before. So again, even autotrace provides no hint to the root of the problem. Luckily, the next thought directs the troubleshooting in the right direction. You realise that the database version is 10g and therefore the system view v$sql_plan (introduced in 9i) shows the execution plan of the currently running statement. For further details how to extract the execution plan directly from the view v$sql_plan See [4].

----------------------------------------------------------------------

----------------------------------------------------------------------

| 0| UPDATE STATEMENT | | | 15| | |

|* 2| HASH JOIN | | 1| 15| | |

| 3| PARTITION RANGE SINGLE | | 998| 7| KEY | KEY |

|* 4| TABLE ACCESS FULL | MY_FACT| 998| 7| KEY | KEY |

| 5| VIEW | VW_SQ_1| 998| 7| | |

| 6| SORT GROUP BY | | 998| 7| | |

| 7| PARTITION RANGE SINGLE| | 998| 7| KEY | KEY |

|* 8| TABLE ACCESS FULL | MY_FACT| 998| 7| KEY | KEY |

----------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - access("VERSION"="VW_COL_1" AND "A"."ENTITY_ID"="ENTITY_ID" AND

"A"."BUSINESS_YEAR"="BUSINESS_YEAR")

4 - filter(("ENTITY_ID"=:B2 AND "BUSINESS_YEAR"=:B1))

8 - filter(("B"."ENTITY_ID"=:B2 AND "B"."BUSINESS_YEAR"=:B1))

Figure 4 The Real Execution Plan

The Cause

Apparently, the real execution plan is completely different from both previous plans generated by explain plan and autotrace. What is the cause of this difference? The bind variables contained in the SQL statement. In Oracle 9i a feature called "bind variable peek" was introduced. For a full explanation of this feature see [1] in the Reference Section. Here we give only a brief overview.

In some situations, when a statement is parsed Oracle peeks (here is the origin of the name of the feature) on the actual value of the bind variables and uses this information while optimising the execution plan. This is certainly an important information that could significantly influence the decision made by the optimiser.

Bind variable peek is not always possible, a typical example of a situation where a peek is not performed is the parsing of a statement within the explain plan statement. The syntax of the explain plan statement allows to process a query containing bind variables without actually defining these variables, so there is no chance to perform the peek. Neither value nor data type of those bind variables is known when the statement is parsed. The optimiser must use the default rule to calculate the selectivity of the predicates with bind variables. The result could be very different from the "real" execution plan based on the peeked values and known data types of bind variables. This is the reason why the execution plan generated by explain plan was so different from the actual one.

What was wrong with autotrace? A run with autotrace actually executes the statement, so the bind variables in this case must be well defined. The presented execution plan is not the “real” executed plan. After the execution an ordinary explain plan of the executed statement is performed and the result plan is sent to the output. There is no peeked information in the execution plan.

We see there is no difference between the explain plan and autotrace option; both can produce misleading execution plans if bind variables are used.

The Trap

The plan switch dependent on the actual value peeked is not an unknown issue. It is well described in case of a histogram defined of the predicate column. Yes of course, if a particular column contains skew data, there is probably more than one method how to access the table based on a predicate on this column (e.g. an index range scan and a table full scan). Depending on the actual value of the bind variable used first time when the statement is parsed, one of those access methods will be chosen. This decision will be used in all other executions of our statement; executions that potentially contain different values of the bind variables making the former optimise plan rather sub optimal. Once upon a time, the cursor will be aged out from the shared pool and the game starts from the beginning. Again, the first parse wins and defines the execution plan for all its successors.

The Partition Peek

Less known is the case in which through the peek the predicate relevant to partition pruning is enriched. Such a predicate (e.g. partition_key_column = :bind_variable) leads without the peek to the runtime pruning marked as KEY-KEY in the execution plan, resulting in using the optimiser statistics from the table level. If a peek is performed, the situation is completely different. Using the peeked value, the exact partition is known (marked as N-N in the columns Pstart and Pstop of the execution plan, where N is the partition number). In the letter case the optimiser statistics from the partition level are used. Depending on the actual peeked value, the optimiser statistics of the proper partition are used.

This is the last piece of puzzle needed to solve the problem.

Explain plan statement (as no peek could be done) uses the table level statistics providing an acceptable execution plan. (see Figure 2)
The whole year 2005 the bind variable peek results in the pruning of the 2005 partition (i.e. the bind variable peek returns the value 2005 for the business_year) that means that the optimiser statistics from the 2005 partition are used. This leads to the same execution plan as in point 1.
Shortly after allocating the partition for the year 2006 there may occur a situation in which the bind variable peek hits the business year 2006. This leads to the usage of statistics from the partition 2006 and in the end to a different execution plan. As the 2006 partition is nearly empty, a hash join with a full scan is considered as the best access.
So far this is not bad as this execution plan is the right one for the 2006 partition. The problem is that the same plan is used repeatedly for all the subsequent executions, especially for those entities stored in the 2005 partition. This makes the processing effectively hanging as the partition for the year 2005 is heavily populated. Full partition scan performed in a loop is a deadly scenario.

The Example

The whole story is demonstrated using an example. This example was tested in Oracle 9i and 10g (versions 9.2.0.5.0 and 10.2.0.1.0).

To create the environment use the script set_up.sql.

The batch behaviour is simulated in the script run.sql. Figure 5 shows the difference between a good and a bad run.

SQL> @run

SQL> -- run bind variable peek example

SQL> -- J.Nemec 2006

SQL> -- http://www.db-nemec.com

SQL> -- this script demonstrates a instability caused by bind

SQL> -- variable peeking.

SQL> -- Due to peek the partition level statistics from different

SQL> -- partition are used leading to a dramatic plan change

SQL> --

SQL> -- Statement 1

SQL> --- peek decides to use statistics from the partition 2005 (good)

SQL> -- Statement 2

SQL> --- peek decides to use statistics from the partition 2006 (bad)

SQL> --

SQL> begin

2 declare

3 i number;

4 cursor c is

5 select entity_id, max(business_year) business_year

6 from my_fact where entity_id between 1 and 1000

7 group by entity_id order by entity_id;

8 begin

9 for x in c loop

10 update my_fact a

11 set status = 'D'

12 where entity_id = x.entity_id and business_year = x.business_year

13 and version = (select max(version) from my_fact b

14 where a.entity_id = b.entity_id and

15 a.business_year = b.business_year);

16 dbms_output.put_line('updating ' || x.entity_id ||

17 ' ' || x.business_year);

18 end loop;

19 end;

20 end;

21 /

PL/SQL procedure successfully completed.

Elapsed: 00:00:09.86

SQL> -- now define the loop with cursor returning the date - 2006 first

SQL> -- note, that the statement must differ from the previous one

SQL> -- to enforce the statement to be parsed with BVP

SQL> begin

2 declare

3 i number;

4 cursor c is

5 select entity_id, max(business_year) business_year

6 from my_fact where entity_id between 4 and 1000

7 group by entity_id order by entity_id;

8 begin

9 for x in c loop

10 update my_fact a

11 set status = 'O' --- status was changed

12 where entity_id = x.entity_id and business_year = x.business_year

13 and version = (select max(version) from my_fact b

14 where a.entity_id = b.entity_id and

15 a.business_year = b.business_year);

16 dbms_output.put_line('updating ' || x.entity_id ||

17 ' ' || x.business_year);

18 end loop;

19 end;

20 end;

21 /

PL/SQL procedure successfully completed.

Elapsed: 00:27:24.03

Figure 5 Two completely different executions of the same statement

To update one thousand of records in the fact table takes approximately ten seconds (see the first run). The second run, differing only in the fact that the first update peeks the new partition, needs nearly half an hour to be completed. This degradation factor is the reason why the job seems to be hanging.

To get an overview of the engaged execution plans use the script generate_plan.sql.

The Peek

To make the story even worse, to check what value would be used in a bind variable peek isn’t the only important thing. Even more important is the information if the peek will be performed at all. In other words, not all database clients perform the bind variable peek. For example while using the dbms_sql to parse a statement no bind variable peek is performed (as a feature). Other clients can omit the peek as a cause of bugs – see [2].

How can we see if a bind variable peek was performed? The best way is to use a 10053 trace while parsing the statement. The Figure 6 shows the relevant part of the trace file which contains the evidence that a peek was done.

*******************************************

Peeked values of the binds in SQL statement

*******************************************

kkscoacd

Bind#0

oacdty=02 mxl=22(21) mxlc=00 mal=00 scl=00 pre=00

oacflg=13 fl2=206001 frm=00 csi=00 siz=24 off=0

kxsbbbfp=07ac1c54 bln=22 avl=02 flg=09

value=1

Bind#1

oacdty=12 mxl=07(07) mxlc=00 mal=00 scl=00 pre=00

oacflg=13 fl2=206001 frm=00 csi=00 siz=8 off=0

kxsbbbfp=07ac455c bln=07 avl=07 flg=09

value="1/1/2005 0:0:0"

Figure 6 – bind variable peek in a 10053 trace file

A particularly interesting question is how to get the correct execution plan of a statement with bind variables without executing the statement (this could be desired for long running statements). An answer (for select statements) can be found under [3] in the Reference Section.

Conclusion

In addition to the well known case of the predicate on a column with a histogram, there exists at least one other case in which the execution plan may be instable caused through the bind variable peeking. A partition pruning based on a predicate with bind variables. This is a particularly important scenario in case of range partitioned fact tables organized as rolling windows. These tables contain at least two types of partitions, those filled up and those pre allocated to future loads. As the optimiser statistics of both types are very different, the risk of getting the wrong execution plan in case of peeking “in the wrong partition” is relatively high. The interesting thing is that the problems could be quasi pre-programmed, as allocation of new partitions is time-dependent.

References

[1] Bind variable peek

http://download-uk.oracle.com/docs/cd/B19306_01/server.102/b14211/optimops.htm#sthref1254

[2] Thin driver doesn’t peek

http://asktom.oracle.com/pls/ask/f?p=4950:8:::::F4950_P8_DISPLAYID:19398056075583#48426602526853

[3] Generation of an explain plan without actually executing the statement.

http://www.jlcomp.demon.co.uk/faq/how_to_explain.html

[4] Using V$SQL_PLAN to extract the execution plan

http://www.oracleadvice.com/Tips/plantip.htm

Jaromir D.B. Nemec is a freelancer specializing in data warehouse and integration solutions. He can be reached under http://www.db-nemec.com

Example

The attachment contains following scripts:

Set_up.sql – script for set up the environment

run.sql – script simulating the batch job

generate_plan.sql – script for generation of the involved execution plans

Last revision 14.7.2006

[1] Note that the example is strongly simplified to clearly demonstrate the core principles. Also the troubleshooting steps are artificially chosen to demonstrate the pitfalls.