Happy New Peek
By Jaromir D.B. Nemec
It is 6 A.M. of the New Years day, some DBA recapitulate the resolutions
while others are already sleeping. In neither of those situations an alerting call
is welcome. Though, exactly that happens. “Our nightly Oracle job is hanging,
but nothing was changed!”, is the plain message. “Nothing was changed, nothing was changed”, flashes through the
DBA’s head before going to resolve the problem, “why do we ever need a
sophisticated database if our business never changes!”.
Let’s demonstrate the troubleshooting step by step[1].
If you want to execute the example in your own environment, skip to the Example section at the end where you find the scripts to
perform the setup.
The first step is to understand the logic of the running batch. The
algorithm is rather simple; the status of some entities (selected based on
defined business rules) in the fact table should be updated.
The tricky part is that the fact table is versioned, i.e. each entity
consists of a number of versions. The business policy says that only the status
of the last version should be updated.
As our business is rather dynamic, each entity contains up to a thousand
of versions. In order to handle such a large number of records, the table is
partitioned by a business year, defined as the year of the transaction date.
The next step in the troubleshooting is to get the SQL statement that is
currently running in the session of the hanging batch. This is not a big task,
accessing the views v$session and v$sqltext you get the "hanging"
statement. See Figure 1.
UPDATE MY_FACT A
SET STATUS = 'O'
WHERE ENTITY_ID
= :B2 AND BUSINESS_YEAR = :B1 AND
VERSION =
(SELECT MAX(VERSION) FROM MY_FACT B
WHERE A.ENTITY_ID = B.ENTITY_ID AND
A.BUSINESS_YEAR = B.BUSINESS_YEAR)
;
Figure 1 The “hanging” Statement
The statement is a relatively simple update corresponding well to the job
description above. The entity and the business year are fixed with bind
variables (the fact table is partitioned by a business_year – see the table DDL
in the Example section). The last version is determined by using a subquery. Apparently the
update statement we see is performed in a loop over all selected business
entities.
The next thing you should do is to check the execution plan of this
statement. With the explain plan statement and dbms_xplan utility this task is
quickly done. See Figure 2.
EXPLAIN PLAN set statement_id = 'N3' into plan_table
FOR
UPDATE MY_FACT A
SET STATUS = 'O'
WHERE ENTITY_ID
= :B2 AND BUSINESS_YEAR = :B1 AND
VERSION =
(SELECT MAX(VERSION) FROM MY_FACT B
WHERE A.ENTITY_ID = B.ENTITY_ID AND
A.BUSINESS_YEAR = B.BUSINESS_YEAR)
;
---
SELECT * FROM table(DBMS_XPLAN.DISPLAY('plan_table', 'N3','ALL')) ;
--
-----------------------------------------------------------------------------------
| Id | Operation | Name |Rows |Cost|Pstart| Pstop |
-----------------------------------------------------------------------------------
| 0| UPDATE
STATEMENT | | 1| 16| |
|
| 1| UPDATE | MY_FACT | |
| | |
|* 2| TABLE ACCESS BY LOC.IND.ROWID | MY_FACT | 1| 3|
| |
| 3| NESTED LOOPS |
| 1| 16| | |
| 4| VIEW | VW_SQ_1 | 2| 10|
| |
| 5| SORT GROUP BY |
| 2| 10| | |
| 6| PARTITION RANGE SINGLE | | 604| 10|
KEY | KEY |
|* 7| TABLE ACCESS BY LOC.IND.ROWID|
MY_FACT | 604| 10| KEY |
KEY |
|* 8| INDEX RANGE SCAN | MY_FACT_IX1| 1208| 6|
KEY | KEY |
| 9| PARTITION RANGE SINGLE | | 1| 2|
KEY | KEY |
|* 10|
INDEX RANGE SCAN |
MY_FACT_IX1| 1| 2|
KEY | KEY |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 -
filter("BUSINESS_YEAR"=:B1)
7 -
filter("B"."BUSINESS_YEAR"=:B1)
8 -
access("B"."ENTITY_ID"=TO_NUMBER(:B2))
10 -
access("ENTITY_ID"=TO_NUMBER(:B2) AND
"VERSION"="VW_COL_1")
Figure 2 The Execution Plan (output modified to optimise presentation)
What is wrong with this execution plan? First an index range scan
on one index partition of our fact table is performed to access all versions of
the processed entity (see access and filter predicates in lines 8 and 7 of the
execution plan). Then using sort group by operation the maximal version
is found. The nested loops operation (line 3) is performed exactly once.
The reason is that the version number is unique within each entity. Therefore
there exists only one record with the maximal version.
Where should be a problem to access thousand of records via index, sort
them and get the record with the maximal version? This execution plan should
not cause a hanging job, even if applied in a loop to process a number of
entities.
Well, the explain plan apparently doesn’t provide the clue, let's try to
effectively execute the statement and observe the execution plan with the
autotrace option. Of course, it is not possible to execute the statement
without initialising the bind variables. We simply choose some random entity_id
and use the current business year. We will see the importance of the choice of
the bind variables later on.
SQL> set autotrace
traceonly;
SQL> --
SQL> variable
ent_id number;
SQL> variable
bus_year varchar2(10);
SQL> --
SQL> execute
:ent_id := 4; :bus_year := '2006-01-01';
PL/SQL procedure
successfully completed.
SQL> --
SQL> UPDATE
MY_FACT A
2
SET STATUS = 'O' WHERE ENTITY_ID = :ent_id AND BUSINESS_YEAR =
TO_DATE(:bus_year,'YYYY-MM-DD') AND
3
VERSION = (SELECT MAX(VERSION) FROM MY_FACT B
4 WHERE A.ENTITY_ID = B.ENTITY_ID AND A.BUSINESS_YEAR
= B.BUSINESS_YEAR)
5 ;
1 row updated.
Execution Plan
----------------------------------------------------------
0
UPDATE STATEMENT Optimizer=ALL_ROWS (Cost=16 Card=1 Bytes=52
)
1
0 UPDATE OF 'MY_FACT'
2
1 TABLE ACCESS (BY LOCAL
INDEX ROWID) OF 'MY_FACT' (TABLE)
(Cost=3 Card=1 Bytes=17)
3
2 NESTED LOOPS (Cost=16
Card=1 Bytes=52)
4
3 VIEW OF 'VW_SQ_1'
(VIEW) (Cost=10 Card=2 Bytes=70)
5
4 SORT (GROUP BY)
(Cost=10 Card=2 Bytes=32)
6
5 PARTITION RANGE
(SINGLE) (Cost=10 Card=604 Bytes
=9664)
7
6 TABLE ACCESS (BY LOCAL INDEX ROWID) OF
'MY_FAC
T' (TABLE) (Cost=10 Card=604
Bytes=9664)
8
7 INDEX (RANGE
SCAN) OF 'MY_FACT_IX1' (INDEX)
(Cost=6 Card=1208)
9
3 PARTITION RANGE
(SINGLE) (Cost=2 Card=1)
10
9 INDEX (RANGE SCAN) OF
'MY_FACT_IX1' (INDEX) (Cost=
2 Card=1)
Statistics
----------------------------------------------------------
220
recursive calls
4
db block gets
125
consistent gets
0
physical reads
64
redo size
637
bytes sent via SQL*Net to client
830
bytes received via SQL*Net from client
3
SQL*Net roundtrips to/from client
11
sorts (memory)
0
sorts (disk)
1
rows processed
SQL> ----
SQL> quit;
Figure 3 The Autotrace Run
Surprisingly, what we see is exactly the same execution plan like the
one before. So again, even autotrace provides no hint to the root of the
problem. Luckily, the next thought directs the troubleshooting in the right
direction. You realise that the database version is 10g and therefore the
system view v$sql_plan (introduced in 9i) shows the execution plan of the currently
running statement. For further details how to extract the execution plan
directly from the view v$sql_plan See [4].
----------------------------------------------------------------------
| Id | Operation | Name |Rows | Cost|Pstart|Pstop |
----------------------------------------------------------------------
| 0| UPDATE STATEMENT | | | 15|
| |
| 1|
UPDATE |
MY_FACT| | | | |
|* 2|
HASH JOIN |
| 1| 15| | |
| 3|
PARTITION RANGE SINGLE | |
998| 7| KEY |
KEY |
|* 4|
TABLE ACCESS FULL |
MY_FACT| 998| 7| KEY | KEY |
| 5|
VIEW |
VW_SQ_1| 998| 7| | |
| 6|
SORT GROUP BY | |
998| 7| |
|
| 7|
PARTITION RANGE SINGLE|
| 998| 7| KEY | KEY |
|* 8|
TABLE ACCESS FULL |
MY_FACT| 998| 7| KEY | KEY |
----------------------------------------------------------------------
Predicate Information
(identified by operation id):
---------------------------------------------------
2 -
access("VERSION"="VW_COL_1" AND
"A"."ENTITY_ID"="ENTITY_ID" AND
"A"."BUSINESS_YEAR"="BUSINESS_YEAR")
4 - filter(("ENTITY_ID"=:B2 AND
"BUSINESS_YEAR"=:B1))
8 -
filter(("B"."ENTITY_ID"=:B2 AND
"B"."BUSINESS_YEAR"=:B1))
Figure 4 The Real Execution Plan
Apparently, the real execution plan is completely different from both previous
plans generated by explain plan and autotrace. What is the cause of this
difference? The bind variables contained in the SQL statement. In Oracle 9i a
feature called "bind variable peek" was introduced. For a full explanation of this feature see [1]
in the Reference Section. Here we give only a brief overview.
In some situations, when a statement is parsed Oracle peeks (here is the
origin of the name of the feature) on the actual value of the bind variables
and uses this information while optimising the execution plan. This is certainly an important information
that could significantly influence the decision made by the optimiser.
Bind variable peek is not always possible, a typical example of a
situation where a peek is not performed is the parsing of a statement within
the explain plan statement. The syntax of the explain plan statement allows to
process a query containing bind variables without actually defining these
variables, so there is no chance to perform the peek. Neither value nor data type
of those bind variables is known when the statement is parsed. The optimiser
must use the default rule to calculate the selectivity of the predicates with
bind variables. The result could be very different from the "real"
execution plan based on the peeked values and known data types of bind
variables. This is the reason why the execution plan generated by explain plan
was so different from the actual one.
What was wrong with autotrace? A
run with autotrace actually executes the statement, so the bind variables in
this case must be well defined. The presented execution plan is not the “real”
executed plan. After the execution an
ordinary explain plan of the executed statement is performed and the result
plan is sent to the output. There is no peeked information in the execution
plan.
We see there is no difference between the explain plan and autotrace
option; both can produce misleading execution plans if bind variables are used.
The plan switch dependent on the actual value peeked is not an unknown
issue. It is well described in case of a histogram defined of the predicate
column. Yes of course, if a particular column contains skew data, there is
probably more than one method how to access the table based on a predicate on
this column (e.g. an index range scan and a table full scan). Depending on the
actual value of the bind variable used first time when the statement is parsed,
one of those access methods will be chosen. This decision will be used in all
other executions of our statement; executions that potentially contain
different values of the bind variables making the former optimise plan rather
sub optimal. Once upon a time, the cursor will be aged out from the shared pool
and the game starts from the beginning.
Again, the first parse wins and defines the execution plan for all its
successors.
Less known is the case in which through the peek the predicate relevant
to partition pruning is enriched. Such a predicate (e.g. partition_key_column
= :bind_variable) leads without the
peek to the runtime pruning marked as KEY-KEY in the execution plan, resulting
in using the optimiser statistics from the table level. If a peek is performed,
the situation is completely different. Using the peeked value, the exact
partition is known (marked as N-N in the columns Pstart and Pstop of the
execution plan, where N is the partition number). In the letter case the
optimiser statistics from the partition level are used. Depending on the actual
peeked value, the optimiser statistics of the proper partition are used.
This is the last piece of puzzle needed to solve the problem.
The whole story is demonstrated using an example. This example was
tested in Oracle 9i and 10g (versions 9.2.0.5.0 and 10.2.0.1.0).
To create the environment use the script set_up.sql.
The batch behaviour is simulated in the script run.sql. Figure 5 shows
the difference between a good and a bad run.
SQL> @run
SQL> -- run bind
variable peek example
SQL> -- J.Nemec
2006
SQL> --
http://www.db-nemec.com
SQL> -- this
script demonstrates a instability caused by bind
SQL> -- variable peeking.
SQL> -- Due to
peek the partition level statistics from different
SQL> -- partition are used leading to a dramatic
plan change
SQL> --
SQL> -- Statement
1
SQL> --- peek
decides to use statistics from the partition 2005 (good)
SQL> -- Statement
2
SQL> --- peek
decides to use statistics from the partition 2006 (bad)
SQL> --
SQL> begin
2
declare
3 i number;
4 cursor c is
5 select entity_id, max(business_year) business_year
6 from my_fact where entity_id between 1 and
1000
7 group by entity_id order by entity_id;
8
begin
9
for x in c loop
10 update my_fact a
11 set status = 'D'
12 where entity_id = x.entity_id and
business_year = x.business_year
13 and version = (select max(version) from
my_fact b
14 where a.entity_id = b.entity_id and
15 a.business_year = b.business_year);
16 dbms_output.put_line('updating ' ||
x.entity_id ||
17 ' ' || x.business_year);
18
end loop;
19
end;
20
end;
21 /
PL/SQL procedure
successfully completed.
Elapsed: 00:00:09.86
SQL> -- now define
the loop with cursor returning the date - 2006 first
SQL> -- note, that
the statement must differ from the previous one
SQL> -- to enforce
the statement to be parsed with BVP
SQL> begin
2
declare
3 i number;
4 cursor c is
5 select entity_id, max(business_year)
business_year
6 from my_fact where entity_id between 4 and
1000
7 group by entity_id order by entity_id;
8
begin
9
for x in c loop
10 update my_fact a
11 set status = 'O' --- status was changed
12 where entity_id = x.entity_id and
business_year = x.business_year
13 and version = (select max(version) from
my_fact b
14 where a.entity_id = b.entity_id and
15 a.business_year = b.business_year);
16 dbms_output.put_line('updating ' ||
x.entity_id ||
17 ' ' || x.business_year);
18
end loop;
19
end;
20
end;
21 /
PL/SQL procedure
successfully completed.
Elapsed: 00:27:24.03
Figure 5 Two completely different executions of the same statement
To update one thousand of records in the fact table takes approximately
ten seconds (see the first run). The second run, differing only in the fact
that the first update peeks the new partition, needs nearly half an hour to be
completed. This degradation factor is the reason why the job seems to be
hanging.
To get an overview of the engaged execution plans use the script
generate_plan.sql.
To make the story even worse, to check what value would be used in a
bind variable peek isn’t the only important thing. Even more important is the
information if the peek will be performed at all. In other words, not all
database clients perform the bind variable peek. For example while using the
dbms_sql to parse a statement no bind variable peek is performed (as a
feature). Other clients can omit the
peek as a cause of bugs – see [2].
How can we see if a bind variable peek was performed? The best way is to
use a 10053 trace while parsing the statement. The Figure 6 shows the relevant
part of the trace file which contains the evidence that a peek was done.
*******************************************
Peeked values of the
binds in SQL statement
*******************************************
kkscoacd
Bind#0
oacdty=02 mxl=22(21) mxlc=00 mal=00 scl=00
pre=00
oacflg=13 fl2=206001 frm=00 csi=00 siz=24
off=0
kxsbbbfp=07ac1c54 bln=22 avl=02 flg=09
value=1
Bind#1
oacdty=12 mxl=07(07) mxlc=00 mal=00 scl=00
pre=00
oacflg=13 fl2=206001 frm=00 csi=00 siz=8
off=0
kxsbbbfp=07ac455c bln=07 avl=07 flg=09
value="1/1/2005 0:0:0"
Figure 6 – bind variable peek in a 10053 trace file
A particularly interesting question is how to get the correct execution plan
of a statement with bind variables without executing the statement (this could
be desired for long running statements). An answer (for select statements) can
be found under [3] in the Reference Section.
In addition to the well known case of the predicate on a column with a
histogram, there exists at least one other case in which the execution plan may
be instable caused through the bind variable peeking. A partition pruning based
on a predicate with bind variables. This is a particularly important scenario
in case of range partitioned fact tables organized as rolling windows. These
tables contain at least two types of partitions, those filled up and those pre
allocated to future loads. As the
optimiser statistics of both types are very different, the risk of getting the
wrong execution plan in case of peeking “in the wrong partition” is relatively high. The interesting thing is
that the problems could be quasi pre-programmed, as allocation of new partitions
is time-dependent.
References
[1] Bind variable peek
http://download-uk.oracle.com/docs/cd/B19306_01/server.102/b14211/optimops.htm#sthref1254
[2] Thin driver doesn’t peek
http://asktom.oracle.com/pls/ask/f?p=4950:8:::::F4950_P8_DISPLAYID:19398056075583#48426602526853
[3] Generation of an explain plan without actually executing the
statement.
http://www.jlcomp.demon.co.uk/faq/how_to_explain.html
[4] Using V$SQL_PLAN to extract the
execution plan
http://www.oracleadvice.com/Tips/plantip.htm
Jaromir D.B. Nemec is a freelancer specializing in data warehouse and
integration solutions. He can be reached under http://www.db-nemec.com
The attachment contains following scripts:
Set_up.sql – script for set up the environment
run.sql – script simulating the batch job
generate_plan.sql – script for generation of the involved execution plans
[1] Note that the example is strongly
simplified to clearly demonstrate the core principles. Also the troubleshooting
steps are artificially chosen to demonstrate the pitfalls.