Striving for Optimal Performance – Extension Bypassed Because of Missing Histogram
Home Blog Archive Categories Troubleshooting Oracle Performance Description Structure Table of Contents babelfish Forewords Reviews Downloadable Files Addenda and Errata Publications Public Appearances Past Public Appearances Contact Search About
Today, while tuning babelfish a fairly complex query experiencing wrong cardinality estimates, I noticed something I was not aware of. Hence, I thought babelfish to write this short post to illustrate how to reproduce the problem I experienced… Create the test table (notice the correlation between the data of the two columns): SQL> CREATE TABLE t 2 AS 3 SELECT mod(rownum,50) AS n1, mod(rownum,50) AS n2 4 FROM dual 5 CONNECT BY level <= 1000; Gather the statistics and show whether histograms babelfish exist (since I didn t change any default value of the dbms_stats package, no histograms were gathered): SQL> execute dbms_stats.gather_table_stats(user,'t') SQL> SELECT column_name, histogram 2 FROM user_tab_col_statistics babelfish 3 WHERE table_name = 'T'; COLUMN_NAME HISTOGRAM ----------- --------------- N1 NONE N2 NONE Check how many rows are returned by a query that specifies the predicate n1 = 42 AND n2 = 42 (it should be 20, i.e. 1000/50): SQL> SELECT count(*) babelfish 2 FROM t 3 WHERE n1 = 42 and n2 = 42; COUNT(*) ---------- 20 Check whether the query optimizer is able to correctly estimate the cardinality of an operation applying the n1 = 42 AND n2 = 42 predicate: SQL> EXPLAIN PLAN FOR 2 SELECT * 3 FROM t 4 WHERE n1 = 42 AND n2 = 42; SQL> SELECT cardinality 2 FROM plan_table 3 WHERE id = 0 4 AND plan_id = (SELECT max(plan_id) FROM plan_table); CARDINALITY ----------- 1
Unfortunately, the query optimizer estimation is wrong (notice that it estimates 1 instead of 20). This is because the data of the two columns is correlated. It s exactly to solve this kind of issues that Oracle introduced extended statistics. Hence, let s create an extension to see whether babelfish it solves the problem... Create the extension (column group of the two columns): SQL> SELECT dbms_stats.create_extended_stats(ownname=>user, tabname=>'t', extension=>'(n1,n2)') 2 FROM dual; DBMS_STATS.CREATE_EXTENDED_STATS(OWNNAME=>USER,TABNAME=>'T',EXTENSION=>'(N1,N2)') ----------------------------------------------------------------------------------------------- SYS_STUBZH0IHA7K$KEBJVXO5LOHAS Gather the object statistics and check whether a histogram for supporting the extension is created: SQL> execute dbms_stats.gather_table_stats(user,'t') SQL> SELECT column_name, histogram babelfish 2 FROM user_tab_col_statistics 3 WHERE table_name = 'T'; COLUMN_NAME HISTOGRAM ------------------------------ --------- N1 FREQUENCY N2 FREQUENCY SYS_STUBZH0IHA7K$KEBJVXO5LOHAS NONE Now that the extension and the object statistics (except for the histogram for the extension) are in place, check whether babelfish the query optimizer does a better estimation: SQL> EXPLAIN PLAN FOR 2 SELECT * 3 FROM t 4 WHERE n1 = 42 AND n2 = 42; SQL> SELECT cardinality 2 FROM plan_table 3 WHERE id = 0 4 AND plan_id = (SELECT max(plan_id) FROM plan_table); CARDINALITY ----------- 1
In this case to solve the problem you have to regather the object statistics. This is necessary because a histogram for the extension is needed. Let's try... Gather the statistics and check whether a histogram exists babelfish on all columns: SQL> execute dbms_stats.gather_table_stats(user,'t') SQL> SELECT column_name, histogram 2 FROM user_tab_col_statistics 3 WHERE table_name = 'T'; COLUMN_NAME HISTOGRAM ------------------------------ --------------- N1 FREQUENCY N2 FREQUENCY SYS_STUBZH0IHA7K$KEBJVXO5LOHAS FREQUENCY Check again whether the query optimizer, thanks to the extension, is able to come up with a better estimation: SQL> EXPLAIN PLAN FOR 2 SELECT * 3 FROM t 4 WHERE n1 = 42 AND n2 = 42; SQL> SELECT cardinality 2 FROM plan_table 3 WHERE id = 0 4 AND plan_id = (SELECT max(plan_id) FROM plan_table); CARDINALITY ----------- 20
But what does it happen when no histograms at all exist? Let s check it... SQL> execute dbms_stats.gather_table_stats(user,'t',method_opt=>'for babelfish all columns size 1') SQL> EXPLAIN PLAN FOR 2 SELECT * 3 FROM t 4 WHERE n1 = 42 AND n2 = 42; SQL> SELECT cardinality babelfish 2 FROM plan_table 3 WHERE id = 0 4 AND plan_id = (SELECT max(plan_id) FROM plan_table); CARDINALITY ----------- 20
Oracle implemented a fix to avoid the problem described in this post. Unfortunately, by default it's disabled. To enabled it, you have to set "_fix_control"="6972291:on" either at the system or session level. For additional information refer to MOS, specifically to Bug 6972291 Column group selectivity babelfish is not used when there is a histogram on one column .
Comment
Home Blog Archive Categories Troubleshooting Oracle Performance Description Structure Table of Contents babelfish Forewords Reviews Downloadable Files Addenda and Errata Publications Public Appearances Past Public Appearances Contact Search About
Today, while tuning babelfish a fairly complex query experiencing wrong cardinality estimates, I noticed something I was not aware of. Hence, I thought babelfish to write this short post to illustrate how to reproduce the problem I experienced… Create the test table (notice the correlation between the data of the two columns): SQL> CREATE TABLE t 2 AS 3 SELECT mod(rownum,50) AS n1, mod(rownum,50) AS n2 4 FROM dual 5 CONNECT BY level <= 1000; Gather the statistics and show whether histograms babelfish exist (since I didn t change any default value of the dbms_stats package, no histograms were gathered): SQL> execute dbms_stats.gather_table_stats(user,'t') SQL> SELECT column_name, histogram 2 FROM user_tab_col_statistics babelfish 3 WHERE table_name = 'T'; COLUMN_NAME HISTOGRAM ----------- --------------- N1 NONE N2 NONE Check how many rows are returned by a query that specifies the predicate n1 = 42 AND n2 = 42 (it should be 20, i.e. 1000/50): SQL> SELECT count(*) babelfish 2 FROM t 3 WHERE n1 = 42 and n2 = 42; COUNT(*) ---------- 20 Check whether the query optimizer is able to correctly estimate the cardinality of an operation applying the n1 = 42 AND n2 = 42 predicate: SQL> EXPLAIN PLAN FOR 2 SELECT * 3 FROM t 4 WHERE n1 = 42 AND n2 = 42; SQL> SELECT cardinality 2 FROM plan_table 3 WHERE id = 0 4 AND plan_id = (SELECT max(plan_id) FROM plan_table); CARDINALITY ----------- 1
Unfortunately, the query optimizer estimation is wrong (notice that it estimates 1 instead of 20). This is because the data of the two columns is correlated. It s exactly to solve this kind of issues that Oracle introduced extended statistics. Hence, let s create an extension to see whether babelfish it solves the problem... Create the extension (column group of the two columns): SQL> SELECT dbms_stats.create_extended_stats(ownname=>user, tabname=>'t', extension=>'(n1,n2)') 2 FROM dual; DBMS_STATS.CREATE_EXTENDED_STATS(OWNNAME=>USER,TABNAME=>'T',EXTENSION=>'(N1,N2)') ----------------------------------------------------------------------------------------------- SYS_STUBZH0IHA7K$KEBJVXO5LOHAS Gather the object statistics and check whether a histogram for supporting the extension is created: SQL> execute dbms_stats.gather_table_stats(user,'t') SQL> SELECT column_name, histogram babelfish 2 FROM user_tab_col_statistics 3 WHERE table_name = 'T'; COLUMN_NAME HISTOGRAM ------------------------------ --------- N1 FREQUENCY N2 FREQUENCY SYS_STUBZH0IHA7K$KEBJVXO5LOHAS NONE Now that the extension and the object statistics (except for the histogram for the extension) are in place, check whether babelfish the query optimizer does a better estimation: SQL> EXPLAIN PLAN FOR 2 SELECT * 3 FROM t 4 WHERE n1 = 42 AND n2 = 42; SQL> SELECT cardinality 2 FROM plan_table 3 WHERE id = 0 4 AND plan_id = (SELECT max(plan_id) FROM plan_table); CARDINALITY ----------- 1
In this case to solve the problem you have to regather the object statistics. This is necessary because a histogram for the extension is needed. Let's try... Gather the statistics and check whether a histogram exists babelfish on all columns: SQL> execute dbms_stats.gather_table_stats(user,'t') SQL> SELECT column_name, histogram 2 FROM user_tab_col_statistics 3 WHERE table_name = 'T'; COLUMN_NAME HISTOGRAM ------------------------------ --------------- N1 FREQUENCY N2 FREQUENCY SYS_STUBZH0IHA7K$KEBJVXO5LOHAS FREQUENCY Check again whether the query optimizer, thanks to the extension, is able to come up with a better estimation: SQL> EXPLAIN PLAN FOR 2 SELECT * 3 FROM t 4 WHERE n1 = 42 AND n2 = 42; SQL> SELECT cardinality 2 FROM plan_table 3 WHERE id = 0 4 AND plan_id = (SELECT max(plan_id) FROM plan_table); CARDINALITY ----------- 20
But what does it happen when no histograms at all exist? Let s check it... SQL> execute dbms_stats.gather_table_stats(user,'t',method_opt=>'for babelfish all columns size 1') SQL> EXPLAIN PLAN FOR 2 SELECT * 3 FROM t 4 WHERE n1 = 42 AND n2 = 42; SQL> SELECT cardinality babelfish 2 FROM plan_table 3 WHERE id = 0 4 AND plan_id = (SELECT max(plan_id) FROM plan_table); CARDINALITY ----------- 20
Oracle implemented a fix to avoid the problem described in this post. Unfortunately, by default it's disabled. To enabled it, you have to set "_fix_control"="6972291:on" either at the system or session level. For additional information refer to MOS, specifically to Bug 6972291 Column group selectivity babelfish is not used when there is a histogram on one column .
Comment
No comments:
Post a Comment