Friday, July 19, 2024

Oracle Database 23ai – Real-Time SQL Plan Management

One of the most frequent questions I get asked is “why did the Oracle Optimizer change the execution plan on a SQL statement?”  Generally, this arises when a statement goes from running well to suddenly performing poorly.  In short, something has changed in a way that the Optimizer thinks the data should be accessed.  Which then leads to the question, why doesn’t it know that the new performance is worse?

Well not to solve all problems but Oracle database 23ai introduces a new feature called Real-Time SQL Plan Management.  This feature builds on-top of existing features Automatic SQL Tuning Sets (Auto STS) and the current SQL Plan Management (SPM).  To summarize, the Oracle database uses these features in together in the following manor:

  • Auto STS captures execution plans of “most” SQL statements and stores then in a system managed SQL tuning set called “SYS_AUTO_STS”
  • At SQL execution time, the Optimizer decides if a new plan is warranted for a SQL statement based on previous execution information and current estimated execution plan.  The Optimizer will then execute the new adjusted plan and compare it to previous runs.
  • If the new plan is better a SQL Plan Baseline will be automatically created for that new plan.  If during future executions of the SQL the performance regresses again the Optimizer will re-analyze and possible disable the baseline or create another baseline as needed.

All that work you may have been doing to identify a poor running SQL, see how to run it better, and then putting in a “fix” such as a baseline is now 100% automated.

Often, we shy away from new features for fear that they will cause more harm than benefit.  Personally, I have adopted many new features from Oracle over the past 30+ years.  My goal here is to shed some light on how this new feature works to help reduce the fear, uncertainty, and doubt that might arise. 

Let’s look at a quick example, note this is running in Oracle Cloud Infrastructure (OCI) with Oracle Database Enterprise Edition 23.4.1.24.06 as we patiently await the on-premises release.  I’m skipping over some of the mechanics to focus more on the results of real-time SPM that you might see and want to understand what occurred.

First for this to work we need to enable the Real-Time SPM (RT SPM) task:

exec dbms_spm.configure('AUTO_SPM_EVOLVE_TASK', 'AUTO')

Note in 19c the EVOLVE task was either on or off and did not do real time evaluation.  AUTO is a new setting for 23ai.

You can check your current settings with the following query:

SELECT PARAMETER_NAME, PARAMETER_VALUE     FROM DBA_SQL_MANAGEMENT_CONFIG;

PARAMETER_NAME                           PARAMETER_VALUE
---------------------------------------- ------------------------------
AUTO_SPM_EVOLVE_TASK                     AUTO

Once the Optimizer detects that a better execution plan exists, it may decide to try it.  If it does you will see a note in the explain plan:

Note -----    - This is SQL Plan Management Test Plan

You will see this when the Optimizer runs the SQL with the possibly better execution plan.  Note, you may find it hard to capture this moment in a real system.  

If the new execution plan performs better, a SQL Plan Baseline (or baseline) will be created for the new execution plan as well as the previous execution plan:

SQL_HANDLE           PLAN_NAME                      ORIGIN             ENA ACC   COST  -------------------- ------------------------------ ------------------ --- --- ------ SQL_2044b318726d1c30 SQL_PLAN_20j5m31t6u71h169efe64 FOREGROUND-CAPTURE YES YES 151184 SQL_2044b318726d1c30 SQL_PLAN_20j5m31t6u71h8059011b FOREGROUND-CAPTURE YES NO  271739

Note the source of the baseline “FOREGROUND-CAPTURE” this means that the Optimizer made this baseline during a SQL execution after analysis.  We can see a little bit more information from the new NOTES column of the DBA_SQL_PLAN_BASELINES view:

SELECT p.plan_name, p.foreground_last_verified,
       pfspm.status result, pfspm.ver verify_type,
       pfspm.test_phv, pftp.elps, pfspm.ref_phv, pfrp.elps
  FROM dba_sql_plan_baselines p,
    XMLTABLE(
         '/notes'
         passing xmltype(p.notes)
         columns
             plan_id         NUMBER    path 'plan_id',
             flags           NUMBER    path 'flags',
             fg_spm          XMLType   path 'fg_spm') pf,
     XMLTABLE(
         '/fg_spm'
         passing pf.fg_spm
         columns
             test_phv        NUMBER         path 'test_phv',
             ref_phv         NUMBER         path 'ref_phv',
             ver             VARCHAR2(8)    path 'ver',
             status          VARCHAR2(8)    path 'status',
             ref_perf        XMLType        path 'ref_perf',
             test_perf       XMLType        path 'test_perf' ) pfspm,
    XMLTABLE(
        '/ref_perf'
        passing pfspm.ref_perf
        columns
             bg              NUMBER         path 'bg',
             cpu             NUMBER         path 'cpu',
             elps            NUMBER         path 'elps' ) pfrp,
    XMLTABLE(
        '/test_perf'
        passing pfspm.test_perf
        columns
             bg              NUMBER         path 'bg',
             cpu             NUMBER         path 'cpu',
             elps            NUMBER         path 'elps' ) pftp
  WHERE notes IS NOT NULL
    AND p.sql_handle = 'SQL_2044b318726d1c30'
  ORDER BY p.foreground_last_verified;

PLAN_NAME                      FOREGROUND_LAST_VERIFIED     RESULT  VERIFY_T  TEST_PHV ELPS    REF_PHV ELPS ------------------------------ ---------------------------- ------- -------- --------- ---- ---------- ---- SQL_PLAN_20j5m31t6u71h169efe64 19-JUL-24 01.27.50.000000 PM better  normal   379518564 2775 2153316635 3674

This shows us that the elapsed time was better for the tested execution plan compared to the reference (existing) plan.  There is more data in the NOTES column, but this is a helpful way to see what the Optimizer did to verify the performance.  In this case the elapsed time was lower for the newly tested plan.

You may wonder, what happens if things change again?   What if this new plan is not the best plan in the future?  Well, the Real Time SPM will look for that also, when the SQL would degrade due to Optimizer execution plan generation estimation, it will not get used.  Again, in the notes section of an explain plan you may see this:

Note

-----

   - Failed to use SQL plan baseline for this statement

To view the execution plan for a given baseline, use the DBMS_XPLAN package:

SELECT *
    FROM table(DBMS_XPLAN.DISPLAY_SQL_PLAN_BASELINE(
             sql_handle=>'SQL_2044b318726d1c30 ',
             plan_name=>'SQL_PLAN_20j5m31t6u71h169efe64',
             format=>'typical'));

-------------------------------------------------------------------------------- SQL handle: SQL_2044b318726d1c30 SQL text: select /* SPM_TEST_QUERY_Q1 */ sum(t2.amount) from   sales_area1 t1,                 sales_area2 t2 where  t1.sale_code = t2.sale_code and           t1.sale_type  = 1 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Plan name: SQL_PLAN_20j5m31t6u71h169efe64         Plan id: 2727377809 Enabled: YES     Fixed: NO      Accepted: YES     Origin: FOREGROUND-CAPTURE Plan rows: From Auto SQL Tuning Set -------------------------------------------------------------------------------- Plan hash value: 2727377809 ----------------------------------------------------------------------------------------------------- | Id  | Operation                             | Name        | Rows  | Bytes | Cost (%CPU)| Time     | ----------------------------------------------------------------------------------------------------- |   0 | SELECT STATEMENT                      |             |       |       |   151K(100)|          | |   1 |  SORT AGGREGATE                       |             |     1 |    33 |            |          | |*  2 |   HASH JOIN                           |             |    62M|  1966M|   151K  (1)| 00:00:06 | |   3 |    TABLE ACCESS BY INDEX ROWID BATCHED| SALES_AREA1 | 25000 |   170K| 15238   (1)| 00:00:01 | |*  4 |     INDEX RANGE SCAN                  | SALES_TYP1I | 25000 |       |   235   (0)| 00:00:01 | |   5 |    TABLE ACCESS FULL                  | SALES_AREA2 |  2500K|    61M|   135K  (1)| 00:00:06 | -----------------------------------------------------------------------------------------------------

Note that there is additional information about the baseline provided by the explain plan and that the plan itself comes from the auto SQL Tuning Set which is where the baseline was created from.

I’m really excited for this new feature and how it can help proactively with SQL performance issues.  Oracle will now detect if the SQL performance has degraded from a previously capture execution plan and then test a possible fix and implement a baseline if the fix is acceptable.  Seems like the answer to “why did the Optimizer change the execution plan?” is, just wait a moment and it will fix itself.

Be sure to check out the Oracle blog by Nigel Bayliss which includes a link to example code he created that was shown in this blog.