Wednesday, May 28, 2025

Oracle database predicate mistakes - 23ai SQL Analysis Report

Often working with the Oracle database Optimizer can feel a bit mesmerizing. Why is it not doing something like ignoring an index when you think it should be used. Historically this took a bit of wizardry or at least in-depth knowledge to figure out. Fortunately, version 23ai adds a new SQL Analysis report that can help provide you the answers.

This new report is part of DBMS_XPLAIN and in the SQL Monitor report if you are using OEM or DBMS_SQL_MONITOR. Here is a quick example, we are going to run a SQL statement and use DBMS_XPLAIN to get the execution plan. You will want to make sure that the alias and predicate information sections are shown in addition to the SQL analyses report.

var bind1 NUMBER
exec :bind1 := 1

SELECT t1.p_category, t2.tpmethod
  FROM products t1, products t2
  WHERE t1.prod_category || '_1' = 'SFD_1'
    AND t2.method_typ != 'SEA'
UNION
SELECT t3.p_category, t4.tpmethod
  FROM products t3, sources t4
  WHERE t3.scid = t4.scid
    AND t4.carrier   = 'AAC'
    AND t4.s_area    = :bind1;

set linesize 200
set tab off
set trims on
set pagesize 1000
column plan_table_output format a180

SELECT *
  FROM table(DBMS_XPLAN.DISPLAY_CURSOR(format=>'typical +alias +sql_analysis_report'));

In the explain plan there will be a new section for the SQL analyses report.

-------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |              |       |       |    24 (100)|          |
|   1 |  HASH UNIQUE                           |              | 10482 | 83866 |    24   (5)| 00:00:01 |
|   2 |   UNION-ALL                            |              | 10482 | 83866 |    24   (5)| 00:00:01 |
|   3 |    MERGE JOIN CARTESIAN                |              | 10481 | 83848 |    18   (0)| 00:00:01 |
|*  4 |     TABLE ACCESS FULL                  | PRODUCTS     |    10 |    40 |     3   (0)| 00:00:01 |
|   5 |     BUFFER SORT                        |              |  1024 |  4096 |    15   (0)| 00:00:01 |
|*  6 |      TABLE ACCESS FULL                 | PRODUCTS     |  1024 |  4096 |     2   (0)| 00:00:01 |
|*  7 |    HASH JOIN                           |              |     1 |    18 |     5   (0)| 00:00:01 |
|*  8 |     TABLE ACCESS BY INDEX ROWID BATCHED| SOURCES      |     1 |    14 |     2   (0)| 00:00:01 |
|*  9 |      INDEX RANGE SCAN                  | SRC_CARR_IDX |     1 |       |     1   (0)| 00:00:01 |
|  10 |     TABLE ACCESS FULL                  | PRODUCTS     |  1024 |  4096 |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SET$1
   3 - SEL$1
   4 - SEL$1 / "T1"@"SEL$1"
   6 - SEL$1 / "T2"@"SEL$1"
   7 - SEL$2
   8 - SEL$2 / "T4"@"SEL$2"
   9 - SEL$2 / "T4"@"SEL$2"
  10 - SEL$2 / "T3"@"SEL$2"

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter("T1"."PROD_CATEGORY"||'_1'='SFD_1')
   6 - filter("T2"."METHOD_TYP"<>'SEA')
   7 - access("T3"."SCID"="T4"."SCID")
   8 - filter(TO_NUMBER("T4"."S_AREA")=:BIND1)
   9 - access("T4"."CARRIER"='AAC')

SQL Analysis Report (identified by operation id/Query Block Name/Object Alias):
-------------------------------------------------------------------------------
   1 -  SET$1
           -  The query block contains UNION which may be expensive.
              Consider using UNION ALL if duplicates are allowed or
              uniqueness is guaranteed.

   3 -  SEL$1
           -  The query block has 1 cartesian product which may be
              expensive. Consider adding join conditions or removing the
              disconnected tables or views.

   4 -  SEL$1 / "T1"@"SEL$1"
           -  The following columns have predicates which preclude their
              use as keys in index range scan. Consider rewriting the
              predicates.
                "PROD_CATEGORY"

   6 -  SEL$1 / "T2"@"SEL$1"
           -  The following columns have predicates which preclude their
              use as keys in index range scan. Consider rewriting the
              predicates.
                "METHOD_TYP"

   8 -  SEL$2 / "T4"@"SEL$2"
           -  The following columns have predicates which preclude their
              use as keys in index range scan. Consider rewriting the
              predicates.
                "S_AREA"

Here we see that there are five items that should be reviewed. The UNION may be expensive, so if not required remove this. If there are no duplicate rows between the two queries or if duplicates can exist in the results then change that to a UNION ALL. In this query the predicates (filters) in the two queries are wildly different so I’m guessing UNION ALL would work fine.

The second issue is the existence of a cartesian product, basically a join is missing in the first query. We have asked for data from two tables (PRODUCTS and PRODUCTS), but did not join them together. Each row of the first table will return all rows of the second table and so on. I'm not sure of a quick answer without knowing more about the query, so lets continue to the next item.

The final issue occurs multiple times and this is the one I want to spend more time on.

“The following columns have predicates which preclude their use as keys in index range scan. Consider rewriting the predicates”

A bit of a mouthful, isn’t it. What it means is that the code in the SQL specifically prevents Oracle from using an index. So, changing the code may allow an index to be used. Let’s walk through three of these examples. Starting with tracing the message to the specific line of code.

 4 -  SEL$1 / "T1"@"SEL$1"
           -  The following columns have predicates which preclude their
              use as keys in index range scan. Consider rewriting the
              predicates.
                "PROD_CATEGORY"

If we look at the predicate information we will see that line 4 of the plan has the following filter predicates:

 4 - filter("T1"."PROD_CATEGORY"||'_1'='SFD_1')

From this we can flow back to the query block name section and see that line 4 is part of SEL$1 (the first select statement) specific to T1 table alias.

 4 - SEL$1 / "T1"@"SEL$1"

Which is a reference to this specific SQL code:

 WHERE t1.prod_category || '_1' = 'SFD_1'

So what is wrong here? Well a concatenation function (||) has been added to the table column, and each record will get modified during the lookup. This modification prevents Oracle from using the index. The simple fix for this is to make the lookup value (variable) look like the data instead of the other way around.

 WHERE t1.prod_category  = 'SFD'

Note, I often see this issue with date lookups where date data is being truncated or formatted to lookup specific days. Again, this immediately precludes using an index. A better way to deal with the date is to use a range scan to get to the day you want.

 WHERE TRUNC(t1.ship_date, 'YYYY-MM-DD’) = '2025-05-28

Instead do this:

 WHERE t1.ship_date between TO_DATE('YYYY-MM-DD HH24:MI’:SS, '2025-05-28 00:00:00’) and TO_DATE('YYYY-MM-DD HH24:MI’:SS, '2025-05-28 23:59:59’)

See how we have kept the table column unconverted allowing Oracle to use indexes once again. Also note that the predicate filters are now in the source data type (date).

To continue with our original statement, now let’s analyze again with the change.

SELECT t1.p_category, t2.tpmethod
  FROM products t1, products t2
  WHERE t1.prod_category  = 'SFD'
    AND t2.method_typ != 'SEA'
  UNION ALL
SELECT t3.p_category, t4.tpmethod
  FROM products t3, sources t4
  WHERE t3.scid = t4.scid
    AND t4.carrier   = 'AAC'
    AND t4.s_area    = :bind1;
------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |              |       |       |    10 (100)|          |
|   1 |  UNION-ALL                            |              |   513 |  4114 |    10   (0)| 00:00:01 |
|   2 |   MERGE JOIN CARTESIAN                |              |   512 |  4096 |     5   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID BATCHED| PRODUCTS     |     1 |     4 |     2   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN                  | PROD_CAT_IDX |     1 |       |     1   (0)| 00:00:01 |
|   5 |    BUFFER SORT                        |              |  1024 |  4096 |     3   (0)| 00:00:01 |
|*  6 |     TABLE ACCESS FULL                 | PRODUCTS     |  1024 |  4096 |     3   (0)| 00:00:01 |
|*  7 |   HASH JOIN                           |              |     1 |    18 |     5   (0)| 00:00:01 |
|*  8 |    TABLE ACCESS BY INDEX ROWID BATCHED| SOURCES      |     1 |    14 |     2   (0)| 00:00:01 |
|*  9 |     INDEX RANGE SCAN                  | SRC_CARR_IDX |     1 |       |     1   (0)| 00:00:01 |
|  10 |    TABLE ACCESS FULL                  | PRODUCTS     |  1024 |  4096 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SET$1
   2 - SEL$1
   3 - SEL$1 / "T1"@"SEL$1"
   4 - SEL$1 / "T1"@"SEL$1"
   6 - SEL$1 / "T2"@"SEL$1"
   7 - SEL$2
   8 - SEL$2 / "T4"@"SEL$2"
   9 - SEL$2 / "T4"@"SEL$2"
  10 - SEL$2 / "T3"@"SEL$2"

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("T1"."PROD_CATEGORY"='SFD')
   6 - filter("T2"."METHOD_TYP"<>'SEA')
   7 - access("T3"."SCID"="T4"."SCID")
   8 - filter(TO_NUMBER("T4"."S_AREA")=:BIND1)
   9 - access("T4"."CARRIER"='AAC')

SQL Analysis Report (identified by operation id/Query Block Name/Object Alias):
-------------------------------------------------------------------------------
   2 -  SEL$1
           -  The query block has 1 cartesian product which may be
              expensive. Consider adding join conditions or removing the
              disconnected tables or views.

   6 -  SEL$1 / "T2"@"SEL$1"
           -  The following columns have predicates which preclude their
              use as keys in index range scan. Consider rewriting the
              predicates.
                "METHOD_TYP"

   8 -  SEL$2 / "T4"@"SEL$2"
           -  The following columns have predicates which preclude their
              use as keys in index range scan. Consider rewriting the
              predicates.
                "S_AREA"

Notice a we are now using an index (PROD_CAT_IDX), but lets continue with our other issues. The S_AREA column has the same predicate concern. Again lets trace the error back:

 8 -  SEL$2 / "T4"@"SEL$2"
           -  The following columns have predicates which preclude their
              use as keys in index range scan. Consider rewriting the
              predicates.
                "S_AREA"

Which points to this predicate:

 8 - filter(TO_NUMBER("T4"."S_AREA")=:BIND1)

Which points to this table / line of the plan / query:

 8 - SEL$2 / "T4"@"SEL$2"

And finally is this bit of code in the SQL:

 and t4.s_area    = :bind1;

So what’s wrong here? Well the predicate filter show a TO_NUMBER function on the column being executed. Just like the last one this will have to be applied to every value of that column in the table and invalidates index usage. But, that is not in the code, so where did it come from?

If we look at the bind variable definition we can see that it was defined as a NUMBER:

 var bind1 NUMBER;
exec :bind1 := 1;

The column S_AREA is defined as CHAR(4), so Oracle must convert the column to match the bind variable. This may seem backwards, but that is what Oracle knows how to do. The fix is to make sure our bind variable data type matches the original column data type:

 var bind1 CHAR(4);
exec :bind1 := '1';

Now lets analyze this SQL again:

var bind1 CHAR(4);
exec :bind1 := '1';

SELECT t1.p_category, t2.tpmethod
  FROM products t1, products t2
  WHERE t1.prod_category  = 'SFD'
    AND t2.method_typ != 'SEA'
UNION ALL
SELECT t3.p_category, t4.tpmethod
  FROM products t3, sources t4
  WHERE t3.scid = t4.scid
    AND t4.carrier   = 'AAC'
    AND t4.s_area    = :bind1;
------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |              |       |       |    10 (100)|          |
|   1 |  UNION-ALL                            |              |   513 |  4114 |    10   (0)| 00:00:01 |
|   2 |   MERGE JOIN CARTESIAN                |              |   512 |  4096 |     5   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID BATCHED| PRODUCTS     |     1 |     4 |     2   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN                  | PROD_CAT_IDX |     1 |       |     1   (0)| 00:00:01 |
|   5 |    BUFFER SORT                        |              |  1024 |  4096 |     3   (0)| 00:00:01 |
|*  6 |     TABLE ACCESS FULL                 | PRODUCTS     |  1024 |  4096 |     3   (0)| 00:00:01 |
|*  7 |   HASH JOIN                           |              |     1 |    18 |     5   (0)| 00:00:01 |
|*  8 |    TABLE ACCESS BY INDEX ROWID BATCHED| SOURCES      |     1 |    14 |     2   (0)| 00:00:01 |
|*  9 |     INDEX RANGE SCAN                  | SRC_CARR_IDX |     1 |       |     1   (0)| 00:00:01 |
|  10 |    TABLE ACCESS FULL                  | PRODUCTS     |  1024 |  4096 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SET$1
   2 - SEL$1
   3 - SEL$1 / "T1"@"SEL$1"
   4 - SEL$1 / "T1"@"SEL$1"
   6 - SEL$1 / "T2"@"SEL$1"
   7 - SEL$2
   8 - SEL$2 / "T4"@"SEL$2"
   9 - SEL$2 / "T4"@"SEL$2"
  10 - SEL$2 / "T3"@"SEL$2"

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("T1"."PROD_CATEGORY"='SFD')
   6 - filter("T2"."METHOD_TYP"<>'SEA')
   7 - access("T3"."SCID"="T4"."SCID")
   8 - filter("T4"."S_AREA"=:BIND1)
   9 - access("T4"."CARRIER"='AAC')

SQL Analysis Report (identified by operation id/Query Block Name/Object Alias):
-------------------------------------------------------------------------------

   2 -  SEL$1
           -  The query block has 1 cartesian product which may be
              expensive. Consider adding join conditions or removing the
              disconnected tables or views.

   6 -  SEL$1 / "T2"@"SEL$1"
           -  The following columns have predicates which preclude their
              use as keys in index range scan. Consider rewriting the
              predicates.
                "METHOD_TYP"

Note that an additional index did not get picked up, this is due to the data set not being large enough yet. If more data shows up in the future, the index could get used now where with the original code it could not.

This time we only have one predicate issue left on the METHOD_TYP column. Tracing this one back:

 6 -  SEL$1 / "T2"@"SEL$1"
           -  The following columns have predicates which preclude their
              use as keys in index range scan. Consider rewriting the
              predicates.
                "METHOD_TYP"

Points to predicate and table information:

 6 - filter("T2"."METHOD_TYP"<>'SEA')
   6 - SEL$1 / "T2"@"SEL$1"

Which is this bit of the SQL code:

 and   t2.method_typ != 'SEA'

The last predicate issue here is that we are asking for all values except one. Well, that is probably most of the table, so it's positive that an index will not get used. This may be unavoidable based on what the query needs to return. Sometimes that just will be the case, and hopefully the other two fixes and usage of indexes will make the SQL perform well enough.

I don’t we have fully fixed this query, but I do hope this explanation of the issues and usage of the new SQL Analysis Report helps your understanding. Index usage is critical to most SQL query performance, and anyone who writes code needs to understand that the code itself can have a major impact on how the SQL can be optimized. With 23ai there are now even more tools to help with that as long as you are looking in the right places.

For more information check out Nigel Bayliss blog New SQL Analysis Report in Oracle Database 23c Free! or the 23ai documentation.

Be sure to try out the 23ai FREE edition!

For more great Oracle content, check out some of my previous presentations on ORAPUB website. Also see where I will be presenting in the near future at the Viscosity events page.

Gary

Friday, July 19, 2024

Oracle Database 23ai – Real-Time SQL Plan Management

One of the most frequent questions I get asked is “why did the Oracle Optimizer change the execution plan on a SQL statement?”  Generally, this arises when a statement goes from running well to suddenly performing poorly.  In short, something has changed in a way that the Optimizer thinks the data should be accessed.  Which then leads to the question, why doesn’t it know that the new performance is worse?

Well not to solve all problems but Oracle database 23ai introduces a new feature called Real-Time SQL Plan Management.  This feature builds on-top of existing features Automatic SQL Tuning Sets (Auto STS) and the current SQL Plan Management (SPM).  To summarize, the Oracle database uses these features in together in the following manor:

  • Auto STS captures execution plans of “most” SQL statements and stores then in a system managed SQL tuning set called “SYS_AUTO_STS”
  • At SQL execution time, the Optimizer decides if a new plan is warranted for a SQL statement based on previous execution information and current estimated execution plan.  The Optimizer will then execute the new adjusted plan and compare it to previous runs.
  • If the new plan is better a SQL Plan Baseline will be automatically created for that new plan.  If during future executions of the SQL the performance regresses again the Optimizer will re-analyze and possible disable the baseline or create another baseline as needed.

All that work you may have been doing to identify a poor running SQL, see how to run it better, and then putting in a “fix” such as a baseline is now 100% automated.

Often, we shy away from new features for fear that they will cause more harm than benefit.  Personally, I have adopted many new features from Oracle over the past 30+ years.  My goal here is to shed some light on how this new feature works to help reduce the fear, uncertainty, and doubt that might arise. 

Let’s look at a quick example, note this is running in Oracle Cloud Infrastructure (OCI) with Oracle Database Enterprise Edition 23.4.1.24.06 as we patiently await the on-premises release.  I’m skipping over some of the mechanics to focus more on the results of real-time SPM that you might see and want to understand what occurred.

First for this to work we need to enable the Real-Time SPM (RT SPM) task:

exec dbms_spm.configure('AUTO_SPM_EVOLVE_TASK', 'AUTO')

Note in 19c the EVOLVE task was either on or off and did not do real time evaluation.  AUTO is a new setting for 23ai.

You can check your current settings with the following query:

SELECT PARAMETER_NAME, PARAMETER_VALUE
    FROM DBA_SQL_MANAGEMENT_CONFIG;

PARAMETER_NAME                           PARAMETER_VALUE
---------------------------------------- ------------------------------
AUTO_SPM_EVOLVE_TASK                     AUTO

Once the Optimizer detects that a better execution plan exists, it may decide to try it.  If it does you will see a note in the explain plan:

Note -----    - This is SQL Plan Management Test Plan

You will see this when the Optimizer runs the SQL with the possibly better execution plan.  Note, you may find it hard to capture this moment in a real system.  

If the new execution plan performs better, a SQL Plan Baseline (or baseline) will be created for the new execution plan as well as the previous execution plan:

SQL_HANDLE           PLAN_NAME                      ORIGIN             ENA ACC   COST  -------------------- ------------------------------ ------------------ --- --- ------ SQL_2044b318726d1c30 SQL_PLAN_20j5m31t6u71h169efe64 FOREGROUND-CAPTURE YES YES 151184 SQL_2044b318726d1c30 SQL_PLAN_20j5m31t6u71h8059011b FOREGROUND-CAPTURE YES NO  271739

Note the source of the baseline “FOREGROUND-CAPTURE” this means that the Optimizer made this baseline during a SQL execution after analysis.  We can see a little bit more information from the new NOTES column of the DBA_SQL_PLAN_BASELINES view:

SELECT p.plan_name, p.foreground_last_verified,
       pfspm.status result, pfspm.ver verify_type,
       pfspm.test_phv, pftp.elps, pfspm.ref_phv, pfrp.elps
  FROM dba_sql_plan_baselines p,
    XMLTABLE(
         '/notes'
         passing xmltype(p.notes)
         columns
             plan_id         NUMBER    path 'plan_id',
             flags           NUMBER    path 'flags',
             fg_spm          XMLType   path 'fg_spm') pf,
     XMLTABLE(
         '/fg_spm'
         passing pf.fg_spm
         columns
             test_phv        NUMBER         path 'test_phv',
             ref_phv         NUMBER         path 'ref_phv',
             ver             VARCHAR2(8)    path 'ver',
             status          VARCHAR2(8)    path 'status',
             ref_perf        XMLType        path 'ref_perf',
             test_perf       XMLType        path 'test_perf' ) pfspm,
    XMLTABLE(
        '/ref_perf'
        passing pfspm.ref_perf
        columns
             bg              NUMBER         path 'bg',
             cpu             NUMBER         path 'cpu',
             elps            NUMBER         path 'elps' ) pfrp,
    XMLTABLE(
        '/test_perf'
        passing pfspm.test_perf
        columns
             bg              NUMBER         path 'bg',
             cpu             NUMBER         path 'cpu',
             elps            NUMBER         path 'elps' ) pftp
  WHERE notes IS NOT NULL
    AND p.sql_handle = 'SQL_2044b318726d1c30'
  ORDER BY p.foreground_last_verified;

PLAN_NAME                      FOREGROUND_LAST_VERIFIED     RESULT  VERIFY_T  TEST_PHV ELPS    REF_PHV ELPS ------------------------------ ---------------------------- ------- -------- --------- ---- ---------- ---- SQL_PLAN_20j5m31t6u71h169efe64 19-JUL-24 01.27.50.000000 PM better  normal   379518564 2775 2153316635 3674

This shows us that the elapsed time was better for the tested execution plan compared to the reference (existing) plan.  There is more data in the NOTES column, but this is a helpful way to see what the Optimizer did to verify the performance.  In this case the elapsed time was lower for the newly tested plan.

You may wonder, what happens if things change again?   What if this new plan is not the best plan in the future?  Well, the Real Time SPM will look for that also, when the SQL would degrade due to Optimizer execution plan generation estimation, it will not get used.  Again, in the notes section of an explain plan you may see this:

Note

-----

   - Failed to use SQL plan baseline for this statement

To view the execution plan for a given baseline, use the DBMS_XPLAN package:

SELECT *
    FROM table(DBMS_XPLAN.DISPLAY_SQL_PLAN_BASELINE(
             sql_handle=>'SQL_2044b318726d1c30 ',
             plan_name=>'SQL_PLAN_20j5m31t6u71h169efe64',
             format=>'typical'));

-------------------------------------------------------------------------------- SQL handle: SQL_2044b318726d1c30 SQL text: select /* SPM_TEST_QUERY_Q1 */ sum(t2.amount) from   sales_area1 t1,                 sales_area2 t2 where  t1.sale_code = t2.sale_code and           t1.sale_type  = 1 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Plan name: SQL_PLAN_20j5m31t6u71h169efe64         Plan id: 2727377809 Enabled: YES     Fixed: NO      Accepted: YES     Origin: FOREGROUND-CAPTURE Plan rows: From Auto SQL Tuning Set -------------------------------------------------------------------------------- Plan hash value: 2727377809 ----------------------------------------------------------------------------------------------------- | Id  | Operation                             | Name        | Rows  | Bytes | Cost (%CPU)| Time     | ----------------------------------------------------------------------------------------------------- |   0 | SELECT STATEMENT                      |             |       |       |   151K(100)|          | |   1 |  SORT AGGREGATE                       |             |     1 |    33 |            |          | |*  2 |   HASH JOIN                           |             |    62M|  1966M|   151K  (1)| 00:00:06 | |   3 |    TABLE ACCESS BY INDEX ROWID BATCHED| SALES_AREA1 | 25000 |   170K| 15238   (1)| 00:00:01 | |*  4 |     INDEX RANGE SCAN                  | SALES_TYP1I | 25000 |       |   235   (0)| 00:00:01 | |   5 |    TABLE ACCESS FULL                  | SALES_AREA2 |  2500K|    61M|   135K  (1)| 00:00:06 | -----------------------------------------------------------------------------------------------------

Note that there is additional information about the baseline provided by the explain plan and that the plan itself comes from the auto SQL Tuning Set which is where the baseline was created from.

I’m really excited for this new feature and how it can help proactively with SQL performance issues.  Oracle will now detect if the SQL performance has degraded from a previously capture execution plan and then test a possible fix and implement a baseline if the fix is acceptable.  Seems like the answer to “why did the Optimizer change the execution plan?” is, just wait a moment and it will fix itself.

Be sure to check out the Oracle blog by Nigel Bayliss which includes a link to example code he created that was shown in this blog.

Thursday, June 6, 2024

Oracle RAC install and Linux 8 - SSH connectivity issue INS-06006

 Many systems are getting upgraded to Linux version 8 due to the impending end of life for Linux 7.  This includes both RedHat and Oracle Enterprise Linux.  In general most are choosing to do a new build out and migrate their databases from the previous system to the new system.  This is the most supportable model, though not the only.

One issue you may run into is installing either Oracle RAC Grid Infrastructure or Oracle Database in a clustered environment.  During the runInstaller or gridSetup.sh steps you may receive an error about "Passwordless SSH connectivity not setup" even though you have set it up and manually verified it.  This is due to a change in the SSH 8 to deal with a security vulnerability (CVE-2019-6111).  You will receive a message like this:

[FATAL] [INS-06006] Passwordless SSH connectivity not set up between the following node(s): [srvr02].

   CAUSE: Either passwordless SSH connectivity is not setup between specified node(s) or they are not reachable. Refer to the logs for more details.

   ACTION: Refer to the logs for more details or contact Oracle Support Services.

When you manually verify SSH connectivity everything works:

$ for i in srvr01 srvr02; do ssh $i hostname -s ; done
srvr01
srvr02

The fix is to make sure you are applying the RU / patch to the software during the install process.  Basically forcing the usage of --applyRU for runInstaller or gridSetup.sh.  This is also the recommended method for all patching / installs going forward (expect more of this in 23ai version).  For 19c you need to use 19.6 or above RU.

For GI install your command line would look something like:

/u01/app/19/grid_1/gridSetup.sh --applyRU /u01/app/grid/patch/36233126

There is a little caveat for this when it comes to database software install.  Normally you would run the install from the first node of the cluster and the installer would replicate the software to each node, using SSH.  The issue is that the RU for the database does not directly include the fix for SSH (Bug 30159782 - Remote Copy Fails if using openssh-7.2. 7.4, etc. (Doc ID 30159782.8)).  You have to include the OCW (Oracle ClusterWare) patch into the runInstaller steps.  The OCW patch is included in the GI RU not the database RU.

Image 1
The easiest way to deal with this is the use the Grid Infrastructure (GI) RU patch for both the GI home and the database home.  Though the syntax will look slightly different for the database home.  We will use the recent April 2024 RU as an example.  For the 19.23 GI RU we see the following patch contents on the download screen (see image 1).  In the red box's we can see that both the database patch and the OCW patch are included.

Once you have the the patch downloaded and unzipped we can verify that from the directory structure

36233126/
├── 36233263  <- Database RU 19.23
├── 36233343
├── 36240578  <- OCW patch 19.23
├── 36383196
├── 36460248
├── automation
├── bundle.xml
├── README.html
└── README.txt

So the steps for setting up a database home on RAC with Linux 8 would look like this:
  1. Stage the patch files E.G. /u01/app/oracle/patch (along with any additional patches)

    unzip -qo p36233126_190000_Linux-x86-64.zip -d /u01/app/oracle/patch


  2. Unzip the install media (19.3) into the home E.G. /u01/app/oracle/product/19/dbhome_2

    unzip -qo LINUX.X64_193000_db_home.zip -d /u01/app/oracle/product/19/dbhome_2


  3. Update the OPatch version in the new home (be sure to download the latest OPatch from patch 6880880 in MOS)

    rm -rf /u01/app/oracle/product/19/dbhome_2/OPatch

    unzip -qo p6880880_190000_Linux-x86-64.zip -d /u01/app/oracle/product/19/dbhome_2


  4. Install the software and apply the patches in one command (be sure to updated the highlighted items based on your configuration).  Here we are installing the RU, OCW, OJVM, and Datapump bundle patches.

    export CV_ASSUME_DISTID=OEL7.8

    /u01/app/oracle/product/19/dbhome_2/runInstaller \
    -silent -ignoreprereqfailure \
    -applyRU /u01/app/oracle/patch/36233126/36233263 \
    -applyOneOffs
    /u01/app/oracle/patch/36233126/36240578,/u01/app/oracle/patch/36420641,/u01/app/oracle/patch/36199232 \
    oracle.install.db.InstallEdition=EE \
    oracle.install.option=INSTALL_DB_SWONLY \
    INVENTORY_LOCATION=/u01/app/OraInventory \
    ORACLE_BASE=/u01/app/oracle \
    UNIX_GROUP_NAME=oinstall \
    oracle.install.db.CLUSTER_NODES=srvr01,srvr02 \
    oracle.install.db.OSDBA_GROUP=dba \
    oracle.install.db.OSOPER_GROUP=oper \
    oracle.install.db.OSBACKUPDBA_GROUP=backupdba \
    oracle.install.db.OSDGDBA_GROUP=dgdba \
    oracle.install.db.OSKMDBA_GROUP=kmdba \
    oracle.install.db.OSRACDBA_GROUP=racdba 

    Note: back slashes (\) are used to continue the command on to individual lines.  Also the patch list is comma separated with no spaces, and be sure to export the CV_ASSUME_DISTID which is required for Linux 8 when installing 19c.

  5. After runInstaller finishes, run the root.sh script on every node you installed on.
If you make a mistake or have issues during the runInstaller, just remove the dbhome directory and start over.

Now you should have a new 19.23 home installed on all nodes of the cluster with the OCW patch, JVM, and DataPump patches all ready to go!

Thursday, May 16, 2024

Serious issue - server crash - Oracle database 19c and Oracle Enterprise Linux 8 Update 8 and above

 

I'm a tad behind on this but think it's a critical issue. This could be a repeat of something you may have already heard about or read about.  I just want to make sure the information is getting out.

There is a serious issue with Oracle database 19c running on Oracle Enterprise Linux 8 Update 8 and above.  The issue starts with database 19.21 and above (October 2023 RU) when running on OEL 8 with the UEK kernel UEK7U1 and above (5.15.0-10* kernel released in April 2023).  If you are running a system with HUGEPAGES memory setup and have the ASM filter driver or ASM Clustered Filesystem (ACFS) kernel modules installed this can lead to a server crash (Linux kernel panic).

We were able to re-create the issue constantly when running a RMAN restore.  Criticality to me is increased as this would affect any large database systems (those using HUGEPAGES and ASM together).

See the following MOS document:
ACFS Kernel Panic (RIP: ofs_teardown_diop_osd) After Updating OS Kernel to UEK7U1 or Later (Doc ID 2998947.1)

Or Mike Dietrich's blog:

Can you use the UEK7 Linux kernel, or may you get some trouble?

The fix for this problem? You have two basic solutions:

  • Stay on UEK7 (5.15.0-3.xxx) until you can upgrade your GI / ASM home and kernel modules to 19.23 (April 2023 RU)
  • Apply the April 2023 RU to your GI / ASM home and make sure the ACFS / ASFD kernel modules are update (this is done automatically)
There is a one-off patch, but it seems to be for a specific configuration / combination.  If you are on the specific kernel version 5.15.0-201.135.6 and running 19.22 ASM / GI (Jan 2024 RU) then there is a patch that can be applied:
     Patch 35983839 - UEK7U2 SUPPORT FOR ACFS, AFD (V5.15.0-201.135.6)

Since Linux kernels update frequently, and corporate security teams want frequent updates, I would lean towards taking up the 19.23 (April 2023 RU) since that is the fully supported minimum supported version for UEK7U2 based on the support matrix:


The issue seems to be first acknowledged in November of 2023, I'm a little behind as I always recommend enterprise customers use a N-1 patching methodology.  So here we are in Q2 of 2024 hitting a issue that is part of Q1 2024 as we uptake a October 2023 Linux kernel with a Jan 2024 database patch.  The good news is, the fix is readily available at this point and our client can move forward instead of backwards.

Hopefully this gives you enough information to be preemptive on the fix and not have systems crashing.

One final reference, a good blog on the OEL kernel release names and dates:


Gary

Wednesday, April 10, 2024

Putting the You In User Groups: UTOUG Training Days 2024

Putting the You In User Groups: UTOUG Training Days 2024

Gary Gordhamer, Managing Principal Consultant, Viscosity NA

We both recently attended and spoke at the Utah Oracle User Group (UTOUG) Training Day event that took place in Salt Lake City Utah on March 21st and 22nd.  This was an outstanding event and the volunteers and conference committee of UTOUG did an outstanding job bringing together dozens of excellent presenters, many of them (like us) Oracle ACEs. 

Finding time to learn, meet our fellow IT specialists, or even have a few moments to plan what the future holds is almost impossible.  The benefits of attending a well-planned in-person event is one of the best ways to meet all these needs. 

In case you doubt that, here’s the list of most in-demand skills for 2024, according to LinkedIn. The top three “soft” skills are Communication, Customer Service, and Leadership.

Still not convinced? In 2023, LinkedIn listed the most in-demand “hard skills” -   things that are not common across most jobs – and the top three on the list were Software Development, SQL, and Finance. 

We both agree that UTOUG Training Days hit every one of these items - on both lists. Whether you might have been utterly new to IT, or a seasoned professional looking to learn new skills while strengthening others, this event compressed all of that into a short two-day event.  If you’re feeling a bit of “Fear of Missing Out” (FOMO), well, that’s our point. And if you’re a manager and you feel like your team missed out, well, that’s good too. – that’s your call to action.

A couple of unique items that UTOUG included in their event: bringing students from nearby regional colleges into the conference, funding scholarships for students, and finishing with a ski day on Saturday for those interested to keep the community atmosphere.  These are critical components to building the next generation of IT professionals and helping employers build their teams.   Again, managers: If you are struggling to find the right talent, maybe you need to invest one or two days into looking where the talent actually is as opposed to casting wide nets into the ocean of the internet.

Day One: Leaders Are Made … and They Are Magnetic

Our first day kicked off with an amazing keynote by Elena Marquetti-Ali about magnetic leadership.  A little tidbit from the session: 85% of managers have no leadership training! So that means everyone who attended Training Days now has more training than 85% of the leaders out there.  This speaks directly to two of the most in-demand skills.  Her “magnetic” 30 minute discussion was covered in a single slide.  Impactful and direct, for those of us armchair speakers, this was a sight to behold beyond the great content.  

Following the keynote was  25 deep technical sessions over five time slots that covered everything from SQL, Cloud, Machine Learning, and even team building.  Breakfast and lunch were included, and we had 15 minutes between each session to spend time meeting our fellow attendees and find out about their takeaways or key challenges.  Hmmm … problem-solving, communication, teamwork, and research … it seems we just hit a few more of those key skills the folks at LinkedIn already identified as crucial to success.

You may ask yourself, how good could these sessions be?  Well they were presented by 31 great people from three different continents.  How great?  Well at least 20 of them are recognized by Oracle corporation as Oracle ACEs for their knowledge and outreach in technical field.  Meanwhile all the topics and speakers were selected through a volunteers review process.  That means users help drive the content, and the content is created by the some of the best.  Have a topic that needs deep technical dive and more outreach or training?  Let the user group know!

Day Two: Humans 25, AI 1

Day Two started with breakfast and another gathering time for all the attendees to learn from each other, followed by another 25 deep technical sessions covering performance tuning, mentoring, operations, and even a great discussion about how even in the face of ever-more-powerful Artificial Intelligence, humans are still irreplaceable. 

In our combined opinion, this was a wonderful conference that hit all the requirements of learning, community, skill building and putting you the IT professional first. Thanks to all the leaders in the UTOUG, especially their conference committee members, for building an amazingly well-rounded event!  

And for those of you who really hoped to attend, but missed it for any number of reasons: Be sure to remember to watch out for this UTOUG event in 2025, don’t forget to look for other regional events nearby, and be sure to tell your local user group what you want to see.

Jim - Gary

Tuesday, January 24, 2023

Running Oracle database containers on Apple M1 using PODMAN

INTRODUCTION


In this blog we will talk about running Oracle database in a PODMAN container on Apple silicon (M1 / M2) under Mac OS X.  As you might know, Oracle does not provide native Oracle database binaries for Apple silicon, which is a different architecture than the Intel / AMD x86.  If you previously ran a Linux container under Docker on a X86 Intel based Mac, you will find a very different situation on a newer Apple M1 or M2 processor.

Docker for Mac OS X is still available and it should be able to emulate x86 automatically.  I personally had some issues getting Oracle’s pre-built database containers to work under docker, and instead decided to look into using PODMAN.

I should mention, that I still find PODMAN somewhat buggy. It’s not clear to me if this is due to the QEMU emulation of x86 on Apple silicon, or Podman, or my specific setup or some combination.  So be warned, I do see a number of times where I start the Oracle database container and it fails.  Then I shutdown my PODMAN machine, and restart it and then the container works again.  Generally once the container is running it has been solid.

Also, since x86 is emulated on the Apple silicon, it runs much slower than what I was used to on my previous Intel based MacBook using Docker.  In my antidotal tests it runs about 70% slower.  Note this is not a knock on Apple silicon, this is just the nature of the full processor emulation that is required.

REQUIREMENTS

So PODMAN has a slightly different architecture than Docker and is an open source product.  This means that installing and using it will be based on setting up and using a number of open source items.  I won’t cover all of them in these directions.  You can find many links on how to install these items.  You will need 

  • HomeBrew (https://brew.sh) or some method to install and run open source software
  • QEMU (https://www.qemu.org) Install this through HomeBrew, this is the Open Source emulation software that will emulate the x86_64 (AMD64) processor.  Note you will see just about everything referenced in the Open Source world as AMD64, as that is who built the original 64-bit instruction set that Intel uses today.
  • Podman (https://podman.io) the container platform, you can install this using HomeBrew or you can directly download the packages and install them
  • Podman desktop (optional) (https://podman-desktop.io) a useful GUI for working with Podman, though you will still need to do some of the work on command line.

AMD64 MACHINE

The next step is to create a machine (VM) for PODMAN based on amd64 processor.  I borrowed these steps from a great IBM article.  I’ll paraphrase / reuse some of their commands but the source article is here:

https://developer.ibm.com/tutorials/running-x86-64-containers-mac-silicon-m1/

Note, when you install PODMAN, you will probably already have a ARM based M1 machine (VM) created.  This default machine cannot run the Oracle provided containers for database.

The machine is a VM that will host your containers in PODMAN.  This is the main difference from Docker which runs a OS daemon, while PODMAN uses a VM that is technically outside of the OS it is running on.

For the purpose of running Oracle database, we will have PODMAN run a Linux amd64 VM under QEMU emulation, and then our containers will run inside of that VM (machine).  Open a terminal prompt (without Rossetta) and execute the following commands:

  1. Verify PODMAN is version 4.1.1 or newer:
    podman --version
  2. Make sure you have stopped the current machine in PODMAN:
    podman machine stop
  3. Verify your QEMU version 7.0 or higher:
    qemu-system-x86_64 --version 
  4. Create a new PODMAN machine using the latest FEDORA CORE image:
    Podman machine init —image-path https://builds.coreos.fedoraproject.org/prod/streams/stable/builds/37.20221225.3.0/x86_64/fedora-coreos-37.20221225.3.0-qemu.x86_64.qcow2.xz intel
    Note: you can update the link by navigating to:
    https://getfedora.org/en/coreos/download?tab=metal_virtualized&stream=stable&arch=x86_64
    and grabbing the URL for DOWNLOAD from “Virtualized -> QEMU”
    Note: the name “intel” will be the name of your PODMAN machine, you can change this to be whatever is useful for you.

Before starting this machine, we need to reconfigure it for the proper settings.  This is critical, and I have made some suggestions to help with performance.  

Edit the configuration file in your favorite text editor: 

~/.config/containers/podman/machine/qemu/intel.json

Note: if you changed the machine name, then replace intel.json with the name your_name.json

Note: this is a JSON file, so watch your syntax closely.  A quick JSON primer: https://guide.couchdb.org/draft/json.html
we will be updating JSON arrays and objects.

First in the “CmdLine” section, change the command that will be run:

"/opt/homebrew/bin/qemu-system-aarch64",

Change to:

"/opt/homebrew/bin/qemu-system-x86_64",

Also in the “CmdLine” section you will have to remove the following options that are for AARM:

"-accel",
"hvf",
"-accel",
"tgc",
"-cpu",
"host",
"-M",
"virt,highmem=on",
"-drive",
"file=/Users//.local/share/containers/podman/machine/qemu/intel_ovmf_vars.fd,if=pflash,format=raw"

Note: the -cpu may say something besides host, if it does remove that as well.

Add the following command line options in the spot you just removed:

  "-cpu",
  "qemu64",
  "-accel",
  "tcg,thread=multi",
  "-m",
  "2048",
  "-smp",
  "4",

This is basically saying these are the first command line options to the qemu-system-x86_64 command.

Note: I have set the memory (-m) to 2GB and the CPU core (-smp) count to 4.  You can adjust these settings for your needs.

Next change the line for the VM firmware:

"file=/opt/homebrew/share/qemu/edk2-aarch64-code.fd,if=pflash,format=raw,readonly=on",

to

"file=/opt/homebrew/share/qemu/edk2-x86_64-code.fd,if=pflash,format=raw,readonly=on",

This is setting the correct firmware for the machine (VM).

Finally at the bottom of the configuration file is a object section that is used to display the machine information in the GUI.  Update the CPUs and Memory settings appropriately:

“CPUs”: 4,

“Memory”: 2048,

Your machine file should now be setup correctly.  Next we need to launch the machine and make sure PODMAN is healthy.  From the terminal run:

podman machine start intel

Note: this will take a few minutes to start.

Verify the machine is running and healthy:

podman machine list

NAME        VM TYPE     CREATED      LAST UP            CPUS        MEMORY      DISK SIZE

intel*      qemu        4 weeks ago  Currently running  4           2.147GB     107.4GB


If you are only going to use this machine for your containers, you can set it to be the default by running from the terminal:

podman system connection default intel

Note: you need to restart the terminal for this to take affect.

Also you can change back to the original AARM container with:

podman system connection default podman-machine-default

Setting up your Oracle Database Container

I had some additional issues here.  As I noted the emulated amd64 machine is not very fast, after I increased the SMP count to 4 I ran into another issue with the JAVA Oracle setup utilities that are executed during container creation.  If you try to load your Oracle Database container and it is hanging, you should adjust your SMP count down to 1 (one) and restart your machine.  Once the container is loaded and usable, you can shutdown the machine, reset the SMP back to 4 (four) or whatever you had.  Then starting the container after this is fine.

I could verify the hang by connecting to the OS of the container:

podman exec -it DB213 sh

Seeing the java process running oracle.assistants.dbca.driver.DBConfigurator and the log file showed now progress happening (under /opt/oracle/cfgtoollogs/dbca/ORCLCDB)

Steps to get an Oracle Database Container

  1. Login to the website (container-registry.oracle.com) and accept the license.  Sign in with the button in the upper left corner.  Click on the repository (Database) then accept the license.
  2. From the terminal, login to the same repository with PODMAN (use same login you accepted the license with):
    podman login container-registry.oracle.com
  3. Pull the container of your choice into PODMAN (use the links on the website to select the specific version you want):
    podman pull container-registry.oracle.com/database/enterprise:21.3.0.0
  4. List the installed images, you may want to note the image ID if you want a specific version:
    podman image list
  5. Now build a container from the image (using ID, you can also use REPOSITORY)
    podman run -d -name DB213 container-registry.oracle.com/database/enterprise
    Note: the DB213 is the my container name, use whatever name works for you.

Note: please see my preface on SMP CPU count if you are having issues with your container not creating a Oracle database properly.

To get a shell on the container:

podman exec -it DB213 sh

Additional tip

I was having some issues with timezone for my demo database.  The containers will be UTC by default.  If you want to change this you can set the parameter tz in the file:

$HOME/.config/containers/containers.conf


[containers]

# Set time zone in container. Takes IANA time zones as well as "local",

# which sets the time zone in the container to match the host machine.

#

tz = "local"

Friday, February 5, 2021

Finding Your SQL Trace File in Oracle Database

 So this question was asked to me a few months back in one of my presentations.  How do I find a trace file from SQL trace on my Oracle Database?

Sorry for the delay, but here are my suggestions.  First you will need to know where your trace file will be located.  By default it will be in the trace directory as part of the diagnostic destination. Make sure you know where your diagnostic destination is located.

SQL> show parameter diag
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
diagnostic_dest                      string      /u01/app/oracle

There are basically three four main ways to find your trace file.

Option 0 - Just ask

Well I learned something new, so I wanted to update this article.  There is a view you can directly query to see your current trace file name.  

SELECT value 
      FROM v$diag_info 
      WHERE name = 'Default Trace File';

This will show the correct trace file name when you update your TRACEFILE_IDENTIFER.  So I would still recommend setting that as outlined in Option 1.

Option 1 - Manually set the trace file name

For this option, we set a session level parameter to drive the name of the trace file.

ALTER SESSION SET tracefile_identifier = 'SQLID_fua0hb5hfst77';
Note: if you receive an error turning on tracing, make sure the user has been granted the alter session privilege.

Next we enable tracing:
EXEC DBMS_SESSION.SET_SQL_TRACE(sql_trace => true);
Run the SQL statement you want to trace:
SELECT sum(t1.c), sum(t2.c)
  FROM t1, t2
  WHERE t1.a = t2.a
   AND t1.d = 10;
Then disable the tracing:
EXEC DBMS_SESSION.SET_SQL_TRACE(sql_trace => false);
Here we can not see the trace file which contains the identifier we gave above:
cd /u01/app/oracle/diag/rdbms/t1db/t1db/trace/

[oracle@srvr03 trace]$ ls -ltrh *fua0*
-rw-r----- 1 oracle oinstall 4.3K Feb  5 08:29 t1db_ora_4978_SQLID_fua0hb5hfst77.trc
-rw-r----- 1 oracle oinstall 2.0K Feb  5 08:29 t1db_ora_4978_SQLID_fua0hb5hfst77.trm
If you are trying to do extended SQL or CBO tracing you can also use different options for doing the SQL trace.  If the trace is going to be large you should also check the size of the maximum dump file.  This can also be adjusted at the session level.
ALTER SESSION SET EVENTS '10053 trace name context forever, level 1'; -- CBO tracing
ALTER SESSION SET EVENTS '10046 trace name context forever, level 12'; -- Additional SQL trace information

SQL> show parameter max_dump_file_size;

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
max_dump_file_size                   string      unlimited

Option 2 - get your session diagnostic information


You can query the name of the current diagnostic trace file for the session.
set linesize 100
column name format a20
column value format a70
select name, value from v$diag_info where name = 'Default Trace File';

NAME VALUE -------------------- ---------------------------------------------------------------------- Default Trace File /u01/app/oracle/diag/rdbms/t1db/t1db/trace/t1db_ora_31523.trc
Now repeat the process of turning on trace, running your SQL, and turning trace back off.

Option 3 - Searching for the trace file


You can search for the trace file at the OS level.  This will probably give you a list of files to verify which one contains the trace file in question.  If you get the SID from the session you can include that in the file search.

SQL> select userenv('SID') from dual;
USERENV('SID') -------------- 19
Then search the trace directory (based on the diagnostic destination setting of your database):
find /u01/app/oracle/diag/rdbms/t1db/t1db/trace -name t1db_ora\*.trc -mtime 0 -exec grep -l "*** SESSION ID:(19." {} \;
/u01/app/oracle/diag/rdbms/t1db/t1db/trace/t1db_ora_26655.trc /u01/app/oracle/diag/rdbms/t1db/t1db/trace/t1db_ora_4978_SQLID_fua0hb5hfst77_.trc /u01/app/oracle/diag/rdbms/t1db/t1db/trace/t1db_ora_31523.trc /u01/app/oracle/diag/rdbms/t1db/t1db/trace/t1db_ora_14396.trc
Note the mtime option for find is looking for files that were updated in the past 24 hours.  If you want to look for older files adjust the number after mtime number.  E.G. +1 will look for files modified 48 hours ago.

There are other ways, but these should be some of the easiest ways.

Gary