Wednesday, April 10, 2024

Putting the You In User Groups: UTOUG Training Days 2024

Putting the You In User Groups: UTOUG Training Days 2024

Gary Gordhamer, Managing Principal Consultant, Viscosity NA

We both recently attended and spoke at the Utah Oracle User Group (UTOUG) Training Day event that took place in Salt Lake City Utah on March 21st and 22nd.  This was an outstanding event and the volunteers and conference committee of UTOUG did an outstanding job bringing together dozens of excellent presenters, many of them (like us) Oracle ACEs. 

Finding time to learn, meet our fellow IT specialists, or even have a few moments to plan what the future holds is almost impossible.  The benefits of attending a well-planned in-person event is one of the best ways to meet all these needs. 

In case you doubt that, here’s the list of most in-demand skills for 2024, according to LinkedIn. The top three “soft” skills are Communication, Customer Service, and Leadership.

Still not convinced? In 2023, LinkedIn listed the most in-demand “hard skills” -   things that are not common across most jobs – and the top three on the list were Software Development, SQL, and Finance. 

We both agree that UTOUG Training Days hit every one of these items - on both lists. Whether you might have been utterly new to IT, or a seasoned professional looking to learn new skills while strengthening others, this event compressed all of that into a short two-day event.  If you’re feeling a bit of “Fear of Missing Out” (FOMO), well, that’s our point. And if you’re a manager and you feel like your team missed out, well, that’s good too. – that’s your call to action.

A couple of unique items that UTOUG included in their event: bringing students from nearby regional colleges into the conference, funding scholarships for students, and finishing with a ski day on Saturday for those interested to keep the community atmosphere.  These are critical components to building the next generation of IT professionals and helping employers build their teams.   Again, managers: If you are struggling to find the right talent, maybe you need to invest one or two days into looking where the talent actually is as opposed to casting wide nets into the ocean of the internet.

Day One: Leaders Are Made … and They Are Magnetic

Our first day kicked off with an amazing keynote by Elena Marquetti-Ali about magnetic leadership.  A little tidbit from the session: 85% of managers have no leadership training! So that means everyone who attended Training Days now has more training than 85% of the leaders out there.  This speaks directly to two of the most in-demand skills.  Her “magnetic” 30 minute discussion was covered in a single slide.  Impactful and direct, for those of us armchair speakers, this was a sight to behold beyond the great content.  

Following the keynote was  25 deep technical sessions over five time slots that covered everything from SQL, Cloud, Machine Learning, and even team building.  Breakfast and lunch were included, and we had 15 minutes between each session to spend time meeting our fellow attendees and find out about their takeaways or key challenges.  Hmmm … problem-solving, communication, teamwork, and research … it seems we just hit a few more of those key skills the folks at LinkedIn already identified as crucial to success.

You may ask yourself, how good could these sessions be?  Well they were presented by 31 great people from three different continents.  How great?  Well at least 20 of them are recognized by Oracle corporation as Oracle ACEs for their knowledge and outreach in technical field.  Meanwhile all the topics and speakers were selected through a volunteers review process.  That means users help drive the content, and the content is created by the some of the best.  Have a topic that needs deep technical dive and more outreach or training?  Let the user group know!

Day Two: Humans 25, AI 1

Day Two started with breakfast and another gathering time for all the attendees to learn from each other, followed by another 25 deep technical sessions covering performance tuning, mentoring, operations, and even a great discussion about how even in the face of ever-more-powerful Artificial Intelligence, humans are still irreplaceable. 

In our combined opinion, this was a wonderful conference that hit all the requirements of learning, community, skill building and putting you the IT professional first. Thanks to all the leaders in the UTOUG, especially their conference committee members, for building an amazingly well-rounded event!  

And for those of you who really hoped to attend, but missed it for any number of reasons: Be sure to remember to watch out for this UTOUG event in 2025, don’t forget to look for other regional events nearby, and be sure to tell your local user group what you want to see.

Jim - Gary

Tuesday, January 24, 2023

Running Oracle database containers on Apple M1 using PODMAN

INTRODUCTION


In this blog we will talk about running Oracle database in a PODMAN container on Apple silicon (M1 / M2) under Mac OS X.  As you might know, Oracle does not provide native Oracle database binaries for Apple silicon, which is a different architecture than the Intel / AMD x86.  If you previously ran a Linux container under Docker on a X86 Intel based Mac, you will find a very different situation on a newer Apple M1 or M2 processor.

Docker for Mac OS X is still available and it should be able to emulate x86 automatically.  I personally had some issues getting Oracle’s pre-built database containers to work under docker, and instead decided to look into using PODMAN.

I should mention, that I still find PODMAN somewhat buggy. It’s not clear to me if this is due to the QEMU emulation of x86 on Apple silicon, or Podman, or my specific setup or some combination.  So be warned, I do see a number of times where I start the Oracle database container and it fails.  Then I shutdown my PODMAN machine, and restart it and then the container works again.  Generally once the container is running it has been solid.

Also, since x86 is emulated on the Apple silicon, it runs much slower than what I was used to on my previous Intel based MacBook using Docker.  In my antidotal tests it runs about 70% slower.  Note this is not a knock on Apple silicon, this is just the nature of the full processor emulation that is required.

REQUIREMENTS

So PODMAN has a slightly different architecture than Docker and is an open source product.  This means that installing and using it will be based on setting up and using a number of open source items.  I won’t cover all of them in these directions.  You can find many links on how to install these items.  You will need 

  • HomeBrew (https://brew.sh) or some method to install and run open source software
  • QEMU (https://www.qemu.org) Install this through HomeBrew, this is the Open Source emulation software that will emulate the x86_64 (AMD64) processor.  Note you will see just about everything referenced in the Open Source world as AMD64, as that is who built the original 64-bit instruction set that Intel uses today.
  • Podman (https://podman.io) the container platform, you can install this using HomeBrew or you can directly download the packages and install them
  • Podman desktop (optional) (https://podman-desktop.io) a useful GUI for working with Podman, though you will still need to do some of the work on command line.

AMD64 MACHINE

The next step is to create a machine (VM) for PODMAN based on amd64 processor.  I borrowed these steps from a great IBM article.  I’ll paraphrase / reuse some of their commands but the source article is here:

https://developer.ibm.com/tutorials/running-x86-64-containers-mac-silicon-m1/

Note, when you install PODMAN, you will probably already have a ARM based M1 machine (VM) created.  This default machine cannot run the Oracle provided containers for database.

The machine is a VM that will host your containers in PODMAN.  This is the main difference from Docker which runs a OS daemon, while PODMAN uses a VM that is technically outside of the OS it is running on.

For the purpose of running Oracle database, we will have PODMAN run a Linux amd64 VM under QEMU emulation, and then our containers will run inside of that VM (machine).  Open a terminal prompt (without Rossetta) and execute the following commands:

  1. Verify PODMAN is version 4.1.1 or newer:
    podman --version
  2. Make sure you have stopped the current machine in PODMAN:
    podman machine stop
  3. Verify your QEMU version 7.0 or higher:
    qemu-system-x86_64 --version 
  4. Create a new PODMAN machine using the latest FEDORA CORE image:
    Podman machine init —image-path https://builds.coreos.fedoraproject.org/prod/streams/stable/builds/37.20221225.3.0/x86_64/fedora-coreos-37.20221225.3.0-qemu.x86_64.qcow2.xz intel
    Note: you can update the link by navigating to:
    https://getfedora.org/en/coreos/download?tab=metal_virtualized&stream=stable&arch=x86_64
    and grabbing the URL for DOWNLOAD from “Virtualized -> QEMU”
    Note: the name “intel” will be the name of your PODMAN machine, you can change this to be whatever is useful for you.

Before starting this machine, we need to reconfigure it for the proper settings.  This is critical, and I have made some suggestions to help with performance.  

Edit the configuration file in your favorite text editor: 

~/.config/containers/podman/machine/qemu/intel.json

Note: if you changed the machine name, then replace intel.json with the name your_name.json

Note: this is a JSON file, so watch your syntax closely.  A quick JSON primer: https://guide.couchdb.org/draft/json.html
we will be updating JSON arrays and objects.

First in the “CmdLine” section, change the command that will be run:

"/opt/homebrew/bin/qemu-system-aarch64",

Change to:

"/opt/homebrew/bin/qemu-system-x86_64",

Also in the “CmdLine” section you will have to remove the following options that are for AARM:

"-accel",
"hvf",
"-accel",
"tgc",
"-cpu",
"host",
"-M",
"virt,highmem=on",
"-drive",
"file=/Users//.local/share/containers/podman/machine/qemu/intel_ovmf_vars.fd,if=pflash,format=raw"

Note: the -cpu may say something besides host, if it does remove that as well.

Add the following command line options in the spot you just removed:

  "-cpu",
  "qemu64",
  "-accel",
  "tcg,thread=multi",
  "-m",
  "2048",
  "-smp",
  "4",

This is basically saying these are the first command line options to the qemu-system-x86_64 command.

Note: I have set the memory (-m) to 2GB and the CPU core (-smp) count to 4.  You can adjust these settings for your needs.

Next change the line for the VM firmware:

"file=/opt/homebrew/share/qemu/edk2-aarch64-code.fd,if=pflash,format=raw,readonly=on",

to

"file=/opt/homebrew/share/qemu/edk2-x86_64-code.fd,if=pflash,format=raw,readonly=on",

This is setting the correct firmware for the machine (VM).

Finally at the bottom of the configuration file is a object section that is used to display the machine information in the GUI.  Update the CPUs and Memory settings appropriately:

“CPUs”: 4,

“Memory”: 2048,

Your machine file should now be setup correctly.  Next we need to launch the machine and make sure PODMAN is healthy.  From the terminal run:

podman machine start intel

Note: this will take a few minutes to start.

Verify the machine is running and healthy:

podman machine list

NAME        VM TYPE     CREATED      LAST UP            CPUS        MEMORY      DISK SIZE

intel*      qemu        4 weeks ago  Currently running  4           2.147GB     107.4GB


If you are only going to use this machine for your containers, you can set it to be the default by running from the terminal:

podman system connection default intel

Note: you need to restart the terminal for this to take affect.

Also you can change back to the original AARM container with:

podman system connection default podman-machine-default

Setting up your Oracle Database Container

I had some additional issues here.  As I noted the emulated amd64 machine is not very fast, after I increased the SMP count to 4 I ran into another issue with the JAVA Oracle setup utilities that are executed during container creation.  If you try to load your Oracle Database container and it is hanging, you should adjust your SMP count down to 1 (one) and restart your machine.  Once the container is loaded and usable, you can shutdown the machine, reset the SMP back to 4 (four) or whatever you had.  Then starting the container after this is fine.

I could verify the hang by connecting to the OS of the container:

podman exec -it DB213 sh

Seeing the java process running oracle.assistants.dbca.driver.DBConfigurator and the log file showed now progress happening (under /opt/oracle/cfgtoollogs/dbca/ORCLCDB)

Steps to get an Oracle Database Container

  1. Login to the website (container-registry.oracle.com) and accept the license.  Sign in with the button in the upper left corner.  Click on the repository (Database) then accept the license.
  2. From the terminal, login to the same repository with PODMAN (use same login you accepted the license with):
    podman login container-registry.oracle.com
  3. Pull the container of your choice into PODMAN (use the links on the website to select the specific version you want):
    podman pull container-registry.oracle.com/database/enterprise:21.3.0.0
  4. List the installed images, you may want to note the image ID if you want a specific version:
    podman image list
  5. Now build a container from the image (using ID, you can also use REPOSITORY)
    podman run -d -name DB213 container-registry.oracle.com/database/enterprise
    Note: the DB213 is the my container name, use whatever name works for you.

Note: please see my preface on SMP CPU count if you are having issues with your container not creating a Oracle database properly.

To get a shell on the container:

podman exec -it DB213 sh

Additional tip

I was having some issues with timezone for my demo database.  The containers will be UTC by default.  If you want to change this you can set the parameter tz in the file:

$HOME/.config/containers/containers.conf


[containers]

# Set time zone in container. Takes IANA time zones as well as "local",

# which sets the time zone in the container to match the host machine.

#

tz = "local"

Friday, February 5, 2021

Finding Your SQL Trace File in Oracle Database

 So this question was asked to me a few months back in one of my presentations.  How do I find a trace file from SQL trace on my Oracle Database?

Sorry for the delay, but here are my suggestions.  First you will need to know where your trace file will be located.  By default it will be in the trace directory as part of the diagnostic destination. Make sure you know where your diagnostic destination is located.

SQL> show parameter diag
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
diagnostic_dest                      string      /u01/app/oracle

There are basically three four main ways to find your trace file.

Option 0 - Just ask

Well I learned something new, so I wanted to update this article.  There is a view you can directly query to see your current trace file name.  

SELECT value 
      FROM v$diag_info 
      WHERE name = 'Default Trace File';

This will show the correct trace file name when you update your TRACEFILE_IDENTIFER.  So I would still recommend setting that as outlined in Option 1.

Option 1 - Manually set the trace file name

For this option, we set a session level parameter to drive the name of the trace file.

ALTER SESSION SET tracefile_identifier = 'SQLID_fua0hb5hfst77';
Note: if you receive an error turning on tracing, make sure the user has been granted the alter session privilege.

Next we enable tracing:
EXEC DBMS_SESSION.SET_SQL_TRACE(sql_trace => true);
Run the SQL statement you want to trace:
SELECT sum(t1.c), sum(t2.c)
  FROM t1, t2
  WHERE t1.a = t2.a
   AND t1.d = 10;
Then disable the tracing:
EXEC DBMS_SESSION.SET_SQL_TRACE(sql_trace => false);
Here we can not see the trace file which contains the identifier we gave above:
cd /u01/app/oracle/diag/rdbms/t1db/t1db/trace/

[oracle@srvr03 trace]$ ls -ltrh *fua0*
-rw-r----- 1 oracle oinstall 4.3K Feb  5 08:29 t1db_ora_4978_SQLID_fua0hb5hfst77.trc
-rw-r----- 1 oracle oinstall 2.0K Feb  5 08:29 t1db_ora_4978_SQLID_fua0hb5hfst77.trm
If you are trying to do extended SQL or CBO tracing you can also use different options for doing the SQL trace.  If the trace is going to be large you should also check the size of the maximum dump file.  This can also be adjusted at the session level.
ALTER SESSION SET EVENTS '10053 trace name context forever, level 1'; -- CBO tracing
ALTER SESSION SET EVENTS '10046 trace name context forever, level 12'; -- Additional SQL trace information

SQL> show parameter max_dump_file_size;

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
max_dump_file_size                   string      unlimited

Option 2 - get your session diagnostic information


You can query the name of the current diagnostic trace file for the session.
set linesize 100
column name format a20
column value format a70
select name, value from v$diag_info where name = 'Default Trace File';

NAME VALUE -------------------- ---------------------------------------------------------------------- Default Trace File /u01/app/oracle/diag/rdbms/t1db/t1db/trace/t1db_ora_31523.trc
Now repeat the process of turning on trace, running your SQL, and turning trace back off.

Option 3 - Searching for the trace file


You can search for the trace file at the OS level.  This will probably give you a list of files to verify which one contains the trace file in question.  If you get the SID from the session you can include that in the file search.

SQL> select userenv('SID') from dual;
USERENV('SID') -------------- 19
Then search the trace directory (based on the diagnostic destination setting of your database):
find /u01/app/oracle/diag/rdbms/t1db/t1db/trace -name t1db_ora\*.trc -mtime 0 -exec grep -l "*** SESSION ID:(19." {} \;
/u01/app/oracle/diag/rdbms/t1db/t1db/trace/t1db_ora_26655.trc /u01/app/oracle/diag/rdbms/t1db/t1db/trace/t1db_ora_4978_SQLID_fua0hb5hfst77_.trc /u01/app/oracle/diag/rdbms/t1db/t1db/trace/t1db_ora_31523.trc /u01/app/oracle/diag/rdbms/t1db/t1db/trace/t1db_ora_14396.trc
Note the mtime option for find is looking for files that were updated in the past 24 hours.  If you want to look for older files adjust the number after mtime number.  E.G. +1 will look for files modified 48 hours ago.

There are other ways, but these should be some of the easiest ways.

Gary

Friday, September 4, 2020

Oracle 19c Database Upgrade - Performance testing options

You may be able to related to this, upgrading Oracle databases can be a real drag.  

Databases tend to be at the heart of critical businesses process, and you may have spent a lot of time getting things to be in a perfect working order.  Upgrading your Oracle database from 11g to 19c might be a very frightening due to the risk of things changing and loss of all the work already done.  This is normally where testing comes into play.  What are your options for testing performance changes when moving from Oracle 11g to 19c? 

There are basically three options:

  • Oracle Real Application Testing or RAT
  • Simulated application workload either manually or with tools like LoadRunner, NeoLoad, or a number of web specific tools
  • Look at individual SQL either manually or through tools like SQL Tuning Sets in Oracle Tuning Pack

First let’s put a few rules in place when it comes to testing.  For a test to be valid you need to use an environment that is as close to the target production system you will be moving to.  That means if you are moving to new servers or changing hardware / storage during the upgrade; then make sure you're running tests on that new hardware or storage.  Oracle 19c has many new features that take advantage of new hardware.  Testing on an 8-year-old piece of hardware that was running 11g with 19c is not the same as testing 19c on a brand-new server.  As the optimizer in Oracle is very aware of underlying infrastructure it will not act the same, so you are not testing the upgrade really.

 

Let’s give a quick description of each of these options that might take some of the stress out of upgrading your Oracle database.

Oracle Real Application Testing or RAT

Oracle makes a very specific database tool for application workload testing.  Real Application Testing or RAT.  Acronym aside, the tool is pretty interesting.  RAT captures SQL statements from your production database, and then you can clone the database to another environment and re-run all the SQL.  This is one of the most complete ways to test just the database component of your application.  A couple of downsides to this tool, first it only tests the database component.  If you are just upgrading the database software, change the database hardware, or only adjusting database settings; then this tool is great.  If you are making application changes or updating the database schema at the same time, then this tool will not work, or at minimum will not be able to test those changes.  Also, RAT is an additional cost license.

 

RAT will run through all the application SQL’s in order but not necessarily in the same amount of time.  E.G. if you capture 8 hours of production workload it will probably take longer than 8 hours to run the replay of the workload on a similarly sized test system even if 19c is faster at individual SQL statements.  Also, RAT cannot simulate everything, such as DataGuard and GoldenGate.  These replication tools do cause some overhead, and RAT will not simulate that.  Though most of that overhead can be very small.

 

High-level steps for a RAT replay:

  1. Capture the workload on your production system, be sure to include AWR data
  2. Clone the production database to test area based on the start time of your capture
  3. Upgrade the cloned test system to 19c and apply and database configuration changes
  4. Transfer the capture files from production to test, and process the capture files
  5. Run the database replay on test
  6. Load your AWR data from production and run comparison reports between production and test

The great thing about RAT is you get a very detailed report of database effort and work.  You can see exactly what the 19c and possibly new hardware did differently at the database level. 

 

The downside to RAT is you cannot change the application schema other than adding indexes or updating statistics.  Overall this is a great way to test your database upgrade, and you can roll back your test database and re-run the RAT testing multiple times.  This allows for iterative changes to database configuration

Simulating Application Workload

Another option is to do application testing.  In this scenario you will have a full test system that include database and application tiers and a full setup of application software.  The basic is idea is to setup a test environment that can be logged into and then create transactions or usage on the application.

 

There can be a lot of complexity with setting up another full environment with all the application components.  It can also be difficult to replicate enough hardware to achieve production level performance or load.  A second challenge can be the actual testing.  It might be easy to have a user login to the test environment and run a few application transactions.  This might give you a feel for some individual performance, but really can’t compare to having many multiple users.  A system with a single user accessing it will react very differently than a system with 100’s or even 1,000’s of active users.  In order to simulate those levels of workload you will need to use tolls like LoadRunner or others that will simulate end users running transactions through the application GUI or API.  Creating the tests that simulate many active users can also be extremely complex and fall short of real workload patterns.

 

The high-level steps for application testing very dependent on your testing process and environment.  You may have a test system with full technology stack and redundant application servers, load balancers, web servers, and databases.  Or you may have a very simple setup just to perform basic smoke testing.

 

The plus side of this is that you can test both application changes and database upgrade at the same time.  Though you may have a hard time during testing teasing out what might be due to an application change versus a database upgrade change.  So that might extend your tuning effort work, though it can decrease the amount of testing work.

Manually running SQL

A third option is to just manually run some SQL statements in your test database.  This can be done by capturing the SQL from the production system using SQL history ASH or AWR or with a SQL Tuning Sets (STS) if you are licensed.  SQL can also be captured using SQL or session tracing.  These statements can then be re-run in a test environment to see how they perform after the database has been modified or upgraded. 

 

As you might already be guessing, this is very complex.  Finding the statements, you want to re-run, possibly updating them, and then making sure the environment is set correct for the SQL to work in test environment can be complex.  If you have no other avenue for testing, this can allow you to possible pick out specifically long running or high impact SQL’s.  This is really a very minimal test, but it may show how the software changes affected the SQL performance profile. 

 

The high-level steps for SQL tuning vary a little based on if you are using a tool like SQL Tuning Sets or manually capturing statements.  Either way it looks something like this:

  • Identify and capture the specific SQL in production (be sure to capture explain plans and run time statistics)
  • Have a test environment that is similar to production in data volume and configuration (or clone from production)
  • Upgrade the test system to 19c
  • Transfer the SQL statements to test (text files. STS, etc…)
  • Run the statements in test and compare explain plans and run time statistics

This can be a little bit time consuming without a tool like STS that will do most of this for you with a few steps or even through tools like Oracle Enterprise Manager (OEM).

 

The plus side of running individual SQL statements is the ability to look at them in-depth.  You will can get very detailed performance information by running traces or SQLT explain plans.  The negative side is that you won’t be able to get a multi-user view of the system.

Summary

All of this comes down to simulating the workload of the application in an upgraded database environment.  The goal of this is to reduce the anxiety or risk of doing your upgrade.  To be clear none of this will fully eliminate that risk, though all of this will make you more prepared for that go live date.  No matter which path you choose this will give more hands-on experience with the 19c upgrade process and seeing how your application perform differently or the same on a 19c database.

Thursday, July 30, 2020

Keeping your Oracle Database Healthy

I recently gave a virtual presentation about database patching and managing risk.  During the session one of the key takeaways is keeping your database healthy.  Although patching is part of keeping things healthy, there is a lot more.

One key is being proactive about health checks.  You should always do some basic database health checks during every patch or upgrade. In fact, you may also want to proactively perform some database health checks weekly or monthly, since many things can happen outside of DB patching.

I've personally been in many patching exercises where we planed a specific amount of time. Having tested the patch in a non production, feeling confident things will go well.  Only to find out that something else in the database is not healthy.  Something only in production.  Some code is invalid, or some system issue is going on.  The patching window is now two times longer as we are fixing the problem we didn't know and still trying to patch.

Oracle provides many great tools to help, ORAchk, EXAchk, and Autonomous Health Framework (AFH).

See note: Autonomous Health Framework (AHF) - Including TFA and ORAchk/EXAChk (Doc ID 2550798.1)

Running these tools proactively and reviewing the results on a regular basses is critical to a healthy system.  Always do some basic database health checks on your database
  • Every patch or upgrade
  • Proactively perform weekly or monthly
  • Before and after a patch (compare)
Notice this is not just during patching, many things can happen outside of DB patching to corrupt your system!

Knowing there is an issue will help you plan your work and make sure your outages have the time you need and everyone is clear on the change that is needed.  The less surprises there are in the work you do, the more the business and management will trust what you are doing in the future.

There are other tools that Oracle provides outside of AHF.  Here is a list of some more Oracle provided scripts and technical notes on things you can check in your database to show it is healthy.  This does not guarantee that the application that uses the database is healthy, this is more about the technical aspects fo the database itself.
  • FAQ: Database Upgrade and Migration (Doc ID 1352987.1)
  • Autonomous Health Framework (AHF) - Including TFA and ORAchk/EXAChk (Doc ID 2550798.1)
  • Script to Collect DB Upgrade / Migrate Diagnostic Information (dbupgdiag.sql) (Doc ID 556610.1)
  • hcheck.sql Script to Check for Known Problems in Oracle8i, Oracle9i, Oracle10g, Oracle 11g and Oracle 12c (Doc ID 136697.1)
  • Assistant: Download Reference for Oracle Database/GI Update, Revision, PSU, SPU(CPU), Bundle Patches, Patchsets and Base Releases (DOC ID 2118136.2)
  • Best Practices for running catalog, catproc and utlrpscript (Doc ID 863312.1)
  • Debug and Validate Invalid Objects (Doc ID 300056.1)
  • Where Can I Find the Parallel Version of Utlrp.sql? (Doc ID 230136.1)
  • Master Note: Troubleshooting Oracle Data Dictionary (Doc ID 1506140.1)
  • Overview of Refreshing Materialized Views (Doc ID 549874.1)
  • How to Monitor the Progress of a Materialized View Refresh (MVIEW) (Doc ID 258021.1)
  • Script to Check the Status or State of the JVM within the Database (Doc ID 456949.1)
Hopefully these notes give you a point in the right direction on how to help keep your database healthy during patching activities, and even when you are not patching.


Wednesday, June 17, 2020

Oracle ACFS / ASM Filter Driver and Oracle Enterprise Linux

In the world of Oracle databases, DBA’s and system administrators don’t always see eye to eye.  Mostly I have found that this is really just about coming at challenges from different angles.  Some of the major changes in Oracle since 10g version has been the introduction of Oracle Automatic Storage Management or ASM.  This has led to many features such as ASMLIB, ASM Clustered Filesystem (ACFS), and now the ASM Filter Driver (AFD or ASMFD).  These features require OS specific compatible libraries, and kernel modules.  Information on compatibility can be confusing.

Background References

Let’s start with some background.  For this blog, I’m going to focus on Oracle Enterprise Linux (OEL).  The information here could be assembled for any supported OS.  This information is from a few sources, the main ones being:
Before making any major patching or upgrade decisions please reference these notes and Oracle support in general. 

Oracle does a pretty good job of providing information on Linux and Oracle database / ASM compatibility.  Though, the notes are not always clear as to what the goals is.  For example, I have heard many people say that ASM has stopped working during a Linux upgrade.  Or that ASM drivers are not compatible with Linux changes, or that you can’t patch Linux without breaking ASM / AFD / ACFS.  Well, these are antidotal statements, and the truth is a little more complex.

Linux Kernel Version

Oracle Enterprise Linux (OEL) recommends using the Unbreakable Enterprise Kernel or UEK.  The main reason behind this is that Oracle controls the destiny of the UEK.  As noted above ACFS and AFD are kernel dependent.  So having the ability to set course for both the Linux Kernel and the ASM related kernel modules is important.  For the scope of this discussion, let’s look at OEL releases 5 through 8 and Oracle Database releases 11gR2 through 19c.  First here is a high level view of OEL and UEK releases.
Support DatesOEL Version
Jun 2007 - Jun 2017OEL5      
Feb 2011 - Mar 2021OEL6      
Jul 2014 - Jul 2024OEL7      
Jul 2019 - Jul 2029OEL8      
UEK YUM ChannelUEKR1UEKR2UEKR3UEKR4UEKR5UEKR6
Major UEK Version2.6.322.6.393.8.134.1.124.14.355.4.17
Released Mar, 2012Oct, 2013Jan, 2016Jun, 2018Mar, 2020

So this is a nice swath of about 10 years of versions.  Please note that the major version number of the kernels does not change if you don’t change UEK channels.  For example, if you move from OEL6 Update 5 to Update 8, but stick with UEKR3 your kernel major number will remain 3.8.13.  The minor numbers beyond the .13 will change, but not the major number.  This is also true for OEL7 and UEKR5 where the major release number is 4.14.35.  Within in this major version, Oracle release new security fixes and patch updates monthly or quarterly for each kernel version that is supported within the support dates.  So for example, OEL6 UEKR2 which was first release in Mar of 2012, just released kernel UEK 2.6.39-400.321.1 in April of 2020.  That is a long history of fixing the same set of code.

ACFS / AFD Linux Kernel Compatibility

Now let's move over to the database support for ACFS / AFD on OEL.  Here is another summary now listing Oracle database versions.
UEK YUM ChannelUEKR1UEKR2UEKR3UEKR4UEKR5UEKR6
Major UEK Version2.6.322.6.393.8.134.1.124.14.355.4.17
DB VersionRelease Date Mar, 2012Oct, 2013Jan, 2016Jun, 2018Mar, 2020
11.2.0.3OEL5 - OEL62.6.32-1002.6.39-100    
11.2.0.4OEL5 - OEL62.6.32-1002.6.39-1003.8.13-1184.1.12-112.16.4  
12.1.0.1OEL5 - OEL62.6.32-1002.6.39-100    
12.1.0.2OEL6 - OEL7 2.6.39-1003.8.13-118   
12.2.0.1OEL6 - OEL7 2.6.39-1003.8.13-1184.1.12-112.16.44.14.35-1902 
18cOEL6 - OEL7 2.6.39-1003.8.13-1184.1.12-112.16.44.14.35-1902TBD
19cOEL7   4.1.12-112.16.44.14.35-1902TBD

Here again I’ve covered about 10 years, and you can see clearly what high-level database versions are compatible with what OEL and UEK versions.  You might also see Oracle’s commitment to long term support for 11.2.0.4.  Similar to the expected support for 19c going until 2024.  Also note that UEKR6 is not yet supported.

Now the information above still high level.  You may assume that every 19c version of Oracle support all UEK4 and UEK5 kernels out of the box.  That is not true.  Database releases and kernel releases are not specifically synchronized.  So perhaps you have tried to test if the ACFS drivers can be installed:
$ acfsdriverstate supported -v
ACFS-9459: ADVM/ACFS is not supported on this OS version: '4.14.35-1902.10.8.el7uek.x86_64'
ACFS-9201: Not Supported

ACFS-9553: Operating System: linux
ACFS-9554: Machine Architecture: x86_64-linux-thread-multi
ACFS-9555: Operating system name and information: Linux srvr03 4.14.35-1902.10.8.el7uek.x86_64 #2 SMP Thu Feb 6 11:02:28 PST 2020 x86_64 x86_64 x86_64 GNU/Linux
ACFS-9556: Release package: oraclelinux-release-7.7-1.0.5.el7.x86_64
ACFS-9557: Version: ADVM/ACFS is not supported on 4.14.35-1902.10.8.el7uek.x86_64
ACFS-9558: Variable _ORA_USM_NOT_SUPPORTED is defined: no


ASM Required Patches

If your combination of grid software and Linux kernel are not supported, you will see the above message “ACFS-9201: Not Supported”.  How can you fix this?  Oracle provides patches at specific versions of GI to support UEK versions.
DB VersionOS VersionUEK ChannelUEK MinimumBase bug or Base version
11.2.0.3.6OEL5 - OEL6UEKR12.6.32-10015986571
11.2.0.3.7OEL5 - OEL6UEKR12.6.32-10012983005
11.2.0.4OEL5UEKR12.6.32-100Base
11.2.0.4OEL5 - OEL6UEKR22.6.39-100Base
11.2.0.4.4OEL6UEKR33.8.13-11816318126
11.2.0.4.6OEL7UEKR33.8.13-3518321597
11.2.0.4.180717 OEL6 - OEL7UEKR44.1.12-112.16.422810422, 28171094, 27463879
12.1.0.1OEL5 - OEL6UEKR12.6.32-100Base
12.1.0.1OEL5 - OEL6UEKR22.6.39-100Base
12.1.0.2.170718OEL5 - OEL6UEKR22.6.39-100Base
12.1.0.2.170718OEL6UEKR33.8.13-13Base
12.1.0.2.170718OEL7UEKR33.8.13-3518321597
12.1.0.2.181016OEL6 - OEL7UEKR44.1.12-112.16.422810422, 27942938, 28171094, 27942938
12.1.0.2.190716OEL7UEKR54.14.35-190227494830
12.2.0.1OEL6UEKR22.6.39-100Base
12.2.0.1OEL6UEKR33.8.13-13Base
12.2.0.1.180717OEL6UEKR44.1.12-112.16.427463879, 28171094
12.2.0.1OEL7UEKR33.8.13-35Base
12.2.0.1.180717OEL7UEKR44.1.12-112.16.427463879, 28171094, 27463879
12.2.0.181016OEL7UEKR54.14.35-190228069955
18.3.0.0OEL6UEKR22.6.39-100Base
18.3.0.0OEL6UEKR33.8.13-13Base
18.4.0.0.181016OEL6UEKR44.1.12-112.16.428069955
18.3.0.0OEL7UEKR33.8.13-35Base
18.4.0.0.181016OEL7UEKR44.1.12-112.16.427463879, 28171094, 27463879
18.4.0.0.181016OEL7UEKR54.14.35-190227494830
19.3.0.0OEL7UEKR44.1.12-112.16.4Base
19.4.190716OEL7UEKR54.14.35-190227494830

The example above on my test system is a fresh install of GI 19.3.0.0 on a OEL 7.8 system using UEKR5.  Based on the list above we can tell that for UEKR5 support I need either 19.4.190716 or I need patch 27494830.  Oracle provides many patches where the title contains “ACFS RU” that you can search for in the patches section of MOS.  I decided to use this patch “MERGE ON ACFS RU 19.6.0.0.0 OF 28531803 30685278 (Patch 31055785)” which includes the following merge of patches:
Patch  31055785     : applied on Fri May 22 14:15:34 CDT 2020
Unique Patch ID:  23448796
Patch description:  "ACFS Interim patch for 31055785"
   Created on 20 Mar 2020, 03:53:18 hrs PST8PDT
   Bugs fixed:
     28531803, 30685278, 27494830, 27917085, 28064731, 28293236, 28321248
     28375150, 28553487, 28611527, 28687713, 28701011, 28740425, 28818513
     28825772, 28844788, 28855761, 28860451, 28900212, 28951588, 28960047
     28995524, 29001307, 29030106, 29031452, 29039918, 29115917, 29116692
     29127489, 29127491, 29167352, 29173957, 29198743, 29229120, 29234059
     29250565, 29264772, 29302070, 29313039, 29318169, 29338628, 29339084
     29350729, 29363565, 29379932, 29391373, 29411007, 29417321, 29428155
     29437701, 29482354, 29484738, 29520544, 29524859, 29527221, 29551699
     29560075, 29586338, 29590813, 29604082, 29643818, 29704429, 29705711
     29721120, 29760083, 29779338, 29791186, 29848987, 29851205, 29862693
     29872187, 29893148, 29929003, 29929061, 29937236, 29941227, 29963428
     30003321, 30032562, 30046061, 30051637, 30057972, 30076951, 30093449
     30140896, 30179531, 30239575, 30251503, 30264950, 30269395, 30275174
     30324590, 30363621, 30655657
     
Which you can see includes the required patch 27494830 (first row, 3rd column). Depending on your version of Oracle, and quarterly bundle patch there are many options of patches available. The quarterly bundle patches for GI generally contain an updated ACFS patch as well.

 

Details of the ASM Kernel Modules 

In 19c, under the GRID_HOME is a directory called usm. On my test system under here I can find this:
$ ls $GRID_HOME/usm/install/Oracle/EL7UEK/x86_64
4.1.12/  4.1.12-112.16.4/  4.14.35-1902/
This is the location of the libraries used to compile the kernel modules for ACFS / AFD.  As expected, there is a directory for each major kernel version that pretty much matches our list above.  The command “amcmd afd_configure -e” is used to install the AFD kernel modules.  The AFD module use the Kernel Application Binary Interface and are installed under the /lib/modules location on your Linux system.  As they are considered weak modules, they are placed in one location and then symbolically linked into each kernel specific directory when the kernel is updated.
[root@srvr03 sbin]# ls -l /lib/modules/4.14.35-1902*/{extra,weak-updates}/oracle
/lib/modules/4.14.35-1902.0.9.el7uek.x86_64/extra/oracle:
total 9824
-rw-r--r-- 1 root root 10055682 Jun  5 13:16 oracleafd.ko

/lib/modules/4.14.35-1902.10.8.el7uek.x86_64/weak-updates/oracle:
total 0
lrwxrwxrwx 1 root root 69 Jun  5 13:17 oracleafd.ko -> /lib/modules/4.14.35-1902.0.9.el7uek.x86_64/extra/oracle/oracleafd.ko

/lib/modules/4.14.35-1902.11.3.1.el7uek.x86_64/weak-updates/oracle:
total 0
lrwxrwxrwx 1 root root 69 Jun  5 13:44 oracleafd.ko -> /lib/modules/4.14.35-1902.0.9.el7uek.x86_64/extra/oracle/oracleafd.ko

Here we can see that the primary module was installed under 4.14.35-1902.0.9.  Then when the kernel 4.14.35-1902.11.3.1 was installed, a symbolic link was created to the original module.

The AFD module is normally loaded by a Linux RC script called afd (/etc/init.d/afd).  You can verify it will be started at boot with the following command as the root user:
[root@srvr03 init.d]# chkconfig --list afd

Note: This output shows SysV services only and does not include native
      systemd services. SysV configuration data might be overridden by native
      systemd configuration.

      If you want to list systemd services use 'systemctl list-unit-files'.
      To see services enabled on particular target use
      'systemctl list-dependencies [target]'.

afd            	0:off	1:off	2:off	3:on	4:off	5:on	6:off

Here you can see that the AFD module will attempt to be loaded at run levels 3 and 5.  You can also check what version of the AFD module you have loaded with the following Linux command:

[oracle@srvr03 ~]$ modinfo oracleafd
filename:       /lib/modules/4.14.35-1902.11.3.1.el7uek.x86_64/weak-updates/oracle/oracleafd.ko
license:        Oracle Corporation
description:    ASM Filter Driver
author:         Oracle Corporation
srcversion:     533BB7E5866E52F63B9ACCB
depends:        
retpoline:      Y
name:           oracleafd
vermagic:       4.14.35-1902.0.9.el7uek.x86_64 SMP mod_unload modversions 
signat:         PKCS#7
signer:         
sig_key:        
sig_hashalgo:   md4
parm:           oracleafd_use_logical_block_size:Non-zero value implies using logical sectorsize (int)
parm:           oracleafd_use_maxio_softlimits:Non-zero value implies using conservative maximum block io limits (int)

Using the lsmod command to see if it is loaded.  If this command returns no lines, then it is not loaded. 

[oracle@srvr03 ~]$ lsmod |grep oracleafd
oracleafd             229376  0 


Conclusion

Ok, so hopefully this has given you a clear picture as to what you need to plan for when using ASM Filter Driver (AFD) or ACFS on your OEL system with UEK.  Once you have a base version of kernel and GI you should have little issues unless you jump UEK channels or do a GI major version change.  As many people are eyeing up 19c upgrades, hopefully the above information is helpful to prepare your Linux OS for the change.  For myself what started as a somewhat simple post about kernel drivers turned into a pretty long delve into the workings of the ASM filter driver.  Please leave your comments and questions.