Production-grade Oracle DBA insights from a practising ACE Apprentice. Oracle 23ai, Exadata, AWR, GoldenGate, OCI — the hard problems, solved in writing.
Specialising in Oracle 23ai, RAC, Exadata, OCI, and core banking performance engineering. Oracle ACE Apprentice based in Riyadh.
Four migration methods compared (Data Guard, RMAN, Data Pump, remote PDB clone), real workload pre-analysis, TDE keystore gotchas, RAC connection string traps, and a cutover checklist that actually held up on the night.
Root-cause diagnosis from hot block identification through GC parameter tuning and the application-level fix that actually worked in production.
Deep technical posts on Oracle performance, RAC, Exadata, and AI-assisted DBA workflows.
I've done enough database migrations to stop believing in "smooth" ones. There are migrations that go according to plan, and migrations that teach you something. This one did both.
Earlier this year we migrated several core databases from Oracle Base Database Service (DBCS) on OCI to Oracle Exadata Database Service on Dedicated Infrastructure (ExaCS). Same region, same VCN, but a completely different platform under the hood.
Three specific pain points drove the decision:
I/O latency. DBCS VM shapes use iSCSI block volumes for storage. It works, but it is not Exadata. Once you see latency numbers from Exadata's RDMA over Converged Ethernet storage fabric, it's hard to go back.
Smart Scan. Our batch reporting jobs are heavy full-scan workloads. On DBCS, there is no storage offloading; every byte goes through the standard I/O path. On ExaCS, those same queries push predicate filtering down to the storage cells.
Infrastructure flexibility. ExaCS allows multiple VM Clusters on the same dedicated Exadata infrastructure, each with scalable OCPUs and storage.
Before touching a single OCI resource, I ran a thorough workload analysis on the source DBCS environment:
SELECT sql_id,
ROUND(elapsed_time_total / 1000000, 2) AS elapsed_sec,
executions_total,
ROUND(elapsed_time_total /
NULLIF(executions_total, 0) / 1000000, 4) AS avg_elapsed_sec
FROM dba_hist_sqlstat s
JOIN dba_hist_sqltext t USING (sql_id)
WHERE snap_id BETWEEN (
SELECT MIN(snap_id) FROM dba_hist_snapshot
WHERE begin_interval_time > SYSDATE - 14)
AND (SELECT MAX(snap_id) FROM dba_hist_snapshot)
ORDER BY elapsed_time_total DESC
FETCH FIRST 25 ROWS ONLY;
Also run a feature usage check to avoid licensing surprises:
SELECT name, detected_usages, currently_used FROM dba_feature_usage_statistics WHERE currently_used = 'TRUE' ORDER BY detected_usages DESC;
We had four realistic options. We didn't use the same method for every database.
My preferred method for large, critical databases. Build ExaCS as a physical standby of the source DBCS, let it sync fully, then perform a controlled switchover. Application downtime is limited to the switchover itself — typically 2–4 minutes on a healthy configuration.
-- Verify on SOURCE DBCS
SELECT log_mode, force_logging, db_unique_name
FROM gv$database;
-- Expect: LOG_MODE=ARCHIVELOG, FORCE_LOGGING=YES
-- Set Data Guard parameters
ALTER SYSTEM SET log_archive_dest_2 =
'SERVICE=exacs_standby ASYNC
VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE)
DB_UNIQUE_NAME=exacs_standby' SCOPE=BOTH;
-- Monitor lag
SELECT name, value, unit
FROM gv$dataguard_stats
WHERE name IN ('transport lag', 'apply lag', 'apply finish time');
-- Switchover
ALTER DATABASE COMMIT TO SWITCHOVER TO PHYSICAL STANDBY
WITH SESSION SHUTDOWN;
-- On ExaCS:
ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY
WITH SESSION SHUTDOWN;
ALTER DATABASE OPEN;
Take a full RMAN backup from DBCS to OCI Object Storage, then restore onto the ExaCS target. Simpler to set up, but requires a downtime window proportional to database size. For databases under 500GB with an acceptable maintenance window, this is fast and reliable.
Best for smaller schemas or logical migrations. Not practical for large transactional databases due to export/import overhead, but ideal for dev/test environments.
If both source DBCS and target ExaCS are running CDB/PDB architecture, you can clone a PDB directly across a database link. The source PDB must be in READ ONLY mode during the clone.
-- On TARGET ExaCS — create the DB link
CREATE DATABASE LINK dbcs_source_link
CONNECT TO clone_link_user IDENTIFIED BY "<password>"
USING '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)
(HOST=<source_scan>)(PORT=1521))
(CONNECT_DATA=(SERVICE_NAME=<cdb_service>)))';
-- Put source PDB in READ ONLY (brief outage starts here)
ALTER PLUGGABLE DATABASE pdb_prod CLOSE IMMEDIATE;
ALTER PLUGGABLE DATABASE pdb_prod OPEN READ ONLY;
-- Execute remote clone on TARGET
CREATE PLUGGABLE DATABASE pdb_prod_exacs
FROM pdb_prod@dbcs_source_link
FILE_NAME_CONVERT = ('+DATAC1', '+DATAX1')
STORAGE UNLIMITED TEMPFILE REUSE;
TDE is mandatory on DBCS. Make sure your TDE wallet password is known, documented, and synchronized between source and target before any cutover activity. It sounds obvious. It becomes non-obvious at 11pm.
RAC conversion happens automatically with Data Guard. But your application needs to connect via the SCAN address. Hunt down every hardcoded connection string before your cutover window.
Backup configuration doesn't transfer. Whatever automatic backup policy was running on your DBCS needs to be manually re-established on ExaCS post-migration.
We had a 2-hour window. Actual database activities took about 75 minutes. Two things saved us: we validated SCAN resolution from all application servers the week before the window, and we had a rehearsed rollback plan with the DBCS connection string kept live until we were satisfied.
After a week of hypercare monitoring, the results were clear. Peak-hour transaction processing time dropped noticeably. Batch reporting jobs finished hours earlier. Smart Scan offloading on full-scan queries is a genuinely different experience. Keep the source DBCS running for at least a full week after cutover.
Deep technical posts on Oracle performance, RAC, Exadata, and AI-assisted DBA workflows.
If you've spent any meaningful time managing Oracle RAC, you've developed a personal relationship with gc buffer busy acquired. Not a good relationship. The kind where you see the name in an AWR report and reach for your coffee.
We run a 2-node RAC cluster for a financial processing system with a very specific access pattern: lots of small, targeted transactions hitting a relatively small set of "hot" rows — account balance tables, transaction status tables.
For months it ran fine. Then we upgraded the application, and suddenly response times for certain transaction types doubled. Not crashed, not errored out — just doubled. Which, in financial services, is enough to get people very upset very quickly.
The top wait event was gc buffer busy acquired with an average wait time around 15–20ms. In RAC, this means your session is trying to get a buffer that another session is in the process of transferring between nodes.
SELECT inst_id, event, total_waits,
time_waited_micro / 1000000 AS time_waited_sec,
average_wait
FROM gv$system_event
WHERE event LIKE 'gc buffer busy%'
ORDER BY time_waited_micro DESC;
Node 1 was the aggressor — it was generating the hot block requests. Node 2 was mostly the victim.
SELECT owner, object_name, object_type,
SUM(CASE WHEN statistic_name = 'gc current blocks received'
THEN value ELSE 0 END) AS current_blocks_received,
SUM(CASE WHEN statistic_name = 'gc cr blocks received'
THEN value ELSE 0 END) AS cr_blocks_received
FROM gv$segment_statistics
WHERE statistic_name IN (
'gc current blocks received',
'gc cr blocks received')
GROUP BY owner, object_name, object_type
ORDER BY current_blocks_received DESC
FETCH FIRST 10 ROWS ONLY;
One table came back as a clear outlier — our account balance table. Several "hot" rows that get updated by almost every transaction were sitting in just a handful of blocks, and every node was fighting over them constantly.
The developer had made an "optimisation" — they changed a query to use an index range scan on a status column. The problem: this index access pattern was now hitting the same small set of "active status" rows repeatedly, concentrating I/O on very few blocks.
The old code was scattering the I/O slightly more — and that slight scatter was actually better in a RAC context because it reduced per-block contention.
Short term: We used DBMS_STATS to increase the number of hash partitions on the hot table's status index. Reverse key indexes were not suitable here because we still needed range scans.
The real fix: We added a sequence-generated "shard key" to the hot rows, and the application was updated to distribute updates across a set of N "slots" for the same logical account balance. It reduced gc buffer busy wait time by about 85% within the first day.
Infrastructure side: We also reviewed gv$cluster_interconnects and found packet retransmits pointing to a firmware issue on one of the network cards. That got patched during the next maintenance window.
gc buffer busy acquired is not an "Oracle RAC problem." It's an application design meeting a shared-everything architecture problem. Before you tune the database, understand what the application is actually doing to those blocks. And when a developer tells you an "optimisation" is making things worse in production — they're usually not wrong.
Deep technical posts on Oracle performance, RAC, Exadata, and AI-assisted DBA workflows.
Every seasoned DBA eventually faces the "Big Table" problem. In our case, it was a monolithic, non-partitioned table that had ballooned to 20 Terabytes. Full table scans were glacial, index rebuilds were an all-weekend affair, and storage costs were escalating rapidly.
We needed to implement a robust Data Lifecycle Management strategy — online, with minimal to zero downtime. Our solution: Range-Interval Partitioning, conversion to an Index-Organized Table (IOT), advanced LOB compression, and DBMS_REDEFINITION.
| Feature | Technical Benefit | Impact |
|---|---|---|
| Online Redefinition | Uses DBMS_REDEFINITION while original table stays fully accessible for DML | Zero Downtime — only brief lock at FINISH_REDEF_TABLE |
| Range-Interval Partitioning | Partitions monthly on CREATION_TIMESTAMP | Enables Partition Pruning — queries scan only a fraction of data |
| Advanced Compression | COMPRESS for historical, COMPRESS ADVANCED for older partitions | Significant storage cost reduction |
| LOB Optimization | SECUREFILE with COMPRESS HIGH and DEDUPLICATE | Can cut size dramatically for repetitive API payloads |
ALTER SESSION FORCE PARALLEL DML PARALLEL 32;
ALTER SESSION FORCE PARALLEL DDL PARALLEL 32;
ALTER SESSION SET DDL_LOCK_TIMEOUT=900;
BEGIN
DBMS_REDEFINITION.CAN_REDEF_TABLE(
user => 'SCHEMA_OWNER',
tname => 'BIG_LOG_TABLE',
options_flag => DBMS_REDEFINITION.CONS_ORIG_PARAMS
);
DBMS_OUTPUT.PUT_LINE('Table can be redefined.');
END;
/
CREATE TABLE SCHEMA_OWNER.BIG_LOG_TABLE_INT (
MESSAGE_ID VARCHAR2(50 BYTE),
CREATION_TIMESTAMP TIMESTAMP(6),
PAYLOAD_CLOB CLOB,
CONSTRAINT PK_BIG_LOG_TABLE_INT
PRIMARY KEY (CREATION_TIMESTAMP, MESSAGE_ID)
)
ORGANIZATION INDEX
COMPRESS 1
TABLESPACE TS_DATA_HOT
PARTITION BY RANGE (CREATION_TIMESTAMP)
INTERVAL (NUMTOYMINTERVAL(1, 'MONTH'))
( PARTITION P_2026_01
VALUES LESS THAN (TIMESTAMP' 2026-02-01 00:00:00')
TABLESPACE TS_DATA_HOT NOCOMPRESS )
LOB (PAYLOAD_CLOB) STORE AS SECUREFILE (
COMPRESS HIGH DEDUPLICATE CACHE LOGGING)
PARALLEL 32;
BEGIN
DBMS_REDEFINITION.START_REDEF_TABLE(
uname => 'SCHEMA_OWNER',
orig_table => 'BIG_LOG_TABLE',
int_table => 'BIG_LOG_TABLE_INT',
options_flag => DBMS_REDEFINITION.CONS_USE_ROWID
);
END;
/
DECLARE num_errors PLS_INTEGER;
BEGIN
DBMS_REDEFINITION.COPY_TABLE_DEPENDENTS(
uname => 'SCHEMA_OWNER',
orig_table => 'BIG_LOG_TABLE',
int_table => 'BIG_LOG_TABLE_INT',
copy_indexes => DBMS_REDEFINITION.COPY_SQL_ERRORS,
copy_triggers => TRUE,
copy_constraints => TRUE,
copy_grants => TRUE,
num_errors => num_errors
);
END;
/
For a 20TB table, run this periodically while the bulk copy is running to minimize final synchronization time:
BEGIN
DBMS_REDEFINITION.SYNC_INTERIM_TABLE(
uname => 'SCHEMA_OWNER',
orig_table => 'BIG_LOG_TABLE',
int_table => 'BIG_LOG_TABLE_INT'
);
END;
/
This is the switchover — the only point of application downtime. On a 20TB table, this typically takes under 30 seconds:
BEGIN
DBMS_REDEFINITION.FINISH_REDEF_TABLE(
uname => 'SCHEMA_OWNER',
orig_table => 'BIG_LOG_TABLE',
int_table => 'BIG_LOG_TABLE_INT'
);
END;
/
DROP TABLE SCHEMA_OWNER.BIG_LOG_TABLE_INT PURGE;
BEGIN
DBMS_STATS.GATHER_TABLE_STATS(
ownname => 'SCHEMA_OWNER',
tabname => 'BIG_LOG_TABLE',
degree => 32,
cascade => TRUE,
granularity => 'ALL'
);
END;
/
Result: A 20TB monolithic table transformed into a monthly-partitioned, IOT-organized, LOB-compressed structure — entirely online, with only a brief exclusive lock during the final switchover step.
Deep technical posts on Oracle performance, RAC, Exadata, and AI-assisted DBA workflows.
It has been a long time — years, actually — since I last posted here on OraDiscuss. Life, work, and everything in between took over. But I'm back, and I couldn't think of a better way to re-launch this journey than by sharing a major professional milestone: I've officially joined the Oracle ACE program as an ACE Apprentice!
Think of it as a "thank you" from Oracle to people who help others figure out their tech. It's not just about what you know; it's about how much you're willing to share.
I used to think you had to be a "wizard" with 30 years of experience to join, but the program is actually designed to support you at every stage of your career.
I'm sharing this because this blog — and you, the readers — are a huge part of why I want to give back. Whether you love writing, speaking, coding, or just answering questions on forums, there is a path for you in the ACE program.
My goal for OraDiscuss moving forward is to share what I'm learning as I learn it. No more waiting until I'm an "expert" to hit publish. We're going to grow together.
You can find my Oracle ACE profile at ace.oracle.com/ords/ace/oapt/profile/madarwish.
It feels good to be back. If you've been following me since the early days, or if you're just joining now, drop a comment on the original post. I'd love to catch up and hear what you've been working on!
Deep technical posts on Oracle performance, RAC, Exadata, and AI-assisted DBA workflows.
It was a very long time since writing in this blog — here we are back again! I was looking for a proper and accurate way to monitor the progress of RMAN backups. The scripts below can be used to monitor progress efficiently.
This gives you an overview of the running backup including compression ratio and estimated completion time:
SELECT recid,
output_device_type,
dbsize_mbytes,
input_bytes/1024/1024 AS input_mbytes,
output_bytes/1024/1024 AS output_mbytes,
(output_bytes/input_bytes*100) AS compression_pct,
(mbytes_processed/dbsize_mbytes*100) AS pct_complete,
TO_CHAR(start_time +
(SYSDATE - start_time) /
(mbytes_processed/dbsize_mbytes),
'DD-MON-YYYY HH24:MI:SS') AS est_complete
FROM v$rman_status rs,
(SELECT SUM(bytes)/1024/1024 dbsize_mbytes FROM v$datafile)
WHERE status = 'RUNNING'
AND output_device_type IS NOT NULL;
This shows all long-running operations including RMAN with percentage completion:
SELECT SID, SERIAL#, opname, SOFAR, TOTALWORK,
ROUND(SOFAR/TOTALWORK*100,2) AS pct_complete
FROM V$SESSION_LONGOPS
WHERE TOTALWORK != 0
AND SOFAR != TOTALWORK
ORDER BY 1;
Both scripts complement each other — use V$RMAN_STATUS for the high-level backup view and V$SESSION_LONGOPS for the detailed operation-by-operation progress, including index rebuilds, stats gathering, and other long-running DBA tasks.
Deep technical posts on Oracle performance, RAC, Exadata, and AI-assisted DBA workflows.
Today I will simplify the Oracle GoldenGate configurations for the bidirectional path setup between homogeneous environments (Oracle to Oracle). This was implemented on Oracle Solaris 11.2 SPARC with Oracle Database 12.1.0.2.0 and Oracle GoldenGate 12c.
Both source and target must be in ARCHIVELOG mode with supplemental logging enabled:
-- Verify archivelog mode SELECT log_mode FROM v$database; -- Enable supplemental logging and forced logging ALTER DATABASE ADD SUPPLEMENTAL LOG DATA; ALTER DATABASE FORCE LOGGING; ALTER SYSTEM SWITCH LOGFILE; -- Verify SELECT force_logging, supplemental_log_data_min FROM v$database; -- Both should return YES
-- On SOURCE (Test1): CREATE USER oggadm1 IDENTIFIED BY ****; GRANT dba TO oggadm1; EXEC DBMS_GOLDENGATE_AUTH.GRANT_ADMIN_PRIVILEGE( grantee=>'OGGADM1', privilege_type=>'capture', grant_select_privileges=>true, do_grants=>TRUE); -- On TARGET (Test2): CREATE USER oggadm2 IDENTIFIED BY ****; GRANT dba TO oggadm2;
-- Primary Extract EXTXP01 (in dirprm/EXTXP01.prm) Extract EXTXP01 ExtTrail ./dirdat/aa UserID oggadm1@TEST1, Password ***** TranLogOptions ExcludeUser oggadm1 Table schema.*; -- Add in GGSCI: GGSCI> Add Extract EXTXP01, TranLog, Begin Now GGSCI> Add ExtTrail ./dirdat/aa, Extract EXTXP01
-- Secondary Extract EXTSE01 (in dirprm/EXTSE01.prm) Extract EXTSE01 RmtHost oragg2, MgrPort 7810, Compress RmtTrail ./dirdat/se Passthru Table schema.table_name; -- Add in GGSCI: GGSCI> Add Extract EXTSE01, ExtTrailSource ./dirdat/aa GGSCI> Add RmtTrail ./dirdat/se, Extract EXTSE01
-- Replicat REPPR01 (in dirprm/REPPR01.prm) Replicat REPPR01 UserID oggadm2@TEST2, Password ***** AssumeTargetDefs SourceDefs dirdef/oratabs.def DiscardFile dirrpt/oratabs.dsc, Append Map schema_name.table_name, Target schema_name.table_name; -- Add in GGSCI: GGSCI> Add Replicat REPPR01, ExtTrail ./dirdat/se
-- Extract on TARGET: GGSCI> Add Extract EXTPR01, TranLog, Begin Now GGSCI> Add ExtTrail ./dirdat/bb, Extract EXTPR01 GGSCI> Add Extract EXTSEC01, ExtTrailSource ./dirdat/bb GGSCI> Add RmtTrail ./dirdat/ra, Extract EXTSEC01 -- Replicat on SOURCE: GGSCI> Add Replicat REPSE01, ExtTrail ./dirdat/ra
-- On both Source and Target: GGSCI> Start extract * GGSCI> Start replicat * GGSCI> Info all -- Monitor reports: GGSCI> Send extract EXTSE01, Report GGSCI> View report EXTSE01
Note: The count of inserts/updates/deletes for the Replicat should match the Extract. Always tail ggserr.log during startup: tail -100f $OGG_HOME/ggserr.log
Deep technical posts on Oracle performance, RAC, Exadata, and AI-assisted DBA workflows.
Recently, I was trying to install a single Oracle Database Enterprise Edition 11.2.0.4 with ASM. Everything completed successfully from the RAW device format through to the grid infrastructure and database software installation. But when I started to create the database using DBCA, I got this error:
Can not use ASM for database storage due to the following reason: Could not connect to ASM due to the following error: ORA-01017: invalid username/password; logon denied.
Everyone will say there is a wrong provided password — that was my first thought too. But I found the ASMSNMP user simply was not created.
-- Step 1: Recreate the password file (this did NOT fix it)
orapwd file=$ORACLE_HOME/dbs/orapw+ASM password=oracle entries=5
-- Step 2: Try connecting as ASMSNMP (fails)
sqlplus asmsnmp/oracle@+ASM as sysdba
-- Step 3: Check existing ASM users
$ asmcmd
ASMCMD> lspwusr
Username sysdba sysoper sysasm
SYS TRUE TRUE FALSE
-- Only SYS exists. ASMSNMP is missing.
-- Login to +ASM instance using SYSASM privilege
sqlplus / as sysasm
SQL> CREATE USER asmsnmp IDENTIFIED BY oracle;
User created.
SQL> GRANT sysdba TO asmsnmp;
Grant succeeded.
-- Verify
ASMCMD> lspwusr
Username sysdba sysoper sysasm
SYS TRUE TRUE FALSE
ASMSNMP TRUE FALSE FALSE
After creating the ASMSNMP user and granting SYSDBA, DBCA was able to connect to ASM and the database creation completed successfully. The root cause appears to be a bug in certain 11.2.0.4 grid infrastructure installations where the ASMSNMP user is not automatically created.
Deep technical posts on Oracle performance, RAC, Exadata, and AI-assisted DBA workflows.
This is a pure licensing post, but it is really important for DBAs to know how to do it correctly. After installing a new 11.2 Enterprise Edition, Oracle installs all database options by default, including ones you may not be licensed to use.
-- Check if Partitioning is enabled: SELECT * FROM v$option WHERE parameter = 'Partitioning'; PARAMETER VALUE ----------------- ----- Partitioning TRUE -- Check the value programmatically: SELECT value FROM v$option WHERE parameter = 'Partitioning'; -- Returns 1 if enabled, 0 if disabled
Starting with Oracle 11.2, a utility called chopt can be used on Unix/Linux and Windows to enable or disable specific database options. It is located in $ORACLE_HOME/bin.
Important: Shut down the database and all services in the same ORACLE_HOME before running chopt, as it rebuilds the Oracle executable.
-- Disable Partitioning option: $ chopt disable partitioning Writing to /u01/app/oracle/product/11.2.0/dbhome_1/install/disable_partitioning.log... -- Enable it back: $ chopt enable partitioning
$ chopt usage: chopt <enable|disable> <option> options: dm = Oracle Data Mining RDBMS Files dv = Oracle Database Vault lbac = Oracle Label Security olap = Oracle OLAP partitioning = Oracle Partitioning rat = Oracle Real Application Testing
-- After restart, verify: SELECT * FROM v$option WHERE parameter = 'Partitioning'; PARAMETER VALUE ----------------- ----- Partitioning FALSE
This can be used to reduce licensing costs for non-used features. Always verify with your Oracle licensing team before disabling options in production environments. Refer to MOS Doc ID 1312416.1 for common questions on the Partitioning option.
Deep technical posts on Oracle performance, RAC, Exadata, and AI-assisted DBA workflows.
A passionate technologist and community leader, Mahmoud Darwish has spent his career mastering the complexities of the Oracle ecosystem. With deep specialisation in Oracle 23ai, RAC, Performance Tuning, AI Vector Search, and large-scale Database Migrations, Mahmoud is recognised for his ability to manage high-availability environments and solve the most complex database challenges.
As the industry shifts toward the cloud, Mahmoud has established significant expertise in Oracle Exadata Cloud@Customer (ExaCC), Oracle Cloud Infrastructure (OCI), specifically focusing on Exadata Cloud Services (ExaCS) and Database Cloud Service (DBCS). He excels at bridging the gap between legacy on-premises systems and modern cloud architectures, implementing complex techniques to ensure peak performance and reliability.
Beyond his technical role, Mahmoud is a dedicated advocate for the Oracle community. He actively shares his real-world troubleshooting experiences and cloud transformation insights through forums and blogs, helping fellow DBAs and architects navigate the nuances of the Oracle landscape.
Mahmoud is an Oracle ACE Apprentice — part of Oracle's recognition programme for technical community contributors. You can view his Oracle ACE profile and contributions at the link below.