Some less known facts about log file sync and other LGWR-related waits Nikolay Savvinov, snr. database performance specialist, Deutsche Bank TechCentre
A few words about me 10 years with Oracle databases was doing particle physics before last 5 years focus on performance optimization Twitter: oradiag Blog: savvinov.com 10 June 2015, HARMONY – 2015, TALLINN N. Savvinov, Some less known facts about LGWR
Outline Well known stuff about LGWR Not-so-well-known stuff about LGWR log parallelism and how it can backfire contention-related log file sync waits excessive commits aren’t always excessive 10 June 2015, HARMONY – 2015, TALLINN N. Savvinov, Some less known facts about LGWR
What is log file sync (LFS)? Changes generate redo (redo makes changes recoverable) Redo needs to be written to persistent storage For performance reasons, this is done asynchronously When user commits changes, he wants to be sure they’re protected LFS is an delay between the commit and the confirmation that redo is on the disk Normally, the main part of LFS is log file parallel write (LFPW) Other components: latch manipulation, inter-process communication, CPU etc. 10 June 2015, HARMONY – 2015, TALLINN N. Savvinov, Some less known facts about LGWR
Why care about LFS? LFS measures the delay introduced by a commit I.e. systems that commit a lot, can spend a lot of time on LFS LFS is critical for low-latency OLTP systems LFS is one of major sources of replication delays LFS is responsible for “commit gaps” => errors in logic 10 June 2015, HARMONY – 2015, TALLINN N. Savvinov, Some less known facts about LGWR
How redo generation works redo is simple (flows in, flow out) log buffer is small log files are written to in a circular manner redo incoming rate ≈ rate of change can be affected by hot backup flush triggered by: commit rollback log buffer 1/3 full log buffer 1MB full every 3 seconds no balance between redo in- and out-flow => delays Redo generation rate Log buffer Redo flush Redo write speed To log files 10 June 2015, HARMONY – 2015, TALLINN N. Savvinov, Some less known facts about LGWR
Piggyback commit: mechanics 10 June 2015, HARMONY – 2015, TALLINN N. Savvinov, Some less known facts about LGWR
Averages lie BIG TIME (1 ms x 10 + 10 ms x 1)/(10 + 1) ≈1.8 ms LFPW, (1 ms x 10 + 5 ms x 10) / (10 + 10) = 3 ms LFS the higher (and the more frequent) the outliers are, the bigger gap between LFS and LFPW 10 June 2015, HARMONY – 2015, TALLINN N. Savvinov, Some less known facts about LGWR
I/O related LFS most common scenario mark of I/O related LFS: LGWR waiting almost exclusively on LFPW I/O performance cannot be judged by time alone, need redo size as well must take into account RAID write penalties synchronous storage-level replication – another common scenario a (rather common) special case: storage-level contention 10 June 2015, HARMONY – 2015, TALLINN N. Savvinov, Some less known facts about LGWR
“log file parallel write” => “log file serialized write” on the database level, LFPW is a simple single event on the OS level: a bunch of write requests to several destinations (multiplexity!) LFPW parameters: select p1, p1text, p2, p2text, p3, p3text from v$active_session_history where event = ‘log file parallel write’ =============================================== 1 files 2050 blocks 2 requests when requests > multiplexing, we see log parallelism in action introduced to reduce latching, has nasty side effects when many CPUs _log_parallelism_max, _log_parallelism_dynamic (note 34583.1) 10 June 2015, HARMONY – 2015, TALLINN N. Savvinov, Some less known facts about LGWR
CPU-related LFS waits redo logging is NOT CPU intensive still, it does need CPU CPU starvation, priority inversion etc. can lead to CPU-related LFS identified by % of time spend “ON CPU” by LGWR (or big outliers) sometimes can be fixed by changing LGWR priority in OS, or by using database parameter _high_priority_processes on the plot: spikes correspond to CPU scheduler quanta 10 June 2015, HARMONY – 2015, TALLINN N. Savvinov, Some less known facts about LGWR
Contention-related LFS Apart from I/O and CPU, another scenario for LFS is contention e.g. writes to control file required when switching log files Signature: LGWR spending significant % of time on “enq: CF – contention” (or big outliers) Small log files increase the risk of this problem! Rather than relying on reducing log switch frequency alone, the best approach is to identify the root cause (e.g. RMAN issues) causing excessive (or slow) writes to control file that lead to the contention 10 June 2015, HARMONY – 2015, TALLINN N. Savvinov, Some less known facts about LGWR
Excessive commits? “You’re committing too much” is one of the most popular responses when complaining about LFS Record: the smallest commit frequency that was declared “excessive” by investigating DBA was … 30 commits per second In reality, an Oracle database can handle thousands, and even tens of thousands of commits per second Excessive commits are primarily a problem for transactional integrity Excessive commits slow down the process that issues them Removing them can LFS => LBS or cause a bottleneck elsewhere 10 June 2015, HARMONY – 2015, TALLINN N. Savvinov, Some less known facts about LGWR
Bad reasons to worry about LFS because without LFS everything would go faster (no it won’t) because it’s in top events in AWR (so what?) because LFS is contributing to locking/contention (no it doesn’t) because LFS is increasing CPU consumption (no it doesn’t) 10 June 2015, HARMONY – 2015, TALLINN N. Savvinov, Some less known facts about LGWR
Summary LFS is a common problem, more common for OLTP not always a performance issues as such, could also be a “staleness” issue, or cause errors in logic % DB time in AWR is rarely a useful measure for LFS impact caused by redo I/O, CPU/scheduling issues, or contention (e.g. for CF) I/O problems can be related to “serialized parallelism” issue, workaround: disable log parallelism reducing commits can transform LFS into LBS when troubleshooting LFS, understanding the scope is of key importance (or performance can be made worse instead of better) 10 June 2015, HARMONY – 2015, TALLINN N. Savvinov, Some less known facts about LGWR