This blog post was originally released on Pythian blog.
This is a short blog post on how one can prove that a particular partition of the table or index is accessed by a specific query.
The solution to this question seemed interesting enough to share with you all. Here’s a short description of what I had:
- A SQL with sql_id =
9kuvj1g38hja2is the TOP 1 statement in “User I/O” section of AWR report.
- The statement queries data from multiple tables. One of them is a fairly large table named “ALERT”. It is range partitioned by date, and each year has it’s own partition (Y2013, Y2012, …).
- The SQL takes 5 – 10 seconds to complete and is executed 500 – 5000 times each hour (sometimes even ~10 of them are running simultaneously).
- ~95% of elapsed time is spent on db file sequential reads.
- Diagnostic pack is licensed (it is required for this method as ASH is used).
We also had a statement so that none of the queries should be reading data older then 1 year, but looking at the statistics I suspected there was something wrong. So here’s how I found out older data from the ALERT table was accessed:
- As the query is executed very often, it has many appearances in ASH.
- As the statement does lots of db file sequential reads, there are many p1 (file_id) and p2 (block_id) combinations available from ASH.
- file_id and block_id can be joined to DBA_EXTENTS to find out the segment which the block belongs to. That is where the partition names being accessed by the SQL where found out.
The challenge with this approach was the fact that I had more then 100K samples for
9kuvj1g38hja2 waiting on db file sequential reads in DBA_HIST_ACTIVE_SESS_HISTORY, and there were millions of extents in the database. How do I join them efficiently? I can fetch only the segments we’re interested in from DBA_EXTENTS before joining it all to DBA_HIST_ACTIVE_SESS_HISTORY (Thanks Yury for this great hint)! So this is the query that solved the mystery:
with exts as (select /*+ materialize */ partition_name, file_id, block_id, blocks from dba_extents where owner='SECRET' and segment_name='ALERT'), stats as (select /*+ materialize */ p1, p2, count(*) cnt from DBA_HIST_ACTIVE_SESS_HISTORY where sql_id='9kuvj1g38hja2' and event='db file sequential read' group by p1, p2) select partition_name, sum(s.cnt) read_count from exts e, stats s where e.file_id=s.p1 and s.p2 between e.block_id and e.block_id+e.blocks-1 group by partition_name order by 2;
It’s short and simple, and it also answered the question I asked. Is old data accessed by the SQL statement? Yes.
PARTITION_NAME READ_COUNT ------------------------------ ---------- Y2006 111 Y2007 5645 Y2008 5698 Y2013 10388 Y2009 13473 Y2010 21600 Y2011 37186 Y2012 56184 8 rows selected
Be careful when interpreting the results! If the SQL does not report a partition as being accessed, it doesn’t mean it was not accessed. ASH is based on sampling – maybe we got “lucky” and samples didn’t cover reads from that partition? Maybe all blocks belonging to the partition were in buffer cache and no reads were needed? But, if the SQL reported a partition as being accessed, we can be 100% sure it really was, and for me that was enough.