This blog post was originally released on Pythian blog.

This is a short blog post on how one can prove that a particular partition of the table or index is accessed by a specific query.

The solution to this question seemed interesting enough to share with you all. Here’s a short description of what I had:

  • A SQL with sql_id = 9kuvj1g38hja2 is the TOP 1 statement in “User I/O” section of AWR report.
  • The statement queries data from multiple tables. One of them is a fairly large table named “ALERT”. It is range partitioned by date, and each year has it’s own partition (Y2013, Y2012, …).
  • The SQL takes 5 – 10 seconds to complete and is executed 500 – 5000 times each hour (sometimes even ~10 of them are running simultaneously).
  • ~95% of elapsed time is spent on db file sequential reads.
  • Diagnostic pack is licensed (it is required for this method as ASH is used).

We also had a statement so that none of the queries should be reading data older then 1 year, but looking at the statistics I suspected there was something wrong. So here’s how I found out older data from the ALERT table was accessed:

  1. As the query is executed very often, it has many appearances in ASH.
  2. As the statement does lots of db file sequential reads, there are many p1 (file_id) and p2 (block_id) combinations available from ASH.
  3. file_id and block_id can be joined to DBA_EXTENTS to find out the segment which the block belongs to. That is where the partition names being accessed by the SQL where found out.

The challenge with this approach was the fact that I had more then 100K samples for 9kuvj1g38hja2 waiting on db file sequential reads in DBA_HIST_ACTIVE_SESS_HISTORY, and there were millions of extents in the database. How do I join them efficiently? I can fetch only the segments we’re interested in from DBA_EXTENTS before joining it all to DBA_HIST_ACTIVE_SESS_HISTORY (Thanks Yury for this great hint)! So this is the query that solved the mystery:

with
exts as (select /*+ materialize */ partition_name, file_id, block_id, blocks
         from dba_extents where owner='SECRET' and segment_name='ALERT'),
stats as (select /*+ materialize */ p1, p2, count(*) cnt
          from DBA_HIST_ACTIVE_SESS_HISTORY where sql_id='9kuvj1g38hja2'
           and event='db file sequential read' group by p1, p2)
select partition_name, sum(s.cnt) read_count
  from exts e, stats s
 where e.file_id=s.p1
   and s.p2 between e.block_id and e.block_id+e.blocks-1
 group by partition_name
 order by 2;

It’s short and simple, and it also answered the question I asked. Is old data accessed by the SQL statement? Yes.

PARTITION_NAME                 READ_COUNT
------------------------------ ----------
Y2006                                 111
Y2007                                5645
Y2008                                5698
Y2013                               10388
Y2009                               13473
Y2010                               21600
Y2011                               37186
Y2012                               56184

 8 rows selected

Be careful when interpreting the results! If the SQL does not report a partition as being accessed, it doesn’t mean it was not accessed. ASH is based on sampling – maybe we got “lucky” and samples didn’t cover reads from that partition? Maybe all blocks belonging to the partition were in buffer cache and no reads were needed? But, if the SQL reported a partition as being accessed, we can be 100% sure it really was, and for me that was enough.