I’m not saying the context indexes on Materialize Views are always large, but sometimes they can be. Today we found an issue in production database where the token information table (named DR$<index_name>$I) of the context index named <index name> was ~1.7Gb in size, but the MV table segment which the index was created on was only ~8 Mb. I decided to dig and and find out what the hell was going on there. After a short while it was discovered that there are ~500 sets of token information in the index table for each row in the materialized view. The following demo illustrates how that happened:

-- Create sample table and materialized view
SQL> create table t_tab2 (x varchar2(10));

Table created.

SQL>
SQL> create materialized view t_mview2
 2  build immediate
 3  using index
 4  as select * from t_tab2;

Materialized view created.

-- insert 3 rows of data
SQL> insert into t_tab2 values ('John');

1 row created.

SQL> insert into t_tab2 values ('Josh');

1 row created.

SQL> insert into t_tab2 values ('Jack');

1 row created.

SQL>
SQL> commit;

Commit complete.

-- verify the contents of table and MV
-- 3 rows
SQL> select * from t_tab2;

X
----------
John
Josh
Jack

-- 0 rows, refresh needed
SQL> select * from t_mview2;

no rows selected

-- refresh MV
SQL> begin
 2    dbms_mview.refresh('T_MVIEW2');
 3  end;
 4  /

PL/SQL procedure successfully completed.

-- verify the contents again
-- 3 rows
SQL> select * from t_tab2;

X
----------
John
Josh
Jack

-- 3 rows, after refresh
SQL> select * from t_mview2;

X
----------
John
Josh
Jack

-- create context index with sync on commit (can be without, but then you have to sync it manually using CTX_DDL.SYNC_INDEX)
SQL> create index t_ctx2 on t_mview2(x) indextype is  ctxsys.context parameters('sync(on commit)');

Index created.

-- check token info, contains 3 values (correct)
SQL> select * from dr$t_ctx2$i;

TOKEN_TEXT TOKEN_TYPE TOKEN_FIRST TOKEN_LAST TOKEN_COUNT TOKEN_INFO
---------- ---------- ----------- ---------- ----------- ----------
JACK                0           3          3           1 008801
JOHN                0           1          1           1 008801
JOSH                0           2          2           1 008801

-- do refresh of the MV again
SQL> begin
 2    dbms_mview.refresh('T_MVIEW2');
 3  end;
 4  /

PL/SQL procedure successfully completed.

-- 3 rows in the table (correct)
SQL> select * from t_tab2;

X
----------
John
Josh
Jack

-- 3 rows in the MV (correct)
SQL> select * from t_mview2;

X
----------
John
Josh
Jack

-- 6 rows ??? not what was expected!
SQL> select * from dr$t_ctx2$i;

TOKEN_TEXT TOKEN_TYPE TOKEN_FIRST TOKEN_LAST TOKEN_COUNT TOKEN_INFO
---------- ---------- ----------- ---------- ----------- ----------
JACK                0           3          3           1 008801
JOHN                0           1          1           1 008801
JOSH                0           2          2           1 008801
JACK                0           6          6           1 008801
JOHN                0           4          4           1 008801
JOSH                0           5          5           1 008801

6 rows selected.

-- rebuild the index
SQL> alter index t_ctx2 rebuild;

Index altered.

-- check again
-- 3 rows! Now it's again correct!
SQL> select * from dr$t_ctx2$i;

TOKEN_TEXT TOKEN_TYPE TOKEN_FIRST TOKEN_LAST TOKEN_COUNT TOKEN_INFO
---------- ---------- ----------- ---------- ----------- ----------
JACK                0           3          3           1 008801
JOHN                0           1          1           1 008801
JOSH                0           2          2           1 008801

-- truncate table and refresh the MV
SQL> truncate table t_tab2;

Table truncated.

SQL> begin
 2    dbms_mview.refresh('T_MVIEW2');
 3  end;
 4  /

PL/SQL procedure successfully completed.

-- we still get those 3 rows in the token info table.
SQL> select * from dr$t_ctx2$i;

TOKEN_TEXT TOKEN_TYPE TOKEN_FIRST TOKEN_LAST TOKEN_COUNT TOKEN_INFO
---------- ---------- ----------- ---------- ----------- ----------
JACK                0           3          3           1 008801
JOHN                0           1          1           1 008801
JOSH                0           2          2           1 008801

-- rebuild again
SQL> alter index t_ctx2 rebuild;

Index altered.

-- 0 rows! As expected!
SQL> select * from dr$t_ctx2$i;

no rows selected

I’m not sure it’s a bug (but looks like that certainly), I reproduced it on 10.2.0.4 (HP-UX) and also 11.2.0.2 (Linux x86). Digging further into the issue showed that a complete refresh of the materialized view changes rowids for all rows and this is most likely why there are multiplexed entries in the token information table (in fact the multiplexion factor should be the same as number of combinations of row data in MV and rowids in MV table segment for that row, unless an index rebuild has been done at some point), it does look so that in case of MV refreshes the maintenance (cleanup part) of the context indexes are not properly done.

Here is a select you can use to identify the context indexes that are created on table segments of materialized views, and check the sizes of related segments to find out if they look reasonable:

SQL> column IDX_OWNER format a10
SQL> column MV_OWNER format a10
SQL> column IDX_NAME format a30
SQL> column MV_NAME format a30
SQL> select u2.username   idx_owner,
            i.idx_name,
            u.username    mv_owner,
            o.OBJECT_NAME mv_name
       from ctxsys.dr$index i, dba_users u, dba_users u2, dba_objects o
      where i.idx_table_owner# = u.user_id
        and i.idx_owner# = u2.user_id
        and u.username = o.OWNER
        and o.OBJECT_ID = i.idx_table#
         and o.object_type = 'TABLE'
         and (o.OWNER, o.object_name) in
             (select o2.OWNER, o2.object_name
                from dba_objects o2
               where object_type = 'MATERIALIZED VIEW');

IDX_OWNER  IDX_NAME                       MV_OWNER   MV_NAME
---------- ------------------------------ ---------- ------------------------------
TEST       T_CTX2                         TEST       T_MVIEW2