Sunday, March 20, 2011

Materialized View fast refresh taking a long time

I have a large table that is replicated from Oracle 10.2.0.4 to and Oracle 9i database using MView replication over the network. The master table is about 50GB, 160M rows and there are about 2 - 3M new or updates rows per day.

The master table has a materialized view log created using rowid.

The full refresh of the view works and takes about 5 hours, which we can live with.

However the fast refresh is struggling to keep up. Oracle seems to require two queries against the mlog and master table to do the refresh, the first looks like this:

SELECT          /*+ */
   DISTINCT "A1"."M_ROW$$"
       FROM "GENEVA_ADMIN"."MLOG$_BILLSUMMARY" "A1"
      WHERE "A1"."M_ROW$$" <> ALL (SELECT "A2".ROWID
                                     FROM "GENEVA_ADMIN"."BILLSUMMARY" "A2"
                                    WHERE "A2".ROWID = "A1"."M_ROW$$")
        AND "A1"."SNAPTIME$$" > :1
        AND "A1"."DMLTYPE$$" <> 'I'

The current plan is:

---------------------------------------------------------------
| Id  | Operation                     | Name                  |
---------------------------------------------------------------
|   0 | SELECT STATEMENT              |                       |
|   1 |  HASH UNIQUE                  |                       |
|   2 |   FILTER                      |                       |
|   3 |    TABLE ACCESS BY INDEX ROWID| MLOG$_BILLSUMMARY     |
|   4 |     INDEX RANGE SCAN          | MLOG$_BILLSUMMARY_AK1 |
|   5 |    TABLE ACCESS BY USER ROWID | BILLSUMMARY           |

When there are 3M rows changed, this query literally runs forever - its basically useless. However, if I rewrite it slightly and tell it to full scan the master table and mlog table, it completes in 20 minutes.

The problem is that the above query is coming out of the inners of Oracle and I cannot change it. The problem is really the FILTER operation on line 2 - if I could get it to full scan both tables and hash join / anti-join, I am confident I can get it to complete quick enough, but no receipe of hints I offer will get this query to stop using the FILTER operation - maybe its not even valid. I can use hints to get it to full scan both the tables, but the FILTER operation remains, and I understand it execute long 5 for each row returned by line 3, which will be 2- 3M rows.

Has anyone got any ideas on how to trick this query into the plan I want without changing the actual query, or better, any ways of getting replication to take a more sensible plan for my tablesizes?

Thanks,

Stephen.

From stackoverflow
  • Hi Stephen,

    As you wrote the queries are part of an internal Oracle mechanism so your tuning options are limited. The fast-refresh algorithm seems to behave differently in the more recent versions, check Alberto Dell’Era’s analysis.

    You could also look into SQL profiles (10g feature). With the package DBMS_SQLTUNE this should allow you to tune individual SQL statements.

    Stephen ODonnell : Good information in Alberto's posts - I had worked a good bit of that out myself, but its good to see someone else say the same things. I am thinking of 'rolling my own' fast refresh on this - its only 3 or 4 queries at the end of the day.
    Vincent Malgrat : @Stephen: good luck with that, I think it may be your best option too. Share it here when you have done it :>
  • How do the estimated cardinalities look for the refresh query in comparison to the actual cardinalities? Maybe the MLOG$ table statistics are incorrect.

    It might be better to have no statistics on the table and lock them in order to invoke dynamic sampling, which ought to give a reasonable estimation based on the multiple predicates in the query.

    Stephen ODonnell : We have tried no stats, accurate stats and 'faking the stats' by setting the number of rows and blocks in the MLog table way higher that it should be and the plan doesn't change. Extracting the query, I can change the access paths with hints (or stored outlines) but getting rid of that filter step in the plan doesn't seem to be possible probably due to the <> ALL ... part of the query. I was hoping an anti-join or something would work, but those hints are being ignored.
    David Aldridge : When you tried no stats did you verify that dynamic sampling was being invoked (the trace file would show the dynamic sampling query)? What is the optimiser mode? The RBO would be disinclined to full scan if it could avoid it, of course.

0 comments:

Post a Comment