Saturday, August 24, 2013

materialized view that is verified to be fast refresh


                 



A materialized view that is verified to be fast refresh should update relatively fast. But, what happens when there are few changes to the master table, no network issues, no aggregation in the snapshot query and the refresh still runs slow?

About Materialized Views
One of the most useful replication and data warehousing features in Oracle is materialized views. Materialized views, also known as snapshots, have been a feature of Oracle for several years. A simple way to conceptualize this is to think of a view of a master table that has actual data that can be refreshed. Since the introduction in 8i, Oracle has consistently enhanced the technology for each subsequent release, including 8i, 9i, and 10g.

The data in a materialized view is updated by either a complete or incremental refresh. An incremental or fast refresh uses a log table to keep track of changes on the master table. A materialized view log (snapshot log) is a schema object that records changes to a master table's data so that a materialized view defined on that master table can be refreshed incrementally. The frequency of this refresh can be configured to run on-demand or at regular time intervals.

In practice, many Oracle customers use materialized views outside of the data warehousing environment. Some companies use fast refresh materialized views on remote databases in order to improve performance and security when using distributed computing for online transaction processing. The speed of a fast refresh will be determined by how much data has changed since the last refresh. If the master table's data is updated very often, then the log table will have more recorded changes to process in order to update the materialized view.

However, the entire concept of the fast refresh is that it should be a relatively quick operation. There is a substantial volume of documentation regarding how to ensure that the refresh is actually doing a fast refresh. New features in 10g including DBMS_MVIEW.EXPLAIN_MVIEW and DBMS_ADVISOR.TUNE_MVIEW provide insight and advice on materialized views. While these features can help you to get an optimal materialized view, it cannot help when the underlying problem is not the materialized view. In fact, the underlying problem might not even be on the consuming site.

A Real Life Example...
A recent experience with a client exposed such a situation. The client complained that a user process was running slow. After a quick analysis, the culprit was determined to be a materialized view refresh run on-demand from a trigger in the process. The master table of the materialized view was a remote table in the same data center. The user process inserted a row into the master table and then refreshed the materialized view. Testing revealed that even when there were only a few changes on the master table, the refresh would still take 10-20 seconds. In this situation I would generally consider the following possibilities: complete versus fast refresh, network bound, many changes on master table, and complex aggregation on MV query. The most likely solution was that a complete refresh was happening. However, the materialized view refresh was confirmed to be a fast refresh by querying USER_MVIEWS.

SELECT mview_name, refresh_mode, refresh_method,
            last_refresh_type, last_refresh_date
       FROM user_mviews;
At the master site, it was verified that the snapshot log existed.

SELECT log_owner, master, log_table
       FROM dba_mview_logs;

So, the two basic requirements for a fast refresh were confirmed. Next, I tested the network bound by running copying 30,000 rows from all_objects from the master to the consumer site in 1-2 seconds. Then, I checked the query of the materialized view and confirmed that it was a simple select from the master table without any aggregation or sorting. To be prudent, DBMS_MVIEW.explain_mview was run on the materialized view. MV_CAPABILITIES_TABLE did not reveal any problems. Furthermore, the master table had approximately 10-30 changes per minute while the refresh was happening 1-2 times per minute. There was no doubt that a fast refresh was occurring, there were no aggregations in the query, there was a small number of changes to the master table, and network issues were not the problem. So what was causing this fast refresh to go so slow?

After determining that the problem was definitely not on the consumer side, the focus of the analysis shifted to the master site. We had already confirmed that the snapshot logs were present so the next step was to see what was contained in those logs. Much to our surprise, there were over 13 million rows in the snapshot log dating back several months. This was unexpected since the user process was refreshing the materialized view a few times every minute. After the refresh, why were the records in that table not being deleted? One possibility was that there was another snapshot using that log. Multiple simple snapshots can use the same snapshot log, meaning that records already used to refresh one snapshot might still be needed to refresh another snapshot. Therefore, Oracle does not delete rows from the log until all snapshots have used them. If the snapshot log has grown very large, then the time to complete a fast refresh will increase as well since more records must be scanned by the consuming site before determining which records to use for the refresh. This explained why a fast refresh with almost no changes on the master table would still take 10-20 seconds to complete.

While the client insisted that no other snapshots existed for this table, the evidence showed that to be the most likely cause. A query of DBA_REGISTERED_SNAPSHOTS seemed to support the client's position since no other snapshots appeared to be using this log.

SELECT owner, NAME, snapshot_site, TO_CHAR(current_snapshots,'mm/dd/yyyy hh24:mi') current_snapshots
FROM dba_registered_snapshots, dba_snapshot_logs
WHERE dba_registered_snapshots.snapshot_id = dba_snapshot_logs.snapshot_id (+)
AND dba_snapshot_logs.MASTER='&table_name'
/
However, the one risk with using materialized views over remote databases is that sometimes a network or site failure can prevent a master of becoming aware that a snapshot has been dropped. Oracle keeps track of snapshots that are using entries in SYS.SLOG$ at the master site. In fact, when deciding whether to purge snapshot log records, Oracle compares SYS.SLOG$.SNAPTIME for the table with MLOG$_.SNAPTIME$$. The rows with a MLOG$_.SNAPTIME$$ equal to or older than the oldest SYS.SLOG$.SNAPTIME for the table are purged from the log. If an orphan entry exists in SYS.SLOG$ at the master site for a deleted snapshot, the SNAPTIME in SLOG$ will not be updated. Consequently, any records in the snapshot log will never be purged during a refresh.

The following query can be useful in identifying situations where a snapshot entry exists in SLOG$ but is not registered and has not been updated in a long time.

SELECT r.NAME snapname, snapid,
NVL(r.snapshot_site, 'not registered') snapsite,
snaptime
FROM   sys.slog$ s, dba_registered_snapshots r
WHERE  s.snapid=r.snapshot_id(+) AND
                                 mowner LIKE UPPER('&owner')                                   
AND       MASTER LIKE UPPER('&table_name');
Solution
After verifying the existing snapshots on the consumer site by querying SYS.SNAP$, it was easy to determine which entries in SYS.SLOG$ at the master site were no longer being used. After using DBMS_SNAPSHOT.PURGE_SNAPSHOT_FROM_LOG to remove the orphaned entries, the old records from the snapshot log were finally purged upon the next refresh. In order to realize this improvement, the high watermark of the materialized view log needed to be reset by using ALTER MATERIALIZED VIEW LOG '&snapshot_log' SHRINK SPACE. It was later found that these orphaned snapshots were the result of obsolete development instances having materialized views on this production table.

Upon realizing the root cause, it was apparent that the impact of this issue was likely to be broader than a slow user process. If the database were scanning 13 million rows about 2 times per minute, then this should have caught the attention of the DBAs who were monitoring the database activity. A recent Statspack report on the master site showed that the MLOG$ table was among the top I/O consumers. The Active Workload Repository (AWR) segment statistics revealed that the logical and physical reads on the MLOG$ table had consistently increased since the earliest measurement. The moment that an MLOG$ table shows up on a Statspack report, it is prudent to determine if there is a problem. In this case, the problem could have been identified much earlier.

Conclusion
Oracle’s materialized views are a great tool for replication and each subsequent release has proven to add new features and enhancements. As with any technology, you must be certain that you are using the features correctly and have implemented a clearly defined and strictly enforced change management policy. While these new technologies will always introduce new possibilities for problems, the old standards of a strict operating procedure can mitigate these risks.


No comments:

Post a Comment