Monday, September 5, 2022

Oracle for Analytics - Materialized View


1 Introduction


In data warehousing environment, performance is an eye-catching theme as always. Materialized view provides an efficient and effective means to improve performance. It enables a mechanism to precompute summaries or expensive joins and store the results. Materialized view can be used for 
improving query performance as well as providing replicated data.

Generally, there are 3 types of materialized view:
  • Materialized View with Aggregates
  • Materialized View Containing Only Joins
  • Nested Materialized Views
We'll create them one by one in the below sections.


2 How to Create a Materialized View?


In this demonstration, I use table OFA_ORDERS to keep measures, along with the other 3 dimensional tables such as OFA_CUSTOMERS, OFA_RETURNS and OFA_PRODUCTS. The model is shown as the diagram below.


ER Diagram

A materialized view is created by using CREATE MATERIALIZED VIEW statementLet's create 3 types of materialized views one by one.

2-1 Create Materialized View Logs


Materialized view logs are not mandatory. However, if you want to use fast refresh, you will need to create materialized view logs. In case the detail table supports partition change tracking (PCT), the log on that table is not required. 
The materialized view log creation statements used in this exercise are as follows.

CREATE MATERIALIZED VIEW LOG ON OFA_ORDERS WITH ROWID (ORDER_ID, ORDER_DATE, SHIP_DATE, SHIP_MODE, CUSTOMER_ID, PRODUCT_ID, SALES, QUANTITY, DISCOUNT, PROFIT) INCLUDING NEW VALUES;

 CREATE MATERIALIZED VIEW LOG ON OFA_PRODUCTS WITH ROWID (PRODUCT_ID, CATEGORY, SUB_CATEGORY, PRODUCT_NAME) INCLUDING NEW VALUES;

 CREATE MATERIALIZED VIEW LOG ON OFA_CUSTOMERS WITH ROWID (CUSTOMER_ID, CUSTOMER_NAME, SEGMENT, COUNTRY_REGION, CITY, STATE, REGION, POSTAL_CODE) INCLUDING NEW VALUES ;

CREATE MATERIALIZED VIEW LOG ON OFA_RETURNS WITH ROWID (ORDER_ID, RETURNED) INCLUDING NEW VALUES ;

 

2-2 Materialized View with Aggregates


The first example is to create a summary materialized view with a column alias list added to the CREATE statement.

CREATE MATERIALIZED VIEW MV_SALES_PRODUCT
(ORDER_DATE, PRODUCT_ID, CATEGORY, SUB_CATEGORY, PRODUCT_NAME, SUM_SALES, SUM_QUANTITY, SUM_PROFIT)
PARALLEL
BUILD IMMEDIATE
REFRESH FAST ON COMMIT
ENABLE QUERY REWRITE
AS
SELECT
    ORDER_DATE
   ,ORD.PRODUCT_ID
   ,CATEGORY
   ,SUB_CATEGORY
   ,PRODUCT_NAME
   ,SUM(SALES)
   ,SUM(QUANTITY)
   ,SUM(PROFIT)
FROM
OFA_ORDERS ORD
LEFT JOIN 
OFA_PRODUCTS PRD
    ON ORD.PRODUCT_ID = PRD.PRODUCT_ID
GROUP BY
    ORDER_DATE
   ,ORD.PRODUCT_ID
   ,CATEGORY
   ,SUB_CATEGORY
   ,PRODUCT_NAME
;

2-3 Materialized View Containing Only Joins


When JOINs become very expensive, you can create a materialized view precalculating the joins as a solution, as shown by the following example. Then build queries and aggregate calculations upon the materialized view.
 
CREATE MATERIALIZED VIEW MV_SALES_JOIN
PARALLEL
BUILD IMMEDIATE
REFRESH FORCE
ENABLE QUERY REWRITE
AS
SELECT
    ORD.ORDER_ID
   ,ORDER_DATE
   ,SHIP_DATE
   ,SHIP_MODE
   ,ORD.CUSTOMER_ID
   ,ORD.PRODUCT_ID
   ,SALES
   ,QUANTITY
   ,DISCOUNT
   ,PROFIT
   ,CATEGORY
   ,SUB_CATEGORY
   ,PRODUCT_NAME
   ,CUSTOMER_NAME
   ,SEGMENT
   ,COUNTRY_REGION
   ,CITY
   ,STATE
   ,CTM.REGION
   ,POSTAL_CODE
   ,RETURNED
FROM
OFA_ORDERS ORD
LEFT JOIN 
OFA_PRODUCTS PRD
    ON ORD.PRODUCT_ID = PRD.PRODUCT_ID
LEFT JOIN 
OFA_CUSTOMERS CTM
    ON ORD.CUSTOMER_ID = CTM.CUSTOMER_ID
LEFT JOIN 
OFA_RETURNS RET
    ON ORD.ORDER_ID = RET.ORDER_ID
;

2-4 Nested Materialized View


The below view, MV_SALES_ANALYTICS, is built on the join materialized view created in the above section, with a column alias assigned to each field.

CREATE MATERIALIZED VIEW MV_SALES_ANALYTICS
PARALLEL
BUILD IMMEDIATE
REFRESH COMPLETE
ENABLE QUERY REWRITE
AS
SELECT
    TO_CHAR(ORDER_DATE, 'YYYYMM')  ORDER_YM
   ,CATEGORY                                                  CATEGORY
   ,SUB_CATEGORY                                        SUB_CATEGORY
   ,SEGMENT                                                    SEGMENT
   ,COUNTRY_REGION                                  COUNTRY_REGION
   ,CITY                                                             CITY
   ,STATE                                                           STATE 
   ,REGION                                                       REGION
   ,SUM(SALES)                                               SUM_SALES
   ,SUM(QUANTITY)                                      SUM_QUANTITY
   ,SUM(PROFIT)                                             SUM_PROFIT
FROM
MV_SALES_JOIN
GROUP BY
    TO_CHAR(ORDER_DATE, 'YYYYMM')
   ,CATEGORY
   ,SUB_CATEGORY
   ,SEGMENT
   ,COUNTRY_REGION
   ,CITY
   ,STATE
   ,REGION
;

Using the above materialized view, we can carry out further analytic tasks. For example, we may want to compare a city's monthly profit with the top in that month.

SELECT
   ORDER_YM
  ,STATE
  ,CITY
  ,SUM(SUM_SALES) TOTAL_SALES
  ,SUM(SUM_PROFIT) TOTAL_PROFIT
  ,SUM(SUM_PROFIT) / SUM(SUM_SALES) PROFIT_RATIO
  ,NTH_VALUE(SUM(SUM_PROFIT), 1) OVER 
  (PARTITION BY ORDER_YM ORDER BY SUM(SUM_PROFIT) DESC 
  ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) TOPPROFIT_YM
FROM
MV_SALES_ANALYTICS
GROUP BY
   ORDER_YM
  ,STATE
  ,CITY
ORDER BY
   ORDER_YM
  ,STATE
  ,CITY
;


3 Partitioned Materialized View


In data warehousing context, partitioning is a significant technique. Partitioning fact tables can provide you with many benefits in terms of scalability, administration, performance and etc. You can partition a materialized view as well.

CREATE MATERIALIZED VIEW MV_SALES_PRODUCT_PARTITIONED
(ORDER_DATE, PRODUCT_ID, CATEGORY, SUB_CATEGORY, PRODUCT_NAME, SUM_SALES, SUM_QUANTITY, SUM_PROFIT)
PARALLEL
PARTITION BY RANGE (ORDER_DATE) INTERVAL (NUMTOYMINTERVAL(1, 'YEAR'))
(PARTITION "P2014"  VALUES LESS THAN (TO_DATE('2015-01-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS')),
PARTITION "P2015"  VALUES LESS THAN (TO_DATE('2016-01-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS')))
BUILD IMMEDIATE
REFRESH FAST
ENABLE QUERY REWRITE
AS
SELECT
    ORDER_DATE
   ,ORD.PRODUCT_ID
   ,CATEGORY
   ,SUB_CATEGORY
   ,PRODUCT_NAME
   ,SUM(SALES)
   ,SUM(QUANTITY)
   ,SUM(PROFIT)
FROM
OFA_ORDERS ORD
LEFT JOIN 
OFA_PRODUCTS PRD
    ON ORD.PRODUCT_ID = PRD.PRODUCT_ID
GROUP BY
    ORDER_DATE
   ,ORD.PRODUCT_ID
   ,CATEGORY
   ,SUB_CATEGORY
   ,PRODUCT_NAME
;


4 Refresh Materialized View


There are 3 packages available for refreshing materialized view.
  • DBMS_MVIEW.FEFRESH - Refresh one or more materialized view.
  • DBMS_MVIEW.FEFRESH_ALL_MVIEWS - Refresh all materialized views.
  • DBMS_MVIEW.FEFRESH_DEPENDENT - Refresh all materialized views relying on the specified tables or materialized views.
For example, we can use this statement to refresh MV_SALES_JOIN materialized view created in Section 2-3.
EXEC DBMS_MVIEW.REFRESH('MV_SALES_JOIN', METHOD => 'C');

Additionally, atomic refresh option ensures the views refreshed as a whole in a single transaction. For example, the all views will be refreshed in a single transaction if atomic refresh is enabled in DBMS_MVIEW.FEFRESH_ALL_MVIEWS.


5 A Note about Compression


If a materialized view is not build on a prebuilt table, it consumes storage space for sure. The table below suggests how much storage space our materialized views take up. You can consider compression technique as an option to save space especially when your data is redundant.






No comments:

Post a Comment

AWS - Build A Serverless Web App

 ‘Run your application without servers’. The idea presented by the cloud service providers is fascinating. Of course, an application runs on...