Thursday, August 3, 2023
Friday, June 16, 2023
Wednesday, June 14, 2023
Monday, September 5, 2022
Oracle for Analytics - SQL Modeling
The Basics
SELECTSEGMENT,CATEGORY,YEAR,SFROM SALES_VIEW_TBLMODELRETURN ALL ROWSPARTITION BY (SEGMENT)DIMENSION BY (CATEGORY, YEAR)MEASURES (SALES S)UNIQUE DIMENSIONRULES UPSERT SEQUENTIAL ORDER(S['Appliance', 2020] = S[CV(CATEGORY), 2019] + S[CV(CATEGORY), 2018],S['Furniture', 2020] = S[CV(CATEGORY), 2019] + S[CV(CATEGORY), 2018])ORDER BYSEGMENT,CATEGORY,YEAR;
- Partition columns. They divide the data set into logical blocks, in other words, define the boundaries for calculations. The concept is similar with partitions in the analytic functions.
- Dimension columns. They are attributes describe the measures. A full combination of them can identify one cell in a partition.
- Measure columns. These columns have numeric values and similar to facts in a multi-dimensional fact table.
RETURN ALL ROWS: This returns all selected including updated rows by the rules. RETURN UPDATED ROWS will return only those updated rows.
UNIQUE DIMENSION: This says the combination of partition and dimension columns uniquely identify one cell.
SEQUENTIAL ORDER: This means the rules will be executed in the written order in the clause.
Rules for SQL Modeling
Single Cell Reference - Constants are used for dimensions, for example, S['Appliance', 2020].
Multi-Cell Reference - Here are the two examples for the left side and right side respectively, S['Appliance', YEAR >= 2018], MAX(S)['Appliance', Year BETWEEN 1998 AND 2002].
CV Function - It provides a means to use the dimension values on the right side which are referenced on the left side as illustrated in this formula, S['Appliance', 2020] = S[CV(CATEGORY), 2019] + S[CV(CATEGORY), 2018].
UPDATE - If the referenced cell in the left side exists, its measure is updated with the calculation result. No new cells will not be inserted.
UPSERT - If the cell exists, its measure will be updated. If it doesn't exist, a new cell will be inserted.
UPSERT ALL - Simply put, it enables the model rules on the left side to have UPSERT capability. For example, S[ANY, 2020] = S[CV(), 2019] + S[CV(), 2018].
- Measures are the only updatable columns
- In terms of execution order, all other clauses, except for Select Distinct and Order By, are evaluated before the Model clause
- You can't use an aggregate or analytic function in either Select list or Order By list when you have a Model clause in your query
- You can't use subqueries in the rules except in FOR loop
- And etc.
Using Aggregate/Analytic Functions in SQL Modeling
An Example of Using an Aggregate Function
SELECTSEGMENT,CATEGORY,YEAR,SFROM SALES_VIEWWHERE YEAR = 2019MODELDIMENSION BY (SEGMENT, CATEGORY, YEAR)MEASURES (SALES S)RULES UPSERT(S['Retail', 'Sub-total', 2019] = SUM(S)[CV(SEGMENT), ANY, 2019],S['Wholesale', 'Sub-total', 2019] = SUM(S)[CV(SEGMENT), ANY, 2019],S['Total', '', 2019] = SUM(S)[ANY, 'Sub-total', 2019])ORDER BYSEGMENT,YEAR,CATEGORY;
SELECTYEAR,SEGMENT,CATEGORY,S,CSUMFROM SALES_VIEWMODELPARTITION BY ( YEAR )DIMENSION BY ( SEGMENT, CATEGORY )MEASURES (SALES S, 0 CSUM )RULES UPSERT(CSUM[ANY, ANY] = SUM(S) OVER ( PARTITION BY YEARORDER BY YEAR, SEGMENT, CATEGORYROWS UNBOUNDED PRECEDING ))ORDER BYYEAR,SEGMENT,CATEGORY;
Parallel Execution
Oracle for Analytics - Pivot and Unpivot
Pivoting is a very useful technique in data analytics, which transforms rows into columns. On the contrary, unpivoting operation brings columns to rows. If you are familiar with pivot table in Excel, you can simply see this as a implementation at database level. Tableau also incorporates this technique into its data source layer.
Pivoting works as shown by the diagram below.
Oracle for Analytics - Materialized View
1 Introduction
- Materialized View with Aggregates
- Materialized View Containing Only Joins
- Nested Materialized Views
2 How to Create a Materialized View?
A materialized view is created by using CREATE MATERIALIZED VIEW statement. Let's create 3 types of materialized views one by one.
2-1 Create Materialized View Logs
CREATE MATERIALIZED VIEW LOG ON OFA_ORDERS WITH ROWID (ORDER_ID, ORDER_DATE, SHIP_DATE, SHIP_MODE, CUSTOMER_ID, PRODUCT_ID, SALES, QUANTITY, DISCOUNT, PROFIT) INCLUDING NEW VALUES;
CREATE MATERIALIZED VIEW LOG ON OFA_PRODUCTS WITH ROWID (PRODUCT_ID, CATEGORY, SUB_CATEGORY, PRODUCT_NAME) INCLUDING NEW VALUES;
CREATE MATERIALIZED VIEW LOG ON OFA_CUSTOMERS WITH ROWID (CUSTOMER_ID, CUSTOMER_NAME, SEGMENT, COUNTRY_REGION, CITY, STATE, REGION, POSTAL_CODE) INCLUDING NEW VALUES ;
CREATE MATERIALIZED VIEW LOG ON OFA_RETURNS WITH ROWID (ORDER_ID, RETURNED) INCLUDING NEW VALUES ;
2-2 Materialized View with Aggregates
CREATE MATERIALIZED VIEW MV_SALES_PRODUCT(ORDER_DATE, PRODUCT_ID, CATEGORY, SUB_CATEGORY, PRODUCT_NAME, SUM_SALES, SUM_QUANTITY, SUM_PROFIT)PARALLELBUILD IMMEDIATEREFRESH FAST ON COMMITENABLE QUERY REWRITEASSELECTORDER_DATE,ORD.PRODUCT_ID,CATEGORY,SUB_CATEGORY,PRODUCT_NAME,SUM(SALES),SUM(QUANTITY),SUM(PROFIT)FROMOFA_ORDERS ORDLEFT JOINOFA_PRODUCTS PRDON ORD.PRODUCT_ID = PRD.PRODUCT_IDGROUP BYORDER_DATE,ORD.PRODUCT_ID,CATEGORY,SUB_CATEGORY,PRODUCT_NAME;
2-3 Materialized View Containing Only Joins
CREATE MATERIALIZED VIEW MV_SALES_JOINPARALLELBUILD IMMEDIATEREFRESH FORCEENABLE QUERY REWRITEASSELECTORD.ORDER_ID,ORDER_DATE,SHIP_DATE,SHIP_MODE,ORD.CUSTOMER_ID,ORD.PRODUCT_ID,SALES,QUANTITY,DISCOUNT,PROFIT,CATEGORY,SUB_CATEGORY,PRODUCT_NAME,CUSTOMER_NAME,SEGMENT,COUNTRY_REGION,CITY,STATE,CTM.REGION,POSTAL_CODE,RETURNEDFROMOFA_ORDERS ORDLEFT JOINOFA_PRODUCTS PRDON ORD.PRODUCT_ID = PRD.PRODUCT_IDLEFT JOINOFA_CUSTOMERS CTMON ORD.CUSTOMER_ID = CTM.CUSTOMER_IDLEFT JOINOFA_RETURNS RETON ORD.ORDER_ID = RET.ORDER_ID;
2-4 Nested Materialized View
CREATE MATERIALIZED VIEW MV_SALES_ANALYTICSPARALLELBUILD IMMEDIATEREFRESH COMPLETEENABLE QUERY REWRITEASSELECTTO_CHAR(ORDER_DATE, 'YYYYMM') ORDER_YM,CATEGORY CATEGORY,SUB_CATEGORY SUB_CATEGORY,SEGMENT SEGMENT,COUNTRY_REGION COUNTRY_REGION,CITY CITY,STATE STATE,REGION REGION,SUM(SALES) SUM_SALES,SUM(QUANTITY) SUM_QUANTITY,SUM(PROFIT) SUM_PROFITFROMMV_SALES_JOINGROUP BYTO_CHAR(ORDER_DATE, 'YYYYMM'),CATEGORY,SUB_CATEGORY,SEGMENT,COUNTRY_REGION,CITY,STATE,REGION;
SELECTORDER_YM,STATE,CITY,SUM(SUM_SALES) TOTAL_SALES,SUM(SUM_PROFIT) TOTAL_PROFIT,SUM(SUM_PROFIT) / SUM(SUM_SALES) PROFIT_RATIO,NTH_VALUE(SUM(SUM_PROFIT), 1) OVER(PARTITION BY ORDER_YM ORDER BY SUM(SUM_PROFIT) DESCROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) TOPPROFIT_YMFROMMV_SALES_ANALYTICSGROUP BYORDER_YM,STATE,CITYORDER BYORDER_YM,STATE,CITY;
3 Partitioned Materialized View
CREATE MATERIALIZED VIEW MV_SALES_PRODUCT_PARTITIONED(ORDER_DATE, PRODUCT_ID, CATEGORY, SUB_CATEGORY, PRODUCT_NAME, SUM_SALES, SUM_QUANTITY, SUM_PROFIT)PARALLELPARTITION BY RANGE (ORDER_DATE) INTERVAL (NUMTOYMINTERVAL(1, 'YEAR'))(PARTITION "P2014" VALUES LESS THAN (TO_DATE('2015-01-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS')),PARTITION "P2015" VALUES LESS THAN (TO_DATE('2016-01-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS')))BUILD IMMEDIATEREFRESH FASTENABLE QUERY REWRITEASSELECTORDER_DATE,ORD.PRODUCT_ID,CATEGORY,SUB_CATEGORY,PRODUCT_NAME,SUM(SALES),SUM(QUANTITY),SUM(PROFIT)FROMOFA_ORDERS ORDLEFT JOINOFA_PRODUCTS PRDON ORD.PRODUCT_ID = PRD.PRODUCT_IDGROUP BYORDER_DATE,ORD.PRODUCT_ID,CATEGORY,SUB_CATEGORY,PRODUCT_NAME;
4 Refresh Materialized View
- DBMS_MVIEW.FEFRESH - Refresh one or more materialized view.
- DBMS_MVIEW.FEFRESH_ALL_MVIEWS - Refresh all materialized views.
- DBMS_MVIEW.FEFRESH_DEPENDENT - Refresh all materialized views relying on the specified tables or materialized views.
EXEC DBMS_MVIEW.REFRESH('MV_SALES_JOIN', METHOD => 'C');
5 A Note about Compression
Wednesday, May 13, 2020
Oracle - How to Retrieve Execution Plan?
1. A Simple Way
If you have the SQL statement at hand, you can simply execute the following commands to retrieve the execution plan.
Here I use "SELECT * FROM EMP" script as an example.
EXPLAIN PLAN FOR ( SELECT * FROM EMP );
SELECT PLAN_TABLE_OUTPUT FROM TABLE( DBMS_XPLAN.DISPLAY );
2. Retrieve Execution Plan Using SQL_ID
2.1 Get SQL_ID
Basically using the SQL script shown as below can fetch the SQL_ID of the intended SQL statement. MODULE column will help you identify who issued the statement.
SELECT MODULE, SQL_ID, CHILD_NUMBER, LAST_LOAD_TIME, SQL_TEXT
FROM V$SQL
WHERE SQL_FULLTEXT LIKE '%FROM EMP%'
ORDER BY MODULE, LAST_LOAD_TIME DESC ;
There are some points that you may need to pay attention to.
- If your have a lengthy SQL statement, it is recommended to use SQL_FULLTEXT instead of SQL_TEXT in WHERE clause.
- If you are using an Oracle Real Application Cluster, you may want to use GV$SQL rather than V$SQL, which retrieves information from all eligible instances.
- LAST_LOAD_TIME help you identify the latest time when the query plan was loaded into the library cache.
2.2 Retrieve Execution Plan
After you get the proper SQL_ID, you can retrieve the Execution Plan via the following script. The 2nd parameter is CHILD_NUMBER, please make sure you have the right number as well.
SELECT PLAN_TABLE_OUTPUT
FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR('4ttqgu8uu8fus', 0, 'TYPICAL'));
As you might have noticed, TYPICAL is specified as the 3rd parameter in this example. It indicates what level information to be retrieved, as listed below.
- BASIC
- TYPICAL
- SERIAL
- ALL
To use it properly, you may need to set a corresponding statistical level, which can be done by the following commands.
Check statistics level: SHOW PARAMETER STATISTICS_LEVEL
Change statistics level: ALTER SESSION SET STATISTICS_LEVEL
React - Makeover in React: W3Schools How To
Last updated on: When it comes to building a React app, I get used to searching for a UI library to build the front pages, a straightforwar...