WL#2866: Materialized Views (reduced support)
Affects: Server-7.1
—
Status: Assigned
In September 2005, at a meeting in Chicago, there was sentiment for materialized views, Oracle-like, but without refreshing. This preliminary description is taken from an old worklog task, WL#941 "Views". Feature: Permanent Materialized Views ------------------------------------- This feature goes by different names: AST or MQT in DB2, Indexed Views in SQL Server, and Materialized Views in Oracle (an older Oracle incarnation was "Snapshots"). The idea is: a view (usually a grouped view or join view) can be materialized permanently, so that queries of summary data will be quick. In other words, the "temporary table" that we produce with the temporary-table algorithm isn't a temporary table, it's a permanent table. It's hard to maintain the summary if anyone changes the base table. I am proposing: we would not "refresh" permanent materialized views frequently -- instead, we would allow them to become stale. When the server restarts, the permanent materialized views empty. When a permanent materialized view is queried for the first time, it is populated. The above proposal is unsophisticated. Our competitors allow periodic refresh, and can update the permanent materialized view by checking log entries. At one time I believed that there would be a standard, but "CREATE MATERIALIZED VIEW" isn't in SQL:2008. A comment from a major customer is: "Usually database server restarts never or restarts once within x months. Therefore it is of no use to wait for the next restart for refreshing the view. On the other hand: the first application preparing the view will not like to wait for its preparation." At a meeting with that customer, the decisions were: (a) this is not relevant for the main product, and not needed in the data warehouse environment (b) "it is a permanent temporary table that is crontabbed", (c) "this is not really a view", (d) Permanent Materialized Views should be materialized (persistent) forever, (e) There should be options for resynching, as part of the VIEW definition.
Support creation of materialized views, with only the bare minimum -- no automatic refresh, no query rewrite. Syntax, stated simply --------------------- CREATE [IGNORE] MATERIALIZED VIEW materialized_view_name [(column_list)] [table_options] [partition_options] AS select_statement ; Example: CREATE MATERIALIZED VIEW mv AS SELECT * FROM t1; The effect is like "CREATE TABLE mv AS SELECT * FROM t1;", except that the new object 'mv' is considered to be a view (although it's really more like a base table). Syntax, with detail notes ------------------------- CREATE /* Note 1 */ [IGNORE] /* Note 2 */ MATERIALIZED VIEW /* Note 3 */ materialized_view_name /* Note 4 */ [(column_list)] /* Note 5 */ [table_options] /* Note 6 */ [partition_options] /* Note 7 */ AS /* Note 9 */ select_statement ; /* Note 10 */ Note 1: The syntax in this area differs from CREATE VIEW: no "CREATE OR REPLACE MATERIALIZED VIEW ..." (see also WL#3129) no "CREATE ALGORITHM = temptable MATERIALIZED VIEW ..." no "CREATE MATERIALIZED VIEW ... AS SELECT ... WITH CHECK OPTION" no "CREATE DEFINER=current_user MATERIALIZED VIEW ..." no "CREATE SQL SECURITY DEFINER MATERIALIZED VIEW ...". We also disallow options that are possible with CREATE TABLE: no "CREATE TEMPORARY MATERIALIZED VIEW ..." no "CREATE MATERIALIZED VIEW IF NOT EXISTS ..." (but see also WL#3129 "Consistent clauses in CREATE and DROP"). Note 2: See the later section "The IGNORE Clause". Note 3: MATERIALIZED (with a Z) is the only legal spelling. MATERIALIZED is not a new reserved word. Note 4: materialized_view_name is the name of a [viewed] table so it follows the same rules. It can be qualified. Its maximum length is 64 Unicode characters. It's in the same namespace as base tables and views, so if there's a base table named 'x', you can't create a materialized view named 'x'. (This is also true in Oracle.) Note 5: the "column_list" is a comma-delimited list of column names, comparable to "CREATE VIEW ... (column_list)". But we allow these MySQL-specific additional clauses which are comparable to clauses in CREATE TABLE: * COMMENT 'string' * [COLUMN_FORMAT {FIXED|DYNAMIC|DEFAULT}] * [STORAGE {DISK|MEMORY}] Note 6: The "[table_options]" clause is like the "[table_options]" clause in CREATE TABLE. See the description of CREATE TABLE: http://dev.mysql.com/doc/mysql/en/create-table.html Originally the thought was that [table_options] would come after "AS SELECT ...", but that might be harder to parse, and people are used to "CREATE TABLE ... [table_options] AS SELECT ...". Certainly the table_option "ENGINE=" makes sense, for example "CREATE MATERIALIZED VIEW v (column1) ENGINE=FALCON AS SELECT 5;". Other sensible table options might be AVG_ROW_LENGTH, CHECKSUM, COMMENT, DATA DIRECTORY, DELAY_KEY_WRITE, INDEX DIRECTORY, MAX_ROWS, MIN_ROWS, PACK_KEYS, ROW_FORMAT, TABLESPACE. But the implementor may choose to ignore any or all table options. Note 7: The [partition_options] clause is accepted. This may introduce a form of constraint, causing either CREATE MATERIALIZED VIEW or REFRESH TABLE to fail. Note 8: It's "AS SELECT" not "[AS] SELECT". AS is not optional. Note 9: The select_statement rules are the same as for "CREATE VIEW ... AS SELECT". Since this is like an ALGORITHM=TEMPTABLE view, there are no restrictions. Oracle would require a selection including a primary key. Example showing every legal clause: CREATE IGNORE MATERIALIZED VIEW test.`Materialized View` (column1, column2) ENGINE=FALCON AS SELECT * FROM base_table; Privileges ---------- Required privileges: same as for views. See the manual description: http://dev.mysql.com/doc/mysql/en/create-view.html (This differs from Oracle. Oracle requires two privileges: CREATE [ANY] MATERIALIZED VIEW and CREATE [ANY] TABLE.) TODO note: Peter will look into 'definer' semantics. The privilege required for REFRESH probably must be more than SELECT on materialized_view_name, it should be SELECT on underlying objects unless DEFINER is user. Other statements that may refer to materialized views directly -------------------------------------------------------------- These statements will work: CREATE INDEX index_name ON materialized_view_name ...; DROP INDEX index_name ON materialized_view_name ...; DROP VIEW materialized_view_name; (This differs from Oracle. Oracle has a statement "DROP MATERIALIZED VIEW".) GRANT SELECT ON materialized_view_name ...; REVOKE SELECT ON materialized_view_name ...; SELECT ... FROM materialized_view_name; (All forms of SELECT will work. The materialized view looks just like any 'base table', internally.) CREATE VIEW ... AS SELECT ... FROM materialized_view_name; LOCK TABLES ... materialized_view_name ...; (LOCK TABLE materialized_view_name READ prelocks only the materialized_view_name. LOCK TABLE materialized_view_name WRITE prelocks all the tables involved in the view definition.) These statements will not work: CREATE TABLE (... FOREIGN KEY ... REFERENCES materialized_view_name ...) HANDLER materialized_view_name OPEN ... RENAME TABLE materialized_view_name ... But these statements will work: ALTER TABLE materialized_view_name ... (for partition changes and renaming), ANALYZE, CHECK TABLE, CHECKSUM, OPTIMIZE, REPAIR. And all forms of data-change statement will not work! INSERT INTO materialized_view_name ... -- no! UPDATE materialized_view_name ... -- no! DELETE FROM materialized_view_name ... -- no! LOAD DATA ... materialized_view_name ... -- no! TRUNCATE materialized_view_name ... no! REPLACE materialized_view_name ... no! It is illegal to refer to a materialized view in any data-change statement, directly or indirectly (that is, you can't refer to a materialized view in a function or stored procedure or trigger which an update/insert/delete uses). The intent of this restriction is to avoid replication. Also, data-change effects would be cancelled during refresh. So this is a difference from DB2, which allows updatable materialized views, or in DB2 dialect "user-maintainable MQTs". Refresh ------- We allow materialized views to get out of date (preferred word: "stale"). It is possible to bring a single one back up to date with: REFRESH [IGNORE] [MATERIALIZED] VIEW materialized_view_name; Privileges required: SELECT on materialized_view_name. Example: REFRESH VIEW mv; The effect is like "DELETE FROM mv;" followed by "INSERT INTO mv SELECT view_select_clause". Thus the refreshing is always complete. There are no tricks for keeping a materialized view up to date by changing it when the underlying tables change. If materialized_view_2 is based on materialized_view_1, then "REFRESH MATERIALIZED VIEW materialized_view_2" does not guarantee that materialized_view_2 is up to date, because materialized_view_1 might also be out of date. Refreshing can happen according to a schedule, if you use CREATE EVENT. Oracle has no REFRESH statement. Oracle has clauses in the CREATE MATERIALIZED VIEW statement for refreshing, including scheduled refreshes or "on commit" refreshes. DB2 has a REFRESH TABLE statement but we decided not to copy DB2 http://publib.boulder.ibm.com/infocenter/db2luw/v8//index.jsp?topic=/com.ibm.db2.udb.doc/core/c0009318.htm REFRESH is not a "data manipulation statement". The IGNORE Clause ----------------- It is possible to cause an error due to partitioning. For example: CREATE TABLE t (s1 INT); INSERT INTO t VALUES (1),(2),(3); CREATE MATERIALIZED VIEW mv PARTITION BY LIST (s1) (PARTITION P1 VALUES IN (1,3)) AS SELECT * FROM t; The result is: ERROR 1526 (HY000): Table has no partition for value 2 But we have decided that partition errors can be ignorable, see WL#4103 "Define IGNORE". So, for both CREATE MATERIALIZED VIEW and REFRESH, we allow IGNORE. The effect would be: the row is skipped, there is no error. Since there is a syntax "ALTER IGNORE TABLE" (with the word IGNORE preceding the word TABLE), the syntax is CREATE IGNORE MATERIALIZED VIEW and REFRESH IGNORE MATERIALIZED VIEW rather than CREATE MATERIALIZED VIEW IGNORE and REFRESH MATERIALIZED VIEW IGNORE. If one says CREATE IGNORE, then that is a characteristic of the table which affects subsequent statements. The metadata will show that the view was created with IGNORE. REFRESH TABLE will have the same effect as REFRESH IGNORE TABLE. ALTER [IGNORE] TABLE materialized_view_name is also legal. Query Rewrite ------------- We won't do query rewrite. Although "query rewrite" is the more common term, MySQL will use "materialized view rewrite", due to majority vote during the March 2008 Stockholm meeting (in favour Timour and Konstantin and Mikael, against Igor and Peter). This will have significance for later worklog tasks. Metadata -------- We will treat materialized views like views, therefore SHOW CREATE VIEW and SHOW CREATE TABLE will show materialized views, SHOW [OPEN] TABLES will show materialized views, SELECT * FROM INFORMATION_SCHEMA.VIEWS will show materialized views. In INFORMATION_SCHEMA.TABLES, TABLE_TYPE = 'MATERIALIZED VIEW'. And INFORMATION_SCHEMA.TABLES.UPDATE_TIME = time of last refresh, since no other form of update is possible. In INFORMATION_SCHEMA.VIEWS, the CHECK_OPTION value is always 'NONE' and IS_UPDATABLE is always 'NO' because materialized views are not updatable. We will want three more columns: * VIEW_TYPE CHAR(12). Possible values: 'MATERIALIZED' or 'ORDINARY' * LAST_REFRESH_TIME TIMESTAMP. Possible values: the timestamp of the CREATE MATERIALIZED VIEW statement, or the timestamp of the last REFRESH statement. Perhaps this isn't necessary, since you can say "CREATE MATERIALIZED VIEW ... AS SELECT CURRENT_TIMESTAMP ..." if you want. * IGNORABLE CHAR(3). Possible values: 'YES' or 'NO'. There is no way to find out whether the materialized view is up to date, because we don't have triggers or timestamps of-last-change for every database object and variable that the materialized view might refer to. This differs from Oracle, which would have separate metadata tables for materialized versus ordinary views, for example ALL_MVIEWS. http://download.oracle.com/docs/cd/B28359_01/server.111/b28320/statviews_1140.htm#sthref1243 Backup and Replication ---------------------- Since you can't update a materialized view: * There will be no effect on replication. REFRESH MATERIALIZED VIEW won't be replicated. * BACKUP DATABASE won't backup materialized view contents. It is true that materialized view contents will be different on different machines, because the SELECT is not deterministic. But that happens with any view. It is true that in the Oracle world one of the uses of materialized views is selective replication. But that is not our interest for this worklog item. Examples -------- CREATE TABLE t (s1 INT); INSERT INTO t VALUES (1); INSERT INTO t VALUES (2); INSERT INTO t VALUES (3); CREATE MATERIALIZED VIEW mv1 AS SELECT * FROM t; CREATE VIEW v (view_column_1) AS SELECT COUNT(*) FROM t; CREATE IGNORE MATERIALIZED VIEW mv4 AS SELECT * FROM t,v; CREATE MATERIALIZED VIEW mv5 (column1, column2 COMMENT 'comment', column3 FORMAT DEFAULT, column4 STORAGE DISK) ENGINE=MEMORY PARTITION BY KEY(column1) AS SELECT 1,2,3,4,COUNT(view_column_1) FROM v,information_schema.tables GROUP BY column1, column2; ALTER MATERIALIZED VIEW mv5 REMOVE PARTITIONING; CREATE INDEX i ON mv5 (column2); DROP INDEX i ON mv5; GRANT SELECT on mv5 TO joe@localhost; REVOKE SELECT ON mv5 FROM joe@localhost; ANALYZE TABLE mv5; CHECK TABLE mv5; CHECKSUM TABLE mv5; REPAIR TABLE mv5; INSERT INTO t VALUES (4); /* Now materialized views may be out of date, because they refer ultimately to table t, and table t has a new row. So refresh them. There are 5 separate statements because we didn't allow "REFRESH TABLE table [,table...];". */ REFRESH MATERIALIZED VIEW mv1; REFRESH MATERIALIZED VIEW mv2; REFRESH MATERIALIZED VIEW mv3; REFRESH MATERIALIZED VIEW mv4; REFRESH MATERIALIZED VIEW mv5; SELECT * FROM INFORMATION_SCHEMA.VIEWS WHERE table_name = 'mv5'; SELECT * FROM mv5; DROP VIEW mv5; References ---------- Since the request was for "materialized views" we'll follow Oracle for most things, but some study of the DB2 and SQL Server features could be profitable. "CREATE MATERIALIZED VIEW" (Oracle11g) http://download-uk.oracle.com/docs/cd/B28359_01/server.111/b28286/statements_6002.htm#i2063793 "Materialized Query Tables" (DB2 v8.2) http://publib.boulder.ibm.com/infocenter/db2help/index.jsp?topic=/com.ibm.db2.udb.doc/core/c0009318.htm "Improving Performance with SQL Server 2005 Indexed Views" http://www.microsoft.com/technet/prodtechnol/sql/2005/ipsql05iv.mspx "Refactoring MySQL to Support Materialized Views" http://www.cs.uwaterloo.ca/~hsmit/reports/cs746_report.pdf
Copyright (c) 2000, 2024, Oracle Corporation and/or its affiliates. All rights reserved.