Table of contents
- What is a materialized view?
- Do I have to recalculate the entire materialized view
- MySQL already supports views, why use Flexviews?
- What are the requirements?
- Why Can't I just use regular SQL to create the views?
- I don't want to learn a SQL API!
- Do I need to use triggers?
- Can I update views synchronously (on commit?)
- Is there a query rewrite facility available?
- What SQL features can be used in combination with incremental refresh?
Frequently asked questions (FAQ):
Q: What is a materialized view?
A materialized view is like a regular view, except that the results are stored into an actual database table, not a virtual one. Since the results are stored in a physical table, it would be very expensive to update the table every time a row changes. Instead the table is updated periodically. This is called a refresh.
Q: Do I have to recalculate the entire materialized view when I refresh it?
A: Flexviews supports two different refresh methods: COMPLETE and INCREMENTAL. You specify the refresh method when you create the view. The SQL converter will tell you if the view can be incrementally refreshed based on the input SQL. See below for a list of limitations for incremental refresh. An incremental refresh will only read data based on the row values which have changed since the last delta computation. This is much more efficient, and can be orders of magnitude faster than a complete refresh. Complete refresh rebuilds the table from scatch based on the SQL definition for the view.
Q: MySQL already supports views, why use Flexviews?
MySQL does not support materialized views. It does support CREATE TABLE AS .. SELECT (CTAS), but this is a poor substitute for a number of reasons.
- MySQL does not store metadata about the SELECT statement that created the table. Flexviews keeps this metadata.
- This means that with CTAS it may not be easy to recreate the table because you may not know the exact SQL statement that created the table. With flexviews it is a single call: flexviews.refresh(...)
- The entire table has to be recalculated from scratch. Wouldn't it be nice to be able to just figure out what changed? The refresh function does this for incrementally refreshable views!
Q: What are the requirements?
- MySQL 5.1
- binlog_format=row, log_slave_updates (may use 5.0 master)
- large stack size
- shell access to mysqlbinlog utility
- SUPER privileges WITH GRANT OPTION
- PHP 5.2 or greater
- CLI
- mysql extension
- pcntl extension(optional)
- PHP is used for FlexCDC and for utility scripts, but the majority of Flexviews is written in MySQL stored procedures.
- Unix shell access (not tested with windows)
- RoboDoc (only if you want to rebuild the documentation)
What is the SQL_API?
The SQL_API is a set of stored procedures for creating materialized views, instead of creating them directly from a SELECT statement like a regular view or like other database systems.
You don't use the SQL_API to access the materialized view, just to create it!
Q: Why Can't I just use regular SQL to create the views?
Unfortunately, Flexviews doesn't have access to a SQL parser, since MySQL doesn't expose that functionality. Instead, it implements a functional API that allows you to programatically build a SQL statement from a series of function calls that combine SQL fragments into a SQL statement stored in the Flexviews data dictionary. Users of Map/Reduce type applications which functionally define operations on a data set may find this somewhat familiar. There are only a small handful of functions that you need to use to define a view. See the manual for more information.
Q: I don't want to learn a SQL API!
The performance benefits of view materialization outweigh the learning curve of the API. If that doesn't convince you, you can try to use the online SQL conversion tool. It will try to convert your query into a set of API calls. YMMV.
Q: How does it tell what changed in my database? Do I need to use triggers?
Flexviews includes FlexCDC, a change-data-capture utility which records changes into log tables, transactionally. You do not need to create triggers on any tables. FlexCDC reads from the binary log and then writes information about the changes into log tables. For this reason you must use the binlog_format=ROW option in my.cnf. Flexviews reads from the change log tables.
Q: What are the FlexCDC requirements?
PHP 5.2 or greater. A user with SUPER privileges. FlexCDC must be configured to use the 'flexviews' database.
Q: Can I update views synchronously (on commit?)
No. Only asynchronous refreshing is supported.
Q: Is there a query rewrite facility available?
If you use a tool like Mondrian, it can advise you on which summary tables to build, and it will use them automatically as long as they are named according to the proper naming specification. In general you will need to modify your queries to use the summary tables. If you already have summary tables which you application uses, then you can probably convert them directly to materialized views easily.
Q: What SQL features can be used in combination with incremental refresh?
Operation | Supported | Notes |
GROUP BY | Yes | All non-aggregate expressions must be in the GROUP BY clause. |
COUNT | Yes | (*) and (expression) are supported |
COUNT_DISTINCT | Yes | Experimental |
SUM | Yes | |
SUM_DISTINCT | No | |
AVG | Yes | Experimental |
AVG_DISTINCT | No | |
MIN | Yes | Experimental |
MIN_DISTINCT | No | |
MAX | Yes | Experimental |
MAX_DISTINCT | No | |
JOIN | Yes | INNER JOIN only. No CROSS JOIN, OUTER JOIN or cartesian products. |
WHERE | Yes | |
HAVING | No | (use a where clause when you select from the view) |
ORDER BY | No | (use an order by clause when you select from the view) |