Materialized Views
YdeevneForudberegnede forespørgselsresultater gemt som fysiske tabeller for hurtig adgang til komplekse aggregeringer og JOINs.
Beskrivelse
Materialized Views er forespørgselsresultater der er gemt fysisk på disk, i modsætning til normale views der er virtuelle. En normal view er bare en gemt forespørgsel der udføres hver gang den kaldes, mens en materialized view faktisk gemmer resultatet. Dette gør dem perfekte til dyre forespørgsler med komplekse JOINs, aggregeringer eller beregninger der bruges ofte men ændres sjældent. Afvejningen er at data kan blive forældet - materialized views skal opdateres periodisk for at afspejle ændringer i underliggende tabeller. Opdatering kan være COMPLETE (genopbyg hele viewet) eller INCREMENTAL (kun ændringer siden sidste opdatering). PostgreSQL understøtter materialized views indbygget. Oracle har avancerede funktioner som query rewrite hvor optimizeren automatisk bruger materialized views. I systemer uden indbygget understøttelse kan samme effekt opnås med denormaliserede tabeller og triggers. Materialized views er særligt værdifulde i data warehouses og analyse hvor samme komplekse forespørgsler køres ofte (dashboards, rapporter).
Problem
Komplekse analytiske forespørgsler med mange JOINs og aggregeringer kan tage minutter at køre. Hvis samme forespørgsel køres ofte (fx dashboard der opdateres hver 5 min), spildes enorme ressourcer på at genberegne samme resultater. Brugere oplever langsomme dashboards.
Løsning
Materialized views forudberegner og gemmer forespørgselsresultater. Første forespørgsel tager tid, men efterfølgende forespørgsler er øjeblikkelige fordi de læser fra gemte resultater. Views opdateres periodisk (fx hver time) for at holde data nogenlunde friske. Dette giver 1000x hastighedsforøgelse for komplekse forespørgsler.
Eksempel
-- PostgreSQL Materialized View
-- Complex query der tager lang tid:
SELECT
c.category_name,
COUNT(DISTINCT o.order_id) as order_count,
COUNT(DISTINCT o.customer_id) as customer_count,
SUM(oi.quantity * oi.price) as total_revenue,
AVG(oi.quantity * oi.price) as avg_order_value
FROM categories c
JOIN products p ON c.category_id = p.category_id
JOIN order_items oi ON p.product_id = oi.product_id
JOIN orders o ON oi.order_id = o.order_id
WHERE o.order_date >= '2024-01-01'
GROUP BY c.category_id, c.category_name
ORDER BY total_revenue DESC;
-- Takes 30 seconds med millions of rows
-- Opret materialized view
CREATE MATERIALIZED VIEW sales_by_category AS
SELECT
c.category_name,
COUNT(DISTINCT o.order_id) as order_count,
COUNT(DISTINCT o.customer_id) as customer_count,
SUM(oi.quantity * oi.price) as total_revenue,
AVG(oi.quantity * oi.price) as avg_order_value,
CURRENT_TIMESTAMP as last_updated
FROM categories c
JOIN products p ON c.category_id = p.category_id
JOIN order_items oi ON p.product_id = oi.product_id
JOIN orders o ON oi.order_id = o.order_id
WHERE o.order_date >= '2024-01-01'
GROUP BY c.category_id, c.category_name;
-- Create index på materialized view
CREATE INDEX idx_sales_category
ON sales_by_category(category_name);
-- Query materialized view (instant!)
SELECT * FROM sales_by_category
ORDER BY total_revenue DESC;
-- Takes milliseconds!
-- Refresh materialized view
REFRESH MATERIALIZED VIEW sales_by_category;
-- Rebuilds view med fresh data
-- Concurrent refresh (no locks)
REFRESH MATERIALIZED VIEW CONCURRENTLY sales_by_category;
-- Requires UNIQUE index
-- Automatic refresh med cron/scheduler
-- Kør hver time:
REFRESH MATERIALIZED VIEW sales_by_category;
-- Scheduled refresh (PostgreSQL pg_cron)
SELECT cron.schedule(
'refresh-sales',
'0 * * * *', -- Every hour
$$REFRESH MATERIALIZED VIEW sales_by_category$$
);
-- Conditional refresh (kun hvis data ændret)
DO $$
BEGIN
IF EXISTS (
SELECT 1 FROM orders
WHERE updated_at > (
SELECT last_updated FROM sales_by_category LIMIT 1
)
) THEN
REFRESH MATERIALIZED VIEW sales_by_category;
END IF;
END $$;
-- MySQL doesn't have native materialized views
-- Workaround: Create table + refresh logic
-- Create summary table
CREATE TABLE sales_by_category_mv (
category_name VARCHAR(100),
order_count INT,
customer_count INT,
total_revenue DECIMAL(12,2),
avg_order_value DECIMAL(10,2),
last_updated TIMESTAMP,
PRIMARY KEY (category_name)
);
-- Procedure to refresh
DELIMITER $$
CREATE PROCEDURE refresh_sales_by_category()
BEGIN
TRUNCATE TABLE sales_by_category_mv;
INSERT INTO sales_by_category_mv
SELECT
c.category_name,
COUNT(DISTINCT o.order_id),
COUNT(DISTINCT o.customer_id),
SUM(oi.quantity * oi.price),
AVG(oi.quantity * oi.price),
NOW()
FROM categories c
JOIN products p ON c.category_id = p.category_id
JOIN order_items oi ON p.product_id = oi.product_id
JOIN orders o ON oi.order_id = o.order_id
WHERE o.order_date >= '2024-01-01'
GROUP BY c.category_id, c.category_name;
END$$
DELIMITER ;
-- Call procedure to refresh
CALL refresh_sales_by_category();
-- Schedule med MySQL Event Scheduler
CREATE EVENT refresh_sales_hourly
ON SCHEDULE EVERY 1 HOUR
DO CALL refresh_sales_by_category();
-- Incremental refresh pattern
-- Track changes i separate table
CREATE TABLE sales_changes (
order_id INT,
changed_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Trigger to track changes
CREATE TRIGGER track_order_changes
AFTER INSERT OR UPDATE ON orders
FOR EACH ROW
INSERT INTO sales_changes (order_id) VALUES (NEW.order_id);
-- Incremental refresh (only changed data)
UPDATE sales_by_category_mv mv
SET
order_count = (
SELECT COUNT(*) FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
WHERE oi.product_id IN (
SELECT product_id FROM products
WHERE category_name = mv.category_name
)
),
last_updated = NOW()
WHERE category_name IN (
SELECT DISTINCT p.category_name
FROM sales_changes sc
JOIN orders o ON sc.order_id = o.order_id
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id
);
-- Node.js scheduled refresh
const cron = require('node-cron');
// Refresh every hour
cron.schedule('0 * * * *', async () => {
console.log('Refreshing materialized view...');
await db.query(
'REFRESH MATERIALIZED VIEW CONCURRENTLY sales_by_category'
);
console.log('Refresh complete');
});
-- Monitoring view freshness
SELECT
schemaname,
matviewname,
last_updated,
NOW() - last_updated as age
FROM sales_by_category;
-- Query rewrite (Oracle feature)
-- Oracle can automatically use materialized view
-- even if query references base tables!
CREATE MATERIALIZED VIEW sales_mv
ENABLE QUERY REWRITE AS
SELECT category_id, SUM(revenue) FROM sales
GROUP BY category_id;
-- Query base table:
SELECT category_id, SUM(revenue) FROM sales
GROUP BY category_id;
-- Oracle automatically uses sales_mv!
-- Real-world dashboard example
-- Daily active users materialized view
CREATE MATERIALIZED VIEW daily_active_users AS
SELECT
DATE(login_time) as date,
COUNT(DISTINCT user_id) as dau,
COUNT(*) as total_logins
FROM user_sessions
WHERE login_time >= CURRENT_DATE - INTERVAL '90 days'
GROUP BY DATE(login_time);
-- Refresh daily at 1 AM
SELECT cron.schedule(
'refresh-dau',
'0 1 * * *',
$$REFRESH MATERIALIZED VIEW daily_active_users$$
);Fordele
- ✓Drastisk hurtigere komplekse forespørgsler
- ✓Reduceret databasebelastning
- ✓Konsistent forespørgselsydeevne
- ✓Kan indeksere på view-resultater
- ✓Perfekt til dashboards og rapporter
Udfordringer
- ⚠Data kan være forældet
- ⚠Opdatering kan tage lang tid
- ⚠Lagrings-overhead
- ⚠Kompleksitet ved samtidig opdatering
- ⚠Vedligeholdelse af opdateringsplan
Anvendelsesområder
- •Analyse-dashboards
- •Rapportgenerering
- •Data warehousing
- •Komplekse aggregeringer
- •Læsetunge arbejdsbelastninger
Eksempler fra den virkelige verden
- •Business intelligence-dashboards
- •E-handels salgsrapporter
- •Sociale medie-analyser
- •Finansielle opsummeringer
- •Kunde 360-visninger