← Tilbage til databaser

InfluxDB

Tidsserie

En purpose-built time series database optimeret til høj-hastigheds writes og effektive time-based queries.

Beskrivelse

InfluxDB er den mest populære open-source time series database, specifikt designet til metrics, events, og analytics data der har timestamps. I modsætning til generelle databaser er InfluxDB optimeret til workloads hvor data konstant kommer ind i høj hastighed og queries primært er time-based. Data organiseres i measurements (lignende tables), tags (indexed metadata), fields (actual values), og timestamps. Tags er automatisk indexed for hurtig querying, mens fields ikke er. InfluxDB's storage engine bruger Time-Structured Merge Tree (TSM) som komprimerer data effektivt og gør range queries ekstremt hurtige. Databasen har indbygget retention policies (automatisk sletning af gamle data), continuous queries (auto-aggregering), og downsampling. InfluxQL er query language'et som ligner SQL men med time-series specifik funktionalitet. InfluxDB 2.0 introducerede Flux, et mere kraftfuldt functional query language. Den bruges massivt til monitoring, IoT sensor data, application metrics, og DevOps observability.

Features

  • Purpose-built for time series data
  • Høj write throughput
  • Automatic data retention policies
  • Continuous queries for downsampling
  • Built-in HTTP API
  • InfluxQL og Flux query languages
  • Efficient compression (10-100x)
  • Kapacitor for alerting

Query Eksempel

-- InfluxDB InfluxQL (SQL-like)

-- Indsæt data (line protocol via HTTP eller CLI)
INSERT cpu,host=server01,region=eu-west value=64.3 1609459200000000000
INSERT cpu,host=server01,region=eu-west value=72.1 1609459260000000000
INSERT memory,host=server01,region=eu-west used=8.2,total=16.0 1609459200000000000

-- SELECT basics
SELECT * FROM cpu WHERE time > now() - 1h

-- Aggregations
SELECT MEAN(value) FROM cpu 
WHERE time > now() - 1h 
GROUP BY time(5m), host

-- Multiple fields
SELECT used, total, (used/total)*100 AS usage_percent 
FROM memory 
WHERE time > now() - 24h

-- Window functions
SELECT DERIVATIVE(MEAN(value), 1s) 
FROM cpu 
WHERE time > now() - 1h 
GROUP BY time(1m)

-- SHOW kommandoer
SHOW MEASUREMENTS
SHOW TAG KEYS FROM cpu
SHOW TAG VALUES FROM cpu WITH KEY = "host"
SHOW FIELD KEYS FROM cpu

-- Retention policy
CREATE RETENTION POLICY "one_week" ON "mydb" 
DURATION 7d 
REPLICATION 1 
DEFAULT

-- Continuous query (auto-aggregation)
CREATE CONTINUOUS QUERY "cpu_mean_5m" ON "mydb"
BEGIN
  SELECT MEAN(value) AS mean_value 
  INTO "cpu_mean_5m"
  FROM "cpu"
  GROUP BY time(5m), *
END

-- Flux query language (InfluxDB 2.0+)
from(bucket: "metrics")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "cpu")
  |> filter(fn: (r) => r.host == "server01")
  |> aggregateWindow(every: 5m, fn: mean)
  |> yield(name: "mean")

-- JavaScript client eksempel
const { InfluxDB, Point } = require('@influxdata/influxdb-client');

const client = new InfluxDB({ url: 'http://localhost:8086', token: 'mytoken' });
const writeApi = client.getWriteApi('myorg', 'mybucket');

// Write data
const point = new Point('cpu')
  .tag('host', 'server01')
  .tag('region', 'eu-west')
  .floatField('value', 64.3);

writeApi.writePoint(point);
await writeApi.close();

// Query data
const queryApi = client.getQueryApi('myorg');
const query = `from(bucket: "mybucket")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "cpu")`;

for await (const { values, tableMeta } of queryApi.iterateRows(query)) {
  console.log(values);
}

Anvendelsesområder

  • Server og application monitoring (metrics)
  • IoT sensor data collection
  • Real-time analytics dashboards
  • DevOps observability (Prometheus alternative)
  • Financial tick data

Fordele

  • Ekstremt hurtig til time series writes
  • Excellent data compression
  • Built-in retention policies
  • SQL-like query language (InfluxQL)
  • God integration med Grafana

Ulemper

  • Ikke optimal for non-time-series data
  • Clustering kun i Enterprise version
  • Begrænset UPDATE og DELETE support
  • Memory intensiv ved store cardinality
  • Schema design kræver forståelse af tags vs fields

Bedst til

  • Monitoring og metrics collection (Prometheus alternative)
  • IoT sensor data med høj frequency
  • Application performance monitoring (APM)
  • Real-time analytics på streaming data
  • DevOps observability stacks

Ikke anbefalet til

  • Transactional workloads (OLTP)
  • Complex joins og relational queries
  • Frequent updates til historical data
  • Document storage
  • Data uden timestamps

Relaterede databaser

TimescaleDBPrometheusGraphiteOpenTSDBDruid