数据建模

查询优先设计原则

在 Cassandra 中,围绕查询而非实体关系设计表。一个查询对应一张表。

-- Queries to support:
-- Q1: Get all orders for a user
-- Q2: Get orders by status (across users)
-- Q3: Get order by ID

-- Q1: orders_by_user
CREATE TABLE orders_by_user (
  user_id  UUID,
  order_id TIMEUUID,
  status   TEXT,
  total    DECIMAL,
  PRIMARY KEY (user_id, order_id)
) WITH CLUSTERING ORDER BY (order_id DESC);

-- Q2: orders_by_status (denormalized copy)
CREATE TABLE orders_by_status (
  status   TEXT,
  order_id TIMEUUID,
  user_id  UUID,
  total    DECIMAL,
  PRIMARY KEY (status, order_id)
) WITH CLUSTERING ORDER BY (order_id DESC);

-- Q3: orders_by_id
CREATE TABLE orders_by_id (
  order_id TIMEUUID PRIMARY KEY,
  user_id  UUID,
  status   TEXT,
  total    DECIMAL
);

分区策略

好的分区键能均匀分布数据,坏的分区键会导致热点。每个分区目标为几百 MB。

-- Anti-pattern: date as partition key (hot partition on today)
CREATE TABLE logs_bad (
  date    DATE,
  ts      TIMEUUID,
  message TEXT,
  PRIMARY KEY (date, ts)
);
-- All today's logs go to ONE node = hot spot!

-- Better: bucket by hour to spread load
CREATE TABLE logs_by_hour (
  sensor_id TEXT,
  hour      TEXT,          -- '2024-01-15-10' (YYYY-MM-DD-HH)
  ts        TIMEUUID,
  message   TEXT,
  PRIMARY KEY ((sensor_id, hour), ts)
);

-- Or use a random bucket number to spread further
CREATE TABLE logs_bucketed (
  bucket    INT,           -- random 0-9
  sensor_id TEXT,
  ts        TIMEUUID,
  message   TEXT,
  PRIMARY KEY ((bucket, sensor_id), ts)
);
-- Write: pick bucket = random 0-9
-- Read: query all 10 buckets (fan-out read)

反规范化模式

-- Pattern: write to multiple tables for different query patterns
-- Application logic handles consistency

-- Write function (pseudocode)
async function createOrder(order) {
  await Promise.all([
    db.execute(
      "INSERT INTO orders_by_user (user_id, order_id, status, total) VALUES (?,?,?,?)",
      [order.userId, order.id, order.status, order.total]
    ),
    db.execute(
      "INSERT INTO orders_by_status (status, order_id, user_id, total) VALUES (?,?,?,?)",
      [order.status, order.id, order.userId, order.total]
    ),
    db.execute(
      "INSERT INTO orders_by_id (order_id, user_id, status, total) VALUES (?,?,?,?)",
      [order.id, order.userId, order.status, order.total]
    )
  ]);
}

-- Use BATCH for atomicity (same partition only for unlogged, cross-partition for logged)
BEGIN BATCH
  INSERT INTO orders_by_user (user_id, order_id, status) VALUES (?, ?, ?);
  INSERT INTO orders_by_id   (order_id, user_id, status) VALUES (?, ?, ?);
APPLY BATCH;

时间序列模式

-- Time series with wide rows (efficient for range reads)
CREATE TABLE sensor_readings (
  sensor_id TEXT,
  bucket    TEXT,         -- e.g., '2024-01-15' groups daily data
  ts        TIMESTAMP,
  value     DOUBLE,
  PRIMARY KEY ((sensor_id, bucket), ts)
) WITH CLUSTERING ORDER BY (ts ASC)
  AND compaction = {
    'class': 'TimeWindowCompactionStrategy',
    'compaction_window_unit': 'DAYS',
    'compaction_window_size': 1
  };

-- Read last 24 hours for a sensor
SELECT ts, value FROM sensor_readings
WHERE sensor_id = 'temp-001' AND bucket = '2024-01-15'
AND ts >= '2024-01-15 00:00:00' AND ts <= '2024-01-15 23:59:59';

-- TimeWindowCompactionStrategy: ideal for time-series
-- Groups SSTables by time window, efficient TTL eviction

数据建模黄金法则

规则说明
为查询而设计先列出所有查询,再设计表结构
避免 ALLOW FILTERING会导致全集群扫描
无 JOIN改用反规范化,写入多张表
避免热点分区为高流量分区键添加桶/分片
分区大小上限保持 100 MB 以下,理想 <10 MB
使用 TTL为有时效性的数据(日志、会话)设置 TTL