数据建模
查询优先设计原则
在 Cassandra 中,围绕查询而非实体关系设计表。一个查询对应一张表。
-- Queries to support:
-- Q1: Get all orders for a user
-- Q2: Get orders by status (across users)
-- Q3: Get order by ID
-- Q1: orders_by_user
CREATE TABLE orders_by_user (
user_id UUID,
order_id TIMEUUID,
status TEXT,
total DECIMAL,
PRIMARY KEY (user_id, order_id)
) WITH CLUSTERING ORDER BY (order_id DESC);
-- Q2: orders_by_status (denormalized copy)
CREATE TABLE orders_by_status (
status TEXT,
order_id TIMEUUID,
user_id UUID,
total DECIMAL,
PRIMARY KEY (status, order_id)
) WITH CLUSTERING ORDER BY (order_id DESC);
-- Q3: orders_by_id
CREATE TABLE orders_by_id (
order_id TIMEUUID PRIMARY KEY,
user_id UUID,
status TEXT,
total DECIMAL
);
分区策略
好的分区键能均匀分布数据,坏的分区键会导致热点。每个分区目标为几百 MB。
-- Anti-pattern: date as partition key (hot partition on today)
CREATE TABLE logs_bad (
date DATE,
ts TIMEUUID,
message TEXT,
PRIMARY KEY (date, ts)
);
-- All today's logs go to ONE node = hot spot!
-- Better: bucket by hour to spread load
CREATE TABLE logs_by_hour (
sensor_id TEXT,
hour TEXT, -- '2024-01-15-10' (YYYY-MM-DD-HH)
ts TIMEUUID,
message TEXT,
PRIMARY KEY ((sensor_id, hour), ts)
);
-- Or use a random bucket number to spread further
CREATE TABLE logs_bucketed (
bucket INT, -- random 0-9
sensor_id TEXT,
ts TIMEUUID,
message TEXT,
PRIMARY KEY ((bucket, sensor_id), ts)
);
-- Write: pick bucket = random 0-9
-- Read: query all 10 buckets (fan-out read)
反规范化模式
-- Pattern: write to multiple tables for different query patterns
-- Application logic handles consistency
-- Write function (pseudocode)
async function createOrder(order) {
await Promise.all([
db.execute(
"INSERT INTO orders_by_user (user_id, order_id, status, total) VALUES (?,?,?,?)",
[order.userId, order.id, order.status, order.total]
),
db.execute(
"INSERT INTO orders_by_status (status, order_id, user_id, total) VALUES (?,?,?,?)",
[order.status, order.id, order.userId, order.total]
),
db.execute(
"INSERT INTO orders_by_id (order_id, user_id, status, total) VALUES (?,?,?,?)",
[order.id, order.userId, order.status, order.total]
)
]);
}
-- Use BATCH for atomicity (same partition only for unlogged, cross-partition for logged)
BEGIN BATCH
INSERT INTO orders_by_user (user_id, order_id, status) VALUES (?, ?, ?);
INSERT INTO orders_by_id (order_id, user_id, status) VALUES (?, ?, ?);
APPLY BATCH;
时间序列模式
-- Time series with wide rows (efficient for range reads)
CREATE TABLE sensor_readings (
sensor_id TEXT,
bucket TEXT, -- e.g., '2024-01-15' groups daily data
ts TIMESTAMP,
value DOUBLE,
PRIMARY KEY ((sensor_id, bucket), ts)
) WITH CLUSTERING ORDER BY (ts ASC)
AND compaction = {
'class': 'TimeWindowCompactionStrategy',
'compaction_window_unit': 'DAYS',
'compaction_window_size': 1
};
-- Read last 24 hours for a sensor
SELECT ts, value FROM sensor_readings
WHERE sensor_id = 'temp-001' AND bucket = '2024-01-15'
AND ts >= '2024-01-15 00:00:00' AND ts <= '2024-01-15 23:59:59';
-- TimeWindowCompactionStrategy: ideal for time-series
-- Groups SSTables by time window, efficient TTL eviction
数据建模黄金法则
| 规则 | 说明 |
|---|---|
| 为查询而设计 | 先列出所有查询,再设计表结构 |
| 避免 ALLOW FILTERING | 会导致全集群扫描 |
| 无 JOIN | 改用反规范化,写入多张表 |
| 避免热点分区 | 为高流量分区键添加桶/分片 |
| 分区大小上限 | 保持 100 MB 以下,理想 <10 MB |
| 使用 TTL | 为有时效性的数据(日志、会话)设置 TTL |