Cassandra指南

数据建模规则

  1. 围绕查询设计表,而非关系
  2. 每种查询模式一张表(反规范化是可以的)
  3. 分区键在节点间分发数据
  4. 聚集键在分区内排序数据
  5. 避免大分区(>100MB或10万行)

CQL示例

-- Create keyspace
CREATE KEYSPACE my_app
WITH replication = {'class': 'NetworkTopologyStrategy', 'datacenter1': 3};

-- Create table (query: get user's recent posts by date)
CREATE TABLE posts_by_user (
    user_id    UUID,
    created_at TIMESTAMP,
    post_id    UUID,
    title      TEXT,
    content    TEXT,
    tags       SET<TEXT>,
    metadata   MAP<TEXT, TEXT>,
    PRIMARY KEY ((user_id), created_at, post_id)  -- (partition, clustering...)
) WITH CLUSTERING ORDER BY (created_at DESC);

-- Insert
INSERT INTO posts_by_user (user_id, created_at, post_id, title)
VALUES (uuid(), toTimestamp(now()), uuid(), 'Hello World')
USING TTL 2592000;  -- 30 days TTL

-- Query (must include full partition key)
SELECT * FROM posts_by_user
WHERE user_id = ? AND created_at > '2024-01-01'
LIMIT 20;

Cassandra vs MongoDB vs DynamoDB

CassandraMongoDBDynamoDB
适合高写入时序数据灵活文档AWS无服务器
查询灵活性低(必须分区键)
写入吞吐量出色出色(托管)
ACID轻量级事务Multi-doc transactions单项ACID