“It’s just one more column.” Famous last words.
If you use ClickHouse long enough, you’ll eventually need to add a column to a big, busy table. Maybe product wants a new flag, maybe you forgot to log a correlation ID, or maybe you’re mid-migration and need a transitional field. In a row-store, adding a column often smells like a table rewrite. In ClickHouse, it’s different—and wonderfully so—but there are still sharp edges worth understanding.
This post walks through how adding a column works under the hood in ClickHouse, the options you have, and the gotchas to consider when doing this on real, high-throughput clusters. We’ll start with mental models, drop to storage-engine details, then climb back up to practical checklists and patterns.
TL;DR (But Please Don’t)
- Adding a column to a
MergeTreetable is metadata-only and fast. Existing parts are not rewritten immediately. Reads synthesize the new column for old parts using its default. - You choose when (or if) to physically backfill. Leave it virtual forever (cheap for constants), or materialize later via merges/mutations if the default is non-constant or you want full on-disk data.
- Replicated clusters propagate the ALTER. Use
ON CLUSTERto keep schemas in lockstep across shards/replicas. - Design for compatibility: prefer
Nullable(...)for transitional changes, be careful withMATERIALIZED/ALIAS, and think about downstream materialized views and ingestion pipelines. - Operationally: watch mutation backlogs, disk I/O, and merges if you choose to materialize; test on one shard; plan codecs and indices intentionally.
Okay—now the fun version with the why.
A Mental Model: Parts, Not Pages
ClickHouse’s MergeTree family stores data in parts: immutable columnar chunks on disk. Inserts create new parts; background merges compact them. Each part is self-describing (it has its own columns, indexes, checksums). Reads stitch together results across parts and across columns.
When you run:
ALTER TABLE events ADD COLUMN request_id UUID DEFAULT generateUUIDv4();
ClickHouse does not open every part and append a column. It simply updates the table metadata. From that point on:
- New parts written after the ALTER will include the
request_idcolumn physically (materialized on insert). - Old parts do not contain
request_id. When a query touches those parts and asks forrequest_id, ClickHouse synthesizes the value using the column’s default expression. No files changed.
This is the core reason ADD COLUMN is fast and online: it doesn’t rewrite history unless you ask it to.
What “Default” Really Means Here
In ClickHouse, a column can have:
- No explicit default – the engine uses the type’s default value:
0/0.0for numbers,''for strings,1970-01-01forDate,0000-00-00 00:00:00forDateTime, empty arrays/maps,falseforBool, andNULLforNullable(T). DEFAULT <expr>– an expression evaluated on insert and, for missing data in old parts, also evaluated at read time.MATERIALIZED <expr>– computed and stored; you can’t insert into it directly.ALIAS <expr>– virtual; never stored; always computed on read.
Key implication: When you add a column to an existing table, any old parts that predate the change don’t have the column’s files. If the column is:
- A constant default (e.g.,
DEFAULT 0,DEFAULT 'unknown'), ClickHouse can synthesize it cheaply on read: it literally behaves like a column filled with that constant for old parts. - A non-constant expression (e.g.,
DEFAULT cityHash64(user_id)), ClickHouse computes it per row for old parts. That’s CPU at query time. You can later materialize to push that CPU to the background once.
What Actually Happens On Disk
Let’s peek at the lifecycle:
-
Before ALTER: Parts on disk contain files like
user_id.bin,user_id.mrk3,ts.bin,ts.mrk3, etc. -
Run
ADD COLUMN:- ClickHouse updates table metadata (
columns.txtin the table metadata path and in system tables). - No old part files are touched.
- ClickHouse updates table metadata (
-
New Inserts:
- Insert paths and materialized views now see the new column in the schema. If you include it in INSERT, your value is stored. If not, the default expression runs, and the column’s files are written for the new part being created.
-
Reads:
- For new parts: data is loaded from the new column’s files.
- For old parts: the column is missing; ClickHouse plugs in the default at read time (constant fast-path if applicable).
-
Merges (optional impact):
- During background merges, if the column has a non-constant default or is
MATERIALIZED, ClickHouse may materialize it into the merged part (engine/version dependent). If the default is a constant, many deployments happily leave it virtual forever; materializing doesn’t improve correctness.
- During background merges, if the column has a non-constant default or is
-
Explicit Materialization (your call): You can force backfill via a mutation:
-- Force materialization for all existing rows ALTER TABLE events UPDATE request_id = request_id WHERE 1; -- or, if you added a DEFAULT and want it baked in: ALTER TABLE events MATERIALIZE COLUMN request_id; -- engine-version dependent syntaxThis creates new parts with the column physically present, replacing old parts as the mutation completes. It’s I/O and CPU heavy—plan it.
Tip: Leave constant defaults virtual for a while (or forever) if you don’t need the column physically. It often saves a lot of churn.
Schema Placement: AFTER and Logical Order
ClickHouse lets you control logical order:
ALTER TABLE events
ADD COLUMN request_id UUID DEFAULT generateUUIDv4()
AFTER user_id;
This does not rewrite old parts either. Column order mostly affects:
SELECT *output order,- How you (humans) read
DESCRIBE TABLE, - Insert formats like CSV without headers (be explicit instead!).
On disk, ClickHouse stores columns independently. Order is a metadata concern.
Types, Nullability, and Safety for Rolling Changes
Choose types with operational safety in mind:
-
Prefer
Nullable(T)for additive fields when producers/consumers will roll out over time.- Old writers unaware of the column can keep inserting; the default for
Nullable(T)isNULL. - New readers can check
col IS NULLto distinguish missing historical values from real ones.
- Old writers unaware of the column can keep inserting; the default for
-
Use
DEFAULTvalues carefully.- If you choose a sentinel (e.g.,
DEFAULT 0), be sure that value has unambiguous semantics. WithNullable, the sentinel can beNULLinstead of overloading0.
- If you choose a sentinel (e.g.,
-
LowCardinality(String)is great for new enumerated fields. Combine with a constant default for cheap reads on old parts. -
Compression codecs: define them at add time if you have strong preferences.
ALTER TABLE events ADD COLUMN feature_flag LowCardinality(String) DEFAULT '' CODEC(ZSTD(3));You can change codecs later, but that requires a rewrite (mutation) to take full effect.
Replication & Distributed Topologies
Adding a column in a replicated/sharded environment requires a little choreography.
Replicated Tables
For ReplicatedMergeTree engines:
- The
ALTERis replicated: it’s recorded in ZooKeeper/ClickHouseKeeper and executed in order on each replica. - Existing data on each replica behaves the same way: old parts synthesize the column on read.
Distributed Tables
For Distributed tables:
-
Distributeditself holds no data (just a local schema and the routing settings). -
You must ensure underlying shard tables have the same schema. Use:
ALTER TABLE shard_db.events ON CLUSTER my_cluster ADD COLUMN request_id UUID DEFAULT generateUUIDv4(); -
Then, optionally, update the
Distributedtable schema (soDESCRIBEreflects it locally). With recent versions,ON CLUSTERensures uniformity across all hosts.
Operational tip: If you run blue/green clusters or have heterogeneous versions, test the ALTER on a canary shard first. Ensure materialized views and ingestion pipelines don’t explode when the new column appears.
Materialized Views, Projections, and Secondary Indexes
Schema changes often ripple into acceleration structures.
Materialized Views (MVs)
- If an MV’s
SELECTreferences*, a new source column may land in the target—sometimes good, sometimes not. - If an MV’s
SELECTlists columns explicitly, it won’t automatically include the new column. You may need to ALTER the target table and update the view if you want to propagate it. - If you add a column with
MATERIALIZEDexpression in the source, remember: you can’t insert into it, but it will be computed for new rows. Old rows remain virtual unless you materialize.
Projections
-
Projections are like per-table materialized layouts. If the new column is used in queries that hit a projection, consider:
- Add the column, then
ALTER TABLE ... MATERIALIZE PROJECTIONto refresh or rebuild the projection so it includes the column.
Data-Skipping Indexes
You can add secondary indexes that reference the new column:
ALTER TABLE events
ADD COLUMN user_tier LowCardinality(String) DEFAULT ''
AFTER user_id;
ALTER TABLE events
ADD INDEX idx_tier user_tier TYPE set(0) GRANULARITY 1; -- example index
ALTER TABLE events MATERIALIZE INDEX idx_tier;
The index definition is metadata; MATERIALIZE INDEX builds it for existing parts. This is a rewrite (background mutation), so account for CPU/I/O.
Performance: When to Materialize vs. Stay Virtual
Rule of thumb:
- If your default is constant (e.g.,
DEFAULT 0,DEFAULT '', orDEFAULT toDate('1970-01-01')), don’t rush to materialize. ClickHouse can produce a constant column for old parts efficiently. -
If your default is computed per row (e.g., hashing another column, parsing JSON, arithmetic), and queries frequently select/filter on it, consider materializing:
- Short-term: leave it virtual until the change stabilizes (no rollbacks), observe CPU impact.
- Medium-term: schedule a mutation off-hours to materialize.
- Long-term: rely on merges to gradually materialize if your version/settings do so—and if the pace is acceptable.
Measuring impact:
- Compare query profiles before/after with
system.query_logCPU and read rows for representative queries that reference the new column. - Watch
system.mutationsfor progress (if you materialize) andsystem.disksfor free space. Mutations need headroom.
Safety Patterns for Rolling Releases
You rarely change only the database. Producers (writers), consumers (readers), ETL jobs, and dashboards all have opinions.
A battle-tested rollout plan:
-
Add a
Nullablecolumn with a safe default.ALTER TABLE events ADD COLUMN request_id Nullable(UUID); -- default is NULL -
Update readers first to tolerate
NULLand prefer the new column if present:SELECT coalesce(request_id, old_request_id) AS request_id ...- Or, if new logic depends on the column, guard it:
WHERE request_id IS NOT NULL.
-
Update writers to start filling the column.
-
(Optional) Backfill historical data (if needed for analytics or joins). For very large tables, stage it:
- Backfill hot partitions first (recent months).
-
Use
WHEREranges to limit mutation scope:ALTER TABLE events UPDATE request_id = generateUUIDv4() WHERE ts >= toDate('2025-01-01') AND request_id IS NULL;
-
(Optional) Tighten the type later:
- If you no longer need
NULL, you canMODIFY COLUMNto non-nullable once all data is filled and apps no longer writeNULL. - This is a rewrite; plan it.
- If you no longer need
-
(Optional) Add indexes/projections once the column’s query patterns are clear.
Failure Modes & Gotchas
Let’s list the traps so you can step around them:
-
Downstream ingestion breaks: Kafka engines,
INSERT SELECTjobs, or external loaders that useINSERT INTO table VALUES (...)without column lists may misalign values once a new column appears. Always preferINSERT INTO table (col1, col2, ...) VALUES .... -
Materialized Views with strict schema: If an MV’s target table doesn’t have the new column, and your
SELECT *in the MV implicitly pulls it, the insert into the target can fail. Either pin the view’s select list or evolve the target in sync. -
Non-deterministic defaults:
DEFAULT now()orgenerateUUIDv4()is evaluated per insert (for new rows) and per read (for old parts) unless you materialize. That means old rows will “change” over time if you query them repeatedly usingnow()as a default. Avoid non-deterministic defaults for added columns unless you materialize immediately. -
Expression cost at read time:
DEFAULT complex_json_extract(payload)across billions of historical rows can be expensive. Either keep the column out of hot queries until materialized or schedule a staged backfill. -
Disk space during mutations: Backfills create new parts alongside old ones until the mutation commits. Ensure enough free space (rule of thumb: > 2× the data volume you’ll rewrite for safety). Use partitioned
WHEREclauses to control scope. -
Codec mismatch expectations: Adding a column with a codec doesn’t retroactively change old parts (they don’t have the column yet). After materialization, your chosen codec applies to new/rewritten parts only.
-
ALIASisn’t stored: Great for derived values, but if you expect to filter heavily on it, remember you’re always computing on read. For heavy workloads, prefer a real stored column withMATERIALIZEDand materialize backfill. -
AFTERdoesn’t “reorder” old parts: It’s for logical order; nothing performance-critical changes. -
Distributed schema drift: If one shard doesn’t receive the
ALTER, queries via theDistributedtable can fail with schema mismatch errors. UseON CLUSTERand monitor.
Practical Walkthrough
Let’s simulate a common scenario: add a request correlation ID to a hot events table.
1) Inspect
DESCRIBE TABLE events;
SELECT
partition,
name,
active,
rows
FROM system.parts
WHERE database = currentDatabase() AND table = 'events' AND active;
Know your partitions, row counts, and disk headroom before you do anything.
2) Add the Column (Safe First)
ALTER TABLE events
ADD COLUMN request_id Nullable(UUID);
Instant. No rewrite. Existing queries that don’t reference request_id are unaffected.
3) Roll Out Writers
Application producers start sending request_id on new inserts. If they don’t, it’s NULL.
Good practice:
INSERT INTO events (ts, user_id, action, request_id) VALUES (..., ..., ..., generateUUIDv4());
4) Update Readers
Be tolerant:
SELECT
ts,
user_id,
action,
request_id
FROM events
WHERE (request_id IS NULL AND ts < now() - INTERVAL 7 DAY) -- transitional logic
OR (request_id IS NOT NULL AND ts >= now() - INTERVAL 7 DAY);
Or for joins/analytics:
SELECT
coalesce(request_id, toUUID('00000000-0000-0000-0000-000000000000')) AS rid,
count()
FROM events
GROUP BY rid;
5) Optional: Backfill Recent Partitions
If analytics benefits from full coverage for, say, this quarter:
ALTER TABLE events
UPDATE request_id = cityHash64(toString(user_id))::UUID -- example derivation
WHERE ts >= toDate('2025-07-01') AND request_id IS NULL;
Watch progress:
SELECT * FROM system.mutations WHERE table = 'events' ORDER BY create_time DESC;
6) Optional: Add an Index
If you filter a lot by request_id:
ALTER TABLE events
ADD INDEX idx_req request_id TYPE bloom_filter GRANULARITY 4;
ALTER TABLE events MATERIALIZE INDEX idx_req;
This builds the index for historical parts (I/O heavy); new parts build it on insert.
Observability While You Change Things
system.mutations: status, progress, failures.system.replication_queue: check for stuck replicated ALTERs.system.parts: track part counts and sizes; big spikes during materialization are normal.system.query_log: compare CPU time and read rows for queries that touch the new column.- Disk/IO metrics: backfills and index materialization are bandwidth-hungry.
Rolling Back
Need to undo? Two paths:
-
Stop using it: simplest operationally; leave the column in place, update queries to ignore it. Zero risk.
-
Drop it:
ALTER TABLE events DROP COLUMN request_id;Metadata-only and fast. If you materialized data for it, dropping removes the files in newly written parts as mutations/merges progress. Existing queries that referenced it will fail—be coordinated.
Edge Cases & Advanced Notes
-
Adding to a
ReplacingMergeTreewith aversioncolumn: Backfills can create multiple versions of rows with/without the column populated. That’s fine; the engine’s rules still pick the newest byversionorsign. Be mindful of the mutation order. -
TTL Expressions: If you add a column used by a
TTLclause later, you’ll trigger rewrites as TTLs fire on parts. Plan TTL materialization and disk layout (MOVE TO VOLUME/DISK) after the column stabilizes. -
Constraints & Defaults Order: If your default references another column, ensure that column already exists and has a value at the time of evaluation. Generally safe when adding, but can surprise you if you interleave multiple ALTERs quickly. One ALTER per deployment step is kinder.
-
Version Skew: Clusters running mixed ClickHouse versions may handle “materialize on merge” behavior slightly differently. If you rely on merges to backfill, validate in staging that merged parts indeed contain the column.
A Deployment Checklist
Print this and stick it to your on-call wall:
-
Plan
- Decide on type (
Nullable?), codec, and default (constant if possible). - Inventory MVs, projections, and downstream jobs.
- Check disk headroom if you will materialize or create indices.
- Decide on type (
-
Stage
- Apply on staging with representative data volume.
- Run queries that will use the new column; measure query CPU.
-
Alter
ALTER TABLE ... ADD COLUMN ... [ON CLUSTER ...]- For
Distributed, ensure all shards have the change.
-
App Rollout
- Update readers first to tolerate absence/NULLs.
- Update writers to populate the new column.
-
(Optional) Backfill
- Decide scope (which partitions).
- Run
ALTER TABLE ... UPDATE ... WHERE ...orMATERIALIZE COLUMN. - Monitor
system.mutations, disk, and query impact.
-
(Optional) Optimize
- Add indices/projections and materialize.
- Consider
MODIFY COLUMNto non-nullable once stable.
-
Validate
- Compare query performance and correctness.
- Confirm schema parity across shards/replicas.
Frequently Asked “Wait, But Does It…”
…lock the table?
No full lock. ADD COLUMN is metadata-only and online. Inserts/queries continue.
…rewrite data immediately? No. Old parts remain untouched. Reads synthesize values; you can materialize later.
…affect primary key or sorting key?
Only if you explicitly modify those keys (separate ALTER). Plain ADD COLUMN does not.
…break SELECT * clients?
It can change result shape. Stabilize consumer schemas or avoid * for external interfaces.
…work for non-MergeTree engines?
Semantics can differ (e.g., Memory, Log engines). The discussion here focuses on MergeTree and descendants (ReplicatedMergeTree, ReplacingMergeTree, SummingMergeTree, etc.), which are most common in production.
Key Takeaways
- Adding a column is fast because ClickHouse doesn’t rewrite old parts. It updates metadata and synthesizes values on read for historical data.
- Your main decision is when (or if) to materialize. Constant defaults? Usually don’t bother. Computed defaults used in hot paths? Materialize when convenient.
- Think in terms of compatibility and rollouts. Prefer
Nullablefor transitional changes, update readers before writers, and test materialized views. - Mind the cluster. Use
ON CLUSTER, verify schema parity, and watch mutations and merges. - Be intentional about codecs, indices, and projections. Add them with eyes open to the rewrite costs.
Further Reading & Ideas to Explore
- ClickHouse official docs on
ALTER TABLE(ADD/MODIFY/DROP, materialization, and mutations). - System tables:
system.parts,system.mutations,system.query_log,system.replication_queue. - Designing rollouts with
Nullableand sentinel defaults in columnar stores. - Performance patterns for data-skipping indexes and
LowCardinality. - Projections for accelerating specific query shapes after schema changes.