Databases

Database Migrations

3 min read

Focus: DATABASES

⚡

TL;DR — Quick Summary

Migrations are version-controlled, numbered SQL files that evolve your schema incrementally and reproducibly.
Every migration needs both up (apply) and down (rollback) functions.
Always add new columns as nullable first to avoid table locks on large tables.
Never modify an already-deployed migration — always create a new numbered migration file instead.

Lesson Overview

🚀 The Schema Change Problem

Your application's database schema will change constantly as features are added. The challenge: how do you change the schema in development, staging, and production reliably, with your whole team, without manual SQL scripts scattered in Slack messages?

Database migrations are version-controlled SQL scripts that evolve your schema step by step. Each migration is a discrete, numbered change — like a commit for your database structure.

📋 How Migration Tools Work

Tools like Flyway, Liquibase, Prisma Migrate, Knex, Alembic (Python), and ActiveRecord (Rails) all share the same core approach:

Store migrations as numbered files in your repo (001_create_users.sql, 002_add_email_column.sql)
Track which migrations have run in a special schema history table
On deployment, run only new migrations that haven't been applied yet
Each migration has an up function (apply the change) and a down function (reverse it)

⚠️ Zero-Downtime Migrations

Adding a column is trivial in development. In production with 500 concurrent users, ALTER TABLE can lock the entire table for minutes on large tables. Zero-downtime migration strategies include:

Expand-Contract pattern: Add new column (nullable), backfill data, update code to write to both, deploy, drop old column.
Online DDL: MySQL 8+ and PostgreSQL support non-blocking index creation and column additions.
Shadow tables: Create the new table structure, migrate data in background, swap with atomic rename.

Rule: never run a raw ALTER TABLE that locks a large production table during peak hours. Plan zero-downtime migration strategy for any schema change on tables with >1M rows.

Conceptual Deep Dive

The migration history table is the source of truth: it records every migration that has ever been applied, with a timestamp and checksum. When you deploy, the tool reads this table, compares it to your migration files, and runs only the new ones in order.

Architecture & Data Flow

Migration Deployment Flow

Rendering diagram…

Expand-Contract Pattern — Zero Downtime Column Rename

Rendering diagram…

Implementation Lab

Knex.js Migrations — Up and Down

// migrations/20240115_001_create_users_table.js
exports.up = async function(knex) {
  await knex.schema.createTable('users', function(table) {
    table.increments('id').primary();
    table.string('name', 100).notNullable();
    table.string('email', 255).notNullable().unique();
    table.string('password_hash', 255).notNullable();
    table.enum('status', ['active', 'inactive', 'banned']).defaultTo('active');
    table.timestamps(true, true);  // created_at, updated_at
  });
};
 
exports.down = async function(knex) {
  await knex.schema.dropTableIfExists('users');
};
 
// --- Next migration ---
// migrations/20240120_002_add_profile_fields.js
exports.up = async function(knex) {
  await knex.schema.alterTable('users', function(table) {
    table.string('avatar_url').nullable();  // always nullable for new columns!new columns!
    table.text('bio').nullable();
    table.date('birthdate').nullable();
    table.index('status');  // add index
  });
};
 
exports.down = async function(knex) {
  await knex.schema.alterTable('users', function(table) {
    table.dropColumn('avatar_url');
    table.dropColumn('bio');
    table.dropColumn('birthdate');
    table.dropIndex('status');
  });
};

Flyway SQL Migrations

-- File: V1__Create_users_table.sql  (Flyway naming: V{version}__{description})
CREATE TABLE users (
  id         BIGINT PRIMARY KEY AUTO_INCREMENT,
  name       VARCHAR(100) NOT NULL,
  email      VARCHAR(255) NOT NULL,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  UNIQUE INDEX ux_users_email (email)
);
 
-- File: V2__Create_posts_table.sql
CREATE TABLE posts (
  id         BIGINT PRIMARY KEY AUTO_INCREMENT,
  user_id    BIGINT NOT NULL,
  title      VARCHAR(255) NOT NULL,
  body       TEXT,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE,
  INDEX idx_posts_user_id (user_id)
);
 
-- File: V3__Add_soft_delete_to_users.sql
-- Safe: nullable column addition doesn't require a full table lock
ALTER TABLE users
ADD COLUMN deleted_at TIMESTAMP NULL DEFAULT NULL,
ADD INDEX idx_users_deleted_at (deleted_at);

Zero-Downtime Migration — Rename a Column Safely

-- UNSAFE: Direct rename locks the table
-- ALTER TABLE users RENAME COLUMN username TO display_name;
-- This is fine for small tables but locks large tables for minutes!
 
-- SAFE: Expand-Contract pattern (3 separate deployments)
 
-- PHASE 1: Add new column (nullable — no default needed, fast)
ALTER TABLE users ADD COLUMN display_name VARCHAR(100) NULL;
 
-- PHASE 2: Backfill existing data (run in batches to avoid long lock)
UPDATE users SET display_name = username
WHERE display_name IS NULL
LIMIT 10000;  -- run repeatedly in small batches
-- Application now writes to BOTH username AND display_name
 
-- PHASE 3: Make new column NOT NULL, drop old column
-- (only after app code only uses display_name)
ALTER TABLE users MODIFY display_name VARCHAR(100) NOT NULL;
ALTER TABLE users DROP COLUMN username;

Best Practices — Interactive Comparison

New columns must be nullable first — NOT NULL on a large table locks it for minutes in production

sql

-- This scans ALL 10 million rows to set the default
-- Locks the entire table for minutes!
-- Your entire app is degraded while this runs
ALTER TABLE users
ADD COLUMN display_name VARCHAR(100) NOT NULL DEFAULT '';
 
-- Users see: ERROR 1205 Lock wait timeout exceeded

NOT NULL causes a full table lock

sql

-- This scans ALL 10 million rows to set the default
-- Locks the entire table for minutes!
-- Your entire app is degraded while this runs
ALTER TABLE users
ADD COLUMN display_name VARCHAR(100) NOT NULL DEFAULT '';
 
-- Users see: ERROR 1205 Lock wait timeout exceeded

Better approach

Nullable first — instantaneous

sql

-- Phase 1: Add nullable (runs in milliseconds)
ALTER TABLE users
ADD COLUMN display_name VARCHAR(100) NULL;
 
-- Phase 2: Backfill in small batches (no lock held)
UPDATE users SET display_name = name
WHERE display_name IS NULL LIMIT 5000;
-- Repeat until 0 rows updated
 
-- Phase 3: Enforce NOT NULL after full backfill
ALTER TABLE users
MODIFY display_name VARCHAR(100) NOT NULL;

Pro Tips — Senior Dev Insights

Always backup production before running migrations: mysqldump -u root -p mydb > backup_pre_migration_$(date +%Y%m%d).sql

For large tables, use tools designed for zero-downtime migrations: gh-ost (MySQL), pg_repack (PostgreSQL), or pt-online-schema-change (Percona).

Use Prisma Migrate for TypeScript/Node projects — it generates migrations from your schema definition and keeps them in sync with your TypeScript types.

Adopt a convention for migration naming: YYYYMMDD_NNN_description.sql makes it easy to see what changed and when, and sorts correctly in any file browser.

⚖️ Popular Migration Tools

Feature	Flyway	Liquibase	Prisma Migrate	Knex	Alembic
Primary language	SQL / Java	SQL / XML / YAML	TypeScript	JavaScript	Python
Auto-generates migrations
Rollback support	Pro only		Manual
Schema drift detection
Popular with	Java / Spring	Enterprise	Next.js / Prisma	Node.js	Python / Flask

Common Developer Pitfalls

Editing deployed migration files — this breaks the checksum check and desynchronizes environments.

Adding NOT NULL columns without defaults to large production tables — causes a full table scan and lock for minutes.

Not taking a database backup before running production migrations — you lose the ability to restore if something goes wrong.

Running migrations manually instead of in CI/CD — leads to environment drift where staging and production have different schemas.

Interview Mastery

Migration tools store a checksum of each migration file in the schema history table. If you modify a file, the checksum no longer matches what was applied. The migration tool will detect this and throw an error, refusing to run. In the best case, this blocks your CI/CD pipeline. In the worst case, if you somehow bypass the check, you'll have different schema states across environments with no way to reconcile them. New migrations should always be new numbered files.

The expand-contract (also called parallel change) pattern involves 3 phases: (1) Expand — add the new column/table alongside the old one (purely additive, no locks). (2) Migrate — update application code to write to both old and new structures simultaneously; backfill historical data in small batches. (3) Contract — once all data is migrated and no code references the old structure, drop the old column/table. This approach never locks tables and is fully reversible at each phase.

Strategies depend on the migration type: (1) If it failed mid-run and the tool left the DB in a partial state, manually ROLLBACK or run the down migration. (2) If the migration was a pure DDL (no data loss), write and run a corrective migration. (3) If data was corrupted, restore from the pre-migration backup (this is why you always take a backup before running migrations in production). (4) Fix the migration file, reset it in the schema history table, and re-run. Always keep a pre-migration backup for production deployments.

Real-World Blueprint

"GitHub runs hundreds of database migrations per year on tables with billions of rows. Their blog has documented how adding a single column to the repositories table (with 40 million rows) required weeks of planning. They used GitHub's own tool 'gh-ost' (GitHub Online Schema Tool) to ghost-copy the table with the new schema, replicate changes in real time, and atomically swap tables — zero downtime, zero locks, zero data loss. The migration was a single migration file in their repo, reviewed by the DBA team, and executed automatically."

Hands-on Lab Exercises

Set up Knex or Flyway in a Node.js or Java project. Create 5 sequential migrations building a complete blog schema.

Practice the full rollback cycle: run migrations up, verify the schema, run migrations down, verify the rollback worked.

Implement the expand-contract pattern on a sample table: add a new column, backfill it, then drop the old column across 3 migration files.

Set up a CI pipeline (GitHub Actions) that automatically runs migrations on push to main, and fails the build if a migration fails.

Real-World Practice Scenarios

You need to add a NOT NULL column with no default to a 10-million-row users table in production without downtime. What's your strategy?

A developer modified an already-deployed migration file to fix a typo. The CI/CD pipeline is now failing with a checksum error. How do you fix it?

You deployed a migration that renamed a column. The new deployment is failing. You need to rollback the code AND the migration. What's the sequence?

Your app runs on 5 servers. During deployment, servers 1-2 are running the new code but servers 3-5 are still running old code. The new code expects a new column. How should the migration be timed?

DevHub

Global Software Engineering Curriculum

Generated Tracking ID

DH-TX-data-migr

Databases • Module Reference

Database Migrations

⚡

TL;DR — Quick Summary

Migrations are version-controlled, numbered SQL files that evolve your schema incrementally and reproducibly.
Every migration needs both up (apply) and down (rollback) functions.
Always add new columns as nullable first to avoid table locks on large tables.
Never modify an already-deployed migration — always create a new numbered migration file instead.

Overview

🚀 The Schema Change Problem

Database migrations are version-controlled SQL scripts that evolve your schema step by step. Each migration is a discrete, numbered change — like a commit for your database structure.

📋 How Migration Tools Work

Tools like Flyway, Liquibase, Prisma Migrate, Knex, Alembic (Python), and ActiveRecord (Rails) all share the same core approach:

Store migrations as numbered files in your repo (001_create_users.sql, 002_add_email_column.sql)
Track which migrations have run in a special schema history table
On deployment, run only new migrations that haven't been applied yet
Each migration has an up function (apply the change) and a down function (reverse it)

⚠️ Zero-Downtime Migrations

Expand-Contract pattern: Add new column (nullable), backfill data, update code to write to both, deploy, drop old column.
Online DDL: MySQL 8+ and PostgreSQL support non-blocking index creation and column additions.
Shadow tables: Create the new table structure, migrate data in background, swap with atomic rename.

Rule: never run a raw ALTER TABLE that locks a large production table during peak hours. Plan zero-downtime migration strategy for any schema change on tables with >1M rows.

Architecture & Logic Flow

Diagram

Rendering diagram…

Diagram

Rendering diagram…

⚖️ Popular Migration Tools

Feature	Flyway	Liquibase	Prisma Migrate	Knex	Alembic
Primary language	SQL / Java	SQL / XML / YAML	TypeScript	JavaScript	Python
Auto-generates migrations
Rollback support	Pro only		Manual
Schema drift detection
Popular with	Java / Spring	Enterprise	Next.js / Prisma	Node.js	Python / Flask

Deep Dive Analysis

Think of migrations like Git commits for your database schema. Just as you wouldn't hand-edit code files directly on a production server, you don't manually run SQL ALTER TABLE statements in production. Instead, every schema change is a migration file — version controlled, peer reviewed, tested in staging, and automatically applied on deployment. The migration history table is the source of truth: it records every migration that has ever been applied, with a timestamp and checksum. When you deploy, the tool reads this table, compares it to your migration files, and runs only the new ones in order.

Implementation Reference

Knex.js Migrations — Up and Down

javascript

// migrations/20240115_001_create_users_table.js
exports.up = async function(knex) {
  await knex.schema.createTable('users', function(table) {
    table.increments('id').primary();
    table.string('name', 100).notNullable();
    table.string('email', 255).notNullable().unique();
    table.string('password_hash', 255).notNullable();
    table.enum('status', ['active', 'inactive', 'banned']).defaultTo('active');
    table.timestamps(true, true);  // created_at, updated_at
  });
};

exports.down = async function(knex) {
  await knex.schema.dropTableIfExists('users');
};

// --- Next migration ---
// migrations/20240120_002_add_profile_fields.js
exports.up = async function(knex) {
  await knex.schema.alterTable('users', function(table) {
    table.string('avatar_url').nullable();  // always nullable for new columns!
    table.text('bio').nullable();
    table.date('birthdate').nullable();
    table.index('status');  // add index
  });
};

exports.down = async function(knex) {
  await knex.schema.alterTable('users', function(table) {
    table.dropColumn('avatar_url');
    table.dropColumn('bio');
    table.dropColumn('birthdate');
    table.dropIndex('status');
  });
};

Flyway SQL Migrations

sql

-- File: V1__Create_users_table.sql  (Flyway naming: V{version}__{description})
CREATE TABLE users (
  id         BIGINT PRIMARY KEY AUTO_INCREMENT,
  name       VARCHAR(100) NOT NULL,
  email      VARCHAR(255) NOT NULL,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  UNIQUE INDEX ux_users_email (email)
);

-- File: V2__Create_posts_table.sql
CREATE TABLE posts (
  id         BIGINT PRIMARY KEY AUTO_INCREMENT,
  user_id    BIGINT NOT NULL,
  title      VARCHAR(255) NOT NULL,
  body       TEXT,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE,
  INDEX idx_posts_user_id (user_id)
);

-- File: V3__Add_soft_delete_to_users.sql
-- Safe: nullable column addition doesn't require a full table lock
ALTER TABLE users
ADD COLUMN deleted_at TIMESTAMP NULL DEFAULT NULL,
ADD INDEX idx_users_deleted_at (deleted_at);

Zero-Downtime Migration — Rename a Column Safely

sql

-- UNSAFE: Direct rename locks the table
-- ALTER TABLE users RENAME COLUMN username TO display_name;
-- This is fine for small tables but locks large tables for minutes!

-- SAFE: Expand-Contract pattern (3 separate deployments)

-- PHASE 1: Add new column (nullable — no default needed, fast)
ALTER TABLE users ADD COLUMN display_name VARCHAR(100) NULL;

-- PHASE 2: Backfill existing data (run in batches to avoid long lock)
UPDATE users SET display_name = username
WHERE display_name IS NULL
LIMIT 10000;  -- run repeatedly in small batches
-- Application now writes to BOTH username AND display_name

-- PHASE 3: Make new column NOT NULL, drop old column
-- (only after app code only uses display_name)
ALTER TABLE users MODIFY display_name VARCHAR(100) NOT NULL;
ALTER TABLE users DROP COLUMN username;

Comparative Best Practices

New columns must be nullable first — NOT NULL on a large table locks it for minutes in production

sql

-- This scans ALL 10 million rows to set the default
-- Locks the entire table for minutes!
-- Your entire app is degraded while this runs
ALTER TABLE users
ADD COLUMN display_name VARCHAR(100) NOT NULL DEFAULT '';
 
-- Users see: ERROR 1205 Lock wait timeout exceeded

NOT NULL causes a full table lock

sql

-- This scans ALL 10 million rows to set the default
-- Locks the entire table for minutes!
-- Your entire app is degraded while this runs
ALTER TABLE users
ADD COLUMN display_name VARCHAR(100) NOT NULL DEFAULT '';
 
-- Users see: ERROR 1205 Lock wait timeout exceeded

Better approach

Nullable first — instantaneous

sql

-- Phase 1: Add nullable (runs in milliseconds)
ALTER TABLE users
ADD COLUMN display_name VARCHAR(100) NULL;
 
-- Phase 2: Backfill in small batches (no lock held)
UPDATE users SET display_name = name
WHERE display_name IS NULL LIMIT 5000;
-- Repeat until 0 rows updated
 
-- Phase 3: Enforce NOT NULL after full backfill
ALTER TABLE users
MODIFY display_name VARCHAR(100) NOT NULL;

Common Pitfalls

•Editing deployed migration files — this breaks the checksum check and desynchronizes environments.
•Adding NOT NULL columns without defaults to large production tables — causes a full table scan and lock for minutes.
•Not taking a database backup before running production migrations — you lose the ability to restore if something goes wrong.
•Running migrations manually instead of in CI/CD — leads to environment drift where staging and production have different schemas.

Key Takeaways

• Always write the <code>down</code> migration — it's your escape hatch if a migration causes problems in production.

• Add new columns as nullable first — adding a NOT NULL column to a large table requires scanning every row to set the default value, which can lock the table.

• Never modify an existing migration that has already been deployed — create a new migration instead. Migration tools detect checksum changes and will refuse to run.

• Run migrations in CI/CD pipelines automatically, not manually. Manual SQL runs on production servers lead to drift between environments.

Hands-on Practice

✓Set up Knex or Flyway in a Node.js or Java project. Create 5 sequential migrations building a complete blog schema.
✓Practice the full rollback cycle: run migrations up, verify the schema, run migrations down, verify the rollback worked.
✓Implement the expand-contract pattern on a sample table: add a new column, backfill it, then drop the old column across 3 migration files.
✓Set up a CI pipeline (GitHub Actions) that automatically runs migrations on push to main, and fails the build if a migration fails.

Expert Pro Tips

"Always backup production before running migrations: <code>mysqldump -u root -p mydb > backup_pre_migration_$(date +%Y%m%d).sql</code>"

"For large tables, use tools designed for zero-downtime migrations: gh-ost (MySQL), pg_repack (PostgreSQL), or pt-online-schema-change (Percona)."

"Use Prisma Migrate for TypeScript/Node projects — it generates migrations from your schema definition and keeps them in sync with your TypeScript types."

"Adopt a convention for migration naming: <code>YYYYMMDD_NNN_description.sql</code> makes it easy to see what changed and when, and sorts correctly in any file browser."

Interview Preparation

Q: Why is it dangerous to modify an existing migration file after it has been deployed?

Master Answer:

Q: What is the expand-contract pattern for zero-downtime migrations?

Master Answer:

The expand-contract (also called parallel change) pattern involves 3 phases: (1) Expand — add the new column/table alongside the old one (purely additive, no locks). (2) Migrate — update application code to write to both old and new structures simultaneously; backfill historical data in small batches. (3) Contract — once all data is migrated and no code references the old structure, drop the old column/table. This approach never locks tables and is fully reversible at each phase.

Q: How do you handle a failed migration in production?

Master Answer:

Industrial Blueprint

Simulated Scenarios

"You need to add a NOT NULL column with no default to a 10-million-row users table in production without downtime. What's your strategy?"

"A developer modified an already-deployed migration file to fix a typo. The CI/CD pipeline is now failing with a checksum error. How do you fix it?"

"You deployed a migration that renamed a column. The new deployment is failing. You need to rollback the code AND the migration. What's the sequence?"

"Your app runs on 5 servers. During deployment, servers 1-2 are running the new code but servers 3-5 are still running old code. The new code expects a new column. How should the migration be timed?"

DevHub

Generated on March 7, 2026 • Ver: 4.0.2

Document Class: Master Education

Confidential Information • Licensed to User