Misk Schema Migrations: Versioned SQL Without Flyway

Series: Building Production Services with Misk — Part 14 of 24

Every service that owns a database eventually owns a schema-evolution problem. You add a column, you backfill, you drop the old one — and you need that to happen the same way in every environment, in order, exactly once. The reflexive answer in JVM-land is Flyway or Liquibase: pull in a dependency, point it at a folder of SQL, done. But Misk already ships a schema migration mechanism baked into misk-jdbc, and once you’ve got a data source configured, it runs at startup with no extra wiring. This post is about that built-in migrator, the standalone Gradle plugin that runs it without booting your service, and the honest question of when you’d still reach for Flyway.

The built-in misk schema migration model

Misk’s migrator is deliberately small. The contract is two methods on a SchemaMigrator interface:

interface SchemaMigrator {
  /** Applies available migration to the database. */
  fun applyAll(author: String): MigrationStatus

  /** Validates that all migrations have been applied to the database. */
  fun requireAll(): MigrationStatus
}

applyAll runs the migrations you haven’t run yet. requireAll runs nothing — it just asserts the database is already up to date and fails if it isn’t. That split is the whole philosophy: in dev and test you let the service migrate itself, and in production you verify rather than mutate. The SchemaMigratorService that runs at startup encodes exactly that distinction:

if (deployment.isTest || deployment.isLocalDevelopment) {
  retry(retryConfig.build()) { migrationState = schemaMigrator.applyAll("SchemaMigratorService") }
} else {
  migrationState = schemaMigrator.requireAll()
}

So when you install a JdbcModule for a data source, a SchemaMigratorService is installed alongside it. Boot the service locally and your tables appear; boot it in production and it refuses to start if a migration is missing, but it will not silently run DDL against your prod database. That’s the right default — production schema changes should be a deliberate, observable step, not a side effect of a deploy that happened to restart a pod first.

State is tracked in a schema_version table the migrator creates itself: one row per applied migration version, plus who applied it. No external bookkeeping, no .flyway_schema_history you have to remember exists.

How migrations are defined

A migration is a plain .sql file on the classpath, named by convention. From the migrator’s own docs:

Each file should contain SQL statements terminated by a ;. The files should be named like v100__exemplar.sql with a v, an integer version, two underscores, a description, and the .sql suffix.

So: v1__create_buckets.sql, v2__add_index.sql, v100__whatever.sql. The integer after the v is the version; everything after the __ is a human-readable description that the migrator ignores. Versions are applied in increasing order and — usefully — do not have to be sequential. You can jump from v5 to v100 if you want room to backport, and the migrator just sorts and applies what’s missing.

You tell Misk where to find these files through the data source config. The exemplar’s YAML shows the shape:

data_source_clusters:
  exemplar-001:
    writer:
      type: MYSQL
      username: root
      password: ""
      database: exemplar_testing
      migrations_resource: "classpath:/migrations"

migrations_resource is a Misk resource URL — classpath:/migrations points at src/main/resources/migrations/. Drop your v*.sql files there and they’re picked up. (There’s also a plural migrations_resources if you need to pull from several locations, and a migrations_resources_regex that defaults to (^|.*/)v(\d+)__[^/]+\.sql — the pattern that defines what counts as a migration file. You rarely touch the regex, but it’s there if your naming has to differ.)

The format matters too: the default is migrations_format: TRADITIONAL, where each file is a schema change. There’s a newer DECLARATIVE mode (Skeema-style, where each file describes a table’s desired state), but traditional versioned migrations are what most services run and what the rest of this post assumes.

The standalone Gradle plugin

Here’s the part that makes the built-in migrator genuinely competitive: you can run it without your service. The misk-schema-migrator-gradle-plugin packages the same migrator into a Gradle task, so CI can migrate a database before, say, jOOQ codegen runs against it — no Guice, no booted application, no web server.

The plugin id and config are:

plugins {
  id("com.squareup.misk.schema-migrator") version "<latest>"
}

miskSchemaMigrator {
  database = "codegen"
  host = "localhost"        // optional, defaults to localhost
  port = 3306               // optional, defaults to 3306
  username = "root"
  password = ""
  migrationsDir = layout.projectDirectory.dir("src/main/resources/db-migrations")
  migrationsFormat = "TRADITIONAL"   // optional, defaults to TRADITIONAL
}

That miskSchemaMigrator extension registers a single task named migrateSchema. Under the hood the task doesn’t reimplement anything — it shells out via JavaExec to misk.jdbc.SchemaMigratorRunner, the same migrator code from misk-jdbc, passing config over stdin (so your password never lands in a process argument list). It’s the production migrator, minus the production service.

The headline use case is jOOQ. If you generate type-safe SQL from a live schema, you need that schema to exist before codegen runs. The plugin’s README spells out the wiring:

tasks.withType<nu.studer.gradle.jooq.JooqGenerate>().configureEach {
  dependsOn("migrateSchema")
}

Now ./gradlew generateJooq (or whatever your codegen task is) migrates a throwaway database first, then generates code against the real, current schema. This is the thing Flyway-in-a-service can’t do cleanly: run your own migrations as a build step, with the same engine that runs them at runtime, so there’s zero drift between “the schema codegen saw” and “the schema the service enforces.”

Worked example

Say you’re adding a rate-limit bucket table. Create the migration under src/main/resources/migrations/ (matching whatever migrations_resource points at):

-- v1__create_rate_limit_buckets.sql
CREATE TABLE rate_limit_buckets(
    `id` varchar(255) NOT NULL PRIMARY KEY,
    `state` BLOB
);

In dev and test, that’s all you do — start the service (or run your tests), and SchemaMigratorService.applyAll(...) creates the table and writes a row into schema_version recording version 1 and the author. Add a second file later:

-- v2__add_bucket_window.sql
ALTER TABLE rate_limit_buckets ADD COLUMN `window_start` BIGINT;

Next boot, only v2 runs — v1 is already recorded, so it’s skipped.

For CI codegen, you don’t boot anything. With the plugin applied and miskSchemaMigrator configured to point at a scratch database, you run:

./gradlew migrateSchema

That spins up a connection, applies v1 and v2 to the codegen database, and exits. Wire dependsOn("migrateSchema") onto your jOOQ task and it happens automatically as part of the build. Same SQL files, same migrator, two entry points.

Going deeper: the Flyway comparison

So is this a real Flyway alternative? For a Misk service, yes — and the plugin’s own README says as much: “It can be used as an alternative to Flyway.” You get versioned, ordered, exactly-once migrations; a tracking table; an apply path and a verify path; and a build-time runner. That covers the 90% case, and it does it without adding a dependency you have to configure separately from the data source you already declared.

Where Flyway and Liquibase still pull ahead:

Repeatable and undo migrations. Flyway has repeatable migrations (R__) for things like re-creating views, and (paid) undo support. Misk’s model is forward-only versioned files — there’s no built-in down. If your workflow leans on repeatable or reversible migrations, Misk won’t replace that.
Rich callbacks and placeholders. Flyway’s beforeMigrate/afterMigrate callbacks and placeholder substitution have no direct Misk equivalent. Misk runs the SQL; that’s the feature set.
Database breadth. Flyway and Liquibase support a long tail of databases and dialect quirks. Misk’s migrator is built around what Misk runs in production — MySQL and Vitess-flavored MySQL — and that’s where its road is paved.
Liquibase’s abstraction layer. If you genuinely want database-agnostic changelogs in XML/YAML/JSON, that’s Liquibase’s whole pitch, and Misk doesn’t try to be that. Misk migrations are SQL, on purpose.

The trade is the usual one: Misk gives you a smaller, opinionated tool that’s already integrated, while Flyway/Liquibase give you a bigger toolbox you have to bolt on. For a service already living in Misk, “already integrated” wins most arguments.

Production notes & gotchas

requireAll in prod means migrations are a separate step. Production boots verify, they don’t apply. Your deploy pipeline needs an explicit migration stage (the Gradle plugin, or a dedicated job) before the new code goes live, or the service will refuse to start against an un-migrated database. This is a feature, but it surprises people expecting deploy-time auto-migration.
Duplicate versions are a hard failure. The migrator builds a map of version to file and asserts there are no collisions — literally require(duplicates.isEmpty()) { "Duplicate migrations found $duplicates" }. Two files both numbered v7__... and nothing migrates. With multiple engineers merging branches, version-number collisions are the single most common breakage, so coordinate your numbers.
One statement batch per file, terminated by ;. Each file’s SQL is run as a batch; make sure every statement ends in a semicolon. A migration that quietly does half of what you intended because of a missing terminator is a miserable thing to debug in prod.
Migrations are forward-only. There’s no rollback. “Fixing” a bad migration means writing a new migration that corrects it — never editing an already-applied file, because its version is recorded in schema_version and won’t re-run.
Vitess migrations aren’t applied by Misk. For VITESS_MYSQL data sources, SchemaMigratorService short-circuits — Vitess schema changes are applied externally (via vtctl/your Vitess tooling), and Misk just gets out of the way. Don’t expect the startup migrator to touch a sharded Vitess keyspace.
The author string is validated against \w+. The migrator inserts an author into schema_version and guards it with require(author.matches(Regex("\\w+"))) to prevent SQL injection. You won’t normally set this by hand, but if you do, keep it to word characters.

What’s next

Migrations get your schema into shape; the next question is how you talk to it. Misk leans on generated, type-safe SQL rather than hand-rolled JDBC, and it has real opinions about sharding. In Part 15: Misk SqlDelight, jOOQ & Vitess Sharding we’ll generate query code from the schema we just migrated and look at what running on Vitess actually demands of your data model.

Target keywords: misk schema migration, misk flyway alternative.