Misk Rate Limiting & Tokens: Bucket4j Backends and Safe ID Generation

Series: Building Production Services with Misk — Part 17 of 24

Two unglamorous problems show up in every service eventually, and Misk has a module for each. The first: you need to mint identifiers (order IDs, payment tokens, idempotency keys) that nobody can guess and nobody can confuse for one another over the phone. The second: you need to stop a misbehaving client (or a runaway retry loop) from flattening an expensive endpoint. Misk rate limiting and misk tokens are the two small, sharp tools for exactly this, and they’re worth knowing before you reinvent either with a Random and a hand-rolled counter. Let’s wire both.

Tokens: safe ID generation

The misk-tokens module exists so you never write UUID.randomUUID().toString() and call it an ID again. Its public surface is one interface, misk.tokens.TokenGenerator:

interface TokenGenerator {
  fun generate(label: String? = null, length: Int = 25): String
}

You inject it like any other dependency. The exemplar’s HelloWebAction does exactly that:

@Singleton
class HelloWebAction @Inject constructor(
  private val tokenGenerator: TokenGenerator,
) : WebAction {
  @Get("/hello/{name}")
  @Unauthenticated
  @ResponseContentType(MediaTypes.APPLICATION_JSON)
  fun hello(@PathParam name: String, /* ... */): HelloResponse {
    return HelloResponse(
      greetings?.joinToString(separator = " ") ?: tokenGenerator.generate(),
      // ...
    )
  }
}

What makes a Misk token better than a raw UUID is the encoding. Tokens use a Crockford Base32 alphabet, 0123456789abcdefghjkmnpqrstvwxyz, deliberately missing the letters that get misread (i, l, o, u). The default length is 25 characters, which is 125 bits of entropy: slightly more than a random UUID’s 122, in fewer, friendlier characters. A production token looks like 75dsma7kscyvbgz7ea1yy3qe8: short enough to fit in a URL, robust enough to never collide, and resistant to the “is that a one or an ell?” failure mode that plagues anything a human might transcribe.

That transcription-friendliness isn’t accidental. TokenGenerator ships a canonicalize helper that maps visually-confusable characters back to their canonical form (o and O become 0, i/I/l/L become 1) and strips spaces. Accept a token a customer read off a receipt, canonicalize it, and 0OoIiLl ambiguity stops being your problem. You don’t need to canonicalize tokens you generated yourself; they’re already canonical.

The label argument is a namespace hint. In production it’s effectively ignored — generate() draws from SecureRandom regardless — but it shines in tests. Install TokenGeneratorModule for the real thing, or FakeTokenGeneratorModule for tests, and the fake produces sequential, predictable tokens prefixed with your label:

tokenGenerator.generate("payment")   // payment000000000000000034
tokenGenerator.generate("cst0mer")   // cst0mer000000000000000035

This is the detail that makes misk-tokens lovely to test against: you can hard-code expected tokens in assertions because the fake is deterministic. FakeTokenGenerator is a FakeFixture, so it resets between tests automatically. (One small gotcha baked into the fake: it strips u from labels, because Crockford Base32 omits u to dodge accidental profanity. Don’t be surprised when your "august" label generates as "agst".)

There’s some in-flight churn here worth flagging: the source carries a parallel TokenGenerator2 interface mid-migration, with a comment-laden plan to eventually collapse the two via typealias. Today misk.tokens.TokenGenerator is a typealias for the (deprecated-in-wisp) wisp.token.TokenGenerator. Inject misk.tokens.TokenGenerator, ignore the 2, and you’ll be on the right side of history when the dust settles.

Rate limiting: the token bucket

The other half of this post is wisp.ratelimiting.RateLimiter — and despite the shared word “token,” it’s an entirely separate concern. Here a token is a unit of capacity in a token bucket: each request tries to consume one, the bucket refills over time, and when it’s empty you’re throttled.

The interface is small and honest:

interface RateLimiter {
  fun consumeToken(key: String, configuration: RateLimitConfiguration, amount: Long = 1): ConsumptionData
  fun testConsumptionAttempt(key: String, configuration: RateLimitConfiguration, amount: Long = 1): TestConsumptionResult
  fun releaseToken(key: String, configuration: RateLimitConfiguration, amount: Long = 1)
  fun availableTokens(key: String, configuration: RateLimitConfiguration): Long
  fun resetBucket(key: String, configuration: RateLimitConfiguration)
  fun <T> withToken(key: String, configuration: RateLimitConfiguration, f: () -> T): ExecutionResult<T>
}

Two pieces define a limit. The key is what you’re limiting — a source IP, a user ID, an API key, and gets its own bucket. The RateLimitConfiguration is the policy:

interface RateLimitConfiguration {
  val capacity: Long          // max tokens the bucket holds
  val name: String            // identifies this config (shows up in metrics)
  val refillAmount: Long      // tokens added each refill period
  val refillPeriod: Duration  // how often refillAmount is added
  val version: Long?          // bump this when you change the config
}

The one method you’ll actually call is withToken. It’s a default method that consumes a token, runs your lambda only if a token was available, and hands back an ExecutionResult<T> carrying both your result (or null) and the ConsumptionData — didConsume, remaining, and resetTime. That last field is what you’d surface as a Retry-After header. The other methods are escape hatches: testConsumptionAttempt is a dry run (the data may be stale by the time it returns — other pods are racing you), releaseToken refunds capacity if an operation you’d already charged for didn’t happen, and resetBucket refills to max.

Choosing a backend

RateLimiter is just the interface. The teeth come from Bucket4j, and Misk ships four modules that wire a distributed Bucket4j proxy manager to a backing store. Pick by what you already run:

Redis — misk-rate-limiting-bucket4j-redis, RedisBucket4jRateLimiterModule. Backed by Jedis (UnifiedJedis), CAS-based, with TTLs tuned to the refill period so idle buckets evict themselves. This is the one to reach for: rate-limit state is ephemeral and high-churn, which is exactly what Redis is good at. The module takes sensible defaults — additionalTtl, maxRetries, retryTimeout, an optional qualifier — so RedisBucket4jRateLimiterModule() with no args is a real configuration.
MySQL — misk-rate-limiting-bucket4j-mysql, MySQLBucket4jRateLimiterModule. Uses SELECT … FOR UPDATE against a table you specify (qualifier, tableName, idColumn, stateColumn). Reasonable if you already have a database and don’t want to operate Redis just for throttling — but you’re putting hot, contended writes on your relational store, and it ships a RateLimitPruner you’re expected to run to garbage-collect dead buckets.
DynamoDB v2 — misk-rate-limiting-bucket4j-dynamodb-v2, DynamoDbV2Bucket4jRateLimiterModule(tableName). Built on the AWS SDK v2 DynamoDbClient. The right DynamoDB choice today.
DynamoDB v1 — misk-rate-limiting-bucket4j-dynamodb-v1, DynamoDbV1Bucket4jRateLimiterModule. Deprecated in the source itself: “the AWS Java v1 SDK is EoL since Dec ‘25.” It exists; don’t start here. If you’re on it, the deprecation note points straight at v2.

All four bind the same RateLimiter interface, so your action code never knows which backend it got. Swap modules, not call sites.

Worked example

Here’s the exemplar’s RateLimitedAction, verbatim — it’s about as compact as a rate-limited endpoint gets:

@Singleton
class RateLimitedAction @Inject constructor(private val rateLimiter: RateLimiter) {
  @Unauthenticated
  @Get("/expensive-rate-limited-action")
  @ResponseContentType(MediaTypes.APPLICATION_JSON)
  fun rateLimitedExample(): RateLimitedExampleResponse {
    val sourceIp = "192.168.1.1"
    val result =
      rateLimiter.withToken(sourceIp, ExampleRateLimitConfiguration) {
        RateLimitedExampleResponse(Random.nextLong())
      }

    val consumptionData = result.consumptionData
    return if (consumptionData.didConsume) {
      result.result!!
    } else {
      throw TooManyRequestsException()
    }
  }
}

object ExampleRateLimitConfiguration : RateLimitConfiguration {
  override val capacity = 10L
  override val name = "ExpensiveRateLimitedAction"
  override val refillAmount = 10L
  override val refillPeriod: Duration = Duration.ofMinutes(1L)
  override val version = 0L
}

Ten requests per minute, keyed by source IP (hard-coded here for the demo — in real code you’d pull it off the request). withToken runs the body only if a token’s available; when didConsume is false, the action throws TooManyRequestsException, which Misk maps to a 429. The config is a plain object implementing the interface — no annotations, no DSL, just five values.

Testing it is where the pieces lock together. The exemplar’s AbstractRateLimitedActionTests injects a FakeClock and a MeterRegistry, then drives the limiter to its edge:

@Test
fun `should throw when we reach limit`() {
  repeat(ExampleRateLimitConfiguration.capacity.toInt()) {
    assertDoesNotThrow { rateLimitedAction.rateLimitedExample() }
  }
  assertThrows<TooManyRequestsException> { rateLimitedAction.rateLimitedExample() }

  fakeClock.add(ExampleRateLimitConfiguration.refillPeriod)
  assertDoesNotThrow { rateLimitedAction.rateLimitedExample() }   // refilled
}

The FakeClock is the trick: instead of Thread.sleep-ing for a minute to watch the bucket refill, you advance time. This works because every backend module wires the Bucket4j proxy manager with a ClockTimeMeter(clock) instead of System.currentTimeMillis() — refill is computed off the injected Clock, so fast-forwarding the fake clock fast-forwards the refill. The abstract test runs concretely against each backend; RedisRateLimitedActionTests is the whole wiring in seven lines:

@MiskTest(startService = true)
class RedisRateLimitedActionTests : AbstractRateLimitedActionTests() {
  @MiskTestModule
  val module: Module = object : KAbstractModule() {
    override fun configure() {
      install(ExemplarTestModule())
      install(RedisModule(DockerRedis.replicationGroupConfig, ConnectionPoolConfig(), useSsl = false))
      install(RedisBucket4jRateLimiterModule())
      install(RedisTestFlushModule())
      bind<MeterRegistry>().toInstance(SimpleMeterRegistry())
    }
  }

  override fun setException() { redis.close() }   // force a backend failure
}

That RateLimiterMetrics the tests assert on — consumptionAttempts tagged SUCCESS / REJECTED / EXCEPTION — is the same Micrometer registry your dashboards read in production. You get throttle observability for free.

Production notes & gotchas

Bind a Clock and a MeterRegistry, or the module won’t load. Every backend module calls requireBinding<Clock>() and requireBinding<MeterRegistry>() in configure(). If you forget either, Guice fails at startup — which is the good failure mode, but the error won’t say “rate limiter,” it’ll say “missing binding.” Now you know.
Don’t key on a spoofable header without thinking. The exemplar keys on a hard-coded IP for demonstration. A real X-Forwarded-For is client-controllable; rate-limit by something you trust — an authenticated caller, a verified API key — or you’ve built a limiter that the abusive client simply rotates around.
Backend exceptions propagate. consumeToken raises whatever the proxy manager throws — JedisException and friends — rather than silently allowing or denying. Decide your fail-open vs. fail-closed policy explicitly; the test’s setException() closing the Redis connection exists precisely to exercise that path. The EXCEPTION metric is there to alert on it.
MySQL buckets need pruning. The MySQL module ships a RateLimitPruner and a prunerPageSize because dead buckets accumulate as rows. Redis evicts via TTL on its own; MySQL does not. Schedule the pruner or watch the table grow.
testConsumptionAttempt is advisory, not a reservation. It tells you whether a token could have been consumed — but between that call and the real consumeToken, another pod may have drained the bucket. Use it for “show the user their remaining quota,” never as a check-then-act guard.
misk tokens and rate-limiter “tokens” are unrelated. Same word, different module, different problem. A TokenGenerator token is an unguessable ID; a rate-limiter token is a unit of capacity. Don’t let the vocabulary collision leak into your variable names.

What’s next

We’ve covered minting IDs and rejecting excess load — both synchronous, both in the request path. Plenty of work, though, has no business blocking a response: sending the email, reindexing the record, settling the batch. In Part 18: Misk Job Queues we’ll hand that work to a queue and get it off the hot path entirely.

Target keywords: misk rate limiting, misk tokens.