Rate Limiting & Circuit Breakers with Resilience4j
Add resilience to Spring Boot 4 services with Resilience4j — circuit breakers, retries, rate limiters, bulkheads, and how to configure them for production.
External services fail. Databases slow down. APIs get overloaded. Without resilience patterns, one failing dependency brings down your entire application. Resilience4j provides circuit breakers, retries, rate limiters, and bulkheads — and integrates cleanly with Spring Boot.
Dependencies
dependencies {
implementation("org.springframework.boot:spring-boot-starter-web")
implementation("org.springframework.boot:spring-boot-starter-actuator")
implementation("org.springframework.boot:spring-boot-starter-aop")
implementation("io.github.resilience4j:resilience4j-spring-boot3:2.2.0")
implementation("io.github.resilience4j:resilience4j-micrometer:2.2.0")
}
The AOP starter is required — Resilience4j uses annotations that are implemented as aspects.
Circuit breaker
A circuit breaker tracks failures. When failures exceed a threshold, it “opens” and immediately rejects requests without calling the downstream service. After a wait period, it allows a few test requests through. If those succeed, it “closes” again.
Configuration
resilience4j:
circuitbreaker:
instances:
paymentService:
sliding-window-type: COUNT_BASED
sliding-window-size: 10
failure-rate-threshold: 50
wait-duration-in-open-state: 10s
permitted-number-of-calls-in-half-open-state: 3
minimum-number-of-calls: 5
record-exceptions:
- java.io.IOException
- java.net.SocketTimeoutException
- org.springframework.web.client.HttpServerErrorException
What these settings mean:
- Track the last 10 calls
- If 50% or more fail, open the circuit
- Stay open for 10 seconds
- Then allow 3 test calls (half-open state)
- Need at least 5 calls before calculating failure rate
Usage with annotations
package com.example.demo.service
import io.github.resilience4j.circuitbreaker.annotation.CircuitBreaker
import org.slf4j.LoggerFactory
import org.springframework.stereotype.Service
import org.springframework.web.client.RestClient
@Service
class PaymentService(
private val restClient: RestClient
) {
private val log = LoggerFactory.getLogger(PaymentService::class.java)
@CircuitBreaker(name = "paymentService", fallbackMethod = "paymentFallback")
fun processPayment(orderId: String, amount: Double): PaymentResult {
val response = restClient.post()
.uri("/payments")
.body(PaymentRequest(orderId, amount))
.retrieve()
.body(PaymentResult::class.java)
return response ?: throw PaymentException("Empty response from payment gateway")
}
private fun paymentFallback(orderId: String, amount: Double, ex: Exception): PaymentResult {
log.warn("Payment circuit breaker fallback for order $orderId: ${ex.message}")
return PaymentResult(
success = false,
message = "Payment service temporarily unavailable. Order queued for retry."
)
}
}
data class PaymentRequest(val orderId: String, val amount: Double)
data class PaymentResult(val success: Boolean, val message: String = "")
data class PaymentException(override val message: String) : RuntimeException(message)
The fallback method must have the same parameters as the original method plus the exception. It runs when the circuit is open or when the call fails.
Programmatic usage
When you need more control:
package com.example.demo.service
import io.github.resilience4j.circuitbreaker.CircuitBreakerRegistry
import org.springframework.stereotype.Service
@Service
class InventoryService(
circuitBreakerRegistry: CircuitBreakerRegistry,
private val restClient: RestClient
) {
private val circuitBreaker = circuitBreakerRegistry.circuitBreaker("inventoryService")
fun checkStock(productId: String): StockResult {
return circuitBreaker.executeSupplier {
restClient.get()
.uri("/inventory/$productId")
.retrieve()
.body(StockResult::class.java)
?: throw RuntimeException("No response from inventory service")
}
}
}
data class StockResult(val productId: String, val quantity: Int, val available: Boolean)
Retry
Retries handle transient failures — network glitches, temporary 503s, connection resets.
Configuration
resilience4j:
retry:
instances:
paymentService:
max-attempts: 3
wait-duration: 500ms
exponential-backoff-multiplier: 2
retry-exceptions:
- java.io.IOException
- java.net.SocketTimeoutException
ignore-exceptions:
- com.example.demo.service.PaymentDeclinedException
Exponential backoff: first retry at 500ms, second at 1000ms, third at 2000ms.
Usage
@Retry(name = "paymentService", fallbackMethod = "paymentFallback")
@CircuitBreaker(name = "paymentService", fallbackMethod = "paymentFallback")
fun processPayment(orderId: String, amount: Double): PaymentResult {
// ... call external service
}
Order matters. Resilience4j applies decorators in this order: Retry → CircuitBreaker → RateLimiter → Bulkhead. The retry wraps the circuit breaker — each retry attempt is tracked by the circuit breaker.
Rate limiter
Rate limiting prevents your service from being overwhelmed — either by external callers or by your own code calling a downstream service too fast.
Configuration
resilience4j:
ratelimiter:
instances:
apiRateLimit:
limit-for-period: 100
limit-refresh-period: 1s
timeout-duration: 0s
paymentGateway:
limit-for-period: 10
limit-refresh-period: 1s
timeout-duration: 500ms
apiRateLimit: 100 requests per second. If exceeded, immediately reject (timeout 0).
paymentGateway: 10 requests per second. Wait up to 500ms for a permit before failing.
Rate limiting an API endpoint
package com.example.demo.controller
import io.github.resilience4j.ratelimiter.annotation.RateLimiter
import org.springframework.http.ResponseEntity
import org.springframework.web.bind.annotation.GetMapping
import org.springframework.web.bind.annotation.RequestMapping
import org.springframework.web.bind.annotation.RestController
@RestController
@RequestMapping("/api/v1/search")
class SearchController(
private val searchService: SearchService
) {
@GetMapping
@RateLimiter(name = "apiRateLimit", fallbackMethod = "rateLimitFallback")
fun search(@RequestParam q: String): ResponseEntity<List<SearchResult>> {
return ResponseEntity.ok(searchService.search(q))
}
private fun rateLimitFallback(q: String, ex: Exception): ResponseEntity<List<SearchResult>> {
return ResponseEntity.status(429)
.header("Retry-After", "1")
.build()
}
}
Rate limiting outbound calls
@RateLimiter(name = "paymentGateway")
fun processPayment(orderId: String, amount: Double): PaymentResult {
return restClient.post()
.uri("/payments")
.body(PaymentRequest(orderId, amount))
.retrieve()
.body(PaymentResult::class.java)!!
}
This prevents your service from overwhelming the payment gateway even under high load.
Bulkhead
A bulkhead limits concurrent calls to a service. If one downstream service is slow, it doesn’t consume all your threads.
Configuration
resilience4j:
bulkhead:
instances:
paymentService:
max-concurrent-calls: 10
max-wait-duration: 500ms
Maximum 10 concurrent calls to the payment service. Additional calls wait up to 500ms for a slot.
Usage
@Bulkhead(name = "paymentService", fallbackMethod = "paymentFallback")
@CircuitBreaker(name = "paymentService", fallbackMethod = "paymentFallback")
fun processPayment(orderId: String, amount: Double): PaymentResult {
// ... call external service
}
Thread pool bulkhead
For isolating calls in a separate thread pool:
resilience4j:
thread-pool-bulkhead:
instances:
paymentService:
max-thread-pool-size: 10
core-thread-pool-size: 5
queue-capacity: 20
This runs payment calls in a dedicated thread pool. If the payment service is slow, it uses its own threads — not the main Tomcat thread pool.
Combining patterns
Stack annotations for full resilience:
@Retry(name = "paymentService")
@CircuitBreaker(name = "paymentService", fallbackMethod = "paymentFallback")
@RateLimiter(name = "paymentGateway")
@Bulkhead(name = "paymentService")
fun processPayment(orderId: String, amount: Double): PaymentResult {
return restClient.post()
.uri("/payments")
.body(PaymentRequest(orderId, amount))
.retrieve()
.body(PaymentResult::class.java)!!
}
Execution order (outermost to innermost):
- Bulkhead — limits concurrent calls
- Rate limiter — limits call rate
- Circuit breaker — tracks failures, opens when threshold exceeded
- Retry — retries failed calls
Monitoring with Actuator
Resilience4j publishes metrics to Micrometer automatically:
management:
endpoints:
web:
exposure:
include: health, prometheus, circuitbreakers, ratelimiters
health:
circuitbreakers:
enabled: true
ratelimiters:
enabled: true
Health status
GET /actuator/health
{
"components": {
"circuitBreakers": {
"status": "UP",
"details": {
"paymentService": {
"status": "UP",
"details": {
"state": "CLOSED",
"failureRate": "0.0%"
}
}
}
}
}
}
Prometheus metrics
# Circuit breaker state (0=CLOSED, 1=OPEN, 2=HALF_OPEN)
resilience4j_circuitbreaker_state{name="paymentService"}
# Failure rate
resilience4j_circuitbreaker_failure_rate{name="paymentService"}
# Rate limiter available permissions
resilience4j_ratelimiter_available_permissions{name="apiRateLimit"}
# Bulkhead available concurrent calls
resilience4j_bulkhead_available_concurrent_calls{name="paymentService"}
Event listeners
React to circuit breaker state changes:
package com.example.demo.config
import io.github.resilience4j.circuitbreaker.CircuitBreakerRegistry
import io.github.resilience4j.circuitbreaker.event.CircuitBreakerOnStateTransitionEvent
import jakarta.annotation.PostConstruct
import org.slf4j.LoggerFactory
import org.springframework.stereotype.Component
@Component
class CircuitBreakerEventListener(
private val circuitBreakerRegistry: CircuitBreakerRegistry
) {
private val log = LoggerFactory.getLogger(CircuitBreakerEventListener::class.java)
@PostConstruct
fun registerListeners() {
circuitBreakerRegistry.circuitBreaker("paymentService")
.eventPublisher
.onStateTransition { event: CircuitBreakerOnStateTransitionEvent ->
log.warn(
"Circuit breaker '${event.circuitBreakerName}' " +
"transitioned from ${event.stateTransition.fromState} " +
"to ${event.stateTransition.toState}"
)
// Send alert to Slack, PagerDuty, etc.
}
}
}
When to use what
| Pattern | Use when | Example |
|---|---|---|
| Circuit breaker | Downstream service might be down | Payment gateway, external APIs |
| Retry | Transient failures are likely | Network glitches, temporary 503s |
| Rate limiter | You need to control throughput | API rate limiting, third-party API quotas |
| Bulkhead | Slow service might consume all threads | Slow database queries, file uploads |
Common mistakes
Retrying non-idempotent operations. Only retry operations that are safe to repeat. POST requests that create resources might create duplicates.
Setting retry delays too low. If the downstream service is overloaded, rapid retries make it worse. Use exponential backoff.
Circuit breaker threshold too sensitive. A 10% failure rate on a cold start (2 out of 20 calls) shouldn’t open the circuit. Set minimum-number-of-calls high enough to avoid false positives.
Not testing failure scenarios. Use WireMock or Toxiproxy to simulate downstream failures and verify your fallbacks work.
Resilience patterns prevent cascading failures. Start with circuit breakers on external service calls, add retries for transient failures, and use rate limiters to protect both your service and your dependencies.