Spring Boot 4 + MongoDB — Aggregations, Schema Evolution & Repository Patterns
Build a Spring Boot 4 application with MongoDB — document modeling in Kotlin, Spring Data repositories, MongoTemplate queries, aggregation pipelines, and schema evolution strategies.
MongoDB fits when your data is naturally hierarchical, schemas evolve frequently, or you need flexible querying over nested documents. If you’re joining six tables to render one screen, a document model might simplify things. If you need strict ACID transactions across multiple entities, stick with PostgreSQL.
This post covers the full Spring Boot 4 + MongoDB setup — from document modeling to aggregation pipelines.
Project setup
Add the MongoDB starter and Testcontainers for integration tests:
dependencies {
implementation("org.springframework.boot:spring-boot-starter-data-mongodb")
testImplementation("org.springframework.boot:spring-boot-testcontainers")
testImplementation("org.testcontainers:mongodb")
}
application.yml:
spring:
data:
mongodb:
uri: mongodb://localhost:27017/myapp
auto-index-creation: true
For tests, Testcontainers spins up a real MongoDB instance:
@SpringBootTest
@Testcontainers
class OrderRepositoryTest {
companion object {
@Container
val mongo = MongoDBContainer("mongo:7")
@DynamicPropertySource
@JvmStatic
fun mongoProperties(registry: DynamicPropertyRegistry) {
registry.add("spring.data.mongodb.uri") { mongo.replicaSetUrl }
}
}
}
Document modeling in Kotlin
import org.springframework.data.annotation.Id
import org.springframework.data.mongodb.core.mapping.Document
import java.math.BigDecimal
import java.time.Instant
@Document(collection = "orders")
data class Order(
@Id
val id: String? = null,
val customerId: String,
val items: List<OrderItem>,
val status: OrderStatus,
val total: BigDecimal,
val createdAt: Instant = Instant.now(),
val updatedAt: Instant = Instant.now()
)
data class OrderItem(
val productId: String,
val name: String,
val quantity: Int,
val price: BigDecimal
)
enum class OrderStatus {
PENDING, CONFIRMED, SHIPPED, DELIVERED, CANCELLED
}
Embedded objects (OrderItem) live inside the parent document — no separate collection, no joins. This works when the embedded data is bounded (an order has a finite number of items) and always accessed with the parent.
Avoid unbounded arrays. If a document could have thousands of embedded items, use a separate collection with a reference instead.
Spring Data Repository
import org.springframework.data.mongodb.repository.MongoRepository
import org.springframework.data.mongodb.repository.Query
import java.time.Instant
interface OrderRepository : MongoRepository<Order, String> {
fun findByCustomerId(customerId: String): List<Order>
fun findByStatus(status: OrderStatus): List<Order>
fun findByCustomerIdAndStatus(customerId: String, status: OrderStatus): List<Order>
@Query("{ 'createdAt': { '\$gte': ?0, '\$lte': ?1 } }")
fun findByDateRange(start: Instant, end: Instant): List<Order>
@Query("{ 'items.productId': ?0 }")
fun findByProductId(productId: String): List<Order>
}
Derived queries (findByCustomerId) work the same as JPA — Spring generates the MongoDB query from the method name. For anything beyond simple filters, use @Query with MongoDB query syntax.
MongoTemplate for dynamic queries
When queries need to be built at runtime — optional filters, conditional sorting, dynamic field selection — MongoTemplate gives you full control:
import org.springframework.data.mongodb.core.MongoTemplate
import org.springframework.data.mongodb.core.query.Criteria
import org.springframework.data.mongodb.core.query.Query
import org.springframework.stereotype.Service
@Service
class OrderQueryService(
private val mongoTemplate: MongoTemplate
) {
fun searchOrders(
customerId: String?,
status: OrderStatus?,
minTotal: BigDecimal?
): List<Order> {
val criteria = Criteria()
val conditions = mutableListOf<Criteria>()
customerId?.let { conditions.add(Criteria.where("customerId").`is`(it)) }
status?.let { conditions.add(Criteria.where("status").`is`(it)) }
minTotal?.let { conditions.add(Criteria.where("total").gte(it)) }
if (conditions.isNotEmpty()) {
criteria.andOperator(conditions)
}
val query = Query(criteria)
return mongoTemplate.find(query, Order::class.java)
}
}
Use MongoRepository for standard CRUD and simple queries. Switch to MongoTemplate when you need dynamic query construction or aggregation pipelines.
Aggregation pipelines
Aggregation is MongoDB’s equivalent of SQL GROUP BY with much more flexibility. Each stage transforms the documents flowing through the pipeline.
import org.springframework.data.mongodb.core.aggregation.Aggregation
import org.springframework.data.mongodb.core.aggregation.AggregationResults
import java.math.BigDecimal
data class CustomerOrderSummary(
val customerId: String,
val totalOrders: Long,
val totalSpent: BigDecimal,
val averageOrderValue: BigDecimal
)
fun getCustomerSummaries(): List<CustomerOrderSummary> {
val aggregation = Aggregation.newAggregation(
Aggregation.match(Criteria.where("status").ne(OrderStatus.CANCELLED)),
Aggregation.group("customerId")
.count().`as`("totalOrders")
.sum("total").`as`("totalSpent")
.avg("total").`as`("averageOrderValue"),
Aggregation.project()
.and("_id").`as`("customerId")
.andInclude("totalOrders", "totalSpent", "averageOrderValue"),
Aggregation.sort(Sort.Direction.DESC, "totalSpent")
)
val results: AggregationResults<CustomerOrderSummary> =
mongoTemplate.aggregate(aggregation, "orders", CustomerOrderSummary::class.java)
return results.mappedResults
}
Pipeline breakdown:
- match — filter out cancelled orders (do this early to reduce documents processed)
- group — group by
customerId, calculate count, sum, and average - project — reshape the output, rename
_idtocustomerId - sort — order by total spent
For joins across collections, use $lookup:
val aggregation = Aggregation.newAggregation(
Aggregation.lookup("customers", "customerId", "_id", "customer"),
Aggregation.unwind("customer"),
Aggregation.project()
.andInclude("total", "status", "createdAt")
.and("customer.name").`as`("customerName")
)
Schema evolution
MongoDB doesn’t enforce a schema, but your application code does. When you add a field to your Kotlin class, existing documents won’t have it.
Adding a field with a default
@Document(collection = "orders")
data class Order(
@Id
val id: String? = null,
val customerId: String,
val items: List<OrderItem>,
val status: OrderStatus,
val total: BigDecimal,
val currency: String = "USD", // new field — old documents get default
val createdAt: Instant = Instant.now(),
val updatedAt: Instant = Instant.now()
)
Kotlin defaults handle this — when MongoDB deserializes an old document without currency, it uses "USD".
Backfilling existing documents
For fields that need real values, run a migration:
mongoTemplate.updateMulti(
Query(Criteria.where("currency").exists(false)),
Update().set("currency", "USD"),
Order::class.java
)
Removing a field
Remove it from the Kotlin class. Old documents still have it in the database, but your application ignores it. Optionally clean up:
mongoTemplate.updateMulti(
Query(),
Update().unset("oldField"),
"orders"
)
Common mistakes
- Treating MongoDB like a relational database — normalizing everything into separate collections and joining them defeats the purpose. Embed related data when it’s accessed together.
- Unbounded arrays — an array that grows without limit will hit the 16MB document size limit and causes poor write performance. Use a separate collection with references instead.
- Missing indexes — MongoDB does a full collection scan without an index on your query fields. See the MongoDB Indexes post for a deep dive on index types and
explain()analysis. - Not using Testcontainers — an embedded fake MongoDB doesn’t behave identically to the real thing. Testcontainers gives you a real instance with minimal setup.