Skip to main content
PCSalt
YouTube GitHub
Back to Spring Boot
Spring Boot · 2 min read

Spring Boot 4 + MongoDB — Aggregations, Schema Evolution & Repository Patterns

Build a Spring Boot 4 application with MongoDB — document modeling in Kotlin, Spring Data repositories, MongoTemplate queries, aggregation pipelines, and schema evolution strategies.


MongoDB fits when your data is naturally hierarchical, schemas evolve frequently, or you need flexible querying over nested documents. If you’re joining six tables to render one screen, a document model might simplify things. If you need strict ACID transactions across multiple entities, stick with PostgreSQL.

This post covers the full Spring Boot 4 + MongoDB setup — from document modeling to aggregation pipelines.

Project setup

Add the MongoDB starter and Testcontainers for integration tests:

dependencies {
    implementation("org.springframework.boot:spring-boot-starter-data-mongodb")
    testImplementation("org.springframework.boot:spring-boot-testcontainers")
    testImplementation("org.testcontainers:mongodb")
}

application.yml:

spring:
  data:
    mongodb:
      uri: mongodb://localhost:27017/myapp
      auto-index-creation: true

For tests, Testcontainers spins up a real MongoDB instance:

@SpringBootTest
@Testcontainers
class OrderRepositoryTest {

    companion object {
        @Container
        val mongo = MongoDBContainer("mongo:7")

        @DynamicPropertySource
        @JvmStatic
        fun mongoProperties(registry: DynamicPropertyRegistry) {
            registry.add("spring.data.mongodb.uri") { mongo.replicaSetUrl }
        }
    }
}

Document modeling in Kotlin

import org.springframework.data.annotation.Id
import org.springframework.data.mongodb.core.mapping.Document
import java.math.BigDecimal
import java.time.Instant

@Document(collection = "orders")
data class Order(
    @Id
    val id: String? = null,
    val customerId: String,
    val items: List<OrderItem>,
    val status: OrderStatus,
    val total: BigDecimal,
    val createdAt: Instant = Instant.now(),
    val updatedAt: Instant = Instant.now()
)

data class OrderItem(
    val productId: String,
    val name: String,
    val quantity: Int,
    val price: BigDecimal
)

enum class OrderStatus {
    PENDING, CONFIRMED, SHIPPED, DELIVERED, CANCELLED
}

Embedded objects (OrderItem) live inside the parent document — no separate collection, no joins. This works when the embedded data is bounded (an order has a finite number of items) and always accessed with the parent.

Avoid unbounded arrays. If a document could have thousands of embedded items, use a separate collection with a reference instead.

Spring Data Repository

import org.springframework.data.mongodb.repository.MongoRepository
import org.springframework.data.mongodb.repository.Query
import java.time.Instant

interface OrderRepository : MongoRepository<Order, String> {

    fun findByCustomerId(customerId: String): List<Order>

    fun findByStatus(status: OrderStatus): List<Order>

    fun findByCustomerIdAndStatus(customerId: String, status: OrderStatus): List<Order>

    @Query("{ 'createdAt': { '\$gte': ?0, '\$lte': ?1 } }")
    fun findByDateRange(start: Instant, end: Instant): List<Order>

    @Query("{ 'items.productId': ?0 }")
    fun findByProductId(productId: String): List<Order>
}

Derived queries (findByCustomerId) work the same as JPA — Spring generates the MongoDB query from the method name. For anything beyond simple filters, use @Query with MongoDB query syntax.

MongoTemplate for dynamic queries

When queries need to be built at runtime — optional filters, conditional sorting, dynamic field selection — MongoTemplate gives you full control:

import org.springframework.data.mongodb.core.MongoTemplate
import org.springframework.data.mongodb.core.query.Criteria
import org.springframework.data.mongodb.core.query.Query
import org.springframework.stereotype.Service

@Service
class OrderQueryService(
    private val mongoTemplate: MongoTemplate
) {

    fun searchOrders(
        customerId: String?,
        status: OrderStatus?,
        minTotal: BigDecimal?
    ): List<Order> {
        val criteria = Criteria()
        val conditions = mutableListOf<Criteria>()

        customerId?.let { conditions.add(Criteria.where("customerId").`is`(it)) }
        status?.let { conditions.add(Criteria.where("status").`is`(it)) }
        minTotal?.let { conditions.add(Criteria.where("total").gte(it)) }

        if (conditions.isNotEmpty()) {
            criteria.andOperator(conditions)
        }

        val query = Query(criteria)
        return mongoTemplate.find(query, Order::class.java)
    }
}

Use MongoRepository for standard CRUD and simple queries. Switch to MongoTemplate when you need dynamic query construction or aggregation pipelines.

Aggregation pipelines

Aggregation is MongoDB’s equivalent of SQL GROUP BY with much more flexibility. Each stage transforms the documents flowing through the pipeline.

import org.springframework.data.mongodb.core.aggregation.Aggregation
import org.springframework.data.mongodb.core.aggregation.AggregationResults
import java.math.BigDecimal

data class CustomerOrderSummary(
    val customerId: String,
    val totalOrders: Long,
    val totalSpent: BigDecimal,
    val averageOrderValue: BigDecimal
)

fun getCustomerSummaries(): List<CustomerOrderSummary> {
    val aggregation = Aggregation.newAggregation(
        Aggregation.match(Criteria.where("status").ne(OrderStatus.CANCELLED)),
        Aggregation.group("customerId")
            .count().`as`("totalOrders")
            .sum("total").`as`("totalSpent")
            .avg("total").`as`("averageOrderValue"),
        Aggregation.project()
            .and("_id").`as`("customerId")
            .andInclude("totalOrders", "totalSpent", "averageOrderValue"),
        Aggregation.sort(Sort.Direction.DESC, "totalSpent")
    )

    val results: AggregationResults<CustomerOrderSummary> =
        mongoTemplate.aggregate(aggregation, "orders", CustomerOrderSummary::class.java)

    return results.mappedResults
}

Pipeline breakdown:

  1. match — filter out cancelled orders (do this early to reduce documents processed)
  2. group — group by customerId, calculate count, sum, and average
  3. project — reshape the output, rename _id to customerId
  4. sort — order by total spent

For joins across collections, use $lookup:

val aggregation = Aggregation.newAggregation(
    Aggregation.lookup("customers", "customerId", "_id", "customer"),
    Aggregation.unwind("customer"),
    Aggregation.project()
        .andInclude("total", "status", "createdAt")
        .and("customer.name").`as`("customerName")
)

Schema evolution

MongoDB doesn’t enforce a schema, but your application code does. When you add a field to your Kotlin class, existing documents won’t have it.

Adding a field with a default

@Document(collection = "orders")
data class Order(
    @Id
    val id: String? = null,
    val customerId: String,
    val items: List<OrderItem>,
    val status: OrderStatus,
    val total: BigDecimal,
    val currency: String = "USD", // new field — old documents get default
    val createdAt: Instant = Instant.now(),
    val updatedAt: Instant = Instant.now()
)

Kotlin defaults handle this — when MongoDB deserializes an old document without currency, it uses "USD".

Backfilling existing documents

For fields that need real values, run a migration:

mongoTemplate.updateMulti(
    Query(Criteria.where("currency").exists(false)),
    Update().set("currency", "USD"),
    Order::class.java
)

Removing a field

Remove it from the Kotlin class. Old documents still have it in the database, but your application ignores it. Optionally clean up:

mongoTemplate.updateMulti(
    Query(),
    Update().unset("oldField"),
    "orders"
)

Common mistakes

  • Treating MongoDB like a relational database — normalizing everything into separate collections and joining them defeats the purpose. Embed related data when it’s accessed together.
  • Unbounded arrays — an array that grows without limit will hit the 16MB document size limit and causes poor write performance. Use a separate collection with references instead.
  • Missing indexes — MongoDB does a full collection scan without an index on your query fields. See the MongoDB Indexes post for a deep dive on index types and explain() analysis.
  • Not using Testcontainers — an embedded fake MongoDB doesn’t behave identically to the real thing. Testcontainers gives you a real instance with minimal setup.