PCSalt
YouTube GitHub
Back to Kotlin
Kotlin · 3 min read

XML Parsing in Kotlin — XmlPullParser and Retrofit

Parse XML in Kotlin using Android's built-in XmlPullParser and Retrofit with SimpleXML converter. Manual vs automatic — when to use what.


XML APIs are rare in 2024 — most modern APIs return JSON. But you’ll still run into XML when dealing with RSS feeds, SOAP services, payment gateways, government APIs, or legacy systems. This post covers two approaches: manual parsing with XmlPullParser and automatic parsing with Retrofit.

The XML

We’ll parse an RSS-style feed — the most common XML format you’ll encounter in the wild:

<?xml version="1.0" encoding="UTF-8"?>
<feed>
  <title>PCSalt Tech Blog</title>
  <link>https://pcsalt.com</link>
  <updated>2024-08-01T10:00:00Z</updated>
  <entries>
    <entry>
      <id>post_001</id>
      <title>JSON Parsing in Java — Manual Parsing with org.json</title>
      <author>
        <name>Navkrishna</name>
        <email>[email protected]</email>
      </author>
      <published>2024-07-01T09:00:00Z</published>
      <category>java</category>
      <tags>
        <tag>java</tag>
        <tag>json</tag>
        <tag>org-json</tag>
      </tags>
      <summary>Parse a complex, real-world JSON response manually using org.json in Java 21.</summary>
      <link>https://pcsalt.com/java/json-parsing-org-json-java/</link>
    </entry>
    <entry>
      <id>post_002</id>
      <title>CQRS with Spring Boot, Kafka &amp; MongoDB — Part 1</title>
      <author>
        <name>Navkrishna</name>
        <email>[email protected]</email>
      </author>
      <published>2024-03-09T12:00:00Z</published>
      <category>architecture</category>
      <tags>
        <tag>kotlin</tag>
        <tag>spring-boot</tag>
        <tag>kafka</tag>
      </tags>
      <summary>Understanding CQRS, when it makes sense, and how Spring Boot, Kafka, and MongoDB fit together.</summary>
      <link>https://pcsalt.com/architecture/cqrs-spring-boot-kafka-part-1/</link>
    </entry>
  </entries>
</feed>

This XML has nested elements, repeated tags, attributes-free structure, and &amp; entity encoding — enough to exercise both parsing approaches.

Approach 1: XmlPullParser (Manual)

XmlPullParser is built into the Android SDK — no dependency needed. It’s a streaming parser that reads XML token by token. Fast and memory-efficient, but verbose.

Model Classes

data class Author(
  val name: String,
  val email: String
)

data class Entry(
  val id: String,
  val title: String,
  val author: Author,
  val published: String,
  val category: String,
  val tags: List<String>,
  val summary: String,
  val link: String
)

data class Feed(
  val title: String,
  val link: String,
  val updated: String,
  val entries: List<Entry>
)

Parser

import org.xmlpull.v1.XmlPullParser
import org.xmlpull.v1.XmlPullParserFactory
import java.io.StringReader

object FeedXmlParser {

  fun parse(xml: String): Feed {
    val parser = XmlPullParserFactory.newInstance().newPullParser()
    parser.setInput(StringReader(xml))

    var feedTitle = ""
    var feedLink = ""
    var feedUpdated = ""
    val entries = mutableListOf<Entry>()

    var eventType = parser.eventType
    while (eventType != XmlPullParser.END_DOCUMENT) {
      if (eventType == XmlPullParser.START_TAG) {
        when (parser.name) {
          "title" -> {
            if (entries.isEmpty()) {
              feedTitle = parser.nextText()
            }
          }
          "link" -> {
            if (entries.isEmpty()) {
              feedLink = parser.nextText()
            }
          }
          "updated" -> feedUpdated = parser.nextText()
          "entry" -> entries.add(parseEntry(parser))
        }
      }
      eventType = parser.next()
    }

    return Feed(feedTitle, feedLink, feedUpdated, entries)
  }

  private fun parseEntry(parser: XmlPullParser): Entry {
    var id = ""
    var title = ""
    var author = Author("", "")
    var published = ""
    var category = ""
    val tags = mutableListOf<String>()
    var summary = ""
    var link = ""

    var eventType = parser.next()
    while (eventType != XmlPullParser.END_DOCUMENT) {
      if (eventType == XmlPullParser.START_TAG) {
        when (parser.name) {
          "id" -> id = parser.nextText()
          "title" -> title = parser.nextText()
          "author" -> author = parseAuthor(parser)
          "published" -> published = parser.nextText()
          "category" -> category = parser.nextText()
          "tag" -> tags.add(parser.nextText())
          "summary" -> summary = parser.nextText()
          "link" -> link = parser.nextText()
        }
      } else if (eventType == XmlPullParser.END_TAG && parser.name == "entry") {
        break
      }
      eventType = parser.next()
    }

    return Entry(id, title, author, published, category, tags, summary, link)
  }

  private fun parseAuthor(parser: XmlPullParser): Author {
    var name = ""
    var email = ""

    var eventType = parser.next()
    while (eventType != XmlPullParser.END_DOCUMENT) {
      if (eventType == XmlPullParser.START_TAG) {
        when (parser.name) {
          "name" -> name = parser.nextText()
          "email" -> email = parser.nextText()
        }
      } else if (eventType == XmlPullParser.END_TAG && parser.name == "author") {
        break
      }
      eventType = parser.next()
    }

    return Author(name, email)
  }
}

That’s ~100 lines of parsing code. Every element needs an explicit when branch, every nested element needs its own parse function, and you have to manually track START_TAG / END_TAG events. Sound familiar? It’s the same pain as org.json for JSON.

Usage

fun main() {
  val xml = """<?xml version="1.0" encoding="UTF-8"?>
    <feed>
      <!-- ... full XML from above ... -->
    </feed>
  """.trimIndent()

  val feed = FeedXmlParser.parse(xml)

  println("Feed: ${feed.title}")
  println("Updated: ${feed.updated}")
  println()

  for (entry in feed.entries) {
    println("--- ${entry.title} ---")
    println("  Author: ${entry.author.name} (${entry.author.email})")
    println("  Published: ${entry.published}")
    println("  Category: ${entry.category}")
    println("  Tags: ${entry.tags.joinToString(", ")}")
    println("  Summary: ${entry.summary}")
    println("  Link: ${entry.link}")
    println()
  }
}

Output

Feed: PCSalt Tech Blog
Updated: 2024-08-01T10:00:00Z

--- JSON Parsing in Java — Manual Parsing with org.json ---
  Author: Navkrishna ([email protected])
  Published: 2024-07-01T09:00:00Z
  Category: java
  Tags: java, json, org-json
  Summary: Parse a complex, real-world JSON response manually using org.json in Java 21.
  Link: https://pcsalt.com/java/json-parsing-org-json-java/

--- CQRS with Spring Boot, Kafka & MongoDB — Part 1 ---
  Author: Navkrishna ([email protected])
  Published: 2024-03-09T12:00:00Z
  Category: architecture
  Tags: kotlin, spring-boot, kafka
  Summary: Understanding CQRS, when it makes sense, and how Spring Boot, Kafka, and MongoDB fit together.
  Link: https://pcsalt.com/architecture/cqrs-spring-boot-kafka-part-1/

Approach 2: Retrofit + SimpleXML (Automatic)

If you’re consuming an XML API over HTTP, Retrofit with the SimpleXML converter handles everything — parsing, networking, and mapping to objects.

Dependencies

implementation 'com.squareup.retrofit2:retrofit:2.11.0'
implementation 'com.squareup.retrofit2:converter-simplexml:2.11.0'

Note: SimpleXML converter is deprecated by Retrofit but still widely used. For new projects, consider JAXB converter (converter-jaxb) if you’re on the JVM, or stick with SimpleXML for Android.

Model Classes with SimpleXML Annotations

import org.simpleframework.xml.Element
import org.simpleframework.xml.ElementList
import org.simpleframework.xml.Root

@Root(name = "author", strict = false)
data class Author(
  @field:Element(name = "name")
  var name: String = "",

  @field:Element(name = "email")
  var email: String = ""
)

@Root(name = "entry", strict = false)
data class Entry(
  @field:Element(name = "id")
  var id: String = "",

  @field:Element(name = "title")
  var title: String = "",

  @field:Element(name = "author")
  var author: Author = Author(),

  @field:Element(name = "published")
  var published: String = "",

  @field:Element(name = "category")
  var category: String = "",

  @field:ElementList(name = "tags", entry = "tag")
  var tags: List<String> = emptyList(),

  @field:Element(name = "summary")
  var summary: String = "",

  @field:Element(name = "link")
  var link: String = ""
)

@Root(name = "feed", strict = false)
data class Feed(
  @field:Element(name = "title")
  var title: String = "",

  @field:Element(name = "link")
  var link: String = "",

  @field:Element(name = "updated")
  var updated: String = "",

  @field:ElementList(name = "entries", entry = "entry")
  var entries: List<Entry> = emptyList()
)

Retrofit Interface

import retrofit2.Call
import retrofit2.http.GET

interface FeedApi {

  @GET("feed.xml")
  fun getFeed(): Call<Feed>
}

Setup and Usage

import retrofit2.Retrofit
import retrofit2.converter.simplexml.SimpleXmlConverterFactory

fun main() {
  val retrofit = Retrofit.Builder()
    .baseUrl("https://pcsalt.com/")
    .addConverterFactory(SimpleXmlConverterFactory.create())
    .build()

  val api = retrofit.create(FeedApi::class.java)
  val response = api.getFeed().execute()

  if (response.isSuccessful) {
    val feed = response.body()
    println("Feed: ${feed?.title}")
    feed?.entries?.forEach { entry ->
      println("  - ${entry.title} by ${entry.author.name}")
    }
  }
}

That’s it. No manual parsing. Retrofit handles the HTTP call, SimpleXML handles the XML → object mapping. The same pattern you’d use with Gson/Moshi for JSON APIs.

Gotchas with SimpleXML + Kotlin

  1. Default values required — SimpleXML creates objects via reflection and needs a no-arg constructor. Kotlin data classes need default values for all fields (= "", = emptyList())
  2. var not val — SimpleXML sets fields after construction, so they must be mutable
  3. @field: prefix — Kotlin annotations target the constructor parameter by default. Use @field:Element to target the backing field, which is what SimpleXML reads
  4. strict = false — Without this, SimpleXML throws on any unrecognized element

Comparison

XmlPullParserRetrofit + SimpleXML
Lines of parsing code~1000 (annotations only)
DependenciesNone (built into Android)retrofit + converter-simplexml
Use caseLocal XML files, offline parsingHTTP XML APIs
MemoryStreaming, low memoryLoads full response
FlexibilityFull controlConvention-based
Learning curveLow (but tedious)Medium (annotations)
Kotlin-friendlyYesRequires var + defaults

When to Use What

  • XmlPullParser — parsing local XML files, offline data, or when you can’t add dependencies (SDK-only)
  • Retrofit + SimpleXML — consuming XML APIs over HTTP, especially when you’re already using Retrofit for JSON endpoints
  • Neither — if you control the API, switch to JSON. It’s simpler, has better tooling, and every modern library supports it natively

Using in Android

Both approaches work on Android:

  • XmlPullParser is part of the Android SDK — import org.xmlpull.v1.XmlPullParser directly
  • Retrofit + SimpleXML works the same as Retrofit + Gson/Moshi — just swap the converter factory

For Android-specific XML like layout files or SharedPreferences XML, you don’t parse manually — the framework handles that. This post is about parsing XML data (APIs, feeds, config files).

What About JSON?

If you’re choosing between XML and JSON for a new API, choose JSON. Check out the JSON Parsing in Kotlin series for a complete guide covering org.json, Gson, Moshi, Kotlin Serialization, and Jackson.