Posted in Blog · Reading time ~9 min

JSON vs YAML vs XML: when to use which

These three formats overlap enough that people often pick by habit or political loyalty rather than fit. They don't actually solve the same problems. Here's a practical comparison: where each one is good, where it's a footgun, and a quick decision guide for the cases you'll actually face.

If you already know which way you're going, the JSON↔YAML, JSON↔XML, and JSON↔CSV converters on this site handle the common cases in your browser.

The same data, three ways

To compare them fairly, here's a tiny user record in each.

JSON

{
  "id": 42,
  "name": "Alice",
  "roles": ["admin", "editor"],
  "active": true
}

YAML

id: 42
name: Alice
roles:
  - admin
  - editor
active: true

XML

<user id="42" active="true">
  <name>Alice</name>
  <roles>
    <role>admin</role>
    <role>editor</role>
  </roles>
</user>

You can see the three personalities right away. JSON is compact and code-shaped. YAML is human-shaped (and quietly indentation-sensitive). XML is document-shaped, with the attribute/element distinction front and center.

Where JSON wins

APIs. Almost every modern HTTP API speaks JSON. It maps directly to JavaScript objects in browsers and to native dicts/structs in every other language. Parsing is fast and library-free in most runtimes.
Stable interchange. JSON's grammar is small enough that every parser agrees on what valid input means. There's a single way to write each value.
Logs. One JSON object per line (JSONL/NDJSON) is the de-facto standard for structured logging. Each line is independently parseable, which makes streaming and tailing trivial.

Where JSON loses

Comments. No comment syntax. For config files this hurts.
Trailing commas. Forbidden, even though every programmer has typed one. The tooling-induced annoyance is real.
Big integers. JSON numbers in JavaScript are 64-bit floats. IDs past 2⁵³−1 silently lose digits. (Other languages handle this fine — but if your JSON ever passes through a JS client, send big IDs as strings.)
No schema in-band. You need JSON Schema or similar to describe shape. That's a feature for some, an extra step for others.

Where YAML wins

Configuration files. Comments, multi-line strings, anchors for reusable blocks — features that exist precisely because humans edit configs by hand.
Kubernetes, CI/CD, Ansible. The whole cloud-native stack standardized on YAML for human-authored manifests. There's no fighting this; if that's your world, use YAML.
Readability. For dense nested data, the absence of brackets and quotes makes large files easier to skim.

Where YAML loses (often spectacularly)

The Norway problem. Unquoted NO, OFF, FALSE become booleans. The country code for Norway is NO. Your CSV-to-YAML pipeline now has country: false instead of country: NO.
Indentation sensitivity. A single mis-indented line silently shifts a value to a sibling. Diffs that should be one line often touch many.
Multiple "standards." YAML 1.1 and 1.2 differ in non-obvious ways. 00 is octal in YAML 1.1. y/yes are booleans in 1.1, strings in 1.2. Parsers disagree.
Performance. YAML parsers are slower than JSON parsers and considerably larger.

Translation: YAML is great for hand-written config files of modest size. It is a poor wire format.

Where XML wins

Mixed content. Text interleaved with markup — articles, prose, anything that's actually document-shaped. JSON can't represent <p>Hello <b>world</b>.</p> without making it ugly.
Schemas with deep history. XSD, RELAX NG, XPath, XSLT — decades of tooling for big enterprise document workflows.
Namespaces. When two vocabularies need to share a document (SVG inside HTML, SOAP envelopes), XML's namespace mechanism actually solves the problem.

Where XML loses

Verbosity. Tags wrapping tags wrapping tags. Over a network, JSON usually transfers fewer bytes.
Attribute vs element ambiguity. Should an ID be an attribute or a child element? Different schools, both common, neither obviously right. Parsing libraries usually pick one and lose the other.
Mental overhead. Validators, namespaces, transforms — the surface area is huge if you don't need it.

A decision guide

Skip the religious war. Pick by the task:

You're designing an HTTP API.

Use JSON. Every client knows it. The format is unambiguous. The tooling is universal.

You're writing config files humans will edit.

YAML or TOML. YAML wins for nested data, TOML for flat tables. Avoid JSON for config — no comments, no trailing commas — unless your tooling forces it (npm's package.json, VSCode's settings).

You're publishing structured documents (articles, books, scientific papers).

XML. The mixed-content model exists for this. Look at JATS, DocBook, TEI before reinventing.

You're storing structured logs.

JSON Lines (JSONL/NDJSON). One JSON object per line. Streaming-friendly, splittable, and every log aggregator understands it.

You're shipping data between two services you control.

JSON unless you have a specific reason to pick something else. Faster parsing, less ambiguity, fewer parser bugs to argue about. For binary efficiency, consider Protocol Buffers, MessagePack, or CBOR — they outperform all three of the formats on this page.

Converting between them

Sometimes you don't get to choose. You receive XML and want JSON; you have JSON and need to drop it into a Kubernetes YAML manifest. The JSON to YAML, YAML to JSON, JSON to XML, and XML to JSON converters on this site handle the common cases. For round-tripping document-style XML with mixed content, expect to lose some information — that's a real tradeoff, not a converter bug.

One sentence

JSON is the language of APIs and logs; YAML is the language of human-edited configs; XML is the language of structured documents — and you don't get to pretend one of them is the right answer for all three.

Got a use case where the obvious choice is wrong? Tell us — we collect counterexamples.