So we're ending up with JSON with schema and query support. In a couple of years...

mantrax5 · on May 6, 2014

It's easy to argue with strawmen like you do, but query and schema support was never the problem of XML.

The problem was that XML is a markup language, and JSON is an object notation. It's right in the name. And this results in different tradeoffs for both.

The XML schema was designed to create schemas for documents. This is why different types of schemas evolved specifically for services (like SOAP) but having to be slapped on top of a markup language base, it couldn't be good even though it tried. JSON doesn't have that initial complexity and the goal matches the use.

Just like it'd be silly to write a manual in JSON, it'll forever remain silly to serialize generic object structures in XML.

Now I could also argue XML is a mediocre markup language, and show alternatives, but as they say, that's a whole 'nother story.

bananas · on May 6, 2014

So you do you serialise and describe a ring buffer object in JSON? How do you provide a decimal type or a complex number? How do you validate this? How do you describe this object to a foreign system? How do you query the object? How do you transform the object from one type to another? How do you transform that object to a document?

You completely misunderstand XML. It's more than an adequate markup language and more than an adequate object format.

XML has few tradeoffs other than complexity. JSON has all tradeoffs but complexity.

mantrax5 · on May 6, 2014

I see, so XML has few tradeoffs other than complexity. So I'm sure given your insistent questions, XML has a native representations for:

- A ring buffer.

- A decimal type.

- A complex number.

No, it doesn't. It's all up to the contract. And there's nothing in XML that makes it more convenient to describe a complicated contract, than JSON (or any other format).

So XML made a tradeoff of complexity, and gained nothing.

Oh, and this is one issue you won't see happen with JSON:

http://blog.detectify.com/post/82370846588/how-we-got-read-a...

Yes, they managed to get full blown read access to Google's servers, including "/etc/passwd" and "/etc/hosts" by passing an XML file and using a standard XML feature.

"[N]aive XML parsers that blindly interpret the DTD of the user supplied XML documents. By doing so, you risk having your parser doing a bunch of nasty things. Some issues include: local file access, SSRF and remote file includes, Denial of Service and possible remote code execution. If you want to know how to patch these issues, check out the OWASP page on how to secure XML parsers in various languages and platforms."

You might want to reevaluate your point about complexity after reading this.

JSON has only two features:

1. Simple.

2. Readable.

The first feature make it possible for your wristwatch to parse JSON with its pin-sized CPU. The second feature makes it possible for you to parse JSON with your pin-sized... Anyway, just kidding. I'm trying to say it's easy to debug.

As for how to describe circular structures and references, and meta-types, you can see what JSON serializers like Jackson do in Java. You'll find that JSON can stretch easily to accommodate such needs.

But again, the problem was never having a format with native representation of everything under the sun.

XML's problem was that its parsers were big, heavy, complicated, poorly understood (as the XXE vulnerability shows). You would never need 90% of what an XML parser supports.

We needed the simplest, dumbest possible format that makes no assumptions about what it is you want to describe in it (except: values and collections), with the simplest, dumbest possible parser (no surpises, no complexity), so that we can then port it everywhere, and build upon it as a reliable base.

And while JSON ain't perfect, it's hell of a lot closer to that ideal than XML is.

WorldWideWayne · on May 6, 2014

You're right. It's so much sillier to do this:

    <fruit id="10" name="orange"/>

Than this:

    {"fruit": {"id":10, "name":"orange"}}

because the first one is called a "markup language" and the second one is called an "object notation". I'm not buying it.

slig · on May 6, 2014

It can be done like this too:

<fruit> <id>10</id> <name>orange</name> </fruit>

Which one is better? I find it hard to decide, and I believe most people do. And that's why we see it mixed, often in the same XML document.

bananas · on May 6, 2014

The one you use is better (correct).

spopejoy · on May 6, 2014

Utterly disagree. Attributes can actually have validated contents, such as enumerated lists, etc, and are attractively terse.

Over-reliance on elements are why Maven pom files are such a verbose disaster, and probably the main reason why web developers puke when trying to stream data. Restating element names make for illegible, bloated data. Attribute-heavy XML is attractively terse and benefits from validation (unlike JSON).

bananas · on May 7, 2014

But you can't change an attribute to a composite type in the future easily.

As for maven POMs, I use Netbeans "add dependency" and that's about it so it's a non issue for me.

wtbob · on May 6, 2014

Far better than either is:

    (fruit (id 10) (name orange))

At least IMHO.

mantrax5 · on May 6, 2014

As the joke goes you can write COBOL in any language, and you're the proof.

In JSON a more typical format would be:

    {"id":10,"name":"orange"}

Now why you need an id and a name is another smell, but let's leave that in for the sake of the example.

You don't need "node names" in JSON. Typically objects of type "fruit" will be hosted in an array whose type you're always aware of by the contract of your service.

Sure you can have polymorphism and hint the type in those exceptional cases:

    {"type":"fruit","id":10,"name":"orange"}

But at least you include that if you need it, it signifies intent, and it's not done just to appease a markup language bent out of shape as a serialization format.

By the way, it's curious you chose to use attributes in your XML example. Attributes aren't typically used to serialize object fields. Can you guess why?

smrtinsert · on May 6, 2014

You seem to have only one use case which appears to be serialization, that isn't nearly the world of what xml covers. Yeah I guess if you don't want to use a database for your blog application json serialization is fine.

A lot of us need much more from our data than that. We need validation, we need a machine readable description format, and we need apis to leverage all this that doesn't change every day.

The json community is constantly reinventing the latter. Who cares if it saves a few bytes in transmission, I don't know anyone who grumbles "oh great, the xml is making my internet slow today."