Archive for the ‘XML’ Category
I usually keep pretty quiet on things that are controversial, and goodness knows you can start a holy war fast with any comments about REST web services. But I’ve been thinking and working on these things for years — REST for over a year, SOAP and friends since 2003 — and maybe my opinions will be interesting to someone.
I dislike SOAP. I’ve worked a lot with it, doing about a dozen major integrations with platforms from companies like Amazon, Google, and Yahoo, using several different programming languages, primarily .NET languages and Perl. I have never worked heavily on the server side with SOAP; I’ve always been a client. From the client side it’s bad enough, and I do consider myself pretty fluent with most of the standards, though some of the WS- standards never were relevant to me. Regardless, I used to be able to read a raw WSDL for a complex service like Adwords, and tell you exactly what XML you needed to send and what you’d get back (primarily because the SOAP libraries available never worked right, and I had to write my own in both C# and Perl).
What’s wrong with SOAP? Complexity. SOAP (and XML in general) is a sledgehammer to crack a nut. It does, however, provide a lot of nice features. You can in theory develop tools for every step of the process: to generate WSDL, write code for servers and clients automatically, and so on. SOAP works, and it generally works well with minor problems caused by quirks of some client libraries not working well with some server implementations (the .NET SOAP libraries never worked right with the WSDLs from Amazon’s early merchant integration services, though they were valid). SOAP is not the most efficient thing that could ever be done, both from the computing and from the human standpoint. You can theoretically do that with other technologies too, but SOAP is the most obvious success story to illustrate (Thrift, CORBA, Protocol Buffers, etc could also be mentioned — but they are not apples-to-apples in this discussion). SOAP is just too heavyweight. I always thought there needs to be something more straightforward and efficient that achieves the same goals.
I’d heard about REST but never looked into it deeply until a couple years ago, when I gained a passing familiarity through some adjacent technologies. Around a year ago I got involved with writing a set of web APIs (I’ll avoid calling them REST APIs) and worked a little more closely to REST. I quickly learned that I didn’t know much, and most of my assumptions were gently shown to be wrong by other team members. But as I learned more, I decided I don’t like REST either.
What’s wrong with REST? Naivety and idealism. No matter how much we dream the dream, things are not self-describing and infinitely fluid, and people seem to fall in love with the “elegance” of HATEOS (which is one of the least elegant acronyms ever created, on any level). When you write a client against a service, the client and all of the systems it’s connected to represent an ossification of some ideas about the interface the service exposes.
An interface of any type is a contract, a promise to the world. The client trusts the promise, and if the promise changes, the client’s code and persistence structures must change to match. The “semantic web” is a wonderful daydream for the day when computers understand “meaning,” but for the time being, very few real useful systems I’m aware of are in the habit of persisting and working with information that might or might not contain anything or nothing in any or no structured format. REST APIs are supposed to somehow be magically discoverable and traversable on demand — “want to know what’s available through the API? Browse the API and see what you find!” This is completely unrealistic. If it were realistic, then why does every REST API have documentation that tells you what you’ll find when you browse the API, and what it means? QED.
Another bit of idealism is the use of HTTP verbs exclusively. HTTP verbs aren’t sacred and somehow sufficient for everything. Designing to four primary verbs encourages a tremendously simplistic (actually, very complicated in subtle ways) worldview of what an efficient API does. Most quasi-REST services I’ve seen (there are very few that really follow the rules) would force one-at-a-time loops over operations involving a lot of data, for example. They were designed by people who’d never heard of the fallacies of distributed computing.
All of this has been said by many others, and probably more succinctly. What do I have to contribute? Only this: to point you to a good design.
About a year ago I was cooking up a hybrid of JSON over HTTP, with many design decisions influenced by Adwords’ design, and called it WORK (as opposed to REST, and because I wanted something that would work). The definitions of the operations and data types (yes, I do believe data types and operations should be predefined and agreed-upon, not discovered dynamically because that Just Doesn’t Work, as explained previously) were all written into a small JSON file. This served as an IDL that could, in theory and mostly in practice, be used to autogenerate client and server code. I didn’t see that project through to completion, but it remains an approach I’d use again in the future.
So, in conclusion I would not recommend that anyone try to develop a REST web service. I’d recommend something more like RPC-over-JSON-using-parts-of-HTTP. This is what most so-called REST services really are anyway. I don’t care if anyone calls them REST, but Roy T. Fielding does, so out of respect for him I’ll use the terms somewhat carefully.
If you’re like me, you’ve gotten tired of writing endless test cases for parsers that can understand the thousands of variations of text output by SHOW INNODB STATUS. I’ve decided to solve this issue once and for all by patching MySQL and InnoDB to output XML, the universal markup format, so tools can understand and manipulate it easily. Here’s a sample snippet:
<status><![CDATA[ ===================================== 100320 15:46:24 INNODB MONITOR OUTPUT ===================================== ... text omitted, but you get the idea ... ]]> </status>
PS: Yes, this is a late April Fool’s joke.
It has all played itself out according to Microsoft’s wishes. They have railroaded through a so-called standard for document representation, gotten it rubber-stamped by so-called standards bodies, and fought their way past all the objections of sensible people and companies. In the process, lots of developing nations have been steamrolled, too. Shame, shame, shame.