I have to thank Steve Vinoski for steering me towards Atom and telling me I should reuse what I could from that protocol. Atom is RSS done the right way and perhaps the best known example of a RESTful protocol.
It's finally ready, the second draft version.
A total rewrite, but very much better than the first draft. Fewer pieces, and more rules. Lots of examples.
There are three things I particularly like in this spec:
- It defines a generic structured document format. Like XML, but simpler. Much simpler. And open to alternatives like JSON. As well as XML, of course.
- It defines a generic RESTful framework, "how HTTP methods are used to create, retrieve, modify, and delete server-side resources." Now this is implied in AtomPub, and explained informally in some places, but has never been formally expressed as far as I know.
- It solves the problem of asynchronous message delivery.
I'll explain just the last point. REST relies on URIs to access resources. It is inherently driven by client "pull". Which is how most programs work, but it's lousy for messaging. It's bad in so many ways. If a client polls once a second, it means every message has an average of 0.5 seconds extra delay. Opening and closing connections is expensive. And it produces clumsy architectures. What is the client doing when it's not checking for a message? Busy-waiting?
The right way to do messaging is server "push". Actually, event-driven architectures work in many, many areas. Code waits for an event, handles the event, loops and waits again. It gives a snappy, responsive system. It lets you design it in clean pieces. It lets you focus on the space between pieces, rather than the pieces. A beautiful, solid, way of making software systems that goes back decades.
So how to make server push work over HTTP, which is an essential "pull" protocol?
The answer is "long polling", in which the server only responds to a GET when an event happens. One way to do this is using a new HTTP header that fetches a resource only when it's been modified after a certain time.
This assumes that timestamps are discrete, which they're not. It also assumes that we're re-fetching an existing resource. But how about fetching a message that is not actually there, yet?
The solution in RestMS is an "asynclet" which is a resource that gets a URI before it exists. If the client GETs this asynclet, that is a long poll. The server holds the connection open, and waits. If the connection drops, or the HTTP library gives a time-out, the client just reconnects and tries again.
It's a very quick - snappy - model, since the microsecond that a message arrives in a server, it can be shoved off to a waiting client with no delay. The model is robust against dropped connections, and against crashed clients. Once the message has arrived, the asynclet URI turns into a normal URI until the client explicitly deletes the message.
Simple, elegant, robust, fast. This is good.
Nothing is free, of course, The problem with long polling is that it ties up a server thread. Servers like Apache have a limited number of worker threads and when these run out, new clients are rejected.
So, normal web servers can't handle it.
Luckily, Zyre is not based on a normal web server. We've restarted the Xitami project and the new web server - which we're calling "X5" - is designed to handle thousands of connections without difficulty. So Zyre is perfectly ready for long polling.
I love it when a plan comes together.
FYI, some thoughts, issues and resources:
http://restlet.tigris.org/issues/show_bug.cgi?id=143
Also how does RestMs compare to REST channels implemented in Dojo? Couldn't they share 'specs' e.g. the header extensions such as:
X-Create-Client-Id: 3254325325
X-Client-Id: 3254325325
X-Subscribe: *
etc?
How 'in sync' is RestMS with the proposals on WebSocket?
How different to Comet/REST Channels, e.g.:
http://cometdaily.com/2008/09/02/rest-channels-http-channels-with-json-support/
http://cometdaily.com/2008/11/12/using-rest-channels-in-dojo/
http://api.dojotoolkit.org/jsdoc/dojox/HEAD/dojox.cometd.RestChannels
An elegant description:
http://www.subbu.org/blog/2006/04/dissecting-ajax-server-push
HTH
From a quick read up on Dojo Rest channels, part of this is already implemented with the concept of pipes, and part with asynclets, which work pretty much as Comet, except right now we don't have specifications for multi-part messages. That is meant to be solved with a "streaming" pipe type.
We've not used any HTTP headers except the standard ones, and I like that. So the next stages are to ensure the basic semantics of RestMS make sense, and then to look at optimizing the two main cases where performance is an issue, publishers that send streams of data, and subscribers that receive streams of data. In both cases we'd do that by keeping open a connection and then sending messages with no confirmations. Some efficient multipart format that lets us include a binary-encoded envelope (XML and JSON are simple but too expensive for really high volume cases).
We already use simple binary envelope encoding in OpenAMQ (http://wiki.amqp.org/spec:6) and I'd like to see how that can be mapped to an HTTP multipart format. Any ideas?
Portfolio
Appreciate not wanting to use 'extension' headers. I still wonder if the WebSocket spec/discussion isn't relevant? Worth tracking/implementing?
http://www.whatwg.org/specs/web-apps/current-work/#network
In another comment I gave a heads-up on extprot not sure if that helps indirectly - my intention had been to use it as the multipart body, if the server had several msg ready it could dump them in one part of multi-part document - the client could decode several messages since extprot allows self delimiting.
The WebSocket protocol looks interesting but right now it looks like we'll use RestMS/HTTP for simple-slow scenarios and AMQP for binary-fast scenarios. As soon as we diverge from HTTP and the advantages that gives us (standard clients, network support, etc.) then we pay a price for using exotic protocols, and we might as well use the one that works best, which is AMQP for this work.
I'm going to look at using extprot for multipart documents, and also for structured content that can be transcoded on the fly into XML or JSON for clients. This definitely looks useful.
Portfolio