Xump - first steps
1242570295|%e %b %Y, %H:%M %Z|agohover
There are two kinds of intelligence: finding special cases, and finding generic answers. Today, I explain Xump, a new project to make a generic answer to messaging.
I've written a lot about messaging recently and one theme comes back again and again. There is, lurking somewhere below the mass of different ways of connecting applications, a universal answer.
It is a slow collective thought process that is taking years and many people. Most people will agree on certain attributes of an ideal messaging system: it is asynchronous, so pieces can all work at their own pace. It is abstracted, so pieces talk to logical addresses, not other pieces. It is symmetrical, so that any application can act both as service or client. And so on…
There is a lot of disagreement over transports. HTTP, AMQP, Comet, HyBi, RestMS. Since different transports reflect different network realities, it seems right that there be lots of choice, and competition.
And there is absolutely no agreement over resource models - the structures that are most important to application developers. Even a single protocol like AMQP has radically different models for versions 0.9, 0.10, and 1.0.
As the architect of two messaging products - OpenAMQ and Zyre - I've had to think about what lies below the transport. The latest draft of RestMS has a concept called "profiles" that lets designers add new semantics for feeds and pipes.
So I've been thinking about this: is there a universal model that lies beneath the different transport layers, and which can implement the many apparently different messaging models we're seeing? AMQP/0.9 has exchanges, queues, and bindings. AMQP/1.0 has queues and links. RSS, AtomPub has feeds. Comet has channels. The Java Messaging System (JMS) has destinations. And so on.
We learned several years ago that AMQP could implement JMS destinations. It looks like AMQP/1.0 can implement AMQP/0.9 semantics. AMQP can implement RSS, if one adds an HTTP transport layer. RestMS can implement AMQP.
So it looks like the more recent, more generic models can in fact implement the older, specific models. Logically, this suggests that there is a fully generic model which can implement any form of application messaging, both broker-based and peer-to-peer (which basically means putting a broker in every node, and defining elastic relationships between them).
What would such a model look like? To be honest, I don't exactly know, but parts of it seem clear:
- Conceptually we have a set of applications that speak to a central messaging broker, though there may be no broker, and applications may be threads, and 'speak' can be anything from a network connection to a shared memory queue.
- Applications publish messages to named shared resources, which are a form of queue. The semantics for publishing are extensible because they depend on external choices.
- Queues are stored in some fashion. The semantics for storage are extensible because there are cost/performance tradeoffs which we want to make available to the application architect.
- Queues deliver their messages to applications. The semantics for delivery are extensible because as for publishing, these depend on external choices. For example, do we deliver messages one by one (pedantic, safe, and slow), or as a stream (faster but more risky)?
- Queues may route messages into other queues according to application-specified subscription criteria that we will call selectors. Selectors say, "when a message matches these criteria, perform such-and-such operation".
- The criteria for selectors are generally but not only, matching on message address. There are many ways of matching: literal comparison, topic patterns, regular expressions, numeric ranges, Cartesian coordinates, XML paths, and so on.
- For O(log n) performance, selector matching is done collectively, as one operation on all selectors for a queue. This is possible (as we showed in 2006 for AMQP) for selectors that work on address patterns, where the set of different possible addresses is limited (to thousands).
- Selectors can also filter messages one by one, an operation with O(n2) cost. Here we do not need to precompute indices but can delegate the entire matching operation to an extension. This is how we do arbitrary content-based routing.
- The two main selector operations are move (where a message goes to only one application) and copy (where messages go to a set of applications).
- Ideally, we can mix selectors of any type on a queue and everything will just work. This is also the big change AMQP/1.0 makes over previous versions of AMQP.
- Messages are opaque binary contents with a textual envelope holding an address and other properties. Selectors work on the envelope, filters can work on the contents as well.
Here is the Parrot pattern for a set of recipients:
Sender
-> Queue
-> Selector address like "rec.pets.*" COPY
-> Recipient
-> Selector address like "rec.pets.dogs" COPY
-> Queue
-> Selector MOVE
-> Recipient
-> Selector address like "rec.pets.cats" COPY
-> Recipient
Here is the Wolfpack pattern for a set of recipients:
Sender
-> Queue
-> Selector address EQ "wolf" COPY
-> Queue
-> Selector MOVE
-> Recipient
-> Selector MOVE
-> Recipient
-> Selector MOVE
-> Recipient
That's it, though I'll be surprised if this is a final breakdown.
Let's think about an implementation. Do we want to expose the above model to the user of the messaging system, i.e. the application programmer? I think the answer is "no", because it is (a) too complex and (b) too raw. Application programmers need simplicity, and the above model of queues, selectors, and filters is too abstract, and thus too complex.
This is the same concern I have with the AMQP/1.0 design. It is a better model for generic messaging than AMQP/0.9. But it seems unnecessary to expose that model to application developers. Lacking from AMQP/1.0 is a set of higher-level models that are easier to use. Like RestMS profiles, or like the patterns I described on the www.restms.org wiki.
So this universal model is something to be embedded in a messaging product, it is not a deliverable messaging model itself. Just as physics engines and graphics engines let game designers draw realistic explosions, so a messaging engine should solve the core problems of a messaging product.
Which brings me to Xump, which is my new project to build a messaging engine based on the above very rough model design.
If Xump is buildable, it will make it simple to add new semantics for storage and matching. It will also make it simple to build a new messaging broker like OpenAMQ/2, or Zyre, which will consist of:
- A transport layer
- A configuration layer
- An administration layer
- The embedded messaging engine
- Extensions that implement storage and matching
- Implementations for the product's specific messaging models
Zyre profiles, for example, would be little applications that use the messaging engine in specific ways. AMQP/0.9.1's exchanges and queues, similarly, are straight-forward to implement on top of Xump and I expect AMQP/1.0 will also be doable. And RSS, AtomPub, and so on.
I'm building Xump using the iMatix Base/2 tools, which are the multi-threaded framework we use for OpenAMQ, Zyre and the X5 web server. Base/2 is nice but a steep learning curve. Sorry about that. At this stage, the goal is to prove the model design, so we can document it and make it easy to re-implement. There is no reason a Python engine should be computationally slower than one written in C.
Follow this on http://github.com/pieterh/xump/tree/master github], and on xump.org where I'll document the model as it starts to work.
Introduction to RestMS
1241350717|%e %b %Y, %H:%M %Z|agohover
For those new to RestMS, I've written an Introduction to RestMS that explains the background to this new protocol.
Here are the conclusions of the article:
Messaging is an important tool that is finally coming within reach of the normal software developer. However the good stuff is still too complex, and the simple stuff not good enough for real work. We think that it's time for a simple capable option, based on a free and open standard, with solid open source implementations.
RestMS is this free and open standard, based on generic routing and queuing models, and easily extensible to new semantics. RestMS was heavily influenced by AMQP, another emerging messaging protocol, and AtomPub, a content publishing protocol that uses a simple and regular HTTP-based API style called "REST".
RestMS is simple, secure, scalable. It is clean, precise and minimal, but gets the job done. It uses established HTTP security standards. And it uses HTTP scaling: servers, caches, proxies, accelerators, optimised stacks.
New RestMS Community Site
1239462366|%e %b %Y, %H:%M %Z|agohover
RestMS is growing and has a new home on www.restms.org plus its own mailing list.
It's easy to remember when the first running RestMS code hit the streets - on 25 December 2008 - since here at iMatix we pronounce it to rhyme with "Xmas". Now a few months later, Thilo Fromm has started on a Python/Django server and started to contribute ideas to the spec, such as a pipe that would pump its messages back into some feed.
RestMS is splitting off from AMQP, which starts to become a special use case. Thilo's server will be stand-alone and eventually Zyre will also be.
So, we've moved RestMS off the wiki.amqp.org site where it was incubated. We based wiki.amqp.org on frameworks built by the Digital Standards Organization and first tested in anger - well, mild pique perhaps - to develop microspecifications around AMQP: CML, DMP, and then RestMS.
At the heart of this framework is Digistan's COSS, the Consensus-Oriented Specification System, with its controversial but satisfyingly evil branch-sharealike-and-merge answer to resolving the conflicts which always emerge in such processes. The good folk at Digistan have built a new template site for specification projects like RestMS. A few clicks (an hour or so) and I was able to clone this template site into a new www.restms.org.
If this all sounds suspiciously easy, it's because Digistan uses Wikidot.com, like this blog, and Wikidot.com is one of those magical tools that turn ordinary people into semi-gods.
The new restms.org site is set up to host not a single, but a collection of specifications, each with authors, contributors, critics. Of course the work will mostly be done by a few individuals, as it always is. But the important thing is that anyone, wanting to get involved, can. There are no barriers except the ability to contribute (which may be limited by one's professional freedom).
As we go through the next round of RestMS drafting it'll become clearer why we need a collection of specs, rather than one document. The main reasons are to split stable designs from experimental work, to give people easy space into which to contribute new ideas.
From Linux To Bangalore And Back
1234262215|%e %b %Y, %H:%M %Z|agohover
Brett Cameron writes to tell me that Zyre and OpenAMQ 1.4a3 are running on an OpenVMS box in Bangalore. I do a quick test using the simplest messaging client ever invented: TELNET.
Brett writes, "just to let you know that today I finally managed to find time to get OpenAMQ 1.4a3 and Zyre up and running on OpenVMS. We have an OpenVMS machine that is accessible from the Internet (hosted in Bangalore), and, assuming that nothing falls over, you should be able to run your uspeak/ulisten Perl scripts against it (TCP/IP address 15.xxx.xxx.199, port 80). I have just given it a spin from my Linux box and it all seems to be hanging together. Quite amusing really… one process on my Linux box talking RestMS with another process on my Linux box over the Internet via an OpenVMS server in Bangalore!"
So the server is in Bangalore, and the publisher and subscriber are both on a Linux box in New Zealand… Not that I doubt Brett, but I need to try for myself! So I whip out TELNET and type the HTTP request "GET /restms/domain/default" by hand, pressing Enter twice to finish it:
ph@ws200901:~/work/trunk/openamq$ telnet 15.xxx.xxx.199 80
Trying 15.xxx.xxx.199...
Connected to 15.xxx.xxx.199.
Escape character is '^]'.
GET /restms/domain/default
HTTP/1.0 200
Content-type: application/restms+xml;type=domain
Last-modified: Mon, 09 Feb 2009 19:42:49 UTC
Etag: 4628193595c88-8a
Server: X5
Date: Mon, 09 Feb 2009 22:22:33 UTC
Content-length: 379
<?xml version="1.0"?>
<restms xmlns = "http://www.imatix.com/schema/restms">
<domain name = "default" title = "Default domain">
<feed type = "direct" name = "default" title = "Default feed"
href = "http://zzz.yyy.hp.com/restms/feed/default" />
<feed type = "fanout" name = "ublog" href =
"http://zzz.yyy.hp.com/restms/feed/ublog" />
</domain>
</restms>
Connection closed by foreign host.
This… is… fantastic. When we started AMQP almost five years ago, the idea of being able to test a connection using nothing more than TELNET was one of our goals. Today we seem to have achieved it.
Kudos to Brett for having ported not just OpenAMQ, but Zyre, and the X5 web server it runs on, to OpenVMS. The beauty of open source…
Fosdem 2009
1234005635|%e %b %Y, %H:%M %Z|agohover
Boring, Complex and Painful - Messaging in the 21st Century. Notes on a presentation for FOSDEM 2009.
Complex and Boring
Complexity is good for the closed-source software business. It keeps competitors away, and captures clients and profits. Messaging is the last frontier, the last big domain where closed-source rules. Our objective: make it simple, make it open, change that market. It's the users and small software firms vs. the large software firms.
The Point of Messaging
Every significant software project will make its own messaging system. Badly. Messaging is an easy question: "Connect these pieces together". But the answer is difficult: addressing, queuing, formatting, transports, cost, reliability. Existing answers all seem wrong in vital areas.
AMQP and What it Means
AMQP is a general-purpose messaging architecture. This means it can be used in any place where you'd make your own messaging system. It's a protocol. This means you can choose from several products, which interoperate and compete. It's open. So all AMQP implementations are FOSS. Designed by users and small software firms. So, a reasonable answer.
The OpenAMQ project
OpenAMQ is iMatix's AMQP server and client. FOSS, of course. Project started in 2005 in parallel with AMQP. Reasonably fast (500k msg/sec), stable, handles very large volumes happily. Built with iMatix Base2 metaprogramming architecture for portable multicore servers. LOC: 937,201. LOMC (lines of metacode): 159,748 LOMMC: 68,801. http://www.openamq.org.
Making it Simpler
AMQP is OK but can be improved. Protocol is too complex. Architecture has some flaws. Community excludes small contributors. Not an open process. Large software firms dominating again. Here we go again… :-)
REST
The web's answer to SOAP. Create-Read-Update-Delete. Server-generated URIs for new resources. Scalable, cacheable, simple. "Moving to pure HTTP allowed us to throw away the code for the XML-RPC service. (I love throwing away old code ;-)"
RestMS
RestMS is a general-purpose messaging protocol for the web. It's AMQP, but RESTful. Works over plain HTTP. Cleans-up the AMQP model giving "feeds", "pipes" and "joins". Connects to AMQP servers. Free & open protocol. http://www.restms.org, http://wiki.amqp.org/spec:7.
Client:
-------------------------------------------------
POST /restms/domain/default
Content-Type: application/restms+xml
Slug: fortune
<?xml version="1.0"?>
<restms xmlns="http://www.imatix.com/schema/restms">
<feed type="service" />
</restms>
Server:
-------------------------------------------------
HTTP/1.1 201 Created
Content-Type: application/restms+xml
Location: http://host.com/restms/feed/fortune
<?xml version="1.0"?>
<restms xmlns="http://www.imatix.com/schema/restms">
<feed type="service" name="fortune" />
</restms>
Zyre
Zyre is iMatix's RestMS server. FOSS, of course. Project started in 2008 in parallel with RestMS. Built with Base2, so scalable to multicore servers, fast, stable. Currently, part of OpenAMQ project, will be split off later. Talks to OpenAMQ at one side, and HTTP at the other. http://www.zyre.com. 3,353 LOMMC.
X5
X5 is iMatix's HTTP server and does HTTP work for Zyre. Is Xitami (1996-2002), rebuilt on Base2. Fast, scalable, secure, smart. Like, auto-blacklists offensive IP addresses. Currently, part of Base2, will be split off later. 3,457 LOMMC.
Live.zyre.com
Live.zyre.com is an open RestMS server. Runs latest build of Zyre and OpenAMQ. To use, you just need an Internet connection. This lets you test RestMS without installation.
Live Zyre
1233952834|%e %b %Y, %H:%M %Z|agohover
live.zyre.com is a new experimental service that lets you test Zyre without any installation hassle. Just point your RestMS application to live.zyre.com and run.
Problem is, you don't have a RestMS application to try. So, we'll make one. We'll make it in Perl1 using the RestMS.pm module. This is a microblogging application2 in two halves. The first half just sends one line of text to the "ublog" feed. The second half listens to that feed and prints out everything it receives.
It's simplest if you download the RestMS.pm module straight into where you write and test this application. Here is the first half, uspeak.pl:
#!/usr/bin/perl
#
# uSpeak app, writes a line of text to the ublog feed
# perl uspeak.pl "some line of text"
#
# (c) 2009 iMatix, may be freely used in any way desired
# with no conditions or restrictions.
#
use RestMS ();
my $domain = RestMS::Domain->new (hostname => "live.zyre.com");
my $feed = $domain->feed (name => "ublog", type => "fanout");
my $message = RestMS::Message->new;
$message->content (shift);
$message->headers (name => "Jeep Nine Thirst");
$feed->send ($message);
And here is the second half, ulisten.pl just a little more complex since it saves/restores its pipe:
#!/usr/bin/perl
#
# uListen app, writes a line of text to the ublog feed
# perl ulisten.pl
#
# (c) 2009 iMatix, may be freely used in any way desired
# with no conditions or restrictions.
#
use RestMS ();
my $domain = RestMS::Domain->new (hostname => "live.zyre.com");
$domain->verbose (0);
my $feed = $domain->feed (name => "ublog", type => "fanout");
# Cache the pipe name to reuse existing pipes (polite!)
# This also ensures we'll catch on whatever we missed...
open (FILE, "ulisten.cfg"); @config = <FILE>; close (FILE);
my $pipe = $domain->pipe (name => $config [0]);
open (FILE, ">ulisten.cfg"); print FILE $pipe->name; close (FILE);
# In any case, connect our pipe to the ublog feed
my $join = $feed->join (pipe => $pipe);
# Now listen and print whatever people say
while (1) {
$message = $pipe->recv;
$pipe->carp ($message->headers (name).":".$message->content);
}
To run, you need certain Perl modules. Sorry about this, it's the way things go. What I did was:
sudo cpan
and then after pressing Enter a lot of times,
install LWP
install Alias
install XML::Simple
It'll depend on what was already on your machine.
Since we're using live.zyre.com, no Zyre install is necessary. I've already hardcoded that domain name into the scripts. So, download the three files - RestMS.pm, ulisten.pl, and uspeak.pl - and then run two perl commands, in two windows:
perl ulisten.pl
and
perl uspeak.pl "hey, I'm running my first restms app!"
If you want to show your 133t skillz, edit uspeak.pl to put your own name.
What does this example show?
- How to create feeds and pipes
- How to wait for, and process messages
- How to make persistent pipes
Enjoy!
RestMS Draft 2
1233933355|%e %b %Y, %H:%M %Z|agohover
There's now a new release of Zyre on http://www.openamq.org. This is still a work in progress and some things don't work, like automatically getting OpenAMQ exchanges visible as feeds.
This is what's changed since the last release:
- Implements the new RestMS spec (draft 2).
- Implements the Digest-AMQP spec, which is a nice example of how to interconnect AMQP and RestMS applications.
- Many changes to the X5 web server core used in Zyre: this is now starting to feel like a real web server.
- There's a class interface in Perl which makes it easy to write RestMS apps. We'll make interfaces in other languages over time.
Since the RestMS interface changed, the code samples on Zyre.com have all gone, we'll build a new library of sample applications. There is a quick starter on Zyre.com
Apologies for the lack of extensive documentation at this stage. Questions to the openamq-dev list!
Slashdot sees Stonehenge, blinks
1232638184|%e %b %Y, %H:%M %Z|agohover
I saw Microsoft has started to contribute to the Apache Stonehenge project. That's a new project to build a web service-oriented architecture (SOA) around the Simple Object Access Protocol (SOAP). RestMS and Zyre aim at the same space: web messaging using HTTP.
In the discussion on Slashdot, it's worth noting that not a single comment actually discussed the detail of the project. Clearly, SOAP and everything tied to it goes above most ordinary geeks' heads like Airforce 1 on its way to a G7 meeting.
The similarity between the two terms is coincidence but I'm sure someone will try to make SOAP mean the SOA Protocol. Anyhow, to cut a long story short, if no-one reading Slashdot can understand the importance of Stonehenge except as a basket for Microsoft donations, then the project is doomed to be an expensive "enterprise" bolt-on that will never get traction.
Ironic for Apache, which started life as a web server that everyone could download, hack, and improve.
RestMS Goes Atomic
1232468319|%e %b %Y, %H:%M %Z|agohover
I have to thank Steve Vinoski for steering me towards Atom and telling me I should reuse what I could from that protocol. Atom is RSS done the right way and perhaps the best known example of a RESTful protocol.
It's finally ready, the second draft version.
A total rewrite, but very much better than the first draft. Fewer pieces, and more rules. Lots of examples.
There are three things I particularly like in this spec:
- It defines a generic structured document format. Like XML, but simpler. Much simpler. And open to alternatives like JSON. As well as XML, of course.
- It defines a generic RESTful framework, "how HTTP methods are used to create, retrieve, modify, and delete server-side resources." Now this is implied in AtomPub, and explained informally in some places, but has never been formally expressed as far as I know.
- It solves the problem of asynchronous message delivery.
I'll explain just the last point. REST relies on URIs to access resources. It is inherently driven by client "pull". Which is how most programs work, but it's lousy for messaging. It's bad in so many ways. If a client polls once a second, it means every message has an average of 0.5 seconds extra delay. Opening and closing connections is expensive. And it produces clumsy architectures. What is the client doing when it's not checking for a message? Busy-waiting?
The right way to do messaging is server "push". Actually, event-driven architectures work in many, many areas. Code waits for an event, handles the event, loops and waits again. It gives a snappy, responsive system. It lets you design it in clean pieces. It lets you focus on the space between pieces, rather than the pieces. A beautiful, solid, way of making software systems that goes back decades.
So how to make server push work over HTTP, which is an essential "pull" protocol?
The answer is "long polling", in which the server only responds to a GET when an event happens. One way to do this is using a new HTTP header that fetches a resource only when it's been modified after a certain time.
This assumes that timestamps are discrete, which they're not. It also assumes that we're re-fetching an existing resource. But how about fetching a message that is not actually there, yet?
The solution in RestMS is an "asynclet" which is a resource that gets a URI before it exists. If the client GETs this asynclet, that is a long poll. The server holds the connection open, and waits. If the connection drops, or the HTTP library gives a time-out, the client just reconnects and tries again.
It's a very quick - snappy - model, since the microsecond that a message arrives in a server, it can be shoved off to a waiting client with no delay. The model is robust against dropped connections, and against crashed clients. Once the message has arrived, the asynclet URI turns into a normal URI until the client explicitly deletes the message.
Simple, elegant, robust, fast. This is good.
Nothing is free, of course, The problem with long polling is that it ties up a server thread. Servers like Apache have a limited number of worker threads and when these run out, new clients are rejected.
So, normal web servers can't handle it.
Luckily, Zyre is not based on a normal web server. We've restarted the Xitami project and the new web server - which we're calling "X5" - is designed to handle thousands of connections without difficulty. So Zyre is perfectly ready for long polling.
I love it when a plan comes together.
Brand Name Necrophilia
1230934862|%e %b %Y, %H:%M %Z|agohover
This week, I resurrected a project that I first started in 1995 when I realized "this Internet thing is going to be HUGE". At the time, I figured one day every advertisement would feature a URL. I never realised the Internet would elect its own world leader only half a generation later.
The name for what Wikipedia describes as a "multi-year coding frenzy" was taken at the last minute in 1999 from an "iMatix" t-shirt viewed in the mirror. Wikipedia says that "the last Xitami release, 2.5 has been in beta since 1999". We abandoned Xitami/2 because it was too much work and did not generate any income. Peoples has to eat.
But times change, and the meta-programming tools (iCL, Base2, XNF) we use to make OpenAMQ and Zyre are really much easier to use than what we had in 1999.
Since Zyre needs a reliable embedded web server, and it's easier to write one from scratch than integrate an existing one, Xitami is alive again. This is what my friend Mato calls "brand name necrophilia" but I see it more like re-enacting a historical drama, or recreating a classic movie, but with shorter skirts and more explosions.
Xitami/3 did exist briefly, but the main reason for calling this new, written from scratch beast "Xitami/5" is the same reason one of my SoftToys video game titles from 1986 was called "Star Warp II". The higher version number makes people think it's a hit series. Clever marketing, see?
As I write this, Xitami's spanking new Digest Authentication module is telling me it still can't correctly calculate the MD5 hash for "MD5 (HA1 : nonce : nonceCount : clientNonce : qop : HA2)". I've been working since midday on the Digest authentication module (the Basic authentication was relatively easy). And it complains:
have:b646519251c16f971246593efca064c2 need:60fecb06726d88ea5f015f6765c34c50
Which is why some people classify security programming as "mindless masochism". It is definitely hard. Making Xitami work with Apache-format passwd and digest files is more like digging a ditch to precise specifications, using a spoon, than leaping from an exploding train wreck.
Still, there are pleasures in writing new code. Xitami/5 takes a cynical and distrustful view of the Internet. If a browser is not absolutely well-behaved, it believes, there is a crook, hacker, spammer, or idiot behind that keyboard. So, Xitami/5 has a policy language that lets me write gems like this (and XML shines for this kind of instant language):
<!-- Detect hostile requests, auto-ban offending IP addresses -->
<policy name = "auto-ban">
<!-- Attempt to smash the server with long requests -->
<detect limit = "255" comment = "long request line" />
<!-- Attempts at injections via the URI -->
<detect value = "%3Cscript" comment = "script injection" />
<detect value = "%3Cform" comment = "form injection" />
<detect value = "%20or" comment = "SQL injection" />
<detect value = "%20and" comment = "SQL injection" />
<detect value = "%20select" comment = "SQL injection" />
<detect value = "%20drop" comment = "SQL injection" />
<!-- Attempts to navigate outside the web root -->
<detect value = ".." comment = "path climbing" />
<detect value = "%5c" comment = "Win32 paths" />
<detect value = "~" comment = "Unix paths" />
<!-- Probe to see if we're a proxy server -->
<detect value = "http://" comment = "proxy probe" />
<default>
<echo>W: hostile request from $from ($comment), blacklisting</echo>
<echo>W: request='$request'</echo>
<ban />
<deny code = "503" text = "Server overloaded" />
</default>
</policy>
Those 25 lines - which are now in the standard Xitami config file - represent about 80% of the bad behaviour on the Internet as represented by people trying to worm their way into unguarded web servers. Note the cute '<ban/>' action which blacklists the IP address of the sender. Yes, with IP spoofing it's possible to Joe-job innocent people into being banned. Too bad. Shoot first, check for friendlies after.
A web server is only as good as its security. And a web application is only as good as its web server. And Zyre is a web application designed for real, serious work.
So, the last few days have seen a flurry of work on Xitami, which is the key to making a secure and trustable Zyre. Something like 1,500 lines of new code, in three days. And this is meta-code, that would be maybe 30,000 lines of normal C code.
Like this, the http_nonce.icl class which generates 'nonces' (if you know what a nonce is, my sympathies):
<?xml?>
<!--
Copyright (c) 1996-2009 iMatix Corporation
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or (at
your option) any later version.
This program is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
General Public License for more details.
For information on alternative licensing for OEMs, please contact
iMatix Corporation.
-->
<class
name = "http_nonce"
comment = "A security token for Digest authentication"
version = "1.0"
script = "icl_gen"
>
<doc>
Nonces are held in a hash table. The nonce value is the key into the
table. This class generates the nonce value.
</doc>
<inherit class = "icl_object">
<option name = "alloc" value = "cache" />
</inherit>
<inherit class = "icl_hash_item">
<option name = "hash_type" value = "str" />
</inherit>
<import class = "http" />
<context>
int64_t
count; // Digest nonce count value
</context>
<method name = "new">
<dismiss argument = "key" value = "nonce" />
<local>
icl_shortstr_t
nonce; // Calculated nonce value
</local>
// Minimalistic algorithm for now
ipr_str_random (nonce, "Ax20");
</method>
<method name = "selftest">
http_nonce_table_t
*table;
http_nonce_t
*nonce;
table = http_nonce_table_new ();
nonce = http_nonce_new (table);
assert (strlen (nonce->key) == 20);
http_nonce_unlink (&nonce);
http_nonce_table_destroy (&table);
</method>
</class>
One of the techniques I really appreciate is "test driven development". This means, mainly, writing a test case for every new function you intend to implement. Then test, show that it does not work (yet), then write it and fix it until it works. The advantage is that those 1,500 lines of code are heavily tested. One has to leave some bugs for the community to discover, but writing code that can be rapidly locked down as "working and tested" is a joy.
iCL - the class language we use - generates test programs automatically, and each rebuild re-runs every single test case. I'm working mostly on a slow EEE 1000HD netbook, and it's fast enough.
My goal with the security work is to make Digest authentication work, and then do asynchronous authentication using a back-end application over AMQP.
This is cute. This is what we designed AMQP for.
So, the web server decides to authenticate a request because the access policy says something like:
<policy name = "private messaging" uri = "/restms/">
<always>
<authenticate mechanism = "digest-amqp" realm = "Messaging network" />
</always>
<group value = "users">
<allow />
</group>
</policy>
And it looks for the "digest-amqp" authentication module (I built Xitami/5 using a plug-in modules design based on 'portals', a Base2 class for making extensible architectures).
The digest-amqp module (which I've not yet written but will soon) does not use a local digest file, but instead sends off an AMQP request to an authentication service. This is possible of course because Zyre speaks AMQP so can send messages to an AMQP server, and get back replies.
I'll write a specification for the Digest-AMQP mechanism and put that onto wiki.amqp.org. The details are important. For example, the authentication service needs to return a MD5 hash constructed in the right fashion from the username, realm, and password. The actual password never leaves the authentication service.
This should make it possible to plug Zyre into LDAP servers and other credential systems.
When that all works, it's time to peek into the throat of hell and get Xitami working with OpenSSL.
A web server is only as good as its security. And Zyre is only as good as Xitami.