Sunday, April 04, 2010

It's not entirely intuitive, but the same architectural style that allows Web servers to handle large scale usage and iterate so successfully, may also help constrained devices (like sub-10$-material-cost sensors) to handle small scale loads and power usage and avoid being upgraded for many years.

A lot of discussions of using REST in constrained devices have glossed over whether it's really interoperable HTTP that's being used (e.g. compressed HTTP) and whether the low-power device would be in the role of the client or the server. My opinion at this point is that for truly constrained devices, compressing HTTP itself would be the wrong choice, and that the device must be in the role of the server. The rest of this series explores why I believe that and should lay open the assumptions that might allow somebody to correct my information and revise my opinion.

The first part of the series attempts to explain is why HTTP is the wrong transfer protocol choice for some constrained devices. In theory it's a massively extensible protocol, with ways to extend the operation set, upgrade versions, add status responses, add headers, add resource content types and so on. Again in theory, almost all headers can be combined with each other and with different methods unless forbidden. In practice, the deployed base makes some HTTP extensions, and particularly the combinatorial expansion of extensions and optional features working together, quite difficult, and much special-case code must be written to handle the cases where widely-deployed software fails to implement some detail correctly. Here's a few examples of problems we've had over the years.

1. In developing WebDAV, we tried to use the standard "OPTIONS *" request (the special-case URI '*' means "give me the options or capabilities of the server"). However, we found that this is poorly implemented. Java servlets, for example, did not support the '*' URI. Other HTTP frameworks supported the '*' URI but did not allow the response to OPTIONS * to be extended by software extending the HTTP framework. Ugh.

2. HTTP has several different ways to indicate the content-length (historical reasons as well as for different use cases). One is the MIME type "multipart/byteranges". Another is chunked transfer encoding, where the server (and thus the client) does not need to know the total length until the last chunk is transferred. A third is "Content-Length". A fourth is for the server to terminate the connection, although this feature opens up the possibility of truncation attacks. Some of these methods work poorly with TCP connection continuation. Both clients and servers have to support almost all of these.

3. Both absolute and relative URIs are allowed in several locations, even though they're not a good idea in all locations. The resolution of a relative URI can be tricky and complicate client software. Implementations of this make it worse; e.g. the "Location" header was defined to allow only absolute URIs, but many implementations have been found which use their generic URI parsing or generating code to allow relative URIs in that field as well.

4. Parsing headers with quoting, separator (comma and semi-colon), whitespace and continuation rules is difficult. See http://greenbytes.de/tech/httpbis/issue-14.xhtml, http://greenbytes.de/tech/httpbis/issue-30.xhtml, http://greenbytes.de/tech/httpbis/issue-62.xhtml, http://greenbytes.de/tech/httpbis/issue-77.xhtml etc. Some headers defined in other specifications besides RFC2616 even used the syntax rules incorrectly and have to be special-cased.

5. The support for the Expect: header and 100 Continue response has never been good. In theory it's required, so there's no way for a client to ask if a server supports it, thus the client could end up waiting quite a while before giving up on the server initial response. Instead quite a few clients ignore the specification text and start sending their request bodies right after the headers including the "Expect" header, without waiting for the server 100 Continue response. This kind of feature also makes it harder to integrate authentication (what happens if the client uses this header when an authentication handshake needs to be initiated? ) and connection/transport features, as well as to implement intermediaries.

There are many more examples in a long issues list (http://greenbytes.de/tech/httpbis/index.xhtml), and some of the discussions of these issues get quite lengthy in considering how features are actually implemented and how they work combined with other features.

Developing a very simple HTTP client is simple, particularly if the client is talking to a single known HTTP server or is making only a small number of kinds of requests. Developing a limited-use HTTP server can be fairly simple too, although it gets to be significantly more complicated if the developer actually tries to handle all RFC2616 features and all unknown headers and methods properly. What turns out to be very hard is building an HTTP client library, server library or extensible server, because these are used in unexpected ways. It's easy for a developer using one of these libraries to break things, e.g. by manually adding a header that doesn't work with the headers added by the library. The library has to support using TLS and not; several kinds of authentication, several extensibility mechanisms and many failure modes.

The HTTP Components project at Apache talks about the many flaws and excessive simplicity of most HTTP client libraries, and states that "This is one reason why Jakarta, and other free and commercial vendors, have implemented independent HTTP clients". In other words, code re-use is seriously reduced by the way HTTP must be implemented. Some software companies are still selling HTTP client libraries to fill implementation gaps.

General-purpose HTTP servers -- ones that work for a large number of resources, a large number of requests, support TLS, content-negotiation, cache-control, redirect, authentication and other major features -- are even harder. When well implemented, HTTP server farms scale tremendously well. But much, much effort has gone into making those work.

When we look specifically at constrained devices, we see a much more limited set of use cases than the overall Web offers.
  • Documents are not large! Thus, document range and response continuation features are not desired. Only one transfer-encoding should be necessary, and conditional request features won't be worthwhile.
  • Documents from constrained devices are not intended for direct user consumption. There is no need for language and charset negotiation.
  • Even negotiating content type may be rare. A constrained device will state what it supports and not negotiate.
  • A constrained device will never act as a proxy, so need not support for Via headers and a bunch of status codes like 504. Further, if we define a non-HTTP, non-proxied transfer protocol that can be converted to HTTP at the first proxy step (or converted from HTTP at the last step), then the non-proxied transfer protocol doesn't need any proxy features at all.
  • Redirects are not necessary or could at a minimum be drastically simplified, making most of the 300 status responses unnecessary.
  • All of the 100 level informative status responses are unnecessary.
  • The authorization model for constrained devices is quite different than the authentication model assumed in HTTP (and associated logic like the 401 status code behavior is therefore unnecessary). Access control will be simpler than in current Web servers.

Imagine instead of a messy HTTP client library or server stack, we had a protocol with about a third the features, less extreme extensibility and simplified parsing. If well-specified and interop-tested early and often, I imagine such a protocol could be implemented in framework form in a tenth the size of an HTTP server framework. In even more constrained cases, a very specific constrained transfer protocol implementation (e.g. one which supported a subset of CRUD operations and did no optional features) could be 1/100th the size of a simple HTTP Web server. That starts to be an appropriate size for burning to a chip and shipping in a $1.00 to $10.00 device.

I tried to get some sanity checking on my estimates, and it's not easy. A Web server can sometimes be a wrapper on a file system, in which case most of the logic involved comes "free" from the file system. For example, a simple implementation of one style of redirect might be a trivial number of lines of code if the underlying file system supports symlinks. Andrew McGregor pointed me to the Contiki operating system for embedded objects, which has a Web server already. So that's a proof that embedded and limited devices can sometimes do straight HTTP.

In sum, if interoperability, flexibility and executable size are all simultaneous concerns, HTTP as-is will pose problems. While it may not be necessary for all devices to use a more space-efficient REST transfer protocol than HTTP, it may be necessary for some. If it is necessary to do a new transfer protocol, it should be possible to design a protocol an order of magnitude smaller just by cutting features and unneeded options from HTTP; and possibly smaller yet by being very careful about parsing and limiting extensibility.

More later on REST itself scaling down.

3 comments:

Mark Nottingham said...

I agree that HTTP as-is may not be appropriate for limited devices (although it's interesting to observe the mobile world, where just a few years ago HTTP+HTML was judged inappropriate; look at us now!).

However, I'm wary of limiting its semantics. E.g., you say 504 isn't useful; however, lots of my automated agents send Cache-Control: only-if-cached, where a 504 response means that the requested representation isn't in cache, allowing them to do a cheap lookup easily.

I think a better approach -- if you want to base this on HTTP -- would be to maintain fidelity to HTTP's semantics, while making its serialisation more efficient. The SPDY experiment by Google is showing some promise in this regard; compare the complexity of a HTTP parser vs. a SPDY parser by comparing http_common and spdy_common in nbhttp.

BTW, IME clients implementations -- especially full-featured ones -- are much tricker and more complex than servers. YMMV, of course.

Sam said...

Are today's devices really that limited? I believe the value in terms of developer friendliness vastly outweighs the resource cost, though it does make sense for such interfaces to constrict HTTP to the bare minimum (for example by enumerating valid HTTP status codes and what they mean in context). If you're worried about overhead then SPDY is one option - and certainly a better one than rolling your own protocols or using something relatively difficult to work with like SNMP.

sam said...

I think the rigid client-server, request-response nature of HTTP is also a large problem. A simple peer-to-peer protocol allowing unsolicited msgs from either side would be better. Or at least a subscribe model. Its very hard to use HTTP to request for asynchronous updates, AJAX style webapps jump through enormous hoops to deal with this limitation, and semi-autonomous loosely connected wireless devices are going to find it even worse. I've never understood why we can't just use a (possibly chopped down) zigbee stack.

Blog Archive

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 Unported License.