Monday, April 19, 2010

Should a constrained device be a RESTful server or a client? I had been assuming server, and I think I can justify this, although I'm not saying that choosing to put the low-power device in the client role is wrong, because it may depend on constraints and use cases. I do think it's a bad idea to require both client and server roles in the more constrained devices so I'm treating this as an 'either' choice, not and/or. Here is my rough analysis, as part 3 in a series on designing a REST framework for constrained devices (parts 1 and 2).

Roles: Reactive, storing, resource owner

To begin with let's separate the three roles commonly included in the definition of a "server". The first role: "a server is a reactive process" [Andrews, 1991] while a client is a triggering process. The second role: the server stores authoritative versions of resources. Most client/server remote file systems have both these roles in the server: the server reacts to requests and enacts storage operations, while the client is the triggering process making requests, but also controls the namespace has constraints on resource state. These client/server file systems scale moderately well but are inflexible and hard to extend compared to HTTP. In REST, the server takes on a third role: the server manages the namespace and resource state.

It's not clear how much mix-and-matching can be done with these roles in practice. Would it work reasonably well to keep the resource and namespace ownership together with the storage, but to make that agent also be the triggering process instead of the reactive process? I don't know examples of that kind of system in practice, nor do I know how to analyze a theoretical system against fuzzy goals like "flexible" and "scalable". But in the discussion of which roles to assign to the low-power device, I try to keep the three roles separate in case that helps shine light on questions like "should the low-power device proactively send requests such as notifications".

Benefits to consider

1a. Continuity: A server in charge of its own storage, namespace and resource model can be installed and host the same resources, responding to requests at any time, for years without changes. The conditions of its use can even change within limits. A Web site that, when launched, handles a few browser requests a day for certain resources, can later have some "mashup" service querying those same resources at automated intervals and extracting the data. Automated clients don't have this ability to be used in different ways without changing their configuration or code. Applying this to our low-power example: a sensor can handle requests from COAP gateways during normal functioning and by laptop-based clients during configuration, testing or development of new applications, without needing to know why it is handling any request (modulo authorization [1]).

1b. Flexibility: The flip-side of continuity is that features can be added. A device can be upgraded without disrupting the operation of the other low-power devices around it, because an upgraded device can host the same resources as the original device, plus new resources.

2. Scaling: In previous posts, I talked about how scaling large is related to scaling small, due to the relationship between power and load. The flip-side of scaling up to a large load handled with a fixed amount of memory and processing power (the normal problem for HTTP servers), is scaling down the memory and processing power for a fixed load (the scaling problem for low-power devices). In the Web today, what scales better, HTTP servers or HTTP clients? It's well-known that HTTP servers scale well, but there's little concern for clients scaling. It is hard to write a program that load-tests HTTP servers by implementing many clients over many connections -- all too often, the load-testing client machine(s) run out of resources before the servers do.

In HTTP, the server is stateless (see where Roy's dissertation describes "Client Stateless Server" and "Layered Client Cache Stateless Server" for the fuller picture), but the client may not be. The client needs to figure out which servers to talk to, what resources to request from the server, how to interpret those resources and what to do next. In Web browsers, it's the human user who makes many of those choices and may have mental state. An automated client might well include a state machine as the client attempts to achieve a goal involving multiple resources and perhaps even multiple servers. At a minimum, the client knows "What page I'm on now" and "what context did I make a resource request in" as part of its state. In contrast to the client, the server can be stateless, reactive, and doesn't need to know who it's going to communicate with as long as they're authorized [1]. With good application design, a server ought to be implementable with very little use of dynamic memory since it is stateless.

3. Naturalness of resource model: the most natural thing seems to be to model sensor readings and the capabilities of the most constrained devices as resources. Further, if the sensor is used for many years without being upgraded, those resources can have extremely stable URIs. The most natural agent to own these resources is the sensor. This related to the flexibility benefit because of the naturalness of extending the resource model. A sensor v1 can have it resource with a set of readings, and a new device with additional readings can simply have additional resources. A multi-purpose sensor can have all the resources that each single-purpose sensor would have.

4. Naturalness of user model: User interfaces are on the side of the triggering process or client as the user initiates requests for information or changes of state, and their client sends that out. It will be very rare for a human to push a button or otherwise interact directly with a sensor and cause it to act as a client. Much more common will be human requests on a laptop or control panel, which can act as a client and send a request over the local network for sensor information.

To conclude, I think these roles and benefits fit together sensibly for a particular kind of constrained device. I am probably focusing too much on a particular idea of a constrained device and its use cases, so I'm happy to admit that this is not the answer for all devices and use cases and would be interested in hearing analysis about other cases. I'm also getting some good comments to the past two posts, so at some point I foresee a post in this series just to discuss the comments.

[1] Authorization does not necessarily break any of the assumptions of statelessness. For example, authorization decisions could be pushed down to a lower layer in the simplest cases: the authorization to communicate with a sensor is the same authorization needed to request its sensor data. In more complex cases, the authorization decisions can be explicitly made above the COAP layer but this is more code and storage on the low-power device. In any case, this investigation is starting to make me think that authorization should either be a network function or an application function, not a COAP function or at the resource transfer layer.

Thursday, April 08, 2010

REST, with proper use of hypermedia, can be very appropriate for constrained devices. In my last post, I talked about how HTTP has a lot of cruft that could be removed if one were to design a HTTP-lite for constrained devices. The REST architectural style works for constrained devices a lot better than the HTTP syntax and HTTP feature list do. Naturally I'm working from Fielding's thesis, but some different points and emphases are warranted for the context of constrained devices interacting with automated agents, as contrasted to the user deciding which HTML link to click on to interact with a Web server. In order to dive into this, we'll need to really understand how REST requires using server-controlled hypermedia to advertise and navigate the server's resources.

REST involves
  • Stateless design. That means no state machine thinking! The client can't assume server state. [1]
  • A uniform interface on resources, similar to CRUD
  • Caching support for resource representations
  • Navigation to server-named resources via server-controlled hypermedia

The last point is often misunderstood. Despite the fact that we call HTTP client messages requests, it's far more common for programmers and engineers to treat them as commands or instructions (SOAP and RPC illustrate this very well). Many APIs designed to be used over the Web assume that the client decides what should be done and tells the server to do it. This isn't RESTful, and it's not just a quibble: putting the client in control of the interactions puts the client in control of the server's performance and ability to scale.

I didn't understand this when I first read Roy's thesis or when I first worked on WebDAV. I needed examples to make sense of the principle and its utility, particularly when the server does not use HTML and the client is not driven by an active user clicking on links. Thus, this post has examples of non-RESTful navigation and state discovery, as well as RESTful examples:
  • The Twitter API: not RESTful in resource naming or discovery
  • WebDAV: somewhat RESTful but made a few mistakes in extending HTTP
  • Atom: quite RESTful application that sits very lightly and comfortably on HTTP

The Twitter API has fixed resources. Those resources have URIs state which are known to the client in advance. For example, the account profile update resource is named "". This limits the Twitter API's ability to rejigger its namespace to add features or to scale by separating resources along a different axis. Next, the client must know a set of permanently-named parameters to be used in a POST request to the URI, to update the profile on the account. The programming model is clearly that twitter client controls the Twitter server by sending commands over HTTP POST. The idea of CRUD is vaguely there (many of the API resources can be retrieved in full with GET, others updated with POST) but diluted by the invention of special update URLs. Caching is possible, at least.

WebDAV is an IETF standard that follows the idea of a limited set of consistent methods well, but doesn't do resource discovery by server-controlled hypermedia. WebDAV set out to provide more authoring functionality such as functions to organize resources and interact with a file system model, so that Webmasters and other Web content creators could have something more suited to Web development than FTP. The WebDAV designers created the PROPFIND request which works a lot like most remote file systems file queries work. In other words, it puts control over querying server resources in the hands of the client. The client determines the scope of the request and the list of properties to be returned, and in the first implementations of WebDAV, the server had little choice but to comply or fail entirely. WebDAV servers had lots of code to handle all the different possible variations on the PROPFIND request, parse its body, and scale differentially based on the different characteristics of the properties that might be requested. All the information to reply to any PROPFIND request that the client might construct, had to be available to any server that might have that request made. Of course, this approach seriously affected performance and scaling.

Atom Feeds was the first standard I saw that did user-free resource discovery via server-controlled hypermedia. An Atom feed is a document offered by the server in XML format, containing links in semantic markup. So, instead of querying the server for blog entries that match a certain pattern, the client simply asks for a feed document via GET. With Atom, the server can precompute or do lazy computation of feed documents, can cache them, can break them down when a feed gets large. This design leaves the scope and detail level of collection membership documents entirely in the servers hands.

More examples can be found at a site that classifies APIs along a continuum of RESTfulness. My quibble with that site is that not all APIs fall along that linear spectrum. WebDAV would have a green box for RESTful identification of resources and a green box for self-descriptive messages, but it would have mixed results for manipulating resources through representations, because collections are not manipulated through representations. It would also have mixed results for using hypermedia "as the engine of application state" because the client makes assumptions about giving resources URIs based on what collection they were put in.

Clearly I need to work on a third post in this series and perhaps a fourth, because I would still like to talk about why constrained devices should operate in the role of the server and provide documents similar to Atom feeds about their state and data. I also have thoughts about a hypothetical framework and how it could be applied in a specific type of constrained device in a way that may not need any dynamic application memory. Hope you can stand to wait.

[1]I think part of what took me so long to understand this was the amount of meaning packed into the word "state". Not only does this mean "what state is the server in, among the states in its state machine", but it also means "what are things named" and "what things exist". So unpacking the injunction that clients shouldn't presuppose state, also means that clients shouldn't presuppose names or existence of resources.

Sunday, April 04, 2010

It's not entirely intuitive, but the same architectural style that allows Web servers to handle large scale usage and iterate so successfully, may also help constrained devices (like sub-10$-material-cost sensors) to handle small scale loads and power usage and avoid being upgraded for many years.

A lot of discussions of using REST in constrained devices have glossed over whether it's really interoperable HTTP that's being used (e.g. compressed HTTP) and whether the low-power device would be in the role of the client or the server. My opinion at this point is that for truly constrained devices, compressing HTTP itself would be the wrong choice, and that the device must be in the role of the server. The rest of this series explores why I believe that and should lay open the assumptions that might allow somebody to correct my information and revise my opinion.

The first part of the series attempts to explain is why HTTP is the wrong transfer protocol choice for some constrained devices. In theory it's a massively extensible protocol, with ways to extend the operation set, upgrade versions, add status responses, add headers, add resource content types and so on. Again in theory, almost all headers can be combined with each other and with different methods unless forbidden. In practice, the deployed base makes some HTTP extensions, and particularly the combinatorial expansion of extensions and optional features working together, quite difficult, and much special-case code must be written to handle the cases where widely-deployed software fails to implement some detail correctly. Here's a few examples of problems we've had over the years.

1. In developing WebDAV, we tried to use the standard "OPTIONS *" request (the special-case URI '*' means "give me the options or capabilities of the server"). However, we found that this is poorly implemented. Java servlets, for example, did not support the '*' URI. Other HTTP frameworks supported the '*' URI but did not allow the response to OPTIONS * to be extended by software extending the HTTP framework. Ugh.

2. HTTP has several different ways to indicate the content-length (historical reasons as well as for different use cases). One is the MIME type "multipart/byteranges". Another is chunked transfer encoding, where the server (and thus the client) does not need to know the total length until the last chunk is transferred. A third is "Content-Length". A fourth is for the server to terminate the connection, although this feature opens up the possibility of truncation attacks. Some of these methods work poorly with TCP connection continuation. Both clients and servers have to support almost all of these.

3. Both absolute and relative URIs are allowed in several locations, even though they're not a good idea in all locations. The resolution of a relative URI can be tricky and complicate client software. Implementations of this make it worse; e.g. the "Location" header was defined to allow only absolute URIs, but many implementations have been found which use their generic URI parsing or generating code to allow relative URIs in that field as well.

4. Parsing headers with quoting, separator (comma and semi-colon), whitespace and continuation rules is difficult. See,,, etc. Some headers defined in other specifications besides RFC2616 even used the syntax rules incorrectly and have to be special-cased.

5. The support for the Expect: header and 100 Continue response has never been good. In theory it's required, so there's no way for a client to ask if a server supports it, thus the client could end up waiting quite a while before giving up on the server initial response. Instead quite a few clients ignore the specification text and start sending their request bodies right after the headers including the "Expect" header, without waiting for the server 100 Continue response. This kind of feature also makes it harder to integrate authentication (what happens if the client uses this header when an authentication handshake needs to be initiated? ) and connection/transport features, as well as to implement intermediaries.

There are many more examples in a long issues list (, and some of the discussions of these issues get quite lengthy in considering how features are actually implemented and how they work combined with other features.

Developing a very simple HTTP client is simple, particularly if the client is talking to a single known HTTP server or is making only a small number of kinds of requests. Developing a limited-use HTTP server can be fairly simple too, although it gets to be significantly more complicated if the developer actually tries to handle all RFC2616 features and all unknown headers and methods properly. What turns out to be very hard is building an HTTP client library, server library or extensible server, because these are used in unexpected ways. It's easy for a developer using one of these libraries to break things, e.g. by manually adding a header that doesn't work with the headers added by the library. The library has to support using TLS and not; several kinds of authentication, several extensibility mechanisms and many failure modes.

The HTTP Components project at Apache talks about the many flaws and excessive simplicity of most HTTP client libraries, and states that "This is one reason why Jakarta, and other free and commercial vendors, have implemented independent HTTP clients". In other words, code re-use is seriously reduced by the way HTTP must be implemented. Some software companies are still selling HTTP client libraries to fill implementation gaps.

General-purpose HTTP servers -- ones that work for a large number of resources, a large number of requests, support TLS, content-negotiation, cache-control, redirect, authentication and other major features -- are even harder. When well implemented, HTTP server farms scale tremendously well. But much, much effort has gone into making those work.

When we look specifically at constrained devices, we see a much more limited set of use cases than the overall Web offers.
  • Documents are not large! Thus, document range and response continuation features are not desired. Only one transfer-encoding should be necessary, and conditional request features won't be worthwhile.
  • Documents from constrained devices are not intended for direct user consumption. There is no need for language and charset negotiation.
  • Even negotiating content type may be rare. A constrained device will state what it supports and not negotiate.
  • A constrained device will never act as a proxy, so need not support for Via headers and a bunch of status codes like 504. Further, if we define a non-HTTP, non-proxied transfer protocol that can be converted to HTTP at the first proxy step (or converted from HTTP at the last step), then the non-proxied transfer protocol doesn't need any proxy features at all.
  • Redirects are not necessary or could at a minimum be drastically simplified, making most of the 300 status responses unnecessary.
  • All of the 100 level informative status responses are unnecessary.
  • The authorization model for constrained devices is quite different than the authentication model assumed in HTTP (and associated logic like the 401 status code behavior is therefore unnecessary). Access control will be simpler than in current Web servers.

Imagine instead of a messy HTTP client library or server stack, we had a protocol with about a third the features, less extreme extensibility and simplified parsing. If well-specified and interop-tested early and often, I imagine such a protocol could be implemented in framework form in a tenth the size of an HTTP server framework. In even more constrained cases, a very specific constrained transfer protocol implementation (e.g. one which supported a subset of CRUD operations and did no optional features) could be 1/100th the size of a simple HTTP Web server. That starts to be an appropriate size for burning to a chip and shipping in a $1.00 to $10.00 device.

I tried to get some sanity checking on my estimates, and it's not easy. A Web server can sometimes be a wrapper on a file system, in which case most of the logic involved comes "free" from the file system. For example, a simple implementation of one style of redirect might be a trivial number of lines of code if the underlying file system supports symlinks. Andrew McGregor pointed me to the Contiki operating system for embedded objects, which has a Web server already. So that's a proof that embedded and limited devices can sometimes do straight HTTP.

In sum, if interoperability, flexibility and executable size are all simultaneous concerns, HTTP as-is will pose problems. While it may not be necessary for all devices to use a more space-efficient REST transfer protocol than HTTP, it may be necessary for some. If it is necessary to do a new transfer protocol, it should be possible to design a protocol an order of magnitude smaller just by cutting features and unneeded options from HTTP; and possibly smaller yet by being very careful about parsing and limiting extensibility.

More later on REST itself scaling down.

Blog Archive

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 Unported License.