Sunday, November 07, 2004

Writing protocol standards is hard work, harder than writing specifications, although they are similar tasks. One of the reasons is that you have to describe the protocol in sufficient detail that somebody who wasn't involved in the process and has different software experience (different features, different user interactions, different architecture, different platform or different programming languages) can still implement the standard and interoperate with other implementors. (Actually it's so hard to do this that no standard gets it "right". At the IETF we're well aware that we do successive approximations, first doing internet-drafts and then doing RFCs at different stages of maturity. ) But we can at least try to do it right, and a proper effort requires a lot of effort including:
  • A description of the model
  • Implementation requirements
  • Examples of protocol usage
  • Definitions/schemas
Often these will seem redundant with each other but they're all important.

The model

The model is key for first-time readers and for people who need to know something shallow about the protocol. There are different kinds of models that are important for protocols, and some of them are described (and examples given) in one of Ekr's works-in-progress:
  • The protocol messaging model. Do messages have headers and bodies, or do they have XML element containers? Does the server respond to messages in the same connection? In a fixed order? Can the server originate messages?
  • The protocol state machine. Are there different states (e.g. pre-handshake, pre-authentication, and main state)?
  • The protocol's data model. What data types are there and what relationship do they have to each other -- folders and messages and flags (IMAP), or collections, resources and properties (WebDAV)?
  • The addressing model, which is almost part of the data model. In SIMPLE you can address other people whereas in XMPP you can address not only human actors but specific software instances running on behalf of those humans. And not to be speciesist, non-humans as well.
There's probably other kinds of models but I've seen examples where each of these could have been explained better. It took me a while to understand IMAP annotations because I didn't factor in the part of the model where each annotation might have several different values depending on the authentication id used to access the value.

The model is important not just for first-time readers and shallow users but also later on for deep users who want to extend the protocol. HTTP has been extended in many ways by people unfamiliar with the way the model is supposed to work. For example, HTTP normally uses the Content-Type to declare the type of the message body, just as one would expect from a concept borrowed from MIME and a messaging system. However, one extension to HTTP (now part of HTTP 1.1 or RFC2616) breaks that model by applying an encoding to the body and that encoding is specified in a different header. So if that feature is used the Content-Type no longer strictly works that way. RFC 3229 moves further away from the MIME-like model as it extends HTTP -- it defines an alternative model, where the Content-Type refers to the type of the resource that is addressed. So now of course there's a schism in the HTTP community about which is the best model to proceed with, to the point of having academic papers written about the alternative models. More clarity about the model in the first place would have helped not only first-time readers of the HTTP spec but also might have helped have fewer problems with these extensions.

Finally, a clear model helps implementors remember and understand each of the requirements. Humans have trouble fitting a bald list of requirements into some memorable pattern, so give implementors a mental model (or several) and they'll do so much faster, with less confusion and mistakes.


The requirements are deeply important, as much so as the model. At the IETF we place so much importance on the wording of requirements that we have a whole standard describing the wording of requirements. Why?

First, models can be interpreted differently by different people. This can happen very easily. IMAPv4 was originally defined in RFC 1730 and there was a lot of text about the model, particularly the different states. However a lot of people implemented the details differently and RFC2060 had to get more specific. Finally, RFC 3501 revised RFC 2060, and most of the changes made in RFC3501 were simply clarifying what the consequences of the model were for various cases -- because implementors made different assumptions, came to different conclusions, and argued persistently about the validity of their incompatible conclusions. Chris Newman explained this to me today when the topic of models + requirements came up, and he should know -- he authored/edited RFC 3501.

Second, a model explains how things fit together, whereas requirements explain what an implementation must do. Implementors are human and operating under different pressures, so it is easy for implementors to read a lot of flexibility into the model and the examples. Clients want to believe that servers will do things similarly (makes their logic easier) so they tend to assume that is the case. So when things are flexible, they must be explained to be so, to encourage client implementors to account for differences. E.g. RFC 3501 says

"Server implementations are permitted to "hide" otherwise accessible mailboxes from the wildcard characters, by preventing certain characters or names from matching a wildcard in certain situations."
When things aren't flexible, the document needs to say so so that implementors aren't given any wiggle room or room for confusion. In RFC3501 we see

The STATUS command MUST NOT be used as a "check for new messages in the selected mailbox" operation
This text is much stronger than saying that the "STATUS command requests the status of the indicated mailbox" (that sentence is also in RFC3051). It's even stronger than saying that the STATUS command isn't intended as a way to check for new messages. (It might be even clearer to say that "client implementations MUST NOT use the STATUS command..." but this is good enough.) IETF standards-writers and implementors have learned painfully that they need to use well-defined terms in attention-getting ALL CAPS in order to get implementors not to misunderstand wilfully or accidentally, whether something is a requirement.

A few more reasons why requirements are needed:
  • Requirements often add more detail than the model should hold. Since the model should be high-level and readably concise, it can't be expected to define all behaviors.
  • Sometimes requirements are examples of the conclusions that somebody would draw if they fully understood the model and all its implications. These have to be complete, however, not only selected examples, because no two people have the same full understanding of the model and all its implications. The requirements help people go back to the model and understand it the same way.
  • Human readers need repetition in order to understand things. Sometimes the requirements restate the model in a different form, and that's fine. When essay writers want their audience to understand they say what they're going to say, say it, then say what they said. We can make our standards more interoperable by balancing that approach with our typical engineering love of elegance through avoiding redundancy. Humans aren't computers, so the engineering avoidance of redundancy in code isn't fully applicable to human-readable text.

Examples are, thankfully, better understood. It's pretty rare to see a protocol go to RFC without a few good examples. Readers expect and demand them (more so than the model or requirements) because we know from reading many kinds of technical documents how useful examples are. I hope I don't need to justify this too much, in fact I find I need to do the opposite and remind people that examples do not replace requirements or models. Implementors need examples to understand the requirements and models but they can easily draw conclusions from examples that are counter to the requirements and don't fit in the model. When a specification has an inconsistency between a requirement and an example, trust most developers to implement to match the example, not the requirement.


Definitions and schemas also tend not to need much justification in a techie crowd. We're attracted by the idea of having absolute certainty about what's valid by trusting a program to compare an example to a definition or schema and validate it. So once again, I have a caveat to offer rather than a justification: make sure that definitions or schemas are put in context very carefully. Can an implementor use the schema to validate incoming XML and reject anything that doesn't match the schema? Probably not, or else it would be impossible to extend the protocol. Early WebDAV implementors built XML schema validators into their servers and rejected client requests that extended the protocol in minor ways that should have been compatible, so I'm taking this lesson from actual experience.

I certainly can't say that when I'm a protocol author, I succeed in doing all of this. But after eight years reviewing and implementing good and bad protocol specifications, I'm beginning to see what works.

Comments welcome.

No comments:

Blog Archive

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 Unported License.