Not Invented Here: 2004

Tuesday, December 14, 2004

I went off on a tangent today and wondered how many translations of Marot's poem "A une Damoyselle Malade" could be found on the Web. I thought there would be more because there are nearly a hundred in Hofstadter's book "Le Ton Beau de Marot" alone, and he encourages readers to write their own. Anyway, here's the list I found, using the first line rather than a title to refer to them because that's what I found most memorable:

Petter Hesselberg wrote Min Konfekt in Norwegian and provided an English gloss of that
Mark Irons wrote Little Mole
Craig Clark wrote Precious Girl
David Short wrote Sweet Jeannou
Melanie Mitchell wrote To my sweet (or in her more wordy version, "To be a joy forever to my sweet")
Enno de Witt wrote Teder Wicht in Dutch (gloss, anyone?)
Kat Walsh wrote Sweetest Rose.

Rather than provide my own translation, I wondered what Babelfish would provide. To get a decent approximation I had to remove line breaks (format as a paragraph), replace abbreviations with fully-spelled-out words, and modernize a couple words. My input:

Ma mignonne, je vous donne le bon jour; Le séjour, ce est prison. Guérison recouvrez, puis ouvrez votre porte et que on sorte vite, car Clément le vous commande. Va, friande de ta bouche, qui se couche en danger pour manger confitures; si tu dures trop malade couleur fade tu prendras et perdras le embonpoint. Dieu te donne santé bonne, ma mignonne.

Babelfish spit out (with me re-adding the line breaks as with the original):

My nice,
I give you
the good day;
The stay,
it is prison.
Cure
cover,
then open
your door
and that one left
quickly,
because Clément
orders you.
Goes, fond of delicacies
of your mouth,
which lies down
in danger
to eat
jams;
if you hard
too sick
insipid color
you will take
and lose
the plumpness.
God gives you
good health,
my nice.

Not too bad, really, though I had to trick Babelfish to get it to be even that good.

Sunday, December 12, 2004

It seems that to keep an eye on terrorists and other dangerous people, the government could be more creative in outsourcing the hard work of keeping an eye on citizens. Leave it to an expert -- Santa Claus. He'd just have to expand his "watch list" from kids to include adults as well. Then we could update the famous song and and still keep the exact same Christmas spirit:

Don't drive like a jerk,
Or be late paying fees,
Steal Post-Its from work
Or share mp3s,
Santa Claus is coming to town.

He's making a list,
Of those who can't fly,
With guys who made jokes,
In the security line,
Santa Claus is coming to town.

He knows if you've been cheating,
On your taxes or your wife,
He knows if you've been smoking pot,
Three strikes and you're in for life.

Anyway, happy holiday season.

Wednesday, December 08, 2004

Jon Udell just posted his wishlist for a calendar protocol (comparing it to WebDAV-only and to traditional calendar-only protocols), a list that's rather close to explaining why I'm working on CalDAV and what I want it to do. (Jeffrey Harris pointed this out to me and even pinged Jon about it, thanks Jeffrey)

Monday, December 06, 2004

A couple new items in my "stuff I made" pages, both the knitting page and the general page: a crocheted purse (almost clutch size) and knitted lace pillowcase edgings.

Tuesday, November 30, 2004

I'm hiring again at OSAF -- this position is for our very first server developer. I'm looking for somebody who can lead the charge, crank out code, and make the server do us proud.

Monday, November 29, 2004

Data modeling is hard. Some loosely correlated thoughts and links:

Model-driven architecture is rigid, at least with the tools as we know them today.
RDF has a simple basic model but leads to very complex structures, as Adam Bosworth explains.
Pictures express complex relationships relatively readably, like the picture in this paper. Unfortunately we need to translate the pictures into text in order to use these in software and network protocols.
The more complex your picture is, the more unreadable your text is.
Text has to be relatively flat to be readable.
References in data formats are like "goto" jumps in programming -- you lose context.
Maybe if data modelers put a little more thought into flattening their models we'd find them easier to use? This may make the models seem less "rich" but "KISS" is good too.

I don't know how true all of this is, but I'm learning.

The "relatively flat" observation seems to hold at least some validity in data formats, programs and even books. Experienced programmers, with the help of good indenting, can see quickly that they're within an 'else' statement inside a loop inside another loop inside an 'if' statement, but even experienced programmers screw this up sometimes (and even more experienced programmers flatten out the code by delegating some reasonable piece off to another method). Books are better if there's no more than three (maybe four) layers -- chapter, section, sub-section, and even this much organization requires human-readable text to link from one section to another and summarize what a bunch of sections are going to say.

Thursday, November 25, 2004

There's a higher quality of homeless people in Palo Alto. I just saw a hand-lettered sign outside of Whole Foods, explaining that the homeless of Palo Alto need food donated in the holiday season (and a very shifty looking guy collecting food on their behalf). Among the list of requested foods was "organic turkey". No cheap turkey for the Palo Alto homeless, please!

Tuesday, November 16, 2004

Timboy has a good medium-sized post on why meetings suck yet they're indispensible.

Friday, November 12, 2004

A while back I posted on honesty in journalistic bias. This week TechCentralStation has a longer essay about why we might see more openness around bias and why that's a fine thing.

Being aware of bias is something I agree with, but I do worry about blinkered views of the world. Too many people reading some highly biased source will simply not read any opposing source, or do so with only mockery in mind. We've got plenty of polarization, thank-you. So my preferred model is journalists who say "Here is my natural bias, and here is me being as unbiased as I can be in covering this topic, through rigorous reasoning and discourse with others who disagree with me."

Sunday, November 07, 2004

Writing protocol standards is hard work, harder than writing specifications, although they are similar tasks. One of the reasons is that you have to describe the protocol in sufficient detail that somebody who wasn't involved in the process and has different software experience (different features, different user interactions, different architecture, different platform or different programming languages) can still implement the standard and interoperate with other implementors. (Actually it's so hard to do this that no standard gets it "right". At the IETF we're well aware that we do successive approximations, first doing internet-drafts and then doing RFCs at different stages of maturity. ) But we can at least try to do it right, and a proper effort requires a lot of effort including:

A description of the model
Implementation requirements
Examples of protocol usage
Definitions/schemas

Often these will seem redundant with each other but they're all important.

The model

The model is key for first-time readers and for people who need to know something shallow about the protocol. There are different kinds of models that are important for protocols, and some of them are described (and examples given) in one of Ekr's works-in-progress:

The protocol messaging model. Do messages have headers and bodies, or do they have XML element containers? Does the server respond to messages in the same connection? In a fixed order? Can the server originate messages?
The protocol state machine. Are there different states (e.g. pre-handshake, pre-authentication, and main state)?
The protocol's data model. What data types are there and what relationship do they have to each other -- folders and messages and flags (IMAP), or collections, resources and properties (WebDAV)?
The addressing model, which is almost part of the data model. In SIMPLE you can address other people whereas in XMPP you can address not only human actors but specific software instances running on behalf of those humans. And not to be speciesist, non-humans as well.

There's probably other kinds of models but I've seen examples where each of these could have been explained better. It took me a while to understand IMAP annotations because I didn't factor in the part of the model where each annotation might have several different values depending on the authentication id used to access the value.

The model is important not just for first-time readers and shallow users but also later on for deep users who want to extend the protocol. HTTP has been extended in many ways by people unfamiliar with the way the model is supposed to work. For example, HTTP normally uses the Content-Type to declare the type of the message body, just as one would expect from a concept borrowed from MIME and a messaging system. However, one extension to HTTP (now part of HTTP 1.1 or RFC2616) breaks that model by applying an encoding to the body and that encoding is specified in a different header. So if that feature is used the Content-Type no longer strictly works that way. RFC 3229 moves further away from the MIME-like model as it extends HTTP -- it defines an alternative model, where the Content-Type refers to the type of the resource that is addressed. So now of course there's a schism in the HTTP community about which is the best model to proceed with, to the point of having academic papers written about the alternative models. More clarity about the model in the first place would have helped not only first-time readers of the HTTP spec but also might have helped have fewer problems with these extensions.

Finally, a clear model helps implementors remember and understand each of the requirements. Humans have trouble fitting a bald list of requirements into some memorable pattern, so give implementors a mental model (or several) and they'll do so much faster, with less confusion and mistakes.

Requirements

The requirements are deeply important, as much so as the model. At the IETF we place so much importance on the wording of requirements that we have a whole standard describing the wording of requirements. Why?

First, models can be interpreted differently by different people. This can happen very easily. IMAPv4 was originally defined in RFC 1730 and there was a lot of text about the model, particularly the different states. However a lot of people implemented the details differently and RFC2060 had to get more specific. Finally, RFC 3501 revised RFC 2060, and most of the changes made in RFC3501 were simply clarifying what the consequences of the model were for various cases -- because implementors made different assumptions, came to different conclusions, and argued persistently about the validity of their incompatible conclusions. Chris Newman explained this to me today when the topic of models + requirements came up, and he should know -- he authored/edited RFC 3501.

Second, a model explains how things fit together, whereas requirements explain what an implementation must do. Implementors are human and operating under different pressures, so it is easy for implementors to read a lot of flexibility into the model and the examples. Clients want to believe that servers will do things similarly (makes their logic easier) so they tend to assume that is the case. So when things are flexible, they must be explained to be so, to encourage client implementors to account for differences. E.g. RFC 3501 says

"Server implementations are permitted to "hide" otherwise accessible mailboxes from the wildcard characters, by preventing certain characters or names from matching a wildcard in certain situations."

When things aren't flexible, the document needs to say so so that implementors aren't given any wiggle room or room for confusion. In RFC3501 we see

The STATUS command MUST NOT be used as a "check for new messages in the selected mailbox" operation

This text is much stronger than saying that the "STATUS command requests the status of the indicated mailbox" (that sentence is also in RFC3051). It's even stronger than saying that the STATUS command isn't intended as a way to check for new messages. (It might be even clearer to say that "client implementations MUST NOT use the STATUS command..." but this is good enough.) IETF standards-writers and implementors have learned painfully that they need to use well-defined terms in attention-getting ALL CAPS in order to get implementors not to misunderstand wilfully or accidentally, whether something is a requirement.

A few more reasons why requirements are needed:

Requirements often add more detail than the model should hold. Since the model should be high-level and readably concise, it can't be expected to define all behaviors.
Sometimes requirements are examples of the conclusions that somebody would draw if they fully understood the model and all its implications. These have to be complete, however, not only selected examples, because no two people have the same full understanding of the model and all its implications. The requirements help people go back to the model and understand it the same way.
Human readers need repetition in order to understand things. Sometimes the requirements restate the model in a different form, and that's fine. When essay writers want their audience to understand they say what they're going to say, say it, then say what they said. We can make our standards more interoperable by balancing that approach with our typical engineering love of elegance through avoiding redundancy. Humans aren't computers, so the engineering avoidance of redundancy in code isn't fully applicable to human-readable text.

Examples

Examples are, thankfully, better understood. It's pretty rare to see a protocol go to RFC without a few good examples. Readers expect and demand them (more so than the model or requirements) because we know from reading many kinds of technical documents how useful examples are. I hope I don't need to justify this too much, in fact I find I need to do the opposite and remind people that examples do not replace requirements or models. Implementors need examples to understand the requirements and models but they can easily draw conclusions from examples that are counter to the requirements and don't fit in the model. When a specification has an inconsistency between a requirement and an example, trust most developers to implement to match the example, not the requirement.

Definitions/Schemas

Definitions and schemas also tend not to need much justification in a techie crowd. We're attracted by the idea of having absolute certainty about what's valid by trusting a program to compare an example to a definition or schema and validate it. So once again, I have a caveat to offer rather than a justification: make sure that definitions or schemas are put in context very carefully. Can an implementor use the schema to validate incoming XML and reject anything that doesn't match the schema? Probably not, or else it would be impossible to extend the protocol. Early WebDAV implementors built XML schema validators into their servers and rejected client requests that extended the protocol in minor ways that should have been compatible, so I'm taking this lesson from actual experience.

I certainly can't say that when I'm a protocol author, I succeed in doing all of this. But after eight years reviewing and implementing good and bad protocol specifications, I'm beginning to see what works.

Comments welcome.

Thursday, November 04, 2004

Today I managed to explain (better than I've ever explained before) a few principles in the design of a network system. I use a client/server network system although you can generalize this to P2P easily. This is the diagram I drew on the whiteboard.

If it's hard to grok this completely abstractly, an IMAP client/server are good to mentally plug in. There are so many different IMAP clients and servers and they all have different APIs, storage models, and internal data models. By "data model" I mean data structures or object models, including caching and relationships between things. So if your Java code instantiates a MailMessage object which has a link to a EmailAddress instance for the 'From' field, that's all part of the internal model. The protocol's data model is similar: in IMAP there are folders and mail messages, mail messages have headers, one of which is the From header, and so on.

So I intended this diagram to convey a whole lot of stuff.

The protocol is the most inflexible part of this system. If you've got any interoperability at all with your protocol, even between unsynchronized releases of client and server, then your protocol is the most fixed element in the system. People constantly use new clients to talk to old servers, and old clients to talk to new servers, which means that even when new clients talk to new servers you're likely using an old protocol. Since your protocol is the hardest thing to change, both its syntax and its data model, it had better be extensible, basic and support many usage models.

The internal data model is the most flexible part of this system. APIs and protocols must continue to be supported across releases. Storage formats are easier to change than APIs but often require upgrade steps. Thus, the internal model is actually the easiest thing to change. Doesn't that mean that it's less important to get that right, because it can be tweaked and refactored to do new things or benefit from new knowledge? Yet many architects focus deeply on the internal model, spending much more time getting it right than the API or the protocol.

Client, server and protocol data models and content models diverge. Many architects design a networked system that starts with the same data model on the client and server and thus naturally they want the same data model expressed in the protocol. But these diverge naturally, sometimes even before Server 1.0 and Client 1.0 ship. For example the implementors of Server 1.0 discover that they need to cache certain strings together for scalability and subtly the data model on the server begins to change. Be aware from the beginning that this will happen. It's not a bad thing. It may even be a good thing to allow the server to be fast and the client to innovate on features.

Practice information hiding at the dotted lines. These are the places to really focus on modularization. Many software developers already understand that your API shouldn't be tied directly through to your storage model and this principle can easily be extended to the protocol modules. I've written bad code that tied too much to the protocol so I'm guilty of that one myself. It seems that unless there's a good reason, the protocol implementation shouldn't be tied directly to the storage model (the implementation should instead retain the freedom to change how things are stored without rewriting everything). It might not be so bad to tie the protocol to the API, i.e., by implementing the protocol logic using only the API. That way, any internal changes that leave the API unchanged, also leave the protocol unchanged. But that isn't always the best choice -- sometimes the protocol support needs access to internals and you don't want to complicate the API too much just to make the protocol fast.

Corollary: Use the best technology and architecture choice for each component independently. Because your client model will diverge from your protocol model and that one from the server model, data model consistency is not a good reason to use the exact same table structure or even the same database software on the client and server. (There may be other good reasons like expertise). Don't try to create the same indexes; the client and the server data access patterns will also diverge if they're even the same to begin with. Don't try to recreate the same caches. Send your server and client teams to different countries to work, maybe! That way the protocol becomes one of the most important ways they client and server teams communicate and they can make fewer hidden assumptions about how the code on the other side works (but they will make some anyway which will bite you in the ass).

Standard protocols and proprietary protocols aren't much different. If the protocol data model and client and server protocol naturally diverge, then even if your system starts out with highly similar models by implementing a proprietary protocol, that advantage erodes and becomes a disadvantage, hindering extensibility. OTOH if you start out implementing a standard protocol and enforcing good separation between the data models, this is a good long-term strategy. You know from the start that there will be translation between the data models -- every protocol message that comes in will have to result in objects or data structures being instantiated in the internal format, and every protocol message that goes out is a tranformation from internal objects or data structures. So that translation layer is solid from the beginning. Furthermore, if the system is using a proven protocol, the extensibility and performance features are likely to be better than one can easily design from scratch.

Protocol syntax isn't very important as long as it's extensible. Translating between models that are different is harder than translating between different syntaxes. It's like translating a business course from American into Chinese -- the language is the easy part, the culture and environment are so different that you can easily mean something you didn't intend to mean. So it's not the end of the world if the syntax is header fields or XML documents, as long as there's a clear way to extend either one. The extensibility is key so that as the clients and servers evolve they're not totally hamstrung by an inflexible protocol.

Whew. That's asking a lot of a l'il ol' whiteboard sketch. Comments welcome.

I have to say, I love a good Fisking, or to Canadianize that, a Frumming. On TCS, Radley Balko takes on David Frum's National Review column on obesity and taxes. It's a good read in its entirety, but I thought it would be fun to summarize anyway, to show how each argument was demolished.

Frum argues:

Canadians are less obese than Americans
Portion sizes are smaller in Canada than in US.
It's because Canadians are less wealthy that portion sizes are smaller.
Smaller portions lead to less obesity.
Obesity leads to health care costs.
Making sodas more expensive (by taxation) will cause lower consumption of sodas (conclusion: also reduce obesity, also reduce health care costs).

Without facts, one might follow that logic. Balko, however, demolishes one after another of these, showing how much of a house of cards that logic was:

Canadians are similarly obese to Americans and Frum's evidence was only anecdotal.
Portion sizes are similar and Frum's evidence was only anecdotal.
Since portion sizes aren't smaller in Canada, wealth isn't a factor in portion sizes (at least the wealth difference between CA/US doesn't matter to that). Also note that total consumption of caloric sodas has been steady for decades as Canadians have gotten significantly richer (and soda cheaper).
This one requires more data to completely demolish, but the evidence that total consumption of caloric sodas has been steady for decades does cast doubt on the idea that smaller cans of sodas will reduce consumption.
There's more evidence that poor fitness (sedentary lifestyles) has a much greater health care cost than obesity.
Such a small increase in price of soda is unlikely to change consumption, given that consumption has been steady for decades as soda production has gotten cheaper and people richer.

Balko doesn't point this out, but soda doesn't appear to be very price sensitive. Rather than buying cheap store-brand sodas the market overwhelmingly goes to the higher-priced image brands.

Thursday, October 28, 2004

I've enjoyed working at OSAF for the past eight months and I'm starting to understand why. We're nice but we're not losers.

We shipped a minor release this month - the 0.4 release of Chandler. The release date of October 26 was picked months ago, as was a candidate set of workflows and other tasks to achieve. We met our release date, partly by making some minor cuts to the feature set (though we still reached the overall goal of having something experimentally usable or demoable), partly by working professionally towards common goals. Even better, we did it without panic, without yelling at each other, without having sales presell the release (heh heh -- no sales). We didn't make developers stay all night. We didn't have unpleasant meetings. We didn't demonize anybody for their bug counts. And we still managed to release on a schedule.

I don't want to get too deep into how we did it (one could write a book), but it had a lot to do with honesty and trust. Maybe when the stakes and egos are high it's too easy to fool oneself into believing ridiculous schedules. I'd like to think our transparency (it's all out there on the wiki helped us be honest with ourselves and communicate potential problems early.

It's nice to have confirmation that it's possible to reach high goals and still be sane.

Wednesday, October 27, 2004

My local security expert can be opinionated and frank, sometimes.

Me: "So have you ever tried getting IPSEC working?"

Him: "I'd rather have a prostate exam."

Saturday, October 23, 2004

Making wedding stuff

I had a fun time making stuff for my friend's wedding a couple weeks ago. Now the pictures have come out.

I made silk shawls for the bridesmaids. Lessons learned:

I couldn't iron a double-folded-over edging into silk organza. Pin a lot.

Sewing raw silk organza edges into a folded border is hard. I bought black silk thread so the seam wouldn't show much, and then held the organza very taut, front to back, as I fed the pinned edge through the machine. Still, the tension in the seam pulled the organza together a bit, until I ironed the *&$! out of it.

Use a ton of water to iron silk charmeuse. But do iron, because it's worth it to get the piping straight.

Silk must be carefully pinned, particularly when you're sewing 7 layers together. Even so, I caught excess organza in the seam a couple times.

Use the right sewing needle to go through those 7 layers. When I used the needle I normally use for quilting, it made a scary "thunk" sound.

Sorry, no pics of the finished shawls at this point. They ended up black silk organza bodies, with charcoal charmeuse endings (folded over the raw organza edges) and purple charmeuse piping.

I was hanging around doing nothing on the wedding day and suddenly (but willingly) pressed into service making the ball for the flower girl to carry. The materials: a styrofoam ball, ribbon, and green "toothpicks" with wire already wrapped around the non-pointy end. The idea, as I saw it (but I Am Not A Florist) was to wrap the free end of the wire around a flower stem, then put the pointy end of the toothpick into the ball. Repeat until covered. Sounds simple, right? Lessons learned:

Plan ahead. Will you be putting leaves on the ball along with the flowers? Put them on first, dummy, or they cover up the flowers.

Pick flowers with strong stems. Flowers with weak structure will be soooo frustrating.

Wrap several flowers/leaves together onto the same toothpick. You probably don't have enough toothpicks, plus it goes faster.

Don't try to push the toothpicks in too far. Otherwise you push the wire or flower right off the toothpick just as it gets buried in styrofoam.

Attach the ribbon handle as strongly as you can -- it's going to be swung around by a 2.5 year old (cute Ava). Very firmly attach the ribbon to two toothpicks and stick them in at different angles so the tension doesn't pull them straight out.

Don't fret, there is no way to hold it without crushing some blooms or tearing off some petals now and then. After failed attempts at wrangling it with wire handles which backfired and tore off flowers, I just used the fingertips of one hand to support it.

Finally, don't worry. Everybody will love it anyway.

Actually, it turned out quite pretty. It's by no means the prettiest thing in this photo, but it worked. Many people helped out a lot and it was a beautiful wedding with a sense of cooperation. For example -- credits to Cheng for the photos I used here.

Friday, October 22, 2004

Joe and St. Peter wrote a WebDAV-related IETF Internet-Draft recently. In fact, it combines two technologies I admire, WebDAV and XMPP, in a way that uses each technology precisely for what it's good for (WebDAV to store application data, XMPP to route application-specific messages).

I talked to a bunch of people at Educause about this yesterday, and they were excited about this and other new WebDAV features because many universities are betting on WebDAV. University of Memphis reports two services that people can't live without: email, and now WebDAV. Since WebDAV is such a flexible repository technology it's hard to tell a potential customer (like Memphis 2.5 years ago, when I worked at Xythos) what it will give them. But give it to users, and they will figure it out. Looks like the Atompub people are storing blog postings on a Web server, which students increasingly can do now via their WebDAV accounts [1]. And of course the universities are also keen on storing calendar data in WebDAV accounts. More application data in fewer server repositories means lower administration costs for these universities.

With all this increased excitement, you'd think I could get more people to show up at WebDAV Working Group meetings or contribute on the mailing list. The next meeting is in Washington DC, in the second week of November. Want to join in? It's easy -- all you have to do is... join in.

[1] I realize that although giving students WebDAV accounts is a little new, giving them Web accounts, or Web space in their Unix accounts, isn't new. However, in some colleges students had to request this service, or know enough Unix to be able to author the content. Plus raw HTTP doesn't support multi-user applications very well. WebDAV extends the usefulness and usability of personal Web space.

Friday, October 15, 2004

This is why I like cats.

I was trying to figure out if anybody was thinking of implementing CalDAV in Slide, so naturally I googled 'caldav slide'. No direct answers, but I did notice a LaughingMeme blog post from last spring that I'm sorry I missed, commenting on CalDAV. I agree with the gist of his caveats, and we're trying to address those, I think.

I can't vouch for the accuracy of this description of Canadian conversational styles (link via Julie), but it rings a bell -- whether because I'm Canadian, female or both, I don't know. Surprisingly, the "weaving" style can sometimes be seen as more aggressive, because it involves pushing conversational pieces into pauses. My American SO sometimes thinks I'm interrupting him. Well, I am interrupting if you assume there are rules about waiting until somebody is done speechifying even in one-on-one conversation, but the interjections are meant to build the conversation, not to be rude to him.

Tuesday, October 05, 2004

Recipe for Ginger ice cream -- sugar and no-sugar options

Ingredients:

1 cup half and half cream
1/4 cup grated fresh ginger root
1 cup heavy whipping cream, whipped to soft peaks
1 cup ginger ale or diet ginger ale
1/2 to 3/4 cup sugar or Splenda
1 T. lemon juice

Soak the grated ginger root in half and half cream. Put this and the ginger ale and whipping cream in the fridge to chill for an hour or so. Use a fine sieve to take the fresh ginger out of the cream. Stir the ginger-flavored cream together with the whipped cream, ginger ale, sugar/Splenda and lemon juice. Immediately put in your ice cream maker etc..

Mmm! I imagine for those who love ginger and can also eat sugar, candied ginger would be a nice addition to the recipe.

Monday, October 04, 2004

How thrilling! After only two years, the XMPP WG has produced RFCs.

RFC3920, Core protocol (framework for applications like notifications, as well as instant messaging and presence)

RFC3921, Instant messaging and presence over the core protocol

RFC3922, how to map XMPP presence information and instant messaging to the same common interchange format that SIMPLE instant messaging can map to

RFC3923, requirements for end-to-end security of instant messages even when routed over more than one protocol

The work began earlier than two years ago, of course. Jabber was designed while Jeremy Miller was waiting for the IETF to come up with a standard IM protocol (and taking too long). When the IETF process still appeared stalled, Jeremy and others proposed that the IETF could standardize Jabber, by creating a working group to design a revision to Jabber that met the IETF standards for internationalization, security, XML namespace usage and a few other things. A BOF meeting was held in July 2002, very eventful and entertaining as IETF meetings go. The WG was formed, and I was added as a co-chair to help Pete Resnick, on October 31 2002. Peter Saint-Andre authored *all* of our internet drafts, doing a huge amount of difficult work in an extremely timely, professional and skilled manner. Whew! It's been fun!

Thursday, September 30, 2004

Via Who Throws a Shoe, I learn that Mel Brooks has called Toronto's bagels "mushy". I can comment on this vital issue. Although I've never lived quite in Toronto, I've lived nearby and visit frequently. Also I've sampled Montreal bagels, New York bagels and many instances of second class bagels from around the continent.

You see, only Montreal and New York bagels are really the original ethnic bagel, made by Jewish people for Jewish communities with the same unadulterated ingredients (slightly different for these two cities, I believe) and quirky bagel-making techniques (including boiling the rings before baking in hot ovens, and who knows what other black magic). That may be because Montreal and New York have long had large communities dominated by Jewish culture, and the ability to affect bagel tastes throughout the rest of the city.

In the rest of Canada/US, bagels are poor imitations made with standard baking equipment and therefore with modified recipes. Donut stores that decide to add bagels to their menu often create pseudo-bagels that can be made with their existing equipment. The result is topologically the same as a bagel but otherwise sub-par -- a mainstream "bagel" is merely a bun with a hole. Even steaming equipment produces a milquetoast substitute. It's just not the same, but most North Americans don't know the difference or don't care. Perhaps some soft-gummed softheads even prefer the less chewy mainstream bagel.

So what's Toronto's story? It's mixed. Some parts of Toronto are like the rest of the wasteland outside Montreal and New York, as epitomized by the ever-present donut stores which make such great Canadian donuts (more on this another time) but such humdrum purported bagels. However, there is a strong Jewish community in some parts of Toronto (such as North York). So in these communities, if you know where to shop, you can find real boiled-and-baked bagels, real chewy and dense and shiny on the outside.

Mel, I'm sympathetic, but you should know better than to buy a bagel just anywhere.

Thursday, September 23, 2004

Another stab (like mine last spring) at why there seems to be a disproportionate fear for economics and economists: Arnold Kling puts forth a few possible reasons in his TCS article. He mentions the math barrier, but I still think that disliking the conclusions of economic thinking is one of the biggest reasons -- and can't be dealt with simply by teaching economics without math.

Monday, September 20, 2004

Another knitting project completed: a fuzzy baby cardigan.

Also see Ekr's post describing our weekend camping trip which was our toughest trip yet but I feel surprisingly good only a day later.

Tuesday, September 14, 2004

I can't resist exposing unintended consequences of well-meaning social programs. Here's a new one (to me) related to the minimum wage: by having a minimum wage, a state blocks entry to the workforce to certain very low-wage and entry-level workers, and this affects the stepping-stone process by which low-wage workers move up the chain to earn higher wages. In other words, a minimum wage hurts a state's rate of moderate wage employment later on. A great explanation at Marginal Revolution.

Friday, September 10, 2004

Open source and open standards

This is part two of a rumination on the trendy adjective "open", exploring why open source and open standards don't go together as well as one might think.

The phrase "open standard" is more loosely defined than "open source", although Bruce Perens is attempting another definition. Microsoft calls a protocol an open standard if the specification is publicly readable or if Microsoft allows or licenses others to legally implement it. However, Microsoft typically retains change control to its so-called open standards. Sun also maintains change control over its major standards and may not even have non-discriminatory licensing for implementations, but claims their standards are more open than Microsoft's because you can "check out" of the standard (ref). The war between Microsoft and Sun over who has more open standards is laughable.

I think these are the main criteria to judge how open a standard is:

Is published and available at low or zero cost
Can be implemented by anybody freely or with non-discriminatory licensing
Can be modified by a group with either completely open membership (IETF) or one that does not discriminate in membership applications (W3C).
Can be implemented interoperably on a wide variety of platforms and with a wide variety of technologies. (E.g. HTTP can be implemented in C, Java or most other languages, on Windows, Unix or just about any modern operating system.)

Anyway, for now let's call standards whose development is open to public contributions "public standards".

Why would an open source effort fail to support public standards, or fail to be interoperable? A lot of possible reasons here, many of which add up to either a high cost which open source developers can find difficult to bear, or lower benefit than a commercial product would gain from doing standards work.

It can be expensive to learn how to implement the standard. Books, workshops and seminars can help the developer learn the standard, or participation in a standards group which may require airfare or meeting fees to do effectively. Writing a from-scratch implementation of a public standard is not usually easy.

Developers may take shortcuts reading, understanding and implementing standards, trying to get something useful working fast rather than make the full investment. There may be certain features required by the standard but which aren't necessary for the product to work in the developer's main use cases. The shortcuts lead to less-interoperable implementations. Closed-source projects may take shortcuts too, but there are pressures to fix the holes quickly in order to claim compatibility, prove interoperability, and sell to customers.

Vanity, confidence, or ego: it's fun and impressive to design a protocol from scratch. An open source developer who is doing the whole project for fun may find the entertainment value more important than a paid developer.

Commercial products must have the credibility of being supported for the long run, and in planning for the long run, developers have other tradeoffs to make. For example, a protocol designed in-house for a specific purpose tends to have design errors which make it harder to maintain in the long run. There are a lot of subtle design points and extra effort involved in making a protocol extensible and flexible. Later on, continuing to support the protocol with newer versions of the software may come to be quite a pain. If they're wise and can afford to do so, developers of long-lived software make the decision to invest early in supporting a standard protocol rather than invent their own with unknown problems.

What if the developer can put in the effort to implement the public standard, but it's not quite ideal for the developer's purposes? It's possible for the developer to influence the public standard, but contributing to standards takes a long time and a lot of effort. If the existing standards don't quite match up to the requirements, an open source developer may not have the resources or time to help it change. Thus, the developer may choose to only partly implement the standard, or implement undocumented extensions, at a detriment to interoperability.

There's the assumption in open source that because you can read the source, the source is the specification. Why should an open source developer take the time to publish a specification for the protocol when you can just read the source to see how it works? So therefore when open source developers do extend or invent a protocol, the resulting protocol often remains undocumented, which isn't conducive to interoperability.

Interoperability testing can be expensive. It might require revenues of some kind to be able to afford interoperability testing. And if you can't even afford interoperability testing, it's harder to justify implementing the interoperable standard.

Although it seems a little rude because Subethaedit is a wonderful tool, I'll pick on Subethaedit for a minute. It's free, it's open source, it has a relatively open process, and it even supports some public standards. It uses Rendezvous (an Apple preview of Zeroconf, which is a public standard in development) and BEEP (an IETF public standard). However there is also a non-public protocol used over BEEP to communicate around the actual changes to the communally edited document. Thus, it would be a challenge for somebody to write another client that interoperated with Subethaedit.

Sadly, many open source projects (as well as many closed source) use the phrase "supports open standards" or "based on open standards" as if it were pixie dust, automatically conferring open interoperability. That's not the case. An open source project can easily fail to be interoperable with other projects, just by innovating new features without rigorously documenting and designing protocols and protocol extensions.

Some open source projects overcome all these hurdles and support public standards, which I, personally, am very grateful for. Once in a while open source developers actually design a new protocol which is interoperable, flexible and unique enough to become a widely supported public standard, and that too deserves kudos.

Tuesday, September 07, 2004

Open as in source, or Open as in process?

A long-standing terminology issue in free software is whether it's "free as in speech, or free as in beer" (ref). A related confusion arises with the adjective "open". The term "Open Source" is most authoritatively defined by the Open Source Initiative. Other phrases like Open Standard and openness (as a way of operating an organization) are more loosely used. With efforts like Open Content and Open Law (not to mention OpenCola), openness is clearly in vogue now.

These "opens" don't always go together, despite common assumptions. There's an automatic assumption that these are all meritorious and that anybody with the merit of doing open source software also has the merits of following open standards and an open process.

Open source and open process

Since "openness" is too vague, I'll take a stab instead at considering "open process". The phrase is relative: a more open process is one which is more transparent, less restricted in participation, and less obscure. For example anybody may search the Mozilla bug database and see bug status (transparent) and anybody may submit a bug (unrestricted participation). In addition the bug database is easy to find and there are instructions for use (not obscure).

Why would an open source effort fail to have an open process? Simple -- a process has a cost that must offset its benefits, and an open process can have a higher cost than a closed process. A small group of programmers (one to three, maybe four) don't get much benefit from formal processes, let alone formally open processes. The way to contribute to an open source project, or how submissions get accepted, or what feature get added, may happen in a very opaque way; decisions may be made by one person or made in private conversations among a small group.

Contrast that to a company where there are very likely to be customer requests and some standard process to handle them. The process may very likely be closed and private, but it's also quite possible for a company to have an open process to discuss feature inclusion. For example, at Xythos, we started a customer forum where customers could get together and discuss the merits of various feature requests. The final decision was up to us but the process was not entirely closed. Some open source projects don't even have a forum for discussing feature requests, or a public list of feature requests.

Of course, open source efforts are probably more likely overall to have open processes. Sites like Sourceforge contribute to this synergy by opening up bug tracking and discussion forums in a cheap way -- open source contributors don't have to make a huge investment to make the development process more open as well.

At OSAF we put a lot of effort into ensuring that our processes are transparent, participatory, and not obscure. It's been estimated that one part of that alone, the weekly IRC chats, consumes nearly one morning per developer per week -- that's enormous. Documenting everything we say in in-person meetings and keeping public Wiki pages up to date are other major costs. Obviously we hope that doing the "openness" thing confers benefits in the long run to the larger community, but at such a high cost, it's obviously not a cost every open source group can bear.

Part II of this post, discussing the weaker synergy between open source and open standards, will come this weekend I hope.

The CalSch Working Group is being closed. I believe this is a good thing -- IETF working groups which stick around for years and years tend to shrink and get dysfunctional. Closing the group clears the slate.

This doesn't mean that no standards work is going to get done in calendaring. In fact, it seems quite the opposite since in the past six months there's been quite a surge of interest in calendaring features and calendar interoperability. The new lists I mentioned in a previous post have seen a fair bit of traffic. The access protocol discussion list (ietf-caldav@osafoundation.org )has 96 members and 72 posts in under a month. The other list discussing bringing the calendar format standards to a new level of RFC (ietf-calsify@osafoundation.org), has 85 members and 154 posts in the same time period. I've talked to a number of different companies, organizations and even venture capitalists about calendaring standards action, and there's an interesting roundtable coming up in Montreal.

Since I have been dragging people into discussing calendaring standards for a year now, and published a calendaring standard proposal eight months ago, I feel like my timing, for once, is right. Maybe I'm one of the butterflies furiously flapping my wings to create a bit of wind-speed out there today.

Tuesday, August 31, 2004

I read Scott Rosenberg's post today about extraordinary attempts to avoid appearance of journalistic bias, then got to discuss it with him today too. I noted that the problem seems to be a bit worse in the US. In Canada and the UK, I think newspapers may be a little happier to be known to have a bias. Not the Globe and Mail, perhaps, which takes a high-horse approach to politics and society despite clearly having biases at times, but at least the Sun. And in the UK I've heard papers are more likely to take sides and I know the Economist sometimes bluntly admits its bias. In the US, it seems, papers go to greater lengths (including forbidding attendance at benefit concerts) to avoid being seen as a mouthpiece for either the Democrats or the Replublicans.

One theory for the extra attempts to appear neutral: since there's only two real US parties, any admitted bias is tantamount to admitting that the paper favours one of them. In Canada politics may not be quite so partisan or polarized, or at least haven't been for so long. A Canadian paper with a slight conservative bent could be a PC mouthpiece, or a Reform mouthpiece, or neither, so at least there's confusion about which party it's supposed to be the mouthpiece for. But I'm not sure this theory holds up because certainly Canadian politics can be partisan (and nasty), and UK politics may be polarized much like American.

Another theory is that since Americans are generally so quick to come up with conspiracy theories, Americans are therefore quick to assume that politicians are somehow controlling journalistic output. Therefore a paper must appear especially untainted to avoid being written off as government-controlled propaganda.

I'm not terribly happy with either of these theories, and maybe I'm wrong that this is uniquely American, so feel free to chime in.

Monday, August 23, 2004

Wikis suck for specs

At OSAF, unfortunately, we don't have very good specs. We have great designs, but it's really hard to go from the designs to know what exactly works and doesn't work, what's implemented and what isn't, and how the non-GUI choices were made, at any given release. It's hard for a developer to pick a feature and see how to implement it, because the developer could either see too much (the wiki pages or fragments that show the bigger context for the design, the aspects of the feature that we can't do today) or too little (the wiki pages or fragments that seemed irrelevant but were vital). Part of the problem may be that we didn't have anybody on staff whose job it was to write specifications, like program managers at Microsoft do. But part of the problem is certainly tools.

OSAF uses a Wiki for most of our current documentation, from proposals to user documentation to meeting notes to personal jottings. The wiki is very fluid and chaotic and not very authoritative. It seems a wiki can be made to be more authoritative -- the wikipedia does this by formalizing around a single kind of content -- but we can't easily do that because our wiki already contains such divergent content. Besides, a wiki isn't an ideal environment for long documents, for printing, for version control, or for template usage.

Length: Specs are ideally longer than the average wiki page. To fully describe a feature so that the different contributors actually agree (rather than just appear to agree), to document the decisions made, and to make the feature testable and documentable, might require a 30 page specification for a 1-dev-month feature. Note that for a long document readers require a table of contents -- ideally clickable if you're reading it on a computer, with page numbering if it's printed.

Printing: You really want to be able to print specs, even today. I'm tempted to bring printed specs to a meeting and ban laptops to get them reviewed properly. Reviewers can mark up paper. Implementors can glance between paper and screen (and write on the paper) as they work.

Version control: Specs should be versioned and checked in and out and maintained in a versioned repository, just like code. Some wiki software does this but it's not very good yet -- I couldn't view a version of this page from before the picture was added. HTML doesn't version very well in general partly because images are external. And if you break wiki pages into smaller-than-30-page chunks for readability, then the versioning gets worse.

Templates: For template support, I miss Word. When I worked at Microsoft, there were Word wizards who made beautiful, useful templates for many purposes, from expense reports to specifications. As a program manager, I wrote many specifications, and I found that a template somewhat like this one kick-started the process. Word also allows templates to contain "click to fill in" fields (like author, product, release, team information; spec status, implementation status) including support for drop-down boxes to limit selection to a few appropriate choices. My favourite spec template had sections for overview, administrative UI, setup concerns, test considerations, rough documentation plan, and internationalization issues. Each of these sections was a reminder that the feature being specified might have impact on other groups. When I joined a group that didn't have a template I made or modified one to start with because I found them so useful. The IETF has specification templates and they make for some consistency between vastly different protocols and authors. A familiar template vastly improves readability.

Am I missing something that would solve this problem in HTML or in another open format? Is there some easy-to-use tool or set of tools to solve this problem? I know I could write my own software to do this, and it wouldn't even have to be terribly complex -- no worse than the XML2RFC tool which produces IETF specifications matching the template. But that's harder both to write the template (and software to generate the document) and for the author who has to get the angly brackets right.

[Tangentially related: According to Mark's amusing post, as a spec reader, I'm a moron. I don't understand a spec merely by reading it through. After I've implemented a spec or tried to explain it to somebody else, or extend it or edit it, I begin to understand it.]

[UPDATE: Ekr and Sheila point out that a Wiki also sucks for entering text. It's not WYSIWYG, so the process of updating and viewing and correcting pages can be slow. Also you've got to be online to change a page. There's no way to synchronize parts of the Wiki for offline changes other than manual painful silliness. Ekr objected to the whole comparison in the first place. He said that evaluating a Wiki for writing specs was like evaluating a bunsen burner for baking bread.]

Friday, August 20, 2004

When I lived in Seattle I volunteered with a group that ran conferences for middle and high-school girls to learn about careers. The conferences were arranged so that each girl could pick a slate of careers (3 or 4 ) and go to workshops where there was supposed to be hands-on practice. For a lot of careers -- vet, doctor -- it's easy to bring in actual real tools and let the girls use them, so those make for easy workshops. It's not so easy for the software industry, however, as I learned when I tried to increase the participation from women at my company.

The obvious approach to teaching girls what it might be like to be a developer is to put them at a computer and show them code. Unfortunately that requires having access to a computer lab at the conference location and setting up all those computers with the same tools. I only saw this done once -- Shaula Massena ran a brilliant workshop where girls used Visual Basic to construct a little prank program they could install on Dad or Mom's PC at home. The program simly popped up a dialog saying click here to dismiss, only when you click "here" the dialog jumped somewhere else on the screen :) Shaula sent the girls home with floppies containing their program executable, very cool.

I eventually helped program managers, testers and tech writers come up with workshop plans that didn't require a computer lab and software installation. For testers, we brought some home appliances in -- mixer, hair dryer, toaster -- explained how to write a test plan, and asked the girls to execute the test plan. They explored edge cases like "what happens when you press the toast button when you're already toasting". The girls also always had fun criticizing the design and usability of the devices, which we encouraged.

For tech writers, the workshop involve writing instructions and then watch somebody follow those instructions to see how challenging it is to write clear instructions. I brought a pile of coloured paper and asked each girl to make a paper airplane (or other origami thing) and then on another piece of paper explain how to make the same airplane. Then the girls exchanged instructions and tried to follow somebody else's instructions. At the end we compared results between the original and the one made by following instructions. Here, fun was had throwing planes and decorating and naming them as well as constructing them. Some girls decorated their instructions pages too -- budding Web designers.

Finally, for program/product managers, my favourite workshop was "Design your own cell phone". I focused the discussion of features by introducing the concept of a target audience and a limited budget. What features are so important for your target audience that you just have to have them? The girls split into teams and came up with lovely ideas. Often, of course, the target audience was "teenage girls" and one group came up with a "privacy phone" that would stop your bratty brother from hearing any of your conversations or messages. But one group targetted old people and thought what it would take to handle a user with poor eyesight and hearing. And another group targeted young kids (or really, concerned parents of young kids) and designed a brightly-coloured egg-shaped phone, wearable on a lanyard around the neck, with only three programmed "send call" buttons so that the kid could only call appropriate authority figures to arrange pick up time, report an emergency, and so on. The girls thought that the phone should also have toy-like features so that the kid would have something to play with besides calling Mom, Dad and Emergency, so they thought there could be a couple more buttons to play ringtones or record and playback thoughts.

For six years I've thought that the kidphone would in fact make a cool product. Finally I find validation via Gizmodo: the MYMO phone for kids. I should have hired that team of girls and founded a company.

Sunday, August 15, 2004

Everybody is knitting ponchos this summer, so I had to as well.

Wednesday, August 11, 2004

If this background material is familiar to you, scroll to the bottom for the call to arms.

I've been following calendaring standards for six years now. In that time, iCalendar has become a fairly widely accepted standard for expressing events, todos and other calendar information in a MIME text document. Many calendaring applications can import iCalendar files, or export to them, or generate an email with an invitation formatted as an iCalendar document. However, there are some problems with iCalendar's complexity, particularly in expressing recurrances, and the companion standards for conveying iCalendar files (iMIP and iTIP) have their own problems.

Calendar interoperability testing has happened sporadically in these years. The first calendar interoperability event was held in 2000, a virtual event in 2002 and another in-person event this summer at Berkeley, organized by the newly revitalized CalConnect consortium. Still, interoperability hasn't improved as much as we'd like because at some point you need to go back and fix the standard.

Also during these six years, the CalSch working group has worked and bickered over the proposal to standardize how clients access a calendar stored on a server -- this protocol, CAP, would be to calendars what IMAP is to email folders. I've never liked the design of CAP, down to the very core model of how it works. So six months ago I made a counter-proposal, CalDAV, and threw it out there as an internet-draft. Finally I'm getting more than just nibbles on CalDAV, in fact several groups have told me their intent to implement CalDAV starting this year. And at the same time, other folks are getting invigorated about revising iCalendar and bringing it to draft standard.

This is all great news. Calendaring interoperability has languished except for that burst of productivity back in 1998. People are locked into one calendar application depending on what server technology they have available, since there's no common calendar access standard. Invitations work, kinda, but in practice the problems with recurrances mean that people must follow up the machine-readable text with human-readable text in case a mistake was made between two different vendors' software.

Good news, but nowhere near done yet -- this is just the beginning. Now we need volunteers. We need people to write specification text, review text, and manage the issues. We need people simply to pay attention to the work being done and provide their experience, or simply opinions, to bring issues to resolution.

Here's where to start -- two new mailing lists, one to take iCalendar to draft standard, one to develop and standardize CalDAV. Tell your friends! Subscribe to one, subscribe to both! We've got good work to do and good people to do it but we need your help.

Monday, August 09, 2004

Tim Bray, a veteran of many other Internet/engineering communities, is a newcomer to the IETF, and was slightly disturbed by some of what he saw. With open mind, he attended IETF meetings, both a few official in-person meetings and the meeting Saint Peter calls the "undernet" of the plenary, the badattitude jabber room (named after a tradition of corporate criticism).

Like Saint Peter, I'm not concerned about what Tim Bray calls the "severe angst" that appeared in badattitude. I think when you invite a group of people together to complain and rant, they do so, attempting to be humorous and entertain each other, at the expense of reflecting the entire reality and balance of their opinions. In fact, several of us have noticed that ever since badattitude started existing during IETF plenaries, there's much fewer dissent voiced at the microphone. Badattitude only has about 60 people and there can be 600 in the actual plenary room so it's hard to believe that allowing 10% of the people to let off steam can directly reduce the microphone usage by a noticeable amount (clearly more than 10%). Theories include:

Uncorrelated -- people stopped complaining during open-microphone for other reasons.
Tipping point -- the few people whose frustration was reduced through badattitude, and didn't go to the microphone, brought microphone usage below some tipping point, influencing others not to complain as well.
The Squawker Squad -- the possibility that the people I know happen to be those most likely to complain, and by drawing them into badattitude I induced a significant percentage of the bellyachers to give their opinions elsewhere.

The first theory is the most likely but I prefer the last (/me grins).

I was supposed to have a full transcript of the badattitude jabber room due to the automatic logging functionality of the jabber client, but nitro ate the chat log. Really -- it's gone, except for the first five minutes. Honest. Anyway, some folks logged in under their real names and might not like it published.

Friday, August 06, 2004

WebDAV and Wiki get compared frequently. They're often in frameworks that solve the same problem, but they play different roles within those frameworks.

A Wiki is a Web site running Web application code that allows people to browse Web pages, easily add new Web pages, and easily edit any Web page through Web forms (the Web application code powers those Web forms and handles their input). Originally Wiki described the first such application but the word has spread and similar applications that run completely different code are also called Wikis.

WebDAV is an IETF Proposed Standard, RFC2518, describing a set of extensions to the HTTP protocol. These extensions allow a client application (or other software agent) to create new HTTP resources more easily, manage them, find them without following hyperlinks, and change them in a way that multiple authors can work together simultaneously.

Wiki uses HTTP forms to transport requests	WebDAV extends HTTP by defining new methods.
Wiki is a UI.	WebDAV is completely orthogonal to a UI.
There exist a number of different Wikis, implemented in different languages, supporting different features.	There are a lot of different WebDAV servers as well as WebDAV clients. The clients can theoretically all work against any server although in practice there are a few outliers (special purpose clients or servers) that don't interoperate so well with the main group.
Wiki is a thin-client application, where any browser can be the client because the smarts are on the server (the code running the Web forms).	WebDAV was created to allow rich clients to author Web resources. A WebDAV client is definitely the entity in control -- but you have to download or find WebDAV client software. Luckily, these clients ship now in all Windows and Mac OS's.
Wiki is a generic name that seems to be used if something is similar to original Wiki	WebDAV is a set of requirements, so a WebDAV implementation can be tested for compliance.
You'd probably not call something a Wiki, if it didn't enable multiple users and allow easy creation of new pages, and result in a Web site collection of human readable pages.	WebDAV can be used to do synchronize with a source code repository, or to change configuration files, or to access a calendar server, or many more things that you would never call "shared authoring".
It's easy to extend (a) Wiki, just code in the new feature -- e.g. search.	It's hard to extend the WebDAV standard, it requires IETF drafts, reviews, and many revisions in the IETF doc format. It can take years -- e.g. SEARCH, begun around 1998. OTOH it's easy to extend a WebDAV server -- e.g. Xythos added anonymous permission tickets to the WFS server -- but no other clients are likely to interoperate with a custom feature like this for a while.

Anyway, in this case, the different roles mean that Wiki and WebDAV are not at all incompatible, and to compare their advantages and disadvantages as if they are both apples can be misleading. Brad, whom I met at a party last week, realizes this: he added WebDAV support to an existing Wiki codebase. It allows another way to add pages to a Wiki or to modify them, and can interact completely smoothly with the existing WebUI. You'd think somebody would have done this before now, but I can't see that anybody has. Way to go Brad.

[UPDATE Aug 9: clarified "extend webdav" to differentiate between extending the standard, and extending a server]

Tuesday, July 13, 2004

I remember back in the late '90s seeing a bunch of people working on mechanisms so that when you go to a Web page -- like this one -- you could see who else was visiting the same Web page, and perhaps start a chat with them. Fast Company did an article comparing four of them that happened to be trying to commercialize the concept. In addition there were papers and academic or research programs, such as the Sociable Web project.

What happened to these? There were some technical issues (performance, bandwidth, client deployment and server deployment barriers) but I don't think those were paramount. I think it was the social and infrastructure barriers combined with a simple model mismatch. First I'll explain the model mismatch.

The Web is a client-server, or request-response medium. A client requests a page, the server responds with the page, done. The connection closes. The client has the page and can display it for the user as long as needed, can cache it for reference later. In what sense is that "visiting" a Web page? How can you say that two users are "on" the same Web page at the same time? They're each viewing local copies of the same page, copies which were obtained independently and at two different times. Even if users loosely think of this as "visiting" a Web page, the protocol model doesn't support that user model very directly. So you have to extend the Web with non-Web-like logic in order to allow people to hook up on a Web page, and that's just harder due to the infrastructure.

The infrastructure around the Web makes all Web extensions more difficult. Groups like XCAP (part of the SIMPLE WG) and Atom are trying to extend the Web in rather Web-like ways and they have lots of trouble dealing with infrastructure like proxies. It's even harder to extend the Web in non-Web-like ways like shared presence on Web pages. For example, caches don't work at all with shared presence systems -- the cache returns the Web page without asking the original source server for the page, so it's impossible for the source server to know everybody looking at the page without disabling caching (which doesn't entirely work). Clients which automatically download Web pages (like syndication browsers) or download through a proxy (like anonymizers or clients behind corporate firewalls) naturally mask the user's identity and time of access.

In addition to the intermediaries, there's the end points -- clients and servers. Client software is sometimes built to allow add-ons (IE and ActiveX, Firefox and all its plugins) but most users don't use them and are trained to distrust extensions. Client-side scripting languages are implemented differently on each platform so many sites support IE only for rich interactions. Server administrators also hate installing new software or new add-ons, particularly ones which allow users interacting with each other -- partly for the scaling and social discord reasons I cover next.

So finally, if all these model, infrastructure, software and technology issues are overcome, there's the social issues.

People protect their identities and hate cases of mistaken identity or identity deception.

People like anonymity and being hidden viewers, when it comes to themselves.

People resent anonymity in others and like to have an accurate idea who else can "see" them.

People protect their own privacy -- they don't like to be seen online without warning. Many Mac users turn Rendezvous off when they're not using it.
People don't take active steps to become visible online unless they have something they want to get out of it. Many Mac users don't turn Rendezvous on unless they are reminded of it somehow *and* can think of a good reason why to turn it on.

Social sites are prone to spam, and people are really sick of spam and don't tolerate many new sources of spam.

Social interactions can get nasty either intentionally or accidentally. Trolls have been around as long as there have been public newsgroups and probably longer. Long-lived chat sites may require moderators, a combination of good technical "fixes" and volunteers, or may be restricted to a small sub-community that tolerates flames, trolls and spam. Ed Krol's Hitchhiker's Guide to the Internet tried to address some of these issues back in '89.

So with all these problems and stresses, social groups online have a tendency to burn out. There's been papers and even a whole course on virtual community design, mechanics and governance.

I am guessing that in part, simple Web-page presence tools are just too lightweight to reasonably deal with (or provide the subtle cues and tools humans need to deal with) these social issues. Once you build a system that can really sustain any kind of interactive community and identity, it's gone far beyond the Web request/response architecture (even if it's loosely built on that) and involved many sophisticated features. It seems so tempting to build something simple and Web-like but human models and interactions don't simplify that well.

Monday, June 21, 2004

Lately some of us have posted lists of excellent papers and essays relating to the Internet and protocols -- here's the Stanford CS 444I reading list along the same lines, by Armando Fox.

Monday, June 07, 2004

Health care systems are incredibly hard to understand, yet people feel very strongly about them. Most Canadians will swear up and down that single-payer is better, the only socially acceptable alternative, and that it makes people healthier at lower cost. (At the same time, Canadians will grip about government cuts in healthcare, waiting lists, administrative snafus, their doctor). Yet most Canadians haven't experienced any other system so it must be the media that's forming their impressions.

The media, of course, is very bad at conveying complex nuanced system characteristics. I was recently pointed at an article with the headline For-Profit Hospitals Cost More. Cost more to whom? To the patient? To the tax-payer? One of the reason that for-profit hospitals cost more, the article says, is because they pay taxes, so presumably the tax-payer carries some of the burden of non-profit hospitals. And why would one believe that this had any relevance for the Canadian debate? It's quite possible that both for-profit and non-profit hospitals are more effective than government-run hospitals. Not that we can agree on what 'effective' means, anyway.

The more I learn about this, the less I know. I have seen government inefficiencies in the Canadian system -- my grandparents don't get their choice of doctor, even if they don't like their doctor, because they live in a small community and they see the doctor they're told to see. They can't drive a little farther, change plans, or pay more (or differently) no matter how much they dislike their doctor. But I've seen lots of inefficiencies surrounding the American insurance system too, particularly the employer-mediated stuff and the requirements for documentation of prior insurance or pre-existing conditions. I don't know. It all makes me unhappy, and suspect that healthcare is simply an intractable, money-wasting or unfair system, no matter how you slice it.

Monday, May 24, 2004

Another knit project finished: a circular spiral shawl in hand-painted wool. More details can be found on my knit stuff page.

Tuesday, May 11, 2004

I love it -- a focus on good news, including items on whether women are worse drivers, Iraqis protesting against Iraqi violence, diabetes research progress, and more. (Link via Instapundit.)

From a Jim Fawcette article (link via Scobleizer):

Blind studies show that users can't distinguish between search results from Google, Ask Jeeves, Yahoo, and Teoma. Yet when you put a logo on the page, users show a decided preference for Google. To me, that totally debunks the idea that Google's search algorithms built on the professional-journal-references model is the key to its success. As The Wall Street Journal's Lee Gomes put it: "Some say Google is the Rolls-Royce of search; maybe what it really is, is the Nike. Googlephiles may think they are exhibiting technical sophistication by their loyalty, but what they are really proving is the extent to which they have been conditioned to respond to logos and brands, just like street kids with their sneakers." (ref)

I can't get at the blind study information, unfortunately -- Fawcette's link is to a WSJ subscription article. Anybody got a pointer for me? I'd like to figure out if it was just the logo added to the page that made the difference, not any other formatting. To me, it seems the primary suckage of the MSN search engine was its interface (which is now, oddly, much like that of Google's). So it's not so surprising that by slapping the same interface on results from different engines meant users couldn't distinguish. There are other possible explanations too, besides pure brand loyalty.

Monday, May 10, 2004

I've been thinking for a while about why people are so often so hostile to economics (it's called the "dismal science"), and I'm not the only one. One theory has it that commerce now typically involves non-tribe members (or even non-human, as with our banks' ATMs and Internet orders). That seems to be more a hostility towards modern scalable commerce, although I can see how that would transfer sometimes to economic arguments or prescriptions relating to international trade or similar topics. Another theory is that people just don't understand it (no exposure in early schooling) therefore fear it. A third theory has it that, in part, left-leaning people fear that learning economics (or indeed, lending it any credence) turns people into conservatives!

Wednesday, April 28, 2004

As a Canadian living in the US, it often distresses me greatly how little Americans know about the proud history of their northern neighbour (not to mention our spelling). Here's a good beginning on Canadian Facts.

Tuesday, April 27, 2004

Grab the nearest book.

Open the book to page 23.

Find the fifth sentence.

Post the text of the sentence in your journal along with these instructions.

OK then:

However you should be careful about creating transports or bridges as it may violate the usage policies for foreign services.

Hard to tell what kind of book this is, right? It's "Instant Messaging in Java" by Iain Shigeoka which I happen to have at work.

I found this on Ted's blog and Ted references Danah Boyd... There's a certain etiquette in the blogosphere which is to credit where you got something like this. But neither Ted nor Ted's creditee invented this so I started following links because I wanted to find out "why". Ted Leung (apr 25) credited Danah Boyd (apr 18) credited Caterina (apr 11) credited David Chess (apr 11) credited long story short pier (apr 8) credited Elkin (apr 8) credited happy_potterer (apr 8) credited sternel (apr 7), who credited nobody in the post itself. The really bizarre thing is that after following this chain I looked at the comments for Sternel's post and somebody else posted a comment asking where Sternel got it from. So, onward: in the comments Sternel credits pegkerr (apr 8), credits kijjohnson (apr 7) credits mckitterick (apr 7) credits bobhowe (apr 7) credits both silvertide (apr 6) and curmudgeon (apr 6). Silvertide credits curmudgeon too. Curmudgeon credits kricker (apr 6) credits cynnerth (apr 6) and pbsage. PBSage also credits Cynnerth. Cynnerth credits seamusd whose journal I can't see. But now I can see the other self-acknowledged geek trying to track back this meme. Apparently it originated with some "find page 18, look at the fourth line" live journal post from who knows who.

Google has "about 15700" links for the search for "Find the fifth sentence" (in quotes) -- all of them blog entries with exactly this meme.

Friday, April 23, 2004

There's new knitted stuff up today, for those of you who follow what I make outside of work. I like browsing other people's knitting blogs (there are hundreds of these in this ring alone), which is why I try to do the minimum of taking photos of my stuff and describing the project at least once. Other knitting bloggers do nearly day-by-day progress reports which I enjoy but can't possibly do if I'm going to find time to knit too!

Thursday, April 22, 2004

As always, I love a good counter-intuitive argument. Here's one that claims that bicycle lanes are more dangerous for bikers.

Tuesday, April 20, 2004

I've been researching a lot of technology and tools lately, comparing solutions and learning principles. Most of it's reading, but I'm finding that friends have interesting stuff to say. So I thought I'd do a post on the subject(s) in case I'm not thinking of the right friends to ask the right questions of they can tell me so.

Peer-to-peer communication models, frameworks, toolkits: JXTA, Rendezvous, Howl and Jedi.

Python libraries for WebDAV and IMAP support: Twisted and Python standard libraries

Replication and synchronization: basic master/slave pattern, master/master pattern, RSync, RSync over HTTP, OceanStore, How to Select a Replication Protocol, the HTTP Distribution and Replication Protocol, HARP, State Replication Protocol, MoteFS, SyncML, problems in treating distributed data as if it were local (Notes on Distributed Computing), Coda (Disconnected Operations in a Distributed FS).

How to do engineering task management: bug dbs, Project, Excel files, other...

Good IDEs that support Python: boa constrictor, pyEclipse, pydev, TruStudio, Wing.

Permissions solutions: WebDAV ACLs, IMAP ACLs, LDAP ACLs, NFS ACLs; Secure Interaction Design, Capabilities v. ACLs, WebDAV tickets (capabilities), Capability theory, Protecting Information,

Inspired by Ted and the people Ted linked to, I'm also thinking of putting together a post on influential software/CS papers. Maybe tomorrow.

Friday, April 16, 2004

I gave a rough WebDAV tutorial Tuesday at OSAF. Here are the slides. There's also something new on my knit stuff page.

Friday, April 09, 2004

I have just finished knitting a new thing - a lacy skirt knit with ribbon yarn. It's pictured at the top of my knitted stuff page.

Monday, April 05, 2004

The Subway diet got a lot of press, much of it from Subway itself, of course -- eat a low-fat 6-inch sub for lunch, and a foot-long one for dinner, and lose weight. It may have inspired the more negative McDonalds diet, in which a documentary director decided to eat only at McDonald's for a month. His rules included that if the counter staff suggested that he super-size something he would, and he would eat everything he ordered.

Spurlock had the idea for the film on Thanksgiving Day 2002, slumped on his mother's couch after eating far too much. He saw a news item about two teenage girls in New York suing McDonald's for making them obese. The company responded by saying their food was nutritious and good for people. Is that so, he wondered? To find out, he committed himself to his 30 days of Big Mac bingeing.
"If there's one thing we could accomplish with the film, it is that we make people think about what they put in their mouth," he said. "So the next time you do go into a fast-food restaurant and they say, 'Would you like to upsize that?' you think about it and say, 'Maybe I won't. Maybe I'll stick with the medium this time.'

Does he really think every time fast food chains offer to supersize or upsize, that customers agree to it? And if so, that they eat every bite? If they did, it would be no surprise if they gained 25 pounds, as Spurlock did, and had a skyrocketing cholesterol level. Note that he also limited his exercise during this period, although I would think simply eating far beyond the point where you feel full, several meals a day, would be the root cause for the bad effects he experienced. In other words, it's not the food itself but the quantity -- he ate an average of $28 worth of food each day, which (according to the price info I could find) means at least seven Big-mac value meals a day!

A different McDonald's month diet with different rules could easily have a very different result. I tend to agree with this woman who believes she can eat only at McDonald's for a month and lose weight. Her rules are pretty flexible but they definitely don't require her to super-size or eat every bite. Or to look at it another way, I suspect if I ate $28 worth of even Fresh Choice meals every day (particularly the pasta, muffins, etc) for 30 days I'd also gain weight.

This work is licensed under a Creative Commons Attribution 3.0 Unported License.