Monday, April 27, 2015

Management of Risk in Software Testing

I read a rather old book called "Technological Risk" while working on startup software test plans, and this is making me think of the tradeoffs in a very abstract manner.

In the same way that there's a value for life whether one admits to putting a value on it or not, there's a value for bugs.  When we invest in road infrastructure or cellphone usage laws to save lives, and whether we save lives in our own country or in a foreign country by sending aid, our personal and governmental expenditures put a value on life.

Investing in finding bugs has a similar implicit value: is a given amount of testing worth the likelihood of finding a bug?  Strict testing regimes that enforce a certain coverage at certain points in the release process have an attraction, but it should be recognized that a strict full test plan is not better, it merely compensates for a lack of information.

An ideal test process, from the perspective of maximizing effectiveness, would utilize as much information as possible about what could have gone wrong and what the costs of mistakes in various areas are.  For example, in a product involving user login, making and seeing posts, and changing display timezone:

  • if timezones are a new feature, the ideal test for this release would involve a lot of work on timezones, including changing system time and different cases of daylight savings boundaries.
  • If the timezone stuff hadn't changed in months, the ideal release acceptance test would only perform a sanity check on timezones, not go to extremes like changing system dates.  
  • Every test pass would confirm user login even if nothing had changed there, because the cost of breaking login would be high.  

An ideal test process with full balancing of risk and cost is not achievable in the real world, but it's very easy to do better than a strict test plan.

Tuesday, April 21, 2015

HTTP error responses in APIs

I've seen a bunch of mis-use of HTTP status responses in HTTP/JSON or HTTP/XML APIs.  It's not merely a matter of taste when interoperability can suffer, as when status responses are used by intermediaries or by client libraries, forcing implementors to override or work around those intermediaries or libraries.  Or maybe I'm just unusually bugged by these things.

Here are a couple gotchas I've seen more than once.  Other opinions welcome.

Gotcha:  Using 401 for 403 and vice versa

401 means that the client is NOT authenticated, and in some client libraries and browsers, this could trigger an authentication request.  The server is required in HTTP to send the authentication challenge header.  If the client is authenticated but doesn't have the right permissions, use 403 instead.  Or if the client is unauthenticated but it doesn't matter, because no authentication would make the request succeed, 403 is also valid.

Gotcha:  Using specific errors as general errors 

There are three errors whose English titles make them seem broadly useful:  406 "Not Acceptable", 412 "Precondition Failed", and 417 "Expectation Failed".  If your API has conditions that need to be met before other parts of the API can be used, it seems natural to use 412, right?  The problem is these three all are tied to specific HTTP features, and are almost never appropriate to the standard kind of HTTP API.

  • "Not Acceptable" is tied to the Accept header.  Only if the request failed because of the client's Accept header choice, might this be the right error to use.  
  • "Precondition Failed" refers to precondition headers which include If, If-Match, If-None-Match, and If-Modified-Since.  Use only if the request failed because of one of these precondition sent by the client. 
  • "Expectation Failed" is the same and refers to the Expect header.  

In addition, I've collected a bunch of useful habits around error response bodies.

Tip:  Use a response body on certain error responses.  

It's perfectly legitimate to put a JSON body on a 4xx or 5xx response.  The specification recommends it for some responses such as 409, where the error code indicates that the state of the resource makes the request fail, and it would sure help the client to know what the state of the resource is.  

Tip:  Include human-readable text

Boy does this help client developers.

Tip:  Include machine-parsable detail codes  

For example, if the API returns a number of 403 errors of different types, or 409 errors with different meanings, the client can handle those cases differently before even consulting the user.  In the following example, the client could use the "uploadInProgress" key to know to automatically try again in 10 seconds if that's how the timeToWait was defined in the API documentation.

  409 Conflict
  Content-Type: application/JSON
  Content-Length: xxx

  { "error": {
     "text": "This object cannot be updated because an upload is currently being processed. Try again in a few seconds.",
     "errorCode": "uploadInProgress",
     "timeToWait": 10

Combining tip 2 and 3 produces a JSON response that has an envelope element "error" that can be extended to include more things, and extensibility is definitely good here.

Blog Archive

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 Unported License.