Friday, October 04, 2013

AWS Persistence for Core Data

I like DynamoDB, and I like architecture that reduces the amount of backend engineering one needs to do in a company whose product is an app.  So I was quite interested to investigate AWS Persistence for Core Data (APCD, for lack of a better short name) in practice.

APCD Overview according to me

APCD is a framework you can install on an iOS app such that when the app wants to save an object to the cloud, it can set APCD to do so silently in the background.  Not only does it save the object to the cloud, but changes made in the cloud can be magically synched back to the app's Core Data.  There's a parallel framework for Android which is promising for supporting that platform with the same architecture.

On the server end, if the server needs to do some logic based on client data, the server can access the DynamoDB tables and view or modify objects created by the applications.  In theory one doesn't have to design a REST/other interface for synchronizing client data to the server or to other clients. That's a significant savings and acceleration of development, so we read up on APCD around the Web and implemented it.

While there were a bunch of minor problems that we could have overcome, the primarily one was: nowhere does Amazon seem to document how to architect to use AWS Persistence, or explain what is it for. In the main article linked above, the sample code and objects are "Checkin" and "Location".  But where's the context?  Are these Checkin and Location objects in the same table?  Is there one giant table for all data?  Does each client have its own private table for a total of N tables?  Or are there two tables? Or 2N?   It really helps if new technology documentation  includes some fully fleshed out applications to give context.  Full source code isn't even what I'm talking about, but at least tell us what the application does, why it's architected the way it is use the new technology, and some other examples of what the new technology is for.

What I think APCD is for

Well we recently put together a couple facts which suggest what APCD is for.

  • You can't have more than 256 tables in a DynamoDB for an account, even when using APCD.  This limitation is very relevant to architectural choices made with APCD.*
  • If an installed app has the key to access any part of a table, the app can access the whole table, all objects.  There's no object-level permissions yet, and because the app access the data on DynamoDB through APCD, the server can't intercede to add permissions checking.
All right, so that tells us we can't architect the application so that each app instance saves its own table separate from other apps' tables.  We run out of table space at 256 installed users if not sooner.  It also tells us that if apps are going to share larger tables, the information in those tables has to be public information.  

So that suggests to me that APCD is for apps to synchronize shared public data.  For example, an application that crowd-sources information on safe, clean public bathrooms.

How my sample app would work

The crowd-sourced bathroom app could have all the bathrooms' data objects in one big table, and each instance of the application can contribute a new bathroom data object or modify an existing one.  A server can access the bathrooms data too, so Web developers could build a Web front-end that interoperates smoothly as long as the data model is stable.  

Now to use the service, even if the whole dataset is too large to download and keep, an app could query for bathrooms within a range of X kilometers or in a city limit, and even synchronize local data for offline use.  When the app boots up it doesn't have to download local bathroom data if it has done so before, instead APCD is supposed to fill in new data objects matching the query, and update the client with changes. 

For security, we have to trust each app to identify users so we can identify and block simple bad actors (somebody using the app interface to insert false information), and we have to have some backup for dealing with the contingency where the app is completely hacked, its key is used to access the bathroom data, and somebody quite malicious destroys all the useful bathroom data.  

What we did

We ended up not using APCD because what we're building does not involve a shared public database. We have semi-private data objects shared among a small set of trusted users.  Doing that with APCDs limitations seemed too far off APCD's garden path of easy progress.

Is there a better way to use APCD? 


*   Yes, you can have the 256 table count lifted, but not by much.  Not, say, to 1 million. That's not how DynamoDB is architected to work well.

No comments:

Blog Archive

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 Unported License.