Sunday, June 7, 2009
GQL blues, or, the bloom is off the rose
And I've built the basic spine of my application - an Android app talking to an AppEngine service, which takes a few well-specified requests and feeds back a few tightly structured (and low-bandwidth) responses. The smartphone app never talks to WikiTravel directly; instead it goes through AppEngine, which can a) do all the heavy lifting, b) log and track what's going on, c) cache data more intelligently, d) provide another layer of control and indirection.
It's actually working pretty well; I'm updating WikiTravel successfully. (Well, from the emulator. I'm going to buy a real phone next week, and I expect to unearth more problems with the app when I do...) The UI is sinfully ugly, and right now it can only show you the well-formatted kind of WikiTravel listings to update, and there are a bunch of niggling things to fix and hardcoded strings to move to R.string and kludges to klean up, but by and large I'm pretty pleased with the progress.
However. I have found the first thing about the Android-AppEngine stack about which I have serious reservations. I'm talking about AppEngine's "GQL" data storage.
Now, on the one hand, this object database is quite cool. You can create, save, and write simple SQL-like queries for objects very easily. It reminds me, pardon me while I date myself, of the good old days of working with GemStone for Smalltalk, a wonderfully elegant (but ahead of its time, and hence cripplingly slow) object database for the world's purest OO language.
But if memory serves you could do a lot more with Gemstone than GQL. Especially when it comes to queries. Now, I'm biased from years of experience with SQL databases - I already know how to make them do what I want, whereas with GQL I have to work it out from a fairly thin foundation - but even so, there is much to be desired.
For instance: you can only have one inequality operator (eg "height < 5") in a query. Which means that "height > 3 AND height < 5" is actually impossible to perform. But wait, there's more; if there's an order by clause as well, it must refer to the field that uses the inequality operator. So "height < 5 ORDER BY emailAddress" is similarly impossible. Oh yes, and fetching more than 1000 rows at a time is equally impossible. As is counting all the rows that match a given WHERE clause. The official workaround for that last is to write a Sharded Counter, which, I think you'll agree, is orders of magnitude more complex than writing "SELECT COUNT(*)".
In general, getting around these restrictions means restructuring your object model heavily ... if you can get around them at all. OK, you denormalize databases for performance, and SQL has its quirks and blind spots too, but jeez, this is way past that: it's more like getting a car and being told "Listen, the steering wheel only turns right, so you have to make three right turns every time you want to go left."
I'm sure they have their reasons. Scaling being one. And for an app like mine, where the data-storage requirements are pretty simple and straightforward, this all works fine. But if I was building something seriously data-intensive? I would think not just twice but thrice before using AppEngine, at least until it supports a more fully-featured datastore.
Regardless, your point is well-taken. In fact, I'd go further: you gain absolutely nothing from tying yourself to these ridiculous restrictions, because the datastore doesn't work at all in its intended goal: automatic scaling.
AppEngine has a rule: if you try to run a query/transaction that takes > 10s to execute, they kill it and rollback; your app gets a DatastoreTimeoutException (at which point the best thing you can do is to probably just show an error to your users).
Read the mailing list and you'll find ubiquitous reports of DatastoreTimeoutExceptions for even trivial queries. (Including my current hobby horse, where I'm getting DTEs loading a single row by key!) http://bit.ly/9V3Z5 For more complex queries, you can try to add an index, but if that fails... you'll just have to present an error message to your users. :-(
I am more than a little taken aback to learn that the datastore may time out when loading a single row. Fortunately for me, my app is insert-lots, update-nothing, select-little, so I'm unlikely to be personally affected ... but still, they better fix that soon.
Subscribe to Post Comments [Atom]
Subscribe to Posts [Atom]