Wednesday, June 17, 2009
I am the alpha dog!
Mind you, the Android app is a fairly trivial piece of programming next to the AppEngine middleware that does the actual parsing and updating of WikiTravel. Which is the way it should be. Resources are scarce on a smartphone, and both processing and battery power are relatively meagre; I expect the design pattern of choice is going to be "get as much of the work done as you can on your server farm, then communicate simple low-bandwidth stuff to the phone."
Ultimately, the phone isn't much more than a user interface. In fact, now that I think about it, my whole app follows the classic Model-View-Controller design: WikiTravel is the model, the smartphone is the view, and my AppEngine service is the controller. Huh. Plus ça change, plus c'est la même chose.
Anyway, a list of tips, tricks, and annoyances since last we met:
- Let's start off with the annoyances. Character encoding. This has never been my strong suit, and I all too often wind up trial-and-erroring a solution. What doesn't help here is that I've got URL encoding (eg "?"->"%3F" in an HTTP request), HTML encoding (eg & -> & in a web page) and actual character encoding (ASCII? Unicode? UTF-8?) and while I grok the first two, I always have a mental block with the latter. I seem to have brute-forced something like a solution. We'll see. I expect it will rear its ugly head again.
Python, weirdly, for all its text-processing power, doesn't come with much in the way of built-in HTML escaping and unescaping. "cgi.escape" will do much, but, annoyingly, not everything in the way of escaping. As for unescaping, fuhgedaboudit. Fortunately, I found this neat little solution somewhere on the Internets:
#HTML unescape
#taken from the Internet
class Unescaper:
entity_re = re.compile(r'&(#?[A-Za-z0-9]+?);')
def replace_entities(self,match):
try:
ent = match.group(1)
if ent[0] == "#":
if ent[1] == 'x' or ent[1] == 'X':
return unichr(int(ent[2:], 16))
else:
return unichr(int(ent[1:], 10))
return unichr(name2codepoint[ent])
except:
return match.group()
def html_unescape(self,data):
return self.entity_re.sub(self.replace_entities, data) - While I'm complaining, as a Java coder, I'm used to all objects having a "toString" representation, and thus, when logging or debugging, being able to put
"this is"+anObject+" and this is "+anOther
without worrying about syntax. But Python, otherwise a far superior text-processing lanaguage, won't let you do this: you have to put "str()" or "repr()" around objects you want to include in a string. For no apparent reason. Sigh. - The good news is, the Java SDK now includes a perfectly acceptable HTML parser. I needed to parse attributes in a tag, didn't want to include a whole new JAR in my project for just that purpose, and really didn't want to write that code myself (it'd be like reinventing the wheel in Detroit.) org.xml.sax to the rescue:
class ListingHandler extends DefaultHandler {
public void startElement(String namespaceUri, String localName, String qualifiedName,
Attributes attributes) throws SAXException {
for (int i=0; i<attributes.getLength(); i++) {
String field = attributes.getLocalName(i);
String value = attributes.getValue(i);
and you don't even have to fuss with XML namespaces if you don't want to (and here, I don't.) - GAEUnit continues to be awesome. Android unit testing continues to be much clumsier. Yet another reason for AppEngine to do most of the tricky work.
- Optimization. Always a thorny issue. As a wise programmer once taught me:
- The first rule of optimization: don't do it.
- The second rule of optimization (For Experts Only): don't do it yet.
The android SDK comes with three excellent documents - Designing for Performance, Designing for Responsiveness, Designing for Seamlessness - that describe best practices. On the one hand, I'm not following them as closely as I could; the ExpandableListView at the heart of my UI is filled with homegrown and relatively expensive ViewEntry objects, rather than arrays as they suggest. On the other, it makes the code a lot easier to read, and ... don't optimize yet. Thus far the app seems fast and responsive enough. Says me. We'll see what others think. - The first rule of optimization: don't do it.
- HTTP GETs and POSTs. In case you'd like an example of how to do that fairly efficiently from Android, here ya go:
public static HttpResponse DoHttpPost(String target, ArrayListparams)
throws IOException
{
try {
HttpPost httpost = new HttpPost(target);
UrlEncodedFormEntity entity = new UrlEncodedFormEntity
(params, HTTP.DEFAULT_CONTENT_CHARSET);
httpost.setEntity(entity);
//configure our request
HttpParams my_httpParams = new BasicHttpParams();
HttpConnectionParams.setConnectionTimeout(my_httpParams, CONNECT_TIMEOUT);
HttpConnectionParams.setSoTimeout(my_httpParams, SOCKET_TIMEOUT);
HttpClient httpclient = new DefaultHttpClient(my_httpParams); //get http client with given params
httpclient.getParams().setParameter("http.useragent", Settings.AppName);
//upload set, let's try it
Log.i("com.rezendi.wtw.Util", "Preparing to post "+params+" to "+target);
HttpResponse response = httpclient.execute(httpost);
Log.i("com.rezendi.wtw.Util", "Sent POST, got " + response.getStatusLine());
//clear out the form data
entity.consumeContent();
return response;
}
catch (IOException ex)
{
Log.e("com.rezendi.wtw.Util", "Error posting "+ params +" to "+target, ex);
throw (ex);
}
}
public static String Encode(String toEncode) throws UnsupportedEncodingException
{
return URLEncoder.encode(toEncode, "UTF-8");
}
public static String DoHttpGet(String queryUri) throws IOException
{
HttpParams my_httpParams = new BasicHttpParams();
HttpConnectionParams.setConnectionTimeout(my_httpParams, Util.CONNECT_TIMEOUT);
HttpConnectionParams.setSoTimeout(my_httpParams, Util.SOCKET_TIMEOUT);
HttpClient httpclient = new DefaultHttpClient(my_httpParams); //get http client with given params
HttpGet httpget = new HttpGet(queryUri);
try {
Log.i("com.rezendi.wtw.Util","Opening HTTP GET connection to "+queryUri);
HttpResponse response = httpclient.execute(httpget);
InputStream is = response.getEntity().getContent();
BufferedReader reader = new BufferedReader(new InputStreamReader(is), BUFFER_SIZE);
StringBuilder sb = new StringBuilder();
String line = null;
while ((line = reader.readLine()) != null) {
sb.append(line);
}
is.close();
response.getEntity().consumeContent();
String results = sb.toString();
return results;
}
catch (IOException ex) {
Log.e("com.rezendi.wtw.Util", "Error GETting from "+queryUri, ex);
throw (ex);
}
}I recommend that you put these in one central Util class, as I did. Mind you, these are static methods on Util, and not (yet) synchronized ... meaning not thread-safe. At the moment I don't think that's an issue; all my HTTP connections are very-low-bandwidth and resolve in seconds, so I can't imagine two such calls realistically colliding unless the phone user has very fast fingers indeed. But famous last words, right? Live by the thread, die by the thread.
- In both the Android and AppEngine cases, I started off by writing the tutorials that come with the SDK, and expanded the app from there. In both cases I'm not sure this was such a good idea. I dunno, maybe it's just my OO background, but if I'm writing a Notepad application, I'd like to have an object model with a Note object. But maybe that's not really the Android way. Performance, responsiveness, seamlessness, and all that.
I did like the way they calved all direct DB access off to a DbHelper class, and I've expanded that. Here's a hint; have all your DbHelpers inherit from a common ancestor. Yeah, I know, favour composition over inheritance, but still, here it will save you a lot of time, and give you a central repository for database creation scripts and such. Also, I'm still not exactly sure when you're supposed to close your database objects. In "onPause()"? Immediately after accessing them? Don't bother and let the Activity superclass handle it? It seems unclear.
- I'm not sure of the best way to handle error handling in Android. I like to have a common superclass for all my UI objects and inherit error handling from it, but that ain't gonna happen here, as my Activities inherit from various different places on Android's Activity tree. I guess I'll build some sort of ErrorHandler and pass exceptions to it where I think they're most likely to occur? Kind of annoying that there's no single piece of code you can write to catch all exceptions before they hit Android's own error-handling, but I guess that's the price you pay for their heavily compository app system.
- Moving back to AppEngine: the memcache service is awesome. Speeds up performance and cuts down on resource consumption immensely. However, for your own good, create a "ClearCache" web handler, for debug purposes. I've twice now spent fifteen minutes wondering why a problem wasn't fixed before realizing that the erroneous data was still in the cache and had not been replaced.
- Python's HTMLParser is very useful. However, it seems to have weird problems with ampersands. I think I've now end-run around those problems, in a clumsy kludgy way, but I fear they too may yet crop up again.
- This project has included just enough JavaScript (maybe 100 lines) that I wonder if I should have used JQuery, and just little enough that I conclude that I probably shouldn't. Next time, maybe.
- I do believe that's all for now. See all y'all in beta, if not before.
Labels: Android, GET, HTTP, POST
So, for what it's worth, the things I needed to get through my head to break the blockage follow. Apologies if it's all "duh"; I had to have this explained to me in very simple words before I realized what I was getting wrong.
1. A 'normal' string ('dog') is a sequence of bytes; a Unicode string (u'dog') is a sequence of Unicode code points, like the letter C (code point U+0043) or 'LATIN SMALL LETTER LZ DIGRAPH' (code point U+02AB).
1a. Out in the world, some people use "Unicode" to mean "a string encoding capable of handling every Unicode character", like "UTF-8". Python never means that. Unicode is a type.
2. You can't exactly print Unicode strings or write them to files. You just call functions that require bytes but will helpfully coerce Unicode to bytestring invisibly. (Which is reasonable on the face of it, just as Python will do the implicit coercion to a float when you ask for "4.5 + 17", but...)
3. "Decode" means "convert string/bytes to Unicode", "encode" means "convert Unicode to string/bytes".
3a. "decode" and "encode" are both the sort of functions that coerce their input when needed. If you say
"dog".encode('utf-8')
Python will see you're calling "encode" on a non-Unicode string, convert it to Unicode, and then encode that as UTF-8.
3b. These implicit conversions always use the default encoding, set in site.py (and readable as sys.getdefaultencoding(), but you can't set it once the interpreter finishes startup). This default is almost always ascii, which means
'Bj\xc3\xb6rk'.decode('utf-8')
will be fine, but
'Bj\xc3\xb6rk'.encode('utf-8')
will throw a UnicodeDecodeError because, implicitly, you've said
'Bj\xc3\xb6rk'.decode(sys.getdefaultencoding()).encode('utf-8')
and the ASCII code table gives no guidance for the middle of that string.
4. This sucks, and is apparently fixed in Python 3.
Leanpitch provides online training in Scrum Master Certification during this lockdown period everyone can use it wisely.
Join Leanpitch 2 Days CSM Certification Workshop in different cities.
Scrum master certification online
CSM certification online
lawyers for bankruptcy near me
Subscribe to Post Comments [Atom]
<< Home
Subscribe to Posts [Atom]
Post a Comment