Google App Engine and Java Application: Clustering Internet search results for a person Aleksandar Kartelj Faculty of Mathematics, University of Belgrade
Google App Engine Web application hosting service Designed for real-time dynamic apps Many simultaneous users Scalable Paying resources monthly Free account (5 million page views a month) Aleksandar Kartelj 2/14
Sandboxing App can only read its own files App cannot manipulate environment variables App cannot access network facilities Distributing request non-deterministically Consequence: run multiple apps on the same server, or one app on multiple servers safely Aleksandar Kartelj 3/14
Architecture overview Aleksandar Kartelj 4/14
Features The static file servers The Datastore –Not join-query database –Most resembling to object database Entities and properties –Entity has 1..* properties –Not to be confused with rows in RDBMS Aleksandar Kartelj 5/14
Features Queries and indexes –Index in made for every simple query –Query performance affected only by the size of the result set Transactions –Optimistic concurrency control –Entity groups (entities updated in one trans.) Services: memcache, URL fetch, Mail, … Aleksandar Kartelj 6/14
App Engine’s request handling Aleksandar Kartelj 7/14
Building applications Aleksandar Kartelj 8/14
App engine dashboard Aleksandar Kartelj 9/14
App engine dashboard Aleksandar Kartelj 10/14
Clustering search results Aleksandar Kartelj 11/14
Clustering search results Aleksandar Kartelj 12/14
EC2 vs GAE vs Azure Aleksandar Kartelj 13
EC2 vs GAE vs Azure Aleksandar Kartelj 14
Thank you. Aleksandar Kartelj Faculty of Mathematics, University of Belgrade