Introduction to Google AppEngine Development in Java Philippe Beaudoin (Track Sponsor)
Overview What is Google AppEngine Hit the ground running: Your first GAE app The datastore Task Queues Overview of other services ?
What is Google AppEngine Fly me to the cloud, baby! PaaS (Platform as a Service) ? ?
Platform features – Not Infrastructure as a Service (i.e. Amazon AWS) – Sandboxed – NoSQL distributed datastore + memcache – Many services (task queues, image manipulation…) APIs for… – Java (And other JVM languages) – Python – Go (beta) ?
Drawbacks – Not all of Java is available (sandbox) – Only communicate via HTTP/HTTPS + – Read-only file system – Must finish processing within 30 seconds Except for new backends Advantages – It will scale! – Pay only for what you need – Super-easy and fast deployment ?
Billing model Free! – 500Mb storage – 5M page views/month Billing enabled ?
Admin console ?
Hit the ground running: Your first GAE app ?
?
?
?
?
The datastore NoSQL (Roughly: a fancy hashmap) Optimistic concurrency – No locking: try first, fail later if things go wrong Multitenancy Memcache (Use it!) Default Java API: JDO or JPA – Get cover! Much better: Objectify (OSS project) ?
?
?
Transactions Each entity has an entity group By default, each entity is in a group of its own Transactions are restricted to one entity group By default, transactions are restricted to a single entity! ?
Everything in one Entity Group? Bad idea! – AppEngine limits the number of requests per second on a given entity group. – Contention: every request wants the same group So design your entity groups carefully – The default (each entity in its group) often works ?
Entity groups design Decide them when designing your data model – Address and phone have coupled transactions? Same group! Grouping is a bit weird: choose one entity as the parent of the other. – Not necessarily a “real” parent-child relationship – No “cascading deletes” ?
The components ?
Persistable Entities Basic Java types (String, int, etc.) Date Blob (binary) , GeoPt, PhoneNumber, etc. Key List<>, Set<>, or arrays of the AnySerializableEntity ?
?
Accessing entities put(), get(), query(), delete() ?
Optimization Get/put/delete multiple keys at once Asynchronous fetches for concurrent queries Fetch only keys when needed ?
Using the memcache Simply annotate the entity Optimized for read-intensive apps – Write-through Caches negative results to Caches get(), put() and delete(), not query(). Can go out-of-sync in some situations – Not for entities requiring rigorous data integrity ?
Useful trick: counters Remember: AppEngine limits the number of requests per second on a given entity group. A counter many “read-increment-write”! The trick: a sharded counter – Create many counters – Pick one randomly, increment it – At the end, collect them all and sum them up Why it works? AppEngine has very fast read! Same idea: sharded lists ?
Queries NoSQL: only allows efficient queries – No table scans – No in-memory sorts – No joins (almost) Queries only allowed on indexed properties Allow “merge-join” queries: Horn 4 legs Spotted Cows! ?
Indexes List of entities sorted by a given field – Allows fast dichotomic search By default all fields are control that – Use on fields or classes Manual indexes – Specify more than one field to sort Ex: Sort by gender then by age – Needed for some complex queries ?
Complex queries income > and age < 30 – Not allowed! Inequalities on two fields gender = f and age > 18 and age < 25 – Allowed! Two inequalities on the same field – Needs a manual index (equality + inequality) Sort by gender, then by age all, sorted by decreasing age – Allowed, needs a manual index (reverse sort) ?
Optimization Index only the fields you need Index only the values you need! boolean admin; Try to limit manual indexes Avoid manual indexes on many List<> fields! Exploding indexes! ?
Useful trick: full text search Full text search not available out-of-the-box Build a KeywordIndex table – KeywordIndex entity is a [keyword, entityId] pair – Normalize keywords (uppercase, no diacritics…) When searching for a prefix: – ofy().query(Customer.class).filter("keyword >=", prefix).filter("keyword <=", prefix + "\ufffd").list(); ?
Task queues Work outside of a user request – Still need to be initiated by a user request Organize works in small discrete tasks – Again: split it up to make it scalable 10 minutes deadline (instead of 30s) Examples: – notifications – schema migration ?
Push queue vs Pull queue Push queue – Meant to be consumed by your AppEngine app – Dispatching managed by AppEngine Pull queue new! – Meant to be consumed externally (REST) or on an AppEngine backend – Consume the tasks when you’re ready – Your task consumer must scale! ?
Configuring a Pull Queue In queue.xml ?
Enqueing a task Or: getQueue(name) ?
Mapper API The “map” part of “mapreduce” Run a task on all the entities of a given type in the datastore Just configure the MapReduceServlet ?
Configure a new mapper ?
The mapper class ?
Launching the mapper ?
Overview of other services URL Fetch Mail / XMPP (Google Talk) OAuth, OpenID, Google Accounts Image manipulation Channel API Blobstore ?
URL Fetch Access other web resources (REST Apis, etc.) – Only http/https – Maximum deadline of 10 seconds – Synchronous or asynchronous Use standard java.net.URLConnection Low-level AppEngine API has more features ?
Channel API Enables COMET on AppEngine – Only for javascript clients ?
Channel API ?
Blobstore Objects up to 2 Gb Useful for videos or large images User use a form to upload to the blobstore Can serve back the blob later ?
Conclusion Platform-as-a-Service – Learn the platform – Constraints are often opportunities – Work with it, not against Still in active development – No https on custom domain A great way to learn what is needed to build a scalable app. – Try your hand at the free version!
Hands on? Google Web Toolkit and the Model View Presenter Architecture Tomorrow, 11am, Camstasia Google Web Toolkit and the Model View Presenter Architecture