INFO 344 Web Tools And Development CK Wang University of Washington Spring 2014
Announcements Azure + AWS credits sent Azure = see slides on Canvas (today) AWS = google it, how to use the credits
Programming Assignment #2
Build a query suggestion service My favorite programming assignment
Rewarding! Does something useful Does something you use everyday 10 hours of work looks like 10 days of results Challenging but FUN.
Query Suggestion All modern websites with user input has this, it’s a must-have. Super useful. Needs to be fast.
Final Product AJAX Web service hosted on Azure, super scalable Written in C#, follow best practices & things taught in class No SDKs/external libraries except for jQuery for AJAX
Great User Experience Must be AJAX Must be fast, i.e. year web service returning the results needs to be < 100ms.
Deliverables Due on April 30, 11pm PST Submit on Canvas Please submit the following as a single zip file: Readme.txt with URL to your Azure instance & GitHub repro Screenshot of your Azure dashboard with Instance running (azure-compute.jpg) C# source code (we should be able to run this locally) Write up explaining how you implemented everything. Make sure to address each of the requirements, writeup.txt (~500 words) Extra credits – short paragraph in extracredits.txt for each extra credit (how to see/trigger/evaluate/run your extra credit feature and how you implemented it)
Start Early!!! Today!!! This is the hardest programming assignment all quarter
Hint Google search “trie”, you will need this to store your data for fast retrieval Your algorithm will likely be recursive/depth first search Trim results to 10 to improve performance Run out of memory = store as much data as you can! Only store titles with a-z, A-z, (space), ignore everything else.
Big O notation Run time complexity as ‘n’ increases Common ones are n = number of elements Common ones are O(n), O(n^2), O(log n), O(n log n) Some proofs may be complicated (see Algo’s class) Linear search = O(n) Binary search = O(log n)
Big O for PA2 Data = millions of titles/phrases! Linear search O(n) – too slow! Trie search O(log n) – very fast! The log-base number is likely 26 (or 27) 10 instructions in O(n) searches 10 titles… 10 instructions in O(log n) searches 26^(10) titles!
Always use Trie? 10 vs. 26^10 OMG. Always use Trie! Not for free! Space-time tradeoff Linear search O(n) = less than 500mb Trie O(log n) = maybe over 10GB to fit everything!
Challenge Code for this assignment == hard so start early! If you can write the code in 4 hours, you will pass any coding interview but this will likely take 20-30 hours esp if you’ve never heard of a ‘trie’ 2 weeks for this assignment so start now! This will be fun and very worthwhile to take the time and learn.
Extra Credit [10pts] Popularity (page view) based query suggestion* [10pts] Hybrid List & Trie data structure (convert to Trie after > X entries in node) [10pts] Handle misspelling gracefully** [5pts] Query suggestion based on user searches These are actually really really fun.
Questions?