INFO 344 Web Tools And Development CK Wang University of Washington Spring 2014.

Slides:



Advertisements
Similar presentations
CSE Spring 2015 INTERMEDIATE PROGRAMMING
Advertisements

Search Engines. 2 What Are They?  Four Components  A database of references to webpages  An indexing robot that crawls the WWW  An interface  Enables.
Some things to think about. Assignment 1 is at the end, but read the whole thing. Please!
Geri carter Spring 2011 Review History of start up All the US companies owned Brin and Page Chrome All the tools Cloud computing gMail.
Protecting your online and on premises assets "Cloud Style" Mike Martin Architect / Microsoft Azure MVP.
Search Engine Optimization March 23, 2011 Google Search Engine Optimization Starter Guide.
147,000 more website visits per month? Three Simple Secrets That will get your website higher on Google SEO101.
Toll Free: Project Manager Tutorial.
Databases & Data Warehouses Chapter 3 Database Processing.
Wordpress SEO. Your Own Website If you want your own website, we have designed Wordpress website templates that you can purchase that have pretty much.
Core Publisher: Station Administrator Tools. Training 1: Site Administration Training 2: Programs Training 3: Content Tagging Training 4: Creating Posts.
INFO 344 Web Tools And Development CK Wang University of Washington Spring 2014.
Search Engines By: Big Cat Jaime DeBartolo, Rachel Adams, Michelle Knapp.
KW Agent Website Training Getting Good with Google.
INFO 344 Web Tools And Development CK Wang University of Washington Spring 2014.
Windows Azure Tour Benjamin Day Benjamin Day Consulting, Inc.
Open Internet Explorer Go to: my.ccsd.net Type YOUR InterAct username and password. Then Submit Query.
INFO 344 Web Tools And Development CK Wang University of Washington Spring 2014.
INFO 344 Web Tools And Development CK Wang University of Washington Spring 2014.
CourseCrawler Matt Berntsen Don Frehulfer Evan Kaiser.
SEO  What is it?  Seo is a collection of techniques targeted towards increasing the presence of a website on a search engine.
INFO 344 Web Tools And Development CK Wang University of Washington Spring 2014.
How to Set-up Your Local Listing. Welcome This tutorial will take you through the steps to set-up (or edit) your Local listing to ensure you get the most.
NoteSearch - Find what you’re looking for. Prototype Team B.
McLean HIGHER COMPUTER NETWORKING Lesson 7 Search engines Description of search engine methods.
Okalo Daniel Ikhena Dr. V. Z. Këpuska December 7, 2007.
BIT 285: ( Web) Application Programming Lecture 15: Tuesday, February 24, 2015 Microsoft Azure Instructor: Craig Duckett.
The Anatomy of a Large-Scale Hyper textual Web Search Engine S. Brin, L. Page Presenter :- Abhishek Taneja.
Core Publisher: Station Administrator Tools. Training 1: Site Administration Training 2: Programs Training 3: Content Tagging Training 4: Creating Posts.
IST 210: Organization of Data
Search Engines By: Faruq Hasan.
INFO 344 Web Tools And Development CK Wang University of Washington Spring 2014.
“How to INDEX and Rank Your Post in Google in Under 60 Seconds!” ~by Brian Cain.
INFO 344 Web Tools And Development CK Wang University of Washington Spring 2014.
INFO 344 Web Tools And Development CK Wang University of Washington Spring 2014.
INFO 344 Web Tools And Development CK Wang University of Washington Spring 2014.
INFO 344 Web Tools And Development CK Wang University of Washington Spring 2014.
Spiderman ©Marvel Comics Creating Web Pages (part 1)
Our MP3 Search Engine Crawler –Searching for Artist Name –Searching for Song Title Website Difficulties Looking Back.
INFO 344 Web Tools And Development CK Wang University of Washington Spring 2014.
INFO 344 Web Tools And Development CK Wang University of Washington Spring 2014.
1 Project 4 Address Lookup. Project 4 Write an ASP.NET app that permits users to retrieve addresses from a potentially large list of addresses. There.
INFO 344 Web Tools And Development CK Wang University of Washington Spring 2014.
The anatomy of a Large-Scale Hypertextual Web Search Engine.
INFO 344 Web Tools And Development CK Wang University of Washington Spring 2014.
Creating Your Own Online Classroom MOODLE. Welcome Amy Basket – 17 years with Bay City Public Schools – Gifted and Talented Program – Volunteer Program.
Dr. Sajib Datta Jan 15,  Instructor: Sajib Datta ◦ Office Location: ERB 336 ◦ Address: ◦ Web Site:
IST 210: ORGANIZATION OF DATA Introduction IST210 1.
How to Set-up Your Business Alliance Listing. Welcome  This tutorial will take you through the steps to set-up (or edit) your listing to ensure you get.
1 1.Log in to the computer in front of you –Temp account: 210class / 2.Update your in Cascadia's system –If I need to you I'll use.
The Anatomy of a Large-Scale Hypertextual Web Search Engine S. Brin and L. Page, Computer Networks and ISDN Systems, Vol. 30, No. 1-7, pages , April.
SEO FOR REDESIGN Eric Werner. DON’T WAIT “ We are going to wait until the redesign is complete to work on SEO” No problem unless any of the following.
1 Chapter 5 (3 rd ed) Your library is an excellent resource tool. Your library is an excellent resource tool.
Crawling When the Google visit your website for the purpose of tracking, Google does this with help of machine, known as web crawler, spider, Google bot,
Information Architecture
KW Agent Website Training
INFO 344 Web Tools And Development
The Anatomy of a Large-Scale Hypertextual Web Search Engine
Richland 1 professional development
INFO 344 Web Tools And Development
INFO 344 Web Tools And Development
Important Resources These resources will help you be successful in US History Class. We’ve used some of them at school, but I’m also asking you to access.
Blackboard Beginner Level Training
INFO 344 Web Tools And Development
INFO 344 Web Tools And Development
INFO 344 Web Tools And Development
INFO 344 Web Tools And Development
INFO 344 Web Tools And Development
INFO 344 Web Tools And Development
SEARCH ENGINE OPTIMIZATION
Presentation transcript:

INFO 344 Web Tools And Development CK Wang University of Washington Spring 2014

Announcements Monday = no class But I will host extra office hours – 12 noon to 3pm – MGH commons – Great opportunity to get PA4 help.

Programming Assignment #4

Connecting Everything

Anatomy QuerySuggest Web Role Search.aspx Dashboard.aspx Admin.asmx Azure Blob QuerySuggest Azure Queue URLs to Crawl Azure Table Web Index Red = Storage Blue = Compute Worker Role Crawler User query suggestions URLs word, URLs AWS RDS Structured Data (NBA stats) Wiki dataset query stats This is basically how Google works! query Azure Table Ranking Azure Blob User Logs PA3 PA1 PA2

Final Product On Azure & AWS Admin.asmx Dashboard.aspx Worker Role = crawler Search.aspx Query suggestion (web role)Results Page (web role) Results from table storage Results from cnn.com (table storage) LeBron James LeBron James stats Full Google Experience! PA2 PA1 PA3

Connecting P1, P2, P3 PA1 => change to support JSONP – Only 1 result, only exact matches PA2 => This will be our core front-end for PA #4 – Add code to query PA #1’s API to grab NBA players – Add code to query Table Storage for indexed sites (from crawler) – Add code to rank results (LINQ) – Show results to end user – Add query suggestion admin stats to Dashboard PA3 => – Change Table storage to map words in title to URL, instead of the current page URL to title/date. For example, if the title is “Microsoft goes IPO” then the key should be “microsoft”, “goes”, “ipo” and the value is the pair. This is a simplified inverted index. – 1 word will likely map to multiple URL’s – Case insensitive! – Still only cnn.com & sports illustrator

Caching, Monetization, Ranking Caching – Web role has cache, size = 100 rows – Just use a dictionary Monetization – Google Adsense! Ranking – Sort by #keyword matches – Then by date – Only use LINQ, 1 statement!

Start Now! Less than 2 weeks! No Late Days

Deliverables Due on Jun 2, 11pm PST – NO LATE DAYS Submit on Canvas Please submit the following as a single zip file: URL to your Azure instance hosting this website in readme.txt URL to your GitHub repro (share your GitHub with me & TA) in readme.txt Visual Studio 2013 project & source code Make sure crawling is complete (or has crawled a bunch of pages) Write up explaining how you implemented everything. Make sure to address each of the requirements, writeup.txt (~500 words) Extra credits – short paragraph in extracredits.txt for each extra credit (how to see/trigger/evaluate/run your extra credit feature and how you implemented it)

Hint Probably easier to start from PA3 Worker Role = same as PA3 – except maybe write to Azure Table part Web Role = PA2 + PA3 Re-launch AWS – Make sure you do Single AZ + Micro instance!!!

Hint Start Early Ask on discussion forum Early

Extra Credit [up to 10pts] Beautiful search results page [10pts] Show body snippet in results page with query words bolded [5pts] Learn ranking from user clicks on URLs – Still 1 LINQ query for all ranking [5pts] Google instant (AJAX, every keystroke in query box => update results page)*

What if my other PAs aren’t working? Start ASAP! PA1 = Not too hard PA2 = Instead of trie, just use Dictionary > where key = first 3 characters of input query. List => titles with that starting character PA3 – Focus on the URL queue and getting sitemap into that queue then getting page title/words into table storage. Don’t worry about the other stuff. This will be much much more important PA4 is basically our last assignment + final exam + final project

Secret I’ll weigh PA4 slightly more if I see a huge improvement. Don’t give up : )

Questions?