1 Web Search Interfaces. 2 Web Search Interface Web search engines of course need a web-based interface. Search page must accept a query string and submit.

Slides:



Advertisements
Similar presentations
WEB DESIGN TABLES, PAGE LAYOUT AND FORMS. Page Layout Page Layout is an important part of web design Why do you think your page layout is important?
Advertisements

1 Web Search Interfaces. 2 Web Search Interface Web search engines of course need a web-based interface. Search page must accept a query string and submit.
DT228/3 Web Development WWW and Client server model.
Copyright 2004 Monash University IMS5401 Web-based Systems Development Topic 2: Elements of the Web (g) Interactivity.
Browsers and Servers CGI Processing Model ( Common Gateway Interface ) © Norman White, 2013.
Servlets and a little bit of Web Services Russell Beale.
1 HTTP – HyperText Transfer Protocol Part 1. 2 Common Protocols In order for two remote machines to “ understand ” each other they should –‘‘ speak the.
CS320 Web and Internet Programming Handling HTTP Requests Chengyu Sun California State University, Los Angeles.
Web Servers How do our requests for resources on the Internet get handled? Can they be located anywhere? Global?
Liang, Introduction to Java Programming, Sixth Edition, (c) 2005 Pearson Education, Inc. All rights reserved Chapter 34 Servlets.
HTTP Hypertext Transfer Protocol. HTTP messages HTTP is the language that web clients and web servers use to talk to each other –HTTP is largely “under.
How the web works: HTTP and CGI explained
World Wide Web1 Applications World Wide Web. 2 Introduction What is hypertext model? Use of hypertext in World Wide Web (WWW) – HTML. WWW client-server.
Programming Project #3CS-4513, D-Term Programming Project #3 Simple Web Server CS-4513 D-Term 2007 (Slides include materials from Operating System.
Definitions, Definitions, Definitions Lead to Understanding.
HTTP Overview Vijayan Sugumaran School of Business Administration Oakland University.
1 The World Wide Web. 2  Web Fundamentals  Pages are defined by the Hypertext Markup Language (HTML) and contain text, graphics, audio, video and software.
CGI Programming: Part 1. What is CGI? CGI = Common Gateway Interface Provides a standardized way for web browsers to: –Call programs on a server. –Pass.
Common Gateway Interface
M. Taimoor Khan * Java Server Pages (JSP) is a server-side programming technology that enables the creation of dynamic,
NETWORK CENTRIC COMPUTING (With included EMBEDDED SYSTEMS)
Web technologies and programming cse hypermedia and multimedia technology Fanis Tsandilas April 3, 2007.
FALL 2005CSI 4118 – UNIVERSITY OF OTTAWA1 Part 4 Web technologies: HTTP, CGI, PHP,Java applets)
ITIS 1210 Introduction to Web-Based Information Systems Chapter 24 How Websites Work with Databases How Websites Work with Databases.
Simple Web Services. Internet Basics The Internet is based on a communication protocol named TCP (Transmission Control Protocol) TCP allows programs running.
COMP3016 Web Technologies Introduction and Discussion What is the Web?
1 HTML and CGI Scripting CSC8304 – Computing Environments for Bioinformatics - Lecture 10.
1 CS 3870/CS 5870 Static and Dynamic Web Pages ASP.NET and IIS.
Comp2513 Forms and CGI Server Applications Daniel L. Silver, Ph.D.
Server-side Scripting Powering the webs favourite services.
Simple Web Services. Internet Basics The Internet is based on a communication protocol named TCP (Transmission Control Protocol) TCP allows programs running.
CP476 Internet Computing Lecture 5 : HTTP, WWW and URL 1 Lecture 5. WWW, HTTP and URL Objective: to review the concepts of WWW to understand how HTTP works.
Robinson_CIS_285_2005 HTML FORMS CIS 285 Winter_2005 Instructor: Mary Robinson.
WebServer A Web server is a program that, using the client/server model and the World Wide Web's Hypertext Transfer Protocol (HTTP), serves the files that.
HOW WEB SERVER WORKS? By- PUSHPENDU MONDAL RAJAT CHAUHAN RAHUL YADAV RANJIT MEENA RAHUL TYAGI.
9 Chapter Nine Compiled Web Server Programs. 9 Chapter Objectives Learn about Common Gateway Interface (CGI) Create CGI programs that generate dynamic.
Chapter 5 HTTP Request Headers. Content 1.Request headers 2.Reading Request Headers 3.Making a Table of All Request Headers 4.Sending Compressed Web Pages.
Java CGI Lecture notes by Theodoros Anagnostopoulos.
10/13/2015 ©2006 Scott Miller, University of Victoria 1 Content Serving Static vs. Dynamic Content Web Servers Server Flow Control Rev. 2.0.
Active Server Pages  In this chapter, you will learn:  How browsers and servers interacted on the Internet when the Internet first became popular 
Website Development with PHP and MySQL Saving Data.
Chapter 6 Server-side Programming: Java Servlets
1 © Netskills Quality Internet Training, University of Newcastle HTML Forms © Netskills, Quality Internet Training, University of Newcastle Netskills is.
ITCS373: Internet Technology Lecture 5: More HTML.
Web Spiders Dan Reeves Bill Walsh HDIW EECS February 2000.
WWW: an Internet application Bill Chu. © Bei-Tseng Chu Aug 2000 WWW Web and HTTP WWW web is an interconnected information servers each server maintains.
1 WWW. 2 World Wide Web Major application protocol used on the Internet Simple interface Two concepts –Point –Click.
Jan 2001C.Watters1 World Wide Web and E-Commerce Client Side Processing.
2: Application Layer 1 Chapter 2: Application layer r 2.1 Principles of network applications  app architectures  app requirements r 2.2 Web and HTTP.
Fall 2000C.Watters1 World Wide Web and E-Commerce Clients & Client Side Processing.
HTTP How the Internet servers and clients communicate.
JS (Java Servlets). Internet evolution [1] The internet Internet started of as a static content dispersal and delivery mechanism, where files residing.
Fall 2000C.Watters1 World Wide Web and E-Commerce Servers & Server Side Processing.
©SoftMooreSlide 1 Introduction to HTML: Forms ©SoftMooreSlide 2 Forms Forms provide a simple mechanism for collecting user data and submitting it to.
Internet Applications (Cont’d) Basic Internet Applications – World Wide Web (WWW) Browser Architecture Static Documents Dynamic Documents Active Documents.
8 Chapter Eight Server-side Scripts. 8 Chapter Objectives Create dynamic Web pages that retrieve and display database data using Active Server Pages Process.
Summer 2007 Florida Atlantic University Department of Computer Science & Engineering COP 4814 – Web Services Dr. Roy Levow Part 1 – Introducing Ajax.
1 State and Session Management HTTP is a stateless protocol – it has no memory of prior connections and cannot distinguish one request from another. The.
Overview of Servlets and JSP
Data Communications and Computer Networks Chapter 2 CS 3830 Lecture 7 Omar Meqdadi Department of Computer Science and Software Engineering University of.
Data Communication EDA344, DIT420 Description of Lab 1 and Optional Programming HTTP Assignment Bapi Chatterjee Prajith R G.
Simple Web Services. Internet Basics The Internet is based on a communication protocol named TCP (Transmission Control Protocol) TCP allows programs running.
Fall 2000C.Watters1 World Wide Web and E-Commerce Servers & Server Side Processing.
HTML III (Forms) Robin Burke ECT 270. Outline Where we are in this class Web applications HTML Forms Break Forms lab.
Lesson 11. CGI CGI is the interface between a Web page or browser and a Web server that is running a certain program/script. The CGI (Common Gateway Interface)
Web Development Web Servers.
IS333D: MULTI-TIER APPLICATION DEVELOPMENT
Chapter 26 Servlets.
Web Search Interfaces.
Web Search Interfaces by Ray Mooney
Presentation transcript:

1 Web Search Interfaces

2 Web Search Interface Web search engines of course need a web-based interface. Search page must accept a query string and submit it within an HTML. Program on the server must process requests and generate HTML text for the top ranked documents with pointers to the original and/or cached web pages. Server program must also allow for requests for more relevant documents for a previous query.

3 Submit Forms HTML supports various types of program input in forms, including: –Text boxes –Menus –Check boxes –Radio buttons When user submits a form, string values for various parameters are sent to the server program for processing. Server program uses these values to compute an appropriate HTML response page.

4 Simple Search Submit Form Type input into the box Type input into the box

5 How To Handle Form Submissions? There are many ways of handling form submissions. Servlet (written in Java and other languages) that provides action on the server side, the opposite of Applet Apache Tomcat is an example of Java implementation jakarta.apache.org/tomcat/jakarta.apache.org/tomcat/ CGI: Common Gateway Interface We will write our own server that supports search

6 Basic Web Server Structure Server program creates a socket for connection. Server program waits for clients request for connection. Clients here typically are Web browser such as Netscape. Once the server receives a request, it examines the type of request and perform the service as requested. The server then sends the results back to the client, typically in an HTML format.

7 Code Example of a Simple Web Server See transparency for the code example Also at fall/code/javaServer/EasyWebServer.java fall/code/javaServer/EasyWebServer.java

8 Socket API in Java A socket is a communication point. Java has two types of socket, a ServerSocket that waits for clients to connect at a given port ServerSocket server = new ServerSocket(PORT); When a client (a browser) connects to a server, the server creates a socket to work with that client ( Socket sock = server.accept(); ) When the work is finished, the server closes the socket A server may work with many clients any any moment

9 Server-Client Communication When a browser connects to a server it sends a collection of information to the server. Here is an example GET / HTTP/1.0 Connection: Keep-Alive User-Agent: Mozilla/4.78 [en] (X11; U; SunOS 5.8 sun4u) Host: polaris:9999 Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */* Accept-Encoding: gzip Accept-Language: en Accept-Charset: iso ,*,utf-8

10 Server-Client Communication -- cont The first line is most important. It indicates the client requests a “ GET ” operation at the given path “/” When the server receives this request, it first checks to see if the request is a valid one. If it is, the server performs the service and returns the results to the client. If the request is a regular Web page, as the above example, the requested page is sent.

11 Server-Client Communication -- cont Code example (the method processHTTPCmd ) is on the transparency and at fall/code/javaServer/EasyWebServer.java fall/code/javaServer/EasyWebServer.java If the client is sending a form (typically a search request), the server has to process the form and extract the information from the the form. When the client sends a form, it is requesting to POST the form to the server

12 Server-Client Communication -- cont The header sent to the server looks as follows. POST /form HTTP/1.0 Referer: Connection: Keep-Alive User-Agent: Mozilla/4.78 [en] (X11; U; SunOS 5.8 sun4u) Host: polaris:9999 Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */* Accept-Encoding: gzip Accept-Language: en Accept-Charset: iso ,*,utf-8 Content-type: application/x-www-form-urlencoded Content-length: 44

13 Server-Client Communication -- cont Key differences from previous “ GET ” example: –The command is now “ POST ” –It has a “ Content-type ” and a “ Content-length ” component The server responds according to the header The request has a “ POST ” so the server knows an action is needed The request has a “ Content-type ” of form

14 Server-Client Communication -- cont The request has a “ Content-length ” so the server knows how long is the form. In our example, the length is 44 The server will read the form following the header from the client. The forms are sent in from the client in pairs of name=value separated by &. In our example, it looks as follows, 44 chars long. FirstInput=123&SecondInput=abc&Submit=Submit

15 Server-Client Communication -- cont How was this string formed? Check the HTML code for the form. Type input into the box Type input into the box

16 Server-Client Communication -- cont The server then parses out the form and act accordingly. In our sample program, we simply echo back the values filled in the form. In actual search engine, the parsed words will be used to retrieve the relevant documents. To parse the form input, we used the Java method StringTokenizer

17 Snapshots of the Sample Web Server

18 Snapshots of the Sample Web Server

19 Simple Search Interface Refinements Currently reprocesses query for “ More results ” requests. –Could store current ranked list with the user session. Could integrate relevance feedback interaction. Could provide “ Get similar pages ” request for each retrieved document (as in Google).Google –Just use given document text as a query.

20 Other Search Interface Refinements Highlight search terms in the displayed document. –Provided in cached file on Google.Google Allow for “advanced” search: –Phrasal search (“..”) –Mandatory terms (+) –Negated term (-) –Language preference –Reverse link –Date preference Machine translation of pages.

21 Clustering Results Group search results into coherent “clusters”: –“microwave dish” One group of on food recipes or cookware. Another group on satellite TV reception. –“Austin bats” One group on the local flying mammals. One group on the local hockey team. Vivisimo groups results into “folders” based on a pre-established categorization of pages (like Yahoo or DMOZ categories).Vivisimo Alternative is to dynamically cluster search results into groups of similar documents.

22 User Behavior Users tend to enter short queries. –Study in 1998 gave average length of 2.35 words. –A 2003 study result is similar Users tend not to use advance search options. Users need to be instructed on using more sophisticated queries.