To query or not to query! Review of search techniques, methods and …tricks Part of this presentation is adapted from:
Web browsers, we all know them!
Statistics…just for fun! yesterday From
What about search engines?
The Journey of a QUERY…
will find entries containing both words It contains an invisible operator called AND (implicit) It could be written like this Or using symbols The most basic search example: Attention here If we want results containing any of the words we use the operator OR
Search Terms Match Exactly If you search for …won't find … cheapinexpensive tvtelevision effectsinfluences childrenkids carautomobile Calif OR CACalifornia
SIMILAR WORDS MATCH The query for child bicycle helmet finds pages that contain words that are similar to some or all of your search terms, e.g., “child,” “children,” or “children's,” “bicycle,” “bicycles,” “bicycle's,” “bicycling,” or “bicyclists,” and “helmet” or “helmets.”child bicycle helmet ( word variations or automatic stemming) Stemming is a technique to search on the stem or root of a word that can have multiple endings.
STOP WORDS Some common words, called STOP WORDS (such as the, on, where, how, de, la, as well as certain single digits and single letters) generally don't add meaning to a search. Stop words appear on so many pages that searching for them usually doesn't help you find relevant results.
TERMS IN ORDER You should enter search terms in the order in which you would expect to find them on the pages you're seeking. A search for New York library gives priority to pages about New York's libraries.New York library While the query new library of York gives priority to pages about the new libraries in York.new library of York
NOT CASE-SENSITIVE Ignoring case distinctions increases the number of results. A search for Red Cross finds pages containing “Red Cross,” “red cross,” or “RED CROSS.”Red Cross MOST OF THE PUNCTUATION AND SPECIAL CHARACTERS ARE IGNORED ! ?,. ; [ / #.
APOSTROPHES HYPHENATED TERMS A term with an apostrophe (single quote, ') doesn't match the term without an apostrophe. A query with the term we're returns different results from a query with the term were.we're were When the search engine encounters a hyphen (–) in a query term, e.g., part-time, it searches for: part-time the term with the hyphen, e.g., part-time the term without the hyphen, e.g., parttime the term with the hyphen replaced by a space, e.g., part time
QUOTED PHRASES A query with terms in quotes finds pages containing the exact quoted phrase. For example, “Larry Page“ finds pages containing the phrase “Larry Page” exactly. So this query would find pages mentioning Google’s co-founder Larry Page, but not pages containing “Larry has a home page,” “Larry E. Page,” or “Congressional page Larry Smith.” “Larry Page“ Some teachers use quoted phrases to detect plagiarism. They copy a few unique and specific phrases into the search box, surround them with quotes, and see if any results are too similar to their student’s supposedly original work.plagiarism
THE + OPERATOR To search for a particular term, put a + sign operator in front of the word in the query. Note that you should not put a space between the + and the word. So, to search for the satirical newspaper The Onion, use +The Onion, not + The Onion.+The Onion Want to learn about Star Wars Episode One? “I” is a stop word and is not included in a search unless you precede it with a + sign: USE Star Wars +I NOT Star Wars IStar Wars +IStar Wars I
THE - OPERATOR To find pages without a particular term, put a – sign operator in front of the word in the query. The – sign indicates that you want to subtract or exclude pages that contain a specific term. Do not put a space between the – and the word. Find pages on “salsa” but not the dance nor dance classes…. USE salsa –dance –class NOT salsasalsa –dance –classsalsa We can combine operators: salsa -dance -class +food
THE ~ OPERATOR The tilde (~) operator takes the word immediately following it and searches both for that specific word and for the word’s synonyms. It also searches for the term with alternative endings. As with the + and – operators, put the ~ (tilde) next to the word, with no spaces between the ~ and its associated word. Synonyms means words with similar meaning but different spelling. ~inexpensive matches “inexpensive,” “cheap,” “affordable,” and “low cost”~inexpensive ~run matches “run,” “runner’s,” “running,” as well as “marathon”~run
THE ‘OR’ AND ‘|’ OPERATORS The OR operator, for which you may also use | (vertical bar), applies to the search terms immediately adjacent to it. Tahiti OR Hawaii Tahiti | Hawaii will find pages that include either “Tahiti” or “Hawaii” or both terms, but not pages that contain neither “Tahiti” nor “Hawaii.”
THE.. OPERATOR Specify that results contain numbers in a range by specifying two numbers, separated by two periods, with no spaces. For example, specify that you are searching in the price range $250 to $1000 using the number range specification $250..$1000. recumbent bicycle $250..$1000
THE * OPERATOR Use *, an asterisk character, known as a wildcard, to match one or more words in a phrase (enclosed in quotes). Each * represents just one or more words. Search engines treats the * as a placeholder for a word or more than one word. For example, “Google * my life“ tells Google to find pages containing a phrase that starts with “Google” followed by one or more words, followed by “my life.” Phrases that fit the bill include: “Google changed my life,” “Google runs my life,” and “Google is my life.”“Google * my life“
USING MORE SEARCH OPERATORS (most of them used by Google search engine) Search FeaturesSearch Operators File Format filetype: Occurrences in the title of the page allintitle: Occurrences in the text of the page allintext: Occurrences in the URL of the page allinurl: Occurrences in the links to the page allinanchor: Domain site: Similar related: Links link:
Examples (for Google): If you include filetype:suffix in your query, Google will restrict the results to pages whose names end in suffix. For example, web page evaluation checklist filetype:pdf will return pdf files that match the terms “web,” “page,” “evaluation,” and “checklist.” You can restrict the results to pages whose names end with pdf and doc by using the OR operator security filetype:pdf OR filetype:docweb page evaluation checklist filetype:pdf security filetype:pdf OR filetype:doc If you start your query with allinurl: Google restricts results to those containing all the query terms you specify in the url. For example, allinurl: google faq will return only documents that contain the words “google” and “faq” in the URL, such as “ google faq If you include location: in your query only articles from the location you specify will be returned. For example, queen location:canada will show articles that match the term “queen” from sites in Canadaqueen location:canada If you start your query with define: Google shows definitions from pages on the web for the term that follows. This advanced search operator is useful for finding definitions of words, phrases, and acronyms. For example, define: blog will show definitions for “Blog”define: blog
An interesting webite that uses most of the Google’s power
Homework (2 steps) Step one, you have to create a table with at least 10 of the operators you learned today. The table will be used for future research and will have the following headers: Notation for operatorWhat it doesExample AND finds results with terms on both sides of the search car AND red OR
Step 2… Coming soon, stay tuned!