Presentation is loading. Please wait.

Presentation is loading. Please wait.

Computer Science 1000 Information Searching I Permission to redistribute these slides is strictly prohibited without permission.

Similar presentations


Presentation on theme: "Computer Science 1000 Information Searching I Permission to redistribute these slides is strictly prohibited without permission."— Presentation transcript:

1 Computer Science 1000 Information Searching I Permission to redistribute these slides is strictly prohibited without permission

2 World Wide Web – The Basics our next topic examines how to find information on the web we consider a few basic terms here (which you’re probably familiar with): page/web page link/hyperlink site/web site later in semester, we will revisit web technologies in much more detail

3 World Wide Web a system of linked documents accessed via the internet often simply referred to as the web sometimes used interchangeably with the internet, but this isn’t exactly correct the internet is the global network of interconnected devices (computers, routers, etc) that exchange data the web refers to the documents being stored, the software that broadcasts and receives them, and the protocols used for transmission

4 Web Page a document stored and accessed on the web identified by a unique URL (Uniform Resource Locator) often referred to simply as a page today’s web pages are very rich in content text images hyperlinks videos

5 Web Site a collection of related webpages on the internet typically belong to a common organization or event example all pages served by the University of Lethbridge make up its website

6 Hyperlink a part of a web page that refers to a different location often just called a link hyperlinks can reference: another place on the same page another webpage hypertext: text containing hyperlinks

7 The Age of Information the computer, internet, and web have changed how we interact with information information storage the amount of available information is significantly greater (and growing rapidly) than even a generation ago information transmission large amounts of information are available with a single mouse click, and transfer almost immediately

8 Information Age – Rapid Onset the situation has transformed tremendously in your lifetimes consider the global information capacity: in 1986: 2.6 exabytes (< 1 CD per person) in 1993: 15.8 exabytes in 2000: 54.5 exabytes in 2007: 295 exabytes (61 CDs per person) how does one successfully navigate such a mountain of digital content? Martin and Lopez. The World’s Technological Capacity to Store, Communicate, and Compute Information. Science 332:6025 2011

9 Information Access even in pre-internet days, there was a wealth of information large-scale: library medium-scale: Encyclopaedia set small-scale: newspaper strategies developed to manage information categories hierarchies indices

10 Classification systematic arrangement in groups or categories according to established criteria – Merriam Webster in other words, the information is categorized according to relevant features consider our course notes: terminology (4 sets of slides) information searching (2-3 sets of slides) etc...

11 Classification classification is not specific to digital information library classification: Dewey Decimal Classification Library of Congress Classification

12 Classification classification is not specific to digital information newspaper classification

13 Classification classification level of detail leads to tradeoffs consider a coarse level of detail e.g. taxonomy of living organisms classify organisms according to Domain (Archaea, Bacteria, Eukarya) advantage: small number of groups disadvantage: each group is massive

14 Classification classification level of detail leads to tradeoffs consider a fine level of detail e.g. taxonomy of living organisms classify organisms according to Genus (Canis, Felis) advantage: each group reasonably small disadvantage: massive number of groups solution: hierarchy

15 Hierarchy a decomposition of classifications according to detail hierarchies contain levels at the top (root) level, there is typically a small number of broad categories each category is decomposed into small categories a classification group is defined by categorization at each level

16 Hierarchy organism taxonomy hierarchy: each Domain categorized into Kingdoms Eukarya FungiPlantaeAnimalia Protista Domain: Kingdom:

17 Hierarchy organism taxonomy hierarchy: each Kingdom classified in Phylum each Phylum classified into Class and so on.. http://ag.arizona.edu/pubs/garden/mg/entomology/intro.html

18 Hierarchy an object is still categorized, but by multiple levels (instead of one) http://schoolworkhelper.net/scientific-taxonomy/

19 Hierarchy facilitates efficient searching through exclusion example (text): suppose you have a collection of a million items these items organized into 10 equal-sized groups each top-level group is also organized into 10 equal subgroups choosing first category eliminates 900000 items choosing second category eliminates 90000 items and so on …

20 Hierarchy hierarchies are very popular consider our previous examples: Library of Congress Classification

21 Hierarchy hierarchies are very popular consider our previous examples: Newspaper

22 Index a detailed list of words, phrases, and/or topics indicating place of occurrence in essence, it maps keywords of interest to their location e.g. a page number a bottom-up approach to information organization as opposed to the top-down structure of a hierarchy particularly popular in printed material books, magazines, volumes, etc

23 Index - Example

24 Index typically used on small-scale books and volumes vs. libraries made efficient through organizational scheme alphabetical is very common some overlap with hierarchies e.g. subtopics

25 Finding Information – The Web as discussed, the amount of information on the web is immense many of the discussed techniques for information finding also apply digitally classification/hierarchies indexing

26 Classification many commercial websites have a classification structure navigation bars

27 Hierarchies many websites, especially large ones, will also arrange their categories in hierarchical fashion

28 Partition a hierarchy where every object occurs only once organism taxonomy – every species appears only once some hierarchies are necessarily partitions e.g. a particular book will only occur at one point in a library classification however, a partition in some case is not natural an object might have an inherent fit in more than one classification

29 Partitions digital content is often stored using overlapping hierarchies (non-partition) potentially more intuitive with hyperlinking, it’s easy to accomplish (two links to the same page) example (text): Three Books for Frugal Fashionistas was stored on NPR’s website under: Home > Arts & Life > Books > Three Books for Frugal Fashionistas Home > Listen > Latest Program > Three Books for Frugal Fashionistas

30 Indexes for the Web unlike hierarchies, indexes are much less common on individual websites site maps might be considered an index of sorts however, there are analogous technologies to indexes that pertain to the web as a whole Search Engines!


Download ppt "Computer Science 1000 Information Searching I Permission to redistribute these slides is strictly prohibited without permission."

Similar presentations


Ads by Google