So Much Data www.sims.berkeley.edu/research/projects/how-much-info 1-2 exabytes per year; 250MB/yr per person on earth (phrased as “everyone on earth writes.

Slides:



Advertisements
Similar presentations
Plagiarism, Copyright and Fair Use
Advertisements

Digital Marketing Overview Tpugliese Adapted from Anton Koekemoer | April 2012.
1 Presented By Avinash Gutte Under The Guidance of Mrs. Hemangi Kulkarni Department of Computer Engineering Pimpri-Chinchwad College of Engineering, Pune.
Tara Guthrie, 2012 Types of Resources: Electronic.
Keyword Searching Tips: Taking Control of Your Searches.
XP Practical PC, 3e Chapter 12 1 Accessing Databases.
SEO for Trends to stay on Top Of. The Internet is a huge factor in how marketing is performed today, and keeping up with the latest SEO trends.
Quiz 2 - Review. Identity Theft and Fraud Identity theft and fraud are: – Characterized by criminal use of the victim's personal information such as a.
Digital Literacy.
SEO Lunch How to Grow A Business in 3 Bites Akiva Ben-Ezra
Hardware: Storage Devices. Definition Memory, i.e.: RAM (Random Access memory) Optical Disks Hard Disks USB Storage Devices CD’s, DVD’s Cache memory Databases.
Teaching and Learning with Technology Click to edit Master title style  Allyn and Bacon 2002 Teaching and Learning with Technology Click to edit Master.
Unit 1 – Improving Productivity Jake Carey. 1.1Why did you use a computer? What other systems / resources could you have used? I used a computer to finish.
A Case Study in Success Online How to generate revenue through content marketing.
Bibliography Examples. This is the basic shape of all bibliographies. Kind of looks like Oklahoma, doesn’t it?
SEO Rehab & Intervention Wednesday, August 20 th pm-4.00pm David Naylor (DaveN) – a UK SEO, Bronco
Evaluating Sources and Making Source Cards. Infohio.org Remember: Infohio is a data base. The sources on Infohio are trustworthy and reliable. You will.
Dr. Sha Li Computer-Based Instructional Technology College of Education, Humanities, and Behavioral Sciences AAMU Introduction to FED 529 Course Online.
 Promote books online add more content – increase sales.
Aardvark Anatomy of a Large-Scale Social Search Engine.
CS523 INFORMATION RETRIEVAL COURSE INTRODUCTION YÜCEL SAYGIN SABANCI UNIVERSITY.
Adobe FLASH What & Why? Where & When? Is Flash dead? What about HTML5?
Vocabulary review.
I.T MEDIA MAISRUL www.roelsite.yolasite.com
Digital Citizenship 6 th – 8 th Unit 1 Lesson 5 A Creator’s Rights What rights do you have as a creator?
JOIN THE MOBILE REVOLUTION AND MOBILIZE YOUR BUSINESS Interactive Consortium International, Limited.
How to Create a PowerPoint Presentation By Carrie Heninger Adapted from ELI Student Workshop by Laurie Miller.
Thanks to Bill Arms, Marti Hearst Documents. Last time Size of information –Continues to grow IR an old field, goes back to the ‘40s IR iterative process.
INTRODUCTION TO RESEARCH. Learning to become a researcher By the time you get to college, you will be expected to advance from: Information retrieval–
THE SOCIAL LIFE OF INFORMATION I203 Social and Organizational Issues of Information.
 Secondary storage (or external memory) - is not directly accessible by the CPU. Secondary storage does not loose the data when the device is powered.
MULTIMEDIA DATABASES -Define data -Define databases.
LOGO Searching the Web CHAPTER 2 Eastern Mediterranean University School of Computing and Technology Department of Information Technology ITEC229 Client-Side.
COM113 Introduction to Computing Storage. Optical Discs What is a CD-ROM?  Compact disc read-only memory  Cannot erase or modify contents  Typically.
Bibliography Examples. This is the basic shape of all bibliographies. Kind of looks like Oklahoma, doesn’t it?
Search Engines Reyhaneh Salkhi Outline What is a search engine? How do search engines work? Which search engines are most useful and efficient? How can.
The Structure of Information Retrieval Systems LBSC 708A/CMSC 838L Douglas W. Oard and Philip Resnik Session 1: September 4, 2001.
LOGO A comparison of two web-based document management systems ShaoxinYu Columbia University March 31, 2009.
Research Paper NE 201 Honora Eskridge NCSU Libraries September 27, 2006.
Scribing Your responsibility to scribe at least one class (5 points of final grade!)
Corrie Acoba TEC 546 April 27, 2011   Storytelling is the practice of using computer-based tools to tell stories. As with traditional storytelling,
Google, Bing, MSN, Yahoo! and many more!. How useful are search Engines? We discussed some of the techniques involved in the previous lesson. Search Engines.
Hardware Software InternetMiscellaneous
Mining for Ideas on the Web Video: 4 min. 45 sec..
Help Wanted Library website: - Services page for basic borrowing information and opening hours - Guides and Online Help – Creative and Cultural Industries.
Research and the Internet. Reference Books An Encyclopaedia is good to use When looking for background information on a topic When trying to find key.
Steve Cassidy Computing at MacquarieNo 1 Searching The Web Steve Cassidy Centre for Language Technology Department of Computing Macquarie University.
why use digital marketing? what is ‘digital marketing’ Digital marketing, also know as online marketing, web marketing and e- marketing, is in its simplest.
© 2015 albert-learning.com Internet 101. © 2015 albert-learning.com Internet 101 Vocabulary  Browser - a program used to view the Internet.  Click -
Week 1 Introduction to Search Engine Optimization.
 SEO Terms A few additional terms Search site: This Web site lets you search through some kind of index or directory of Web sites, or perhaps both an.
COPYRIGHT LAW PRESENTATION By Jacelyn Vital-McPherson.
Think Digital, Think Ally Digital Media 1of19 SEO Press Release Strategy 2015.
Introduction to Digital Media 1. What is digital media? Digital media is a form of electronic media where data is stored in digital (as opposed to analog)
COMPUTER MANAGEMENT CLASSROOM PLAN BY SARAH HALL.
Storage Devices. Store, to store and storage I have stored my pictures in a CD. I have to go to the store. Your storage device isn’t working, so you need.
Google Scholar Google Scholar allows the researcher to search for scholarly articles on a broad range of subjects.
Client-Side Internet and Web Programming
So You Have to Write a Research Paper!
Data Representation N4/N5.
Where can I find articles for research
Digital Marketing Overview
Digital Citizenship for Students and Educators
Little work is accurate
All About the Internet.
How did you know? When beginning a new topic with learners, what is your approach with learners? Where does technology come in? What does the technology.
My digital footprint By: Ava Ryalls.
Search and Retrieval in a Virtual World
Secondary Storage Devices
Journal of Web Semantics 55 (2019)
Presentation transcript:

So Much Data exabytes per year; 250MB/yr per person on earth (phrased as “everyone on earth writes something the size of Moby Dick 250 times a year” it makes no sense; phrased as “everyone on earth makes 15 minutes of video each year” it doesn’t sound so bad)

What kind of media? Paper: TB/yr; mostly office documents Film: TB/yr, mostly home snapshots Optical: TB/yr, mostly music CDs Magnetic: TB/yr, mostly for computers (300TB of camcorder tape) Disk drives – 2500 petabytes per year, 55% for desktop (in 2000 they said disk was $10/GB and would reach $1 in 2005 – I saw 76 cents/GB last week)

Disk prices

How much online? About 100M books have been published; perhaps 200K have been digitized, half available free and half for pay. (Half in French, by the way). Very little music or video is online legally. The Web is about TB of text; images 5X that; “deep web” or “dark matter” may be 100X as much.

Strategies for finding things Search engines: Back of book indexes, now Google Human guidance: Once citations, now hyperlinks Knowledge structures: Encyclopedias; thesauri; someday we might see PRECIS, CYC, or Semantic Web actually work Ranking as a way of combining 1 and 2 seems useful. As for the Semantic Web, Dave Parnas once wrote that “a data base is something that works, a knowledge base is something that doesn’t work”

What have you looked for? Tell us something you searched for that you couldn’t find. Was the problem that it (probably) (a) isn’t known, or (b) isn’t digitized and online, or (c) is restricted by legal or business rules, or (d) you couldn’t find it?

How should things be found? For something that you wanted to find, and believe was probably known, and probably available, how would you have liked to phrase the query? What prompted your interest? How can you formalize that interest? What kind of data description would you need?