Download presentation
Presentation is loading. Please wait.
2
© Tefko Saracevic 1 Information Science 2005 Tefko Saracevic, PhD School of Communication, Information and Library Studies Rutgers University New Brunswick, New Jersey USA http://www.scils.rutgers.edu/~tefko
3
© Tefko Saracevic 2 Information science: a short definition “the science dealing with the efficient collection, storage, and retrieval of information” Webster
4
© Tefko Saracevic 3 Organization of presentation 1.Big picture – problems, solutions, social place 2.Structure – main areas in research & practice 3.Technology – information retrieval – largest part 4.Information – representation; bibliometrics 5.People – users, use, seeking, context 6.Digital libraries – whose are they anyhow? 7.Paradigm shift – distancing of areas 8.Conclusions – big questions for the future
5
© Tefko Saracevic 4 Scope α Evolution and state of the field in the last decade of the old and first decade of the new century
6
© Tefko Saracevic 5 1.The big picture Problems addressed α Bit of history: Vannevar Bush (1945): β Defined problem as “... the massive task of making more accessible of a bewildering store of knowledge.” β Problem still with us & growing
7
© Tefko Saracevic 6 … solution α Bush suggested a machine: “Memex... association of ideas... duplicate mental processes artificially.” α Technological fix to problem α Still with us: technological determinant
8
© Tefko Saracevic 7 At the base of information science: Problem Trying to control content in α Information explosion β exponential growth of information artifacts, if not of information itself PLUS today α Communication explosion β exponential growth of means and ways by which information is communicated, transmitted, accesses, used
9
© Tefko Saracevic 8 technological solution, BUT … applying technology to solving problems of effective use of information BUT: from a HUMAN & SOCIAL and not only TECHNOLOGICAL perspective
10
© Tefko Saracevic 9 or a symbolic model Information Technology People
11
© Tefko Saracevic 10 Problems & solutions: SOCIAL CONTEXT α Professional practice AND scientific inquiry related to: Effective communication of knowledge records - ‘literature’ - among humans in the context of social, organizational, & individual need for and use of information α Taking advantage of modern information technology
12
© Tefko Saracevic 11 or as White & McCaine put it: “modeling the world of publications with a practical goal of being able to deliver their content to inquirers [users] on demand.”
13
© Tefko Saracevic 12 Elaboration α Knowledge records = texts, sounds, images, multimedia, web... ‘literature’ in given domains β content-bearing structures – central to information science α Communication = human-computer-literature interface β study of information science is the interface between people & literatures α Information need, seeking, and use = reason d'être α Effectiveness = relevance, utility
14
© Tefko Saracevic 13 General characteristics α Interdisciplinarity - relations with a number of fields, some more or less predominant α Technological imperative - driving force, as in many modern fields α Information society - social context and role in evolution - shared with many fields
15
© Tefko Saracevic 14 2.Structure Composition of the field α As many fields, information science has different areas of concentration & specialization α They change, evolve over time β grow closer, grow apart β ignore each other, less or more
16
© Tefko Saracevic 15 most importantly different areas… α receive more or less in funding & emphasis β producing great imbalances in work & progress β attracting different audiences & fields α this includes β vastly different levels of support for research and β huge commercial investments & applications
17
© Tefko Saracevic 16 How to view structure? by decomposing areas & efforts in research & practice emphasizing Technology Informatio n or People or
18
© Tefko Saracevic 17 α Identified with information retrieval (IR) β by far biggest effort and investment β international & global β commercial interest large & growing Part 3. Technology
19
© Tefko Saracevic 18 Information Retrieval – definition & objective “ IR:... intellectual aspects of description of information,... search,... & systems, machines...” Calvin Mooers, 1951 α How to provide users with relevant information effectively? For that objective: 1. How to organize information intellectually? 2. How to specify the search & interaction intellectually? 3. What techniques & systems to use effectively?
20
© Tefko Saracevic 19 Streams in IR Res. & Dev. 1. Information science: β Services, users, use; β Human-computer interaction; β Cognitive aspects 2. Computer science: β Algorithms, techniques β Systems aspects 3. Information industry: β Products, services, Web β Market aspects α Problem: β relative isolation – discussed later
21
© Tefko Saracevic 20 Contemporary IR research α Now mostly done within computer science β e.g Special Interest Group on IR, Association for Computing Machinery (SIGIR,ACM) α Spread globally β e.g. major IR research communities emerged in China, Korea, Singapore α Branched outside of information science - “everybody does information retrieval” β data mining, machine learning, natural language processing, artificial intelligence, computer graphics …
22
© Tefko Saracevic 21 Text REtrieval Conference (TREC) α Started in 1992, now probably ending β “support research within the IR community by providing the infrastructure necessary for large- scale evaluation” α Methods β provides large test beds, queries, relevance judgments, comparative analyses β essentially using Cranfield 1960’s methodology β organized around tracks γ various topics – changing over years
23
© Tefko Saracevic 22 TREC impact α International – big impact on creating research communities α Annual conferences β report. exchange results, foster cooperation α Results β mostly in reports, available at http://trec.nist.gov/ http://trec.nist.gov/ β overviews provided as well β but, only a fraction published in journals or books
24
© Tefko Saracevic 23 TREC tracks 2004 103 groups from 21 countries α Genomics with 4 sub tracks α HARD (High Accuracy Retrieval from Documents) α Novelty (new, nonredundant information) α Question answering α Robust (improving poorly performing topics) α Terabyte (very large collections) α Web track α Previous tracks: β ad-hoc (1992-1999) β routing (92–97) β interactive (94-02) β filtering (95-02) β cross language (97-02) β speech (97-00) β Spanish (94-96) β video (00-01) β Chinese (96-97) β query (98-00) β and a few more run for two years only
25
© Tefko Saracevic 24 Broadening of IR – ever changing, ever new areas added α Cross language IR (CLIR) α Natural language processing (NLP IR) α Music IR (MIR) α Image, video, multimedia retrieval α Spoken language retrieval α IR for bioinformatics and genomics α Summarization; text extraction α Question answering α Many human-computer interactions α XML IR α Web IR; Web search engines α DB and IR integration – structured and unstructured data
26
© Tefko Saracevic 25 Commercial IR α Search engines based on IR α But added many elaborations & significant innovations β dealing with HUGE numbers of pages fast β countering spamming & page rank games – adversarial IR γ never ending combat of algorithms α Spread & impact worldwide β about 2000 engines in over 160 countries β English was dominant, but not any more
27
© Tefko Saracevic 26 Commercial IR: brave new world α Large investments & economic sector β hope for big profits, as yet questionable α Leading to proprietary, secret IR β also aggressive hiring of best talent β new commercial research centers in different countries (e.g. MS in China) α Academic research funding is changing β brain drain from academe
28
© Tefko Saracevic 27 IR successfully effected: α Emergence & growth of the INFORMATION INDUSTRY α Evolution of IS as a PROFESSION & SCIENCE α Many APPLICATIONS in many fields β including on the Web – search engines α Improvements in HUMAN - COMPUTER INTERACTION α Evolution of INTEDISCIPLINARITY IR has a long, proud history
29
© Tefko Saracevic 28 Part 4. Information α Several areas of investigation; β as basic phenomenon – not much progress γ measures as Shannon's not successful γ concentrated on manifestations and effects β information representation γ large area connected with IR, librarianship γ metadata β bibliometrics γ structures of literature Covered in separate lectures
30
© Tefko Saracevic 29 Part 5. People α Professional services β in organization – moving toward knowledge management, competitive intelligence β in industry – vendors, aggregators, Internet, α Research β user & use studies β interaction studies β broadening to information seeking studies, social context, collaboration β relevance studies β social informatics
31
© Tefko Saracevic 30 User & use studies α Oldest area β covers many topics, methods, orientations β many studies related to IR γ e.g. searching, multitasking, browsing, navigation α Branching into Web use studies β quantitative & qualitative studies β emergence of webmetrics
32
© Tefko Saracevic 31 Interaction α Traditional IR model concentrates on matching not user side & interaction α Several interaction models suggested γ Ingwersen’s cognitive, Belkin’s episode, Saracevic’s stratified model β hard to get experiments & confirmation α Considered key to providing γ basis for better design γ understanding of use of systems α Web interactions a major new area
33
© Tefko Saracevic 32 Information seeking α Concentrates on broader context not only IR or interaction, people as they move in life & work α Based on concept of social construction of information α Most active area, particularly in Europe, with annual conferences
34
© Tefko Saracevic 33 Information seeking Sampling of theories, models α Why people seek information: β Taylor’s stages of information need β Dervin’s Sense-Making – gap, bridge β Belkin’s Anomalous State of Knowledge β Chatman’s life in the round – inf. poverty α How people seek information: β Wilson’s General Model of inf. seeking β Bates’ berrypicking – acts in searching β Kuhlthau’s information search process β Chang’s browsing model β Benoit’s communicative action - Habermas
35
© Tefko Saracevic 34 Part 7. Paradigm split in technology - people α Split from early 80’s to date into two orientations èSystem-centered γ algorithms, TREC γ continue traditional IR model èHuman-(user)-centered γ cognitive, situational, user studies γ interaction models, some started in TREC α These became almost separate universes – one based in computer science, the other in information science & libraianship
36
© Tefko Saracevic 35 Critiques, cultures α Number of critiques (e.g. Dervin & Nilan) about isolated systems approach β calls for user-centered approaches, designs & evaluation α But user-centered studies did not deliver very useful design pointers, guides α Very different cultures: β computer science has own, more science & technology oriented β information science more humanities oriented β C.P. Snow’s two cultures
37
© Tefko Saracevic 36 Human vs. system α Human (user) side: β often highly critical, even one-sided β mantra of implications for design β but does not deliver concretely α System side: β mostly ignores user side & studies β ‘tell us what to do & we will’ α Issue NOT H or S approach β even less H vs. S β but how can H AND S work together β major challenge for the future
38
© Tefko Saracevic 37 Reconciliation? α Several efforts to provide human- centered design β but more discussion than real application α Integration of information seeking and information retrieval in context (Ingwersen & Järvelin) α Research & development toward β using search context, improving user search experiences & search quality β machine learning, incorporating semantics
39
© Tefko Saracevic 38 Funding α Most funding goes toward systems side & computer science β most (very large %) support for system work α In the digital age support is for digital α True globally
40
© Tefko Saracevic 39 6.Digital libraries LARGE & growing area α “Hot” area in R&D β a number of large grants & projects in the US, European Union, & other countries up to now; β will it continue? It is not growing β but “DIGITAL” big & “libraries“ small α “Hot” area in practice β building digital collections, hybrid libraries, β many projects throughout the world β growing at a high rate
41
© Tefko Saracevic 40 Technical problems α Substantial - larger & more complex than anticipated: β representing, storing & retrieving of library objects γ particularly if originally designed to be printed & then digitized β operationally managing large collections - issues of scale β dealing with diverse & distributed collections γ interoperability β assuring preservation & persistence β incorporating rights management
42
© Tefko Saracevic 41 Digital Library Initiatives in the US (DLI) α Research consortia under National Science Foundation β DLI 1: 1994-98, 3 agencies, $24M, six large projects β DLI 2: 1999-2006, 8 agencies, $60+M, 77 large & small projects in various categories α ‘digital library’ not defined to cover many topics & stretch ideas β not constrained by practice
43
© Tefko Saracevic 42 European Union α DELOS Network of Excelence on Digital Libraries β many projects throughout European Union γ heavily technological β many meetings, workshops β resembles DLIs in the US β well funded, long range
44
© Tefko Saracevic 43 Research issues β understanding objects in DL γ representing in many formats γ non-textual materials β metadata, cataloging, indexing β conversion, digitization β organizing large collections β federated searching over distributed (various) collections β managing collections, scaling β preservation, archiving β interoperability, standardization β accessing, using,
45
© Tefko Saracevic 44 DL projects in practice α Heavily oriented toward a variety of institutions – primarily libraries β but also museums, professional societies, specific domains, etc etc α Main orientation: institutional missions, contexts, finances β sustainability, preservation in real world β managing growth, rights, access
46
© Tefko Saracevic 45 Agendas α Most DL research agenda is set from top down β from funding agencies to projects β imprint of the computer science community's interest & vision α Most DL practice agendas are set from bottom up β from institutions, incl. many libraries β imprint of institutional missions, interests & vision γ providing access to specialized materials and collections from an institution (s) that are otherwise not accessible γ covering in an integral way a domain with a range of sources
47
© Tefko Saracevic 46 Connection? α DL research & DL practice presently are conducted β mostly independent of each other, β minimally informing each other, β & having slight, or no connection α Parallel universes with little connections & interaction
48
© Tefko Saracevic 47 8.Conclusions IS contributions α IS effected handling of inf. in society α Developed an organized body of knowledge & professional competencies α Applied interdisciplinarity α IR reached a mature stage α IR penetrated many fields & human activities α Stressed HUMAN in human-computer interaction
49
© Tefko Saracevic 48 Challenges α Adjust to the growing & changing social & organizational role of inf. & related inf. infrastructure α Play a positive role in globalization of information α Respond to technological imperative in human terms α Respond to changes from inf. to communication explosion - bringing own experiences to resolutions, particularly to the INTERNET α Join competition with quality α Join DIGITAL with LIBRARIES
50
© Tefko Saracevic 49 Juncture α IS is at a critical juncture in its evolution α Many fields, groups... moving into information β big competition β entrance of powerful players β fight for stakes α To be a major player IS needs to progress in its: β research & development β professional competencies β educational efforts β interdisciplinary relations α Reexamination necessary
51
© Tefko Saracevic 50
52
© Tefko Saracevic 51
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.