Presentation is loading. Please wait.

Presentation is loading. Please wait.

© Tefko Saracevic, Rutgers University1 how were digital libraries evaluated? Tefko Saracevic, Ph.D. School of Communication, Information and Library Studies.

Similar presentations


Presentation on theme: "© Tefko Saracevic, Rutgers University1 how were digital libraries evaluated? Tefko Saracevic, Ph.D. School of Communication, Information and Library Studies."— Presentation transcript:

1

2 © Tefko Saracevic, Rutgers University1 how were digital libraries evaluated? Tefko Saracevic, Ph.D. School of Communication, Information and Library Studies Rutgers University http://www.scils.rutgers.edu/~tefko Tefko Saracevic, Ph.D. School of Communication, Information and Library Studies Rutgers University http://www.scils.rutgers.edu/~tefko

3 © Tefko Saracevic, Rutgers University2 “Evaluating digital libraries is a bit like judging how successful is a marriage” (Marchionini, 2000) “Evaluating digital libraries is a bit like judging how successful is a marriage” (Marchionini, 2000)

4 © Tefko Saracevic, Rutgers University3 digital libraries since emergence in early/mid 1990’s  large expenditures in research & practice globally  great many practical developments  many research efforts & programs  growing exponentially everything about digital libraries is explosive except evaluation  relatively small, even neglected area

5 © Tefko Saracevic, Rutgers University4 literature reports on DL evaluation two distinct types:  meta or “about” literature  suggest approaches, models, concepts  discussed evaluation  object or “on” literature  actual evaluations, contains data  data could be hard or soft meta literature much larger

6 © Tefko Saracevic, Rutgers University5 objective & corpus for this study to synthesize object literature only selection criteria: 1. directly address a DL entity or a DL process 2. contain data in whatever form some 90 reports selected estimate: no more than 100 or so evaluation reports exist totally

7 © Tefko Saracevic, Rutgers University6 difficulties with boundaries when is a study of DL involving  human information behavior  users and use  web portals or  information retrieval also a DL evaluation study? often matter of a judgment call

8 © Tefko Saracevic, Rutgers University7 approach taken here literature on DL evaluation was decomposed synthesis of evaluation was done by examining the following five aspects: 1. construct for evaluation  what was evaluated? What elements (components, parts, processes…) were involved in evaluation? 2. context of evaluation - selection of a goal, framework, viewpoint or level(s) of evaluation.  what was the basic approach or perspective?

9 © Tefko Saracevic, Rutgers University8 approach … 3.criteria reflecting performance as related to selected objectives  what parameters of performance were investigated ? 4. methodology for doing evaluation  what measures and measuring instruments were used? 5. findings, except one, were not generalized next: we will deal with each in turn

10 © Tefko Saracevic, Rutgers University9 1. constructs several types evaluated  entities  developed in research projects  operational digital libraries in institutions  multiple digital libraries  processes not restricted to a given digital library

11 © Tefko Saracevic, Rutgers University10 constructs: examples of entities constructed as DL in R&D projects:  Perseus – classics; evaluated most & first  ADEPT – geo resources for undergraduates  DeLIVER – sci-tech journals  Envision – computer science literature  Water in the Earth System – high school  National Gallery of the Spoken Word - archive  Making of America prototype - 19 th cent. journals some are full DL, some components

12 © Tefko Saracevic, Rutgers University11 constructs: entities (cont.) operational DL :  New Zealand DL – comp. sc. tech. reports  ARTEMIS – science materials for school 6 to 12  Internet Public library – digital reference  UK Nat Electronic Library for Health – in a large hospital  Mann Library Gateway, Cornell – access interface some or other aspect evaluated

13 © Tefko Saracevic, Rutgers University12 constructs: entities (cont.) multiple DL:  Project SOUP, Cornell – 6 digital collections in libraries & museums  Middlesex U – 6 general DL accessing journals & articles

14 © Tefko Saracevic, Rutgers University13 constructs: entities missing missing evaluation of operational DLs  in academic, public, national libraries, museums, …  lot of statistics collected, but as yet not subject of evaluation institutional DLs are a terra incognita as to evaluation commercial DL products also missing from formal evaluation most research constructs also not evaluated

15 © Tefko Saracevic, Rutgers University14 constructs: examples of processes variety of processes evaluated without reference to a DL :  various representations e.g.  noun-phrasing, key-phrasing  various tools  video searching  link generation  load balancing on servers  image retrieval

16 © Tefko Saracevic, Rutgers University15 constructs: processes (cont.) user behavior  usage patterns in service logs  perception of quality  work patterns of experts  user preferences  information seeking in hypermedia DL

17 © Tefko Saracevic, Rutgers University16 2. context of studies many different approaches used:  Systems-centered approach:  widely applied  study of performance assessing effectiveness and/or efficiency  results may inform specific choices in design or operations  Human-centered approach:  also widely applied  study of behavior such as information seeking, browsing, searching or performance in completion of given tasks  implications for design, but indirectly rather than directly

18 © Tefko Saracevic, Rutgers University17 2. context of studies (cont.)  Usability-centered approach:  assessment of different features e.g. of portals, by users.  a bridge between systems- and human-centered approaches.  mixed, or self-evident results

19 © Tefko Saracevic, Rutgers University18 2. context of studies (cont.)  Ethnographic approach: comprehensive observation of  life-ways, culture and customs in a digital library environment  impact of a digital library on a given community  applied successfully in a few studies, with illuminating results, particularly as to impact.  Anthropological approach: comprehensive observation of  different stakeholders or communities and their cultures in relation to a given digital library  applied in one study with interesting results illuminating barriers between stakeholder communities.

20 © Tefko Saracevic, Rutgers University19 2. context of studies (cont.)  Sociological approach: assessment of  situated action or user communities in social setting of a DL  applied in one study with disappointing results  Economic approach: study of  costs, cost benefits, economic values and impacts.  strangely, it was applied at the outset of digital library history (e.g. project PEAK) but now the approach is not really present at all

21 © Tefko Saracevic, Rutgers University20 2. context of studies – general observations levels of evaluation vary from  micro level – e.g. fast forward for video surrogates  macro level – e.g. impact of Perseus on the field and education in classics temporal aspects  some become obsolete fast e.g. on technology  other longitudinal

22 © Tefko Saracevic, Rutgers University21 3. criteria – general observations chosen standard(s) to judge thing by  there is no evaluation without criteria in IR: relevance is basic criterion in libraries: fairly standardized in DL: no basic or standardized criteria, no agreement  DL metrics efforts not yet fruitful  thus, every evaluator choose own criteria as to DL evaluation criteria there is a jungle out there

23 © Tefko Saracevic, Rutgers University22 usability criteria “extent to which a user can achieve goals with effectiveness, efficiency & satisfaction in context of use” (International Standards Organization) widely used, but no uniform definition for DL general, meta criterion, covers a lot of ground umbrella for many specific criteria used in DL evaluations so what was meant by usability in DL evaluation?  usability criteria as to content, process, format & overall assessment

24 © Tefko Saracevic, Rutgers University23 usability criteria (cont.) Content (of a portal or site)  accessibly, availability  clarity (as presented)  complexity (organization, structure)  informativeness  transparency  understanding, effort to understand  adequacy  coverage, overlap,  quality, accuracy  validity, reliability  authority Process (carrying out tasks as: search, browse, navigate, find, evaluate or obtain a resource)  learnability to carry out  effort/time to carry out  convenience, ease of use  lostness (confusion)  support for carrying out  completion (achievement of task)  interpretation difficulty  sureness in results  error rate

25 © Tefko Saracevic, Rutgers University24 usability criteria (cont.) Format  attractiveness  consistency  representation of labels (how well are concepts represented?)  communicativeness of messages Overall assessment  satisfaction  success  relevance, usefulness of results  impact, value  quality of experience  barriers, irritability  preferences  learning

26 © Tefko Saracevic, Rutgers University25 systems criteria as DL are systems, many traditional systems criteria used pertain to performance of given  processes/algorithms  technology  system overall

27 © Tefko Saracevic, Rutgers University26 systems criteria (cont.) Process/algorithm performance  relevance (of obtained results)  clustering  similarity  functionality  flexibility  comparison with human performance  error rate  optimization  logical decisions  path length  clickthroughs  retrieval time Technology performance  response time  processing time, speed  capacity, load Overall system  maintainability  scalability  interoperability  sharability  costs

28 © Tefko Saracevic, Rutgers University27 other criteria use, usage  usage patterns  use of materials  usage statistics  who uses what, when  for what reasons/decisions ethnographic in different groups:  conceptions, misconceptions  practices  language, frame of reference  communication  learning  priorities  impact

29 © Tefko Saracevic, Rutgers University28 4. methodologies digital libraries are complex entities  many methods appropriate  each has strengths, weaknesses range of methods used is wide  there is no “best” or holistic method  but, no agreement or standardization on any methods makes generalizations difficult, even impossible

30 © Tefko Saracevic, Rutgers University29 4. methodologies used surveys (most prevalent) interviews observations think aloud focus groups task performance log analysis usage analysis record analysis experiments economic analysis case study ethnographic analysis

31 © Tefko Saracevic, Rutgers University30 5. results not synthesized here hard to synthesize anyhow generalizations are hard to come by except one!

32 © Tefko Saracevic, Rutgers University31

33 © Tefko Saracevic, Rutgers University32 5. findings: users and digital libraries a number of studies reported various versions of the same result: users have many difficulties with DLs  usually do not fully understand them  they hold different conception of a DL from operators or designers  they lack familiarity with the range of capabilities, content and interactions  they often engage in blind alley interactions

34 © Tefko Saracevic, Rutgers University33 users and digital libraries: “It’s like being given a Rolls Royce and only knowing how to sound the horn” quote from a surgeon in study of digital libraries in a clinical setting (Adams & Blanford, 2001)

35 © Tefko Saracevic, Rutgers University34 analogy perceptions of users and perceptions of designers and operators of a DL are generally not very close users are from Venus and digital libraries are from Mars leads to the versus hypothesis

36 © Tefko Saracevic, Rutgers University35 is it: why VERSUS?  users and digital libraries see each other differently users AND digital library or users VERSUS digital library ?

37 © Tefko Saracevic, Rutgers University36 users AND digital library model context user cognitive task affective competence context digital library content representation organization services user model of digital library digital library model of user

38 © Tefko Saracevic, Rutgers University37 how close are they? user VERSUS digital library model what digital library assumes about user: - behavior? - needs? digital library model of user what user assumes about digital library: how it works? what to expect? user model of digital library

39 © Tefko Saracevic, Rutgers University38 the versus hypothesis in use, more often than not, digital library users and digital libraries are in an ADVERSARIAL position hypothesis does not apportion blame  does not say that DL are poorly designed  or that users are poorly prepared adversarial relation may be a natural order of things

40 © Tefko Saracevic, Rutgers University39 evaluation of digital libraries – conclusions I impossible? not really hard? very could not generalize yet no theories no general models embraced yet, although quite a few proposed in comparison to total works on DL, only a fraction devoted to evaluation: WHY?

41 © Tefko Saracevic, Rutgers University40 why? – some speculations Complexity: digital libraries are highly complex  much much more than technological systems alone  evaluation of complex systems is very hard  we are just learning how to do this job  experimenting with doing it in many different ways Premature: it may be too early in the evolution of digital library for evaluation on a more organized scale

42 © Tefko Saracevic, Rutgers University41 why? (cont.) Interest: There is no interest in evaluation  R&D interested in doing, building, implementing, breaking new paths, operating …  evaluation of little or no interest  plus there is no time to do it, no payoff Funding: inadequate or no funds for evaluation  evaluation time consuming, expensive, requires commitment  grants have minimal or no funds for evaluation  granting agencies not allocating programs for evaluation  no funds = no evaluation.

43 © Tefko Saracevic, Rutgers University42 why? (cont.) Culture: evaluation not a part of research and operations of DL  below the cultural radar; a stepchild  communities with very different cultures involved  language, frames of reference, priorities, understandings differ  communication is hard, at times impossible  evaluation means very different things to different constituencies

44 © Tefko Saracevic, Rutgers University43 why? (cont.) Cynical: who wants to know or demonstrate actual performance?  emperor clothes?  dangerous?  evaluation may be subconsciously or consciously suppressed

45 © Tefko Saracevic, Rutgers University44 ultimate evaluation The ultimate evaluation of digital libraries:  assessing transformation in their context, environment  determining possible enhancing changes in institutions, learning, scholarly publishing, disciplines, small worlds …  and ultimately in society due to digital libraries.

46 © Tefko Saracevic, Rutgers University45 conclusions II 1. evaluation of digital libraries still in formative years 2. not funded much, if at all 3. but necessary for understanding how to  build better digital libraries & services &  enhance their role THE END! – except

47 © Tefko Saracevic, Rutgers University46 Dubrovnik – conference organized by Bozo Tezak 1977 picture taken by Bob Hayes THANKS! Another paper at another time …

48 © Tefko Saracevic, Rutgers University47 evaluation digital library how to do it?

49 © Tefko Saracevic, Rutgers University48 evaluation perspective - Rockwell

50 © Tefko Saracevic, Rutgers University49 evaluation perspectives

51 © Tefko Saracevic, Rutgers University50 evaluation perspective …

52 © Tefko Saracevic, Rutgers University51

53 © Tefko Saracevic, Rutgers University52 sources acknowledgements the paper and PowerPoint presentation at: http://www.scils.rutgers.edu/~tefko/articles annotated bibliography at: http://www.scils.rutgers.edu/~miceval related version was presented at DELOS WP7 Workshop on the Evaluation of Digital Libraries, University of Padua, Italy, 4-5 October 2004 Thanks to Ying Zhang PhD cand. at Rutgers - helped in compilation of bibliographies & did the annotations at the MICeval site. Also thanks to R. Chialdi


Download ppt "© Tefko Saracevic, Rutgers University1 how were digital libraries evaluated? Tefko Saracevic, Ph.D. School of Communication, Information and Library Studies."

Similar presentations


Ads by Google