Presentation is loading. Please wait.

Presentation is loading. Please wait.

11 Building a Usable Infrastructure for e-Science: An Information Perspective Christine L. Borgman Professor & Presidential Chair in Information Studies.

Similar presentations


Presentation on theme: "11 Building a Usable Infrastructure for e-Science: An Information Perspective Christine L. Borgman Professor & Presidential Chair in Information Studies."— Presentation transcript:

1 11 Building a Usable Infrastructure for e-Science: An Information Perspective Christine L. Borgman Professor & Presidential Chair in Information Studies University of California, Los Angeles e-Science All Hands Meeting, Nottingham 20 September 2005 Christine L. Borgman Professor & Presidential Chair in Information Studies University of California, Los Angeles e-Science All Hands Meeting, Nottingham 20 September 2005 These slides are available under Creative Commons Non-commercial Attribution License, Christine L. Borgman, 2005 http://creativecommons.org

2 2 e-Science Goals to enable new forms of science that are –information-intensive –data-intensive –distributed –collaborative –multi-disciplinary to use information technology to –leverage data as a form of science capital –to manage the “data deluge” –improve access to scientific information to enable new forms of science that are –information-intensive –data-intensive –distributed –collaborative –multi-disciplinary to use information technology to –leverage data as a form of science capital –to manage the “data deluge” –improve access to scientific information

3 3 Information & knowledge layer Middleware services layer ITC Infrastructure Processors, memory, network Content Applications Space e-Science infrastructure : Layered Model Digital Libraries Scientific DBs User Interfaces & Tools Slide courtesy of Stephen Griffin, NSF, and Norman Wiseman, JISC

4 4 Usability Screen displays Work practices Culture of science Incentives of scientists Economics, law, policy, and institutions of science Screen displays Work practices Culture of science Incentives of scientists Economics, law, policy, and institutions of science

5 5 An infrastructure OF or FOR information? –OF information: A framework to support any kind of information Bits, objects, independent of context –FOR information: Fits into work practices Facilitates communication between groups Provides context for interpretation, use, re-use of information Reflects the incentives of scientists Provides permanent access to information –OF information: A framework to support any kind of information Bits, objects, independent of context –FOR information: Fits into work practices Facilitates communication between groups Provides context for interpretation, use, re-use of information Reflects the incentives of scientists Provides permanent access to information

6 6 Value chain of information Relationships –Scientific basis –Sources –Methods –History –Provenance Networks of –publications –data –composite objects Relationships –Scientific basis –Sources –Methods –History –Provenance Networks of –publications –data –composite objects Image: http://www.indexgeo.com.au/tech/asdd/discover.gif

7 7 Grid E-Scientists Entire E-Science Cycle Encompassing experimentation, analysis, publication, research, learning 5 Institutional Archive Local Web Publisher Holdings Digital Library E-Scientists Graduate Students Undergraduate Students Virtual Learning Environment E-Experimentation E-Scientists Technical Reports Reprints Peer- Reviewed Journal & Conference Papers Preprints & Metadata Certified Experimental Results & Analyses Data, Metadata & Ontologies eBank Project Slide Courtesy of Liz Lyon, UKOLN

8 8 Crystallographic e-Prints  Direct Access to Raw Data from scientific papers Raw data sets can be very large and these are stored at National Datastore using SRB server Slide courtesy Jeremy Frey & Tony Hey

9 9 British Atmospheric Data Centre British Oceanographic Data Centre Simulations Assimilation Complexity + Volume + Remote Access = Grid Challenge Slide courtesy Bryan Lawrence & Tony Hey

10 10 Roman Forum, Western End, ca. 400AD, copyright Regents of the University of California

11 11 Role of publications in science Product of research Cumulative, historical record of science Input to research Value chain: Network of documents linked via citations Product of research Cumulative, historical record of science Input to research Value chain: Network of documents linked via citations Image: http://www.bronxville.k12.ny.us/Library/Good_Library_person.jpg

12 12 Access to scientific publications Libraries –Paper journals via subscription –Electronic journals via leased access –Control via bibliographic records (metadata) Colleagues –Pre-prints in disciplinary repositories –Private circulation Libraries –Paper journals via subscription –Electronic journals via leased access –Control via bibliographic records (metadata) Colleagues –Pre-prints in disciplinary repositories –Private circulation Image: http://siggy.chem.ucla.edu/Visit_UCLA/Visit_UCLA.html

13 13 Role of data in e-Science Data-centric collaboration Data as product of research? Data as input to research? Value chain –Data to data links? –Provenance? –Data to publication links? Data-centric collaboration Data as product of research? Data as input to research? Value chain –Data to data links? –Provenance? –Data to publication links? Image: http://quake.wr.usgs.gov/research/deformation/twocolor/lvnet.gif

14 14 What are data in science? Ecology: weather, ground water, sensor readings, historical record Medicine: x-rays Chemistry: protein structures Astronomy: spectral surveys Biology: specimens Physics: events, objects Documentation: Lab and field notebooks, spreadsheets Ecology: weather, ground water, sensor readings, historical record Medicine: x-rays Chemistry: protein structures Astronomy: spectral surveys Biology: specimens Physics: events, objects Documentation: Lab and field notebooks, spreadsheets Image: http://cdiac.ornl.gov/oceans/NAtl_map.jpg

15 15 When are data? Instrument readings or scientific fact? Events or findings? When to trust data Factual status –What to release –When to release Instrument readings or scientific fact? Events or findings? When to trust data Factual status –What to release –When to release CENS Image: New York Times

16 Contaminant Transport Group Multimedia, Multiscale problems (time and space) Multidisciplinary (current and as yet unknown) problems Management, visualization, exploration of massive, heterogeneous data streams Monitoring habitat with sensor networks

17 17 How are data documented? Standards –Metadata standards within fields –Ontologies within fields Practices –Project-specific data models –Instrument-specific models –Researcher-specific models Current data –Born digital Legacy data –Born digital in other formats –Paper, other media –Documented by project, instrument, researcher… Standards –Metadata standards within fields –Ontologies within fields Practices –Project-specific data models –Instrument-specific models –Researcher-specific models Current data –Born digital Legacy data –Born digital in other formats –Paper, other media –Documented by project, instrument, researcher… Image source: http://www.medscape.com/content/2004/00/46/81/468129/art-mgm468129.fig1.jpg

18 18

19 19 What data are retained for re-use? Genomics: deposit expected Physics: shared by collaborators, not openly published Chemistry: highly contentious Ecology: many small, local projects, local data Genomics: deposit expected Physics: shared by collaborators, not openly published Chemistry: highly contentious Ecology: many small, local projects, local data Image source: http://www.bbc.co.uk/schools/gcsebitesize/img/ict04datastorage.gif

20 20 Under what conditions can data be shared, re-used? Funding source –Public: access may be mandatory –Private: access may be limited –Public-private partners: negotiated Economic (resale) value of data –Chemistry: very high –Stock market, geospatial: time dependent –Particle physics: low Funding source –Public: access may be mandatory –Private: access may be limited –Public-private partners: negotiated Economic (resale) value of data –Chemistry: very high –Stock market, geospatial: time dependent –Particle physics: low Image: http://www.britishcouncil.org/global- common-330x220-pound-sign.jpg

21 21 Under what conditions can data be shared, re-used? Privacy, confidentiality –Sciences (e.g., atoms, molecules, genomes): low –Sciences (e.g., endangered species): high –Medicine (e.g., patient records): high –Social sciences (e.g., interviews, observations): high Security –Authorizing access –Security practices Privacy, confidentiality –Sciences (e.g., atoms, molecules, genomes): low –Sciences (e.g., endangered species): high –Medicine (e.g., patient records): high –Social sciences (e.g., interviews, observations): high Security –Authorizing access –Security practices Image: Christine L. Borgman, 2005

22 22 Who controls, who owns data? Ownership vs control –Own and control –Control but not own –Own but not control Who can authorize release of data? –Investigator –University intellectual property office –Funding agency –Collaboration partner What intellectual property practices, rules, laws govern? Ownership vs control –Own and control –Control but not own –Own but not control Who can authorize release of data? –Investigator –University intellectual property office –Funding agency –Collaboration partner What intellectual property practices, rules, laws govern? Image source: http://www.nelsonmullins.com/legal-practice-area/Practice_Insets/Intellectual-Property-Inter.jpg

23 23 How to access scientific data? Private access –Request data from another investigator –Train researcher in a novel method or technique –Barter data for research funds, access to labs, corporate partnership… Open, public access –Data repositories: BODC, BADC, BIRN, NEON, GEON, UKDA… –Data posted on local portal, website Private access –Request data from another investigator –Train researcher in a novel method or technique –Barter data for research funds, access to labs, corporate partnership… Open, public access –Data repositories: BODC, BADC, BIRN, NEON, GEON, UKDA… –Data posted on local portal, website Image: Christine L. Borgman, 1995

24 24

25 25

26 26

27 27 What data deserve to be permanently accessible? What are the scientific criteria for preservation? What is the equivalent of peer review for data? Whose data do you trust? What data will be re- used? How much to invest? Who will add the value? What are the scientific criteria for preservation? What is the equivalent of peer review for data? Whose data do you trust? What data will be re- used? How much to invest? Who will add the value? Image: Christine L. Borgman, 2005

28 28

29 29 Incentives to share data Tradition of “open science” Replicate, compare results Ask new questions Form multi-disciplinary alliances Required by funding agency or journal Tradition of “open science” Replicate, compare results Ask new questions Form multi-disciplinary alliances Required by funding agency or journal Image source: www.buffaloworks.us/ images/sharing%20orangs.jpg

30 30 Incentives not to share data Rewards for publication, not for data management Effort to document data Concern for “free riders” Risks of misinterpretation of data Risks of losing control over data Risks of loss of intellectual property Rewards for publication, not for data management Effort to document data Concern for “free riders” Risks of misinterpretation of data Risks of losing control over data Risks of loss of intellectual property Image source: www.buildingsrus.co.uk/.../ target1.htm

31 31 Content and context Scholarly publications provide context –Literature review, history of problem, definitions of terms –Theory, hypotheses, goals –Research method, discussion of results –Cumulation of scientific knowledge Datasets, repositories remove context –Data elements, names of variables –Instrument readings –Numerical, textual data –Images, descriptions of artifacts Scholarly publications provide context –Literature review, history of problem, definitions of terms –Theory, hypotheses, goals –Research method, discussion of results –Cumulation of scientific knowledge Datasets, repositories remove context –Data elements, names of variables –Instrument readings –Numerical, textual data –Images, descriptions of artifacts

32 32 Constructing the value chain Links between data, documents and objects –Based on common standards –Robust over time –Robust over migration of software and hardware Metadata –Based on common standards –Describe what can be done with them –Describe conditions for use Permanent access –Incentives –Curatorial expertise –Institutional models Links between data, documents and objects –Based on common standards –Robust over time –Robust over migration of software and hardware Metadata –Based on common standards –Describe what can be done with them –Describe conditions for use Permanent access –Incentives –Curatorial expertise –Institutional models

33 33 An infrastructure for information Support work practices –Tools and services to capture, retain, document Facilitate communication, exchange –Methods to describe, cite documents and data –Methods to represent and use composite objects –Robust linking Reflect scientific incentives –Rewards for data contribution –Rewards for data management Support work practices –Tools and services to capture, retain, document Facilitate communication, exchange –Methods to describe, cite documents and data –Methods to represent and use composite objects –Robust linking Reflect scientific incentives –Rewards for data contribution –Rewards for data management Image source: http://clubs.myams.org/gvca/i mages/Context- Meaning_web.gif

34 34 Involve the stakeholders Scientists Technologists Social scientists Librarians, archivists Universities Funding agencies Corporate partners Publishers Scientists Technologists Social scientists Librarians, archivists Universities Funding agencies Corporate partners Publishers Image source: http://www.ox.ac.uk/

35 35 “May all your problems be technical” Jim Gray, ACM Turing award winner “May all your problems be technical” Jim Gray, ACM Turing award winner


Download ppt "11 Building a Usable Infrastructure for e-Science: An Information Perspective Christine L. Borgman Professor & Presidential Chair in Information Studies."

Similar presentations


Ads by Google