Download presentation
Presentation is loading. Please wait.
1
1 CBioC: Collaborative Bio- Curation Chitta Baral Department of Computer Science and Engineering Arizona State University
2
2 Agenda Introduction Using the C-BioCurator System Overall Architecture Installation User Authentication User Interaction Text extraction systems Existing databases System Implementation Conclusion and Future Work
3
3 Introduction Motivation Our goal in this paper is to help get information nuggets of articles and abstracts and store in a database. The challenge is that the number of articles are huge and they keep growing, and need to process natural language. The two existing approaches human curation and use of automatic information extraction systems They are not able to meet the challenge, as the first is expensive, while the second is error-prone.
4
4 Introduction (cont’d) Approach: We propose a solution that is inexpensive, and that scales up. Our approach takes advantage of automatic information extraction methods as a starting point, Based on the premise that if there are a lot of articles, then there must be a lot of readers and authors of these articles. We provide a mechanism by which the readers of the articles can participate and collaborate in the curation of information. We refer to our approach as “Collaborative Curation''.
5
5 Introduction (cont’d) Results: We report on our system CBioC (short for Collaborative Bio-Curator) which facilitates collaborative curation. Availability: A prototype of the web interaction version is currently available at http://www.cbioc.org http://www.cbioc.org
6
6 Using the C-BioCurator System Overall Architecture: The two main components of our CBioC system are (i) the CBioC interface and (ii) the CBioC database. The user interacts with the CBioC system through the CBioC interface, and The curated or extracted data (from the abstracts and texts of the articles) together with the user interaction with respect to these data is stored in the CBioC database.
7
7 Using the C-BioCurator System (cont’d)
8
8 Installation and Invocation A researcher need to download our system and install it in her computer. Whenever the researcher accesses a web page from where she can access an article or an abstract, the CBioC system wakes up and creates an interaction frame.
9
9 With Web Band Version
10
10 Without Web Band Version
11
11 Using the C-BioCurator System (cont’d) User authentication The authentication is necessary as different kinds of user are allowed different levels of interaction by our system. For example, anonymous (non-registered) users are only allowed browsing ability, and are not allowed to leave any impression (such as adding facts or voting) for the future.
12
12
13
13 Using the C-BioCurator System (cont’d) User Interaction Past the user authentication, the CBioC uses the pubmed ID passed to search the database regarding any data about that article. If it finds such data, it then displays them in the interaction frame, taking into account the researcher’s preferences. It allows registered researchers to vote for the correctness of individual data tuples.
14
14 Using the C-BioCurator System (cont’d) Text extraction systems We periodically run (off-line) the best available automated text extraction systems on the pubmed abstracts and store the results in the CBioC database. If no information regarding a particular abstract is found in the CBioC database, then the information extraction systems will be run (on- line) on that abstract and the results will be displayed.
15
15 Using the C-BioCurator System (cont’d) Existing databases Protein Interaction (Extracted, Exchanged (e.g., BIND)) Reference User account Voting
16
16 Implementation
17
17 Conclusion and Future Work we have presented a vision that overcomes and suggests a solution to the seemingly insurmountable problem of being able to curate information nuggets from the extremely large and fast growing body of bio-medical literature. We have developed a prototype implementing our solution, and will be improved continuously. We believe that our proposed solution could really have a big impact on Bio-medical research, and hence this paper.
18
18 Conclusion and Future Work (cont’d) Our approach of using mass collaboration to curate bio-medical texts can be further generalized to the web as a whole (or other document repositories) where a group of people interested in a group of documents can collaborate to extract the knowledge buried in those documents, and simultaneously using automated extracted systems as a first step. We refer to this as collaborative meta-web, and are working on expanding it to many other domains.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.