의료정보검색 (Information Retrieval) 2003. 9. 24. 최진욱. 2 정보검색이란 Information Retrieval 원하는 정보를 찾는 것 Data retrieval vs. Information retrieval.

Slides:



Advertisements
Similar presentations
Chapter 5: Introduction to Information Retrieval
Advertisements

Modern information retrieval Modelling. Introduction IR systems usually adopt index terms to process queries IR systems usually adopt index terms to process.
Multimedia Database Systems
Basic IR: Modeling Basic IR Task: Slightly more complex:
Modern Information Retrieval Chapter 1: Introduction
UCLA : GSE&IS : Department of Information StudiesJF : 276lec1.ppt : 5/2/2015 : 1 I N F S I N F O R M A T I O N R E T R I E V A L S Y S T E M S Week.
Modern Information Retrieval by R. Baeza-Yates and B. Ribeiro-Neto
Web Search - Summer Term 2006 II. Information Retrieval (Basics Cont.)
Web- and Multimedia-based Information Systems. Assessment Presentation Programming Assignment.
IR Models: Overview, Boolean, and Vector
T.Sharon - A.Frank 1 Internet Resources Discovery (IRD) Classic Information Retrieval (IR)
IR Models: Structural Models
Models for Information Retrieval Mainly used in science and research, (probably?) less often in real systems But: Research results have significance for.
Information Retrieval Concerned with the: Representation of Storage of Organization of, and Access to Information items.
Chapter 2Modeling 資工 4B 陳建勳. Introduction.  Traditional information retrieval systems usually adopt index terms to index and retrieve documents.
IR Models: Latent Semantic Analysis. IR Model Taxonomy Non-Overlapping Lists Proximal Nodes Structured Models U s e r T a s k Set Theoretic Fuzzy Extended.
Advance Information Retrieval Topics Hassan Bashiri.
Vector Space Model CS 652 Information Extraction and Integration.
Modern Information Retrieval Chapter 1 Introduction.
Information retrieval Finding relevant data using irrelevant keys Example: database of photographic images sorted by number, date. DBMS: Well structured.
Using The World Wide Web Information Gathering. TCP/IP Communications protocol  how computers communicate or “talk” How does it work?
IR Models: Review Vector Model and Probabilistic.
Other IR Models Non-Overlapping Lists Proximal Nodes Structured Models Retrieval: Adhoc Filtering Browsing U s e r T a s k Classic Models boolean vector.
Recuperação de Informação. IR: representation, storage, organization of, and access to information items Emphasis is on the retrieval of information (not.
Information Retrieval: Foundation to Web Search Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems August 13, 2015 Some.
Internet Technology I د. محمد البرواني. Project Number 3 Computer crimes in the cybernet Computer crimes in the cybernet Privacy in the cybernet Privacy.
1 Jordan University of Science & Technology Faculty of Computer & Information Technology Department of Computer Science CIS 100Internet.
Introduction To Internet
Modern Information Retrieval Computer engineering department Fall 2005.
Information Retrieval Introduction/Overview Material for these slides obtained from: Modern Information Retrieval by Ricardo Baeza-Yates and Berthier Ribeiro-Neto.
Information Retrieval Chapter 2: Modeling 2.1, 2.2, 2.3, 2.4, 2.5.1, 2.5.2, Slides provided by the author, modified by L N Cassel September 2003.
Information Retrieval Models - 1 Boolean. Introduction IR systems usually adopt index terms to process queries Index terms:  A keyword or group of selected.
1 Information Retrieval Acknowledgements: Dr Mounia Lalmas (QMW) Dr Joemon Jose (Glasgow)
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
IR Models J. H. Wang Mar. 11, The Retrieval Process User Interface Text Operations Query Operations Indexing Searching Ranking Index Text quer y.
Introduction to Digital Libraries hussein suleman uct cs honours 2003.
Information Retrieval Model Aj. Khuanlux MitsophonsiriCS.426 INFORMATION RETRIEVAL.
CSCE 5300 Information Retrieval and Web Search Introduction to IR models and methods Instructor: Rada Mihalcea Class web page:
1 University of Palestine Topics In CIS ITBS 3202 Ms. Eman Alajrami 2 nd Semester
Mining the Web Ch 3. Web Search and Information Retrieval 인공지능연구실 박대원.
Information in the Digital Environment Information Seeking Models Dr. Dania Bilal IS 530 Spring 2005.
Introduction to Information Retrieval Aj. Khuanlux MitsophonsiriCS.426 INFORMATION RETRIEVAL.
1 Patrick Lambrix Department of Computer and Information Science Linköpings universitet Information Retrieval.
Introduction to Information Retrieval Example of information need in the context of the world wide web: “Find all documents containing information on computer.
Information Retrieval CSE 8337 Spring 2007 Introduction/Overview Some Material for these slides obtained from: Modern Information Retrieval by Ricardo.
Recuperação de Informação Cap. 01: Introdução 21 de Fevereiro de 1999 Berthier Ribeiro-Neto.
Information Retrieval
The Boolean Model Simple model based on set theory
Information Retrieval and Web Search IR models: Boolean model Instructor: Rada Mihalcea Class web page:
C.Watterscsci64031 Classical IR Models. C.Watterscsci64032 Goal Hit set of relevant documents Ranked set Best match Answer.
Set Theoretic Models 1. IR Models Non-Overlapping Lists Proximal Nodes Structured Models Retrieval: Adhoc Filtering Browsing U s e r T a s k Classic Models.
CSCI-235 Micro-Computers in Science The Internet and World Wide Web.
Information Retrieval and Web Search Introduction to IR models and methods Rada Mihalcea (Some of the slides in this slide set come from IR courses taught.
Introduction n IR systems usually adopt index terms to process queries n Index term: u a keyword or group of selected words u any word (more general) n.
Xiaoying Gao Computer Science Victoria University of Wellington COMP307 NLP 4 Information Retrieval.
Information Retrieval Models School of Informatics Dept. of Library and Information Studies Dr. Miguel E. Ruiz.
Web Design Terminology Unit 2 STEM. 1. Accessibility – a web page or site that address the users limitations or disabilities 2. Active server page (ASP)
Week-6 (Lecture-1) Publishing and Browsing the Web: Publishing: 1. upload the following items on the web Google documents Spreadsheets Presentations drawings.
Information Retrieval on the World Wide Web
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
Multimedia Information Retrieval
Information Retrieval
موضوع پروژه : بازیابی اطلاعات Information Retrieval
CSE 635 Multimedia Information Retrieval
Introduction to Information Retrieval
Recuperação de Informação B
Recuperação de Informação B
Berlin Chen Department of Computer Science & Information Engineering
Recuperação de Informação
Advanced information retrieval
Presentation transcript:

의료정보검색 (Information Retrieval) 최진욱

2 정보검색이란 Information Retrieval 원하는 정보를 찾는 것 Data retrieval vs. Information retrieval

3 What is IR? Information Retrieval is a science which deals with the knowledge representation, storage, organization and access of information items.

4 Need for Information Retrieval 얼마나 많은 정보를 활용할 수 있 는가 ? 정보는 과연 재활용할 수 있는 형 태로 기록되어 있는가 ? Index, search 등을 쉽게 할 수 있는 가 ? 다른 의료기관에서도 같은 표현 을 사용하고 있는가 ? 진료를 얼마나 지원해 줄 수 있는 가 ?

5 NLM and Medline 10 million articles 3,500 journals since 1966 PubMed, Internet Grateful Med –

6 IR 관련된 기술분야 Internet Search Engine Vocabulary System Information Modeling Filtering and Classification Natural Language Processing ….…

7 Internet TCP/IP 를 사용하는 전세계적인 network public(not free, but open to everyone) carrier of electronic mail convenient to get free SW terabytes of information dynamic rerouting

8 Telephone network

9 사용료 ? Another network

10 국방성 ARPAnet TCP/IP 통신프로토콜 사용 Network in early stage

11 Protocol – rules of behavior – 한국 : 정지 신호 준비 – 독일 : 출발 준비 TCP/IP – 2 widely used network protocols : computer network 에 접속하기 위한 100 여 가지의 규약 TCP/IP

12 Internet Address Network 주소 Host 주소 32 bit 8 ~ 24 bit

13 Internet Classes

14 Address

15 URL

16 Hypertext

17 DNS Domain Name System(DNS) – 전세계적으로 흩어져 있는 컴퓨터의 소재 정보를 관리하는 컴퓨터시스템 – 도메인 이름을 IP 주소로 변환하는 역할

18 인터넷에서 정보 찾기 Search engine 이용 News group 에 문의 Mailing list 활용

19 Internet Search Engine – – –

20 Internet Search Engine

21 검색엔진의 특징별 비교 구분 장점 장점 단점 단점 디렉토리형 ( 주제형 ) 특정한 주제별로 정보를 찾 을 때 유리하다. 특별한 주제어나 키워드를 모를 경우 사용하기 편하다. 단계별 분류를 거쳐 정보를 찾 게 되므로 잘못 들어설 경우 시간 적 낭비가 크다. 로봇 에이전트형 ( 키워드형 ) 몇 개의 키워드를 통해 원 하는 정보를 쉽게 찾을 수 있 다. 1~2 개 정도의 단어나 검색식 이 부정확할 경우 정확한 찾기가 어렵다. 메타형 한번의 키워드 입력만으로 원하는 정보를 쉽게 찾을 수 있다. 다른 검색엔진들을 참조해야 하므로 검색속도가 느리다. 특정한 검색엔진별로의 검색 에서 실패할 경우도 많다.

22 AND search Search for Monet AND Renoir Search for +Monet +Renoir Search for Monet Renoir – “All the words” option

23 OR search Search for UPS U.P.S. Search for UPS OR U.P.S. Search for UPS U.P.S – “Any of the Words” option “foreign policy” vs foreign policy

24 NOT search Search for “bugs life” -ants Search for “bugs life” NOT ants Search for “bugs life” AND NOT ants

25 Near Search Korea NEAR climate – Altavista (advanced search) – two terms within 10 words Korea NEAR climate – Lycos (advanced search) – two terms within 25 words

26 USENET - USENET system 광범위한 게시판 시스템 - news group USENET 에 개설된 토론 그룹을 말함 news server in SNU news server in Melbourne

27 Newsgroup Search Engine

28 Mailing list n automatic mailing programs lLISTSERV lMajordomo 백악관 안터넷변천사 computer & privacy travel, weather

IR Modeling

30 IR steps Text processing Indexing – inverted file – signature file Organization in DB Query processing Evaluation

31 Information-Retrieval Process Content Database Information Need Query Result IndexingQuery Formulation Retrieval Evaluation Refinement

32 Indexing Process document accent spacing stop word noun group stemming automatic or manual indexing structure recognition full text index terms

33 Classification of IR UserTaskUserTask Retrieval: Adhoc Filtering Browsing Classic Models boolean vector probabilistic Structured Models non-overlapping lists proximal nodes Browsing Flat Structure Guided Hypertext Set Theoretic Fuzzy Extended Boolean Algebraic Generalized Vector Lat. Semantic Index Neural Network Probabilistic Inference Network Belief Network

34 Boolean Model query can be written in disjunctive normal form q = k a  (k b   k c ) q dnf = (1,1,1)  (1,1,0)  (1,0,0) Ka Kc Kb

35 Vector Model and Weight function K = { 소나타, 2000cc, 자동변속, 흰색, …, k t } D 1 = {20, 20, 11, 5, …,5} D 2 = {20, 18, 12, 4, …, 5} D 30 = {0, 20, 12, 3, …,9} 현대차 삼성차 weight terms are assumed to be mutually independent !

36 Boolean vs. Vector model Petroleum Mexico Oil Texas Refinery Ship ( ) ( ) Boolean Vector

37 Retrieval Issues Indexing – inverted file Ranking – relevance 에 따른 ranking – chronology 에 따른 ranking Display Item Aspirin Attack Heart Prevention Attribute (Doc. #) 1,5,6,9 3,6,7,8 4,6,7,10 1,2,6,9 (aspirin, prevention) (prevention, …) (aspirin, attack, heat)

38 환자번호 : 판독번호 : 검사코드 : RC102 검사명 : Brain CT (Pre contrast) 주진단상병명 : Infarction Of Posterior Cerebral Artery Territory 검사일자 : 검사결과 : BRAIN MRI + MRA [Finding] 양쪽 PVWM 의 여러 개의 UBO 는 underlying SVD 의 가능성이 있을 것으로 보임. Left thalamus 와 occipital lobe 에 patchy high signal intensity 가 있고 FLAIR image 에서 마찬가지 소견임. T1WI 에서 iso signal intensity 의 portion 으로 … Radiology Results WORDDPW abort1,5 aberrant3 brain7,3 … … portion4,8 topology5,8 Inverted File Significant words only Indexing with Inverted File

39 Evaluation of IR Recall – 전체 찾아야 되는 ( 원하는 내용을 담고있는 ) 문서 중 검색되어진 문서 숫자 Precision – 검색된 문서 중 찾고자 하는 문서를 갖고 있 는 비율

40 Word based indexing 문제점 Context – meaning is affected by meaning of other words high, blood, pressure low pressure at high altitude increase red blood cell Polysemy – lead vs lead Synonymy – hypertension vs. high blood pressure Granularity – antibiotics, penicillin Focus of Content – Key word vs Plain word

41 The End