COMP3740 CR32: Knowledge Management and Adaptive Systems Overview and example KM exam questions By Eric Atwell, School of Computing, University of Leeds.

Slides:



Advertisements
Similar presentations
Using CAB Abstracts to Search for Articles. Objectives Learn what CAB Abstracts is Know the main features of CAB Abstracts Learn how to conduct searches.
Advertisements

Language Technologies Reality and Promise in AKT Yorick Wilks and Fabio Ciravegna Department of Computer Science, University of Sheffield.
Your dissertation and the Library James Webley 19 February 2013.
School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING An open discussion and exchange of ideas Introduced by Eric Atwell, Language.
COMP3410 DB32: Technologies for Knowledge Management 08 : Introduction to Knowledge Discovery By Eric Atwell, School of Computing, University of Leeds.
COMP3740 CR32: Knowledge Management and Adaptive Systems
Comp3776: Data Mining and Text Analytics Intro to Data Mining By Eric Atwell, School of Computing, University of Leeds (including re-use of teaching resources.
COMP3740 CR32: Knowledge Management and Adaptive Systems Introduction By Eric Atwell, School of Computing, University of Leeds.
Searching for Information: advanced & using Endnote Web to manage references Sport & Exercise Science Year 2: Autumn 2012 Peter Bradley: Subject Librarian.
Welcome to the seminar course
Chapter 5: Introduction to Information Retrieval
Business Intelligence
1 CHBE 594 Lecture 14 Data Bases For The Chemical Sciences.
Search Engines and Information Retrieval
Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.
Introduction to Library Research Gabriela Scherrer Reference Librarian for English Languages and Literatures, University Library of Bern.
Web of Science: An Introduction Peggy Jobe
PPAS 3190: Introduction to Library Research Timothy Bristow – Scott Library Political Science & Public Policy Librarian.
Exercise Your your Library ® Smart Searching UW Library Winter 2007.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.
Library Workshop for EPA Sep Outline 2 Find Library resources for research  iSearch  ProQuest Education Databases RefWorks – a web-based.
Introduction to Library Research Gabriela Scherrer Reference Librarian for English Languages and Literatures, University Library of Bern.
Search Engines and Information Retrieval Chapter 1.
Lecture Five: Searching for Articles INST 250/4.  What are LCSH? ◦ Why should one hyperlink on the LCSH in the Library catalogue search?  Subject vs.
Using sources in your Advanced Higher Investigation.
Rajesh Singh Deputy Librarian University of Delhi Measuring Research Output.
Lecture Four: Steps 3 and 4 INST 250/4.  Does one look for facts, or opinions, or both when conducting a literature search?  What is the difference.
Building An Academic Career
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
What to Know: 9 Essential Things to Know About Web Searching Janet Eke Graduate School of Library and Information Science University of Illinois at Champaign-Urbana.
Overviews of ITCS 6161/8161: Advanced Topics on Database Systems Dr. Jianping Fan Department of Computer Science UNC-Charlotte
©2003 Paula Matuszek CSC 9010: Text Mining Applications Document Summarization Dr. Paula Matuszek (610)
IL Step 3: Using Bibliographic Databases Information Literacy 1.
Strategies for Conducting Research on the Internet Angela Carritt User Coordinator, Oxford University Library Services Angela Carritt User Education Coordinator,
COMP3410 DB32: Technologies for Knowledge Management 10 : Introduction to Knowledge Discovery By Eric Atwell, School of Computing, University of Leeds.
Intro to Critiquing Research Your tutorial task is for you to critique several articles so that you develop skills for your Assignment.
WIRED Week 3 Syllabus Update (next week) Readings Overview - Quick Review of Last Week’s IR Models (if time) - Evaluating IR Systems - Understanding Queries.
The Structure of Information Retrieval Systems LBSC 708A/CMSC 838L Douglas W. Oard and Philip Resnik Session 1: September 4, 2001.
ITEC0700/ NETE0501/ ISEC0502 Research Methodology#5 Suronapee Phoomvuthisarn, Ph.D.
CH 42 DEVELOPING A RESEARCH PLAN CH 43 FINDING SOURCES CH 44 EVALUATING SOURCES CH 45 SYNTHESIZING IDEAS Research!
Project Thesis 2006 Adapted from Flor Siperstein Lecture 2004 Class CLASS Project Thesis (Fundamental Research Tools)
COMP53311 Knowledge Discovery in Databases Overview Prepared by Raymond Wong Presented by Raymond Wong
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Comparing Document Segmentation for Passage Retrieval in Question Answering Jorg Tiedemann University of Groningen presented by: Moy’awiah Al-Shannaq
COMP 2208 Dr. Long Tran-Thanh University of Southampton Decision Trees.
Of An Expert System.  Introduction  What is AI?  Intelligent in Human & Machine? What is Expert System? How are Expert System used? Elements of ES.
1 Manual LIMO Content  What’s LIMO?  Content of LIMO  Getting started in LIMO  Performing Searches  Using the Search Results  Managing.
The reference management software -also called citation management software, citation manager or personal bibliographic management software- are programs.
Research Skills for Your Essay Where to begin…. Starting the search task for real Finding and selecting the best resources are the key to any project.
DRAFT Library Resources – Teaching and Learning Adapted from a presentation by Ruth Russell, NOTE: References to UCL have been replaced.
A Bibliographic Management Software NORSHUHADA SAIDIN REFERENCE & RESEARCH DIVISION PERPUSTAKAAN KEJURUTERAAN UNIVERSITI SAINS MALAYSIA.
Advanced Higher Computing Science The Project. Introduction Worth 60% of the total marks for the course Must include: An appropriate interface using input.
1 CS 430 / INFO 430: Information Retrieval Lecture 20 Web Search 2.
Advanced Higher Computing Science
How to get started with RefWorks
CS6501 Advanced Topics in Information Retrieval Course Policy
Proposal for Term Project
Knowledge Management Tools
How to get started with RefWorks
Library Workshop for ENG1377 Exploring iSearch & Google Scholar
Data Mining: Concepts and Techniques Course Outline
Advanced Embodiment Design 26 March 2015
IL Step 3: Using Bibliographic Databases
Lecture 8 Information Retrieval Introduction
Objectives, activities, and results of the database Lituanistika
Dept. of Computer Science University of Liverpool
Information Retrieval and Web Design
Introduction to Search Engines
Citation databases and social networks for researchers: measuring research impact and disseminating results - exercise Elisavet Koutzamani
Presentation transcript:

COMP3740 CR32: Knowledge Management and Adaptive Systems Overview and example KM exam questions By Eric Atwell, School of Computing, University of Leeds

S1: Eric Atwell Office: 6.06a S2: Vania Dimitrova Office: 9.10p

Semester 1 Topics in KM Knowledge in Knowledge Management –the nature of knowledge, definitions and different types –Knowledge used in Knowledge Based Systems, KM systems Knowledge and Information Retrieval / Extraction –Analysis of WWW data: Google tools, SketchEngine, BootCat –IR: finding documents which match keywords / concepts –IE: extracting key terms, facts (DB-fields) from documents –Matching user requirements, advanced/intelligent matching –Mining WWW as source of data and knowledge Knowledge Discovery –Collating data in data warehouse; transforming and cleaning –Cross-industry standard process for data mining (CRISP-DM) –OLAP, knowledge visualisation, machine learning in WEKA –Analysis of WWW-sourced data

Past Exam Papers? One way to see what you need to learn is to look at past exam papers – this gives a birds eye view past exam papers COMP3740 CR32 is a new module … BUT developed from –COMP3410 Technologies for Knowledge ManagementCOMP3410 Technologies for Knowledge Management –COMP3640 Personalisation and User-Adaptive SystemsCOMP3640 Personalisation and User-Adaptive Systems For example, past COMP3410 exam paper covers some topics in CR32exam paper

Q1a: KM for bibliographic search Serge Sharoff is a lecturer at Leeds University who has published many research papers relating to technologies for knowledge management, for example: … (i)Imagine you are asked to assess the impact of Dr Sharoffs research, by finding a list of papers by other researchers which cite these publications. Suggest three Information Retrieval tools you could use for this task. State an advantage and a disadvantage of each of these three IR tools for this search task, in comparison to the other tools.

A1a: KM for bibliographic search (i)Name 3 appropriate tools e.g. Google Scholar, CiteSeer, ISI Web of Knowledge, Google BooksGoogle ScholarCiteSeerISI Web of KnowledgeGoogle Books An appropriate pro and con of each, eg: Google Scholar: Pro: wider coverage, all publications on open WWW; Con: does not give full references, just URL and some details Citeseer: Pro: stores papers in several formats plus BibTeX references; Con: not as good coverage, esp interdisciplinary ISI Web of Knowledge or Web of Science: Pro: good coverage of top journals including paid-for Con: most papers in this field are not in top journals

Q1a (ii): KM doesnt always work Q: Suggest three reasons why citations for some papers might not be found by any of your suggested IR tools A: - Two of these papers are in Russian, citations may also be; these tools focus on English-language papers; - Papers in this field are mainly in conference/workshop proceedings, not journals, hence less likely to be indexed by IR tools (esp Web of Science) - older papers may not be online, so less likely to be found and cited by others

Q1b: Info Retrieval v Info Extraction What is the difference between Information Retrieval and Information Extraction? A Knowledge Management consultancy aims to build a database of all Data Mining tools available for download via the WWW, including name, cost, implementation language, input/output format(s), and Machine Learning algorithm(s) included; should they use IR or IE for this task, and why?

A1b: Info Retrieval v Info Extraction IR: finding whole documents which match query IE: extracting data/info from a given text to populate fields in data-base or knowledge-base records Both IR and IE are appropriate: this task requires IR to find DM tool description webpages from whole WWW, but then finding the specific details in each webpage is identifying fields in records for DB population task

Q1c: using relevance feedback to adapt a query IR query finds matching documents. The user may say some are not relevant. Relevance feedback can guide the system to adapt the initial query – new query finds more of the same This may look complicated but its just putting the numbers into the equation…

Relevance feedback example [4 marks: 1 for correct q vector, 1 for realising sums a single d vector, 1for 3 weighted vectors, 1 for answer] q' = q + di / | HR | - di / | HNR| = 0.5 q d1 0.5 d4 = 0.5 (1.0, 0.6, 0.0, 0.0, 0.0) (0.8, 0.8, 0.0, 0.0, 0.4) 0.5 (0.6, 0.8, 0.4, 0.6, 0.0) = (0.5, 0.3, 0.0, 0.0, 0.0) + (0.4, 0.4, 0.0, 0.0, 0.2) (0.3, 0.4, 0.2, 0.3, 0.0) = (0.6, 0.3, 0.2, 0.3, 0.2)

Q2: Knowledge processes In 2008, Leeds University adopted the Blackboard Virtual Learning Environment (VLE) to be used in undergraduate taught modules in all schools and departments. In future, lectures and tutorials may become redundant at Leeds University: if we assume that student learning fits Colemans model of Knowledge Management processes, then the Virtual Learning Environment provides technologies to deal with all stages in this model. All relevant explicit, implicit, tacit and cultural knowledge can be captured and stored in our Virtual Learning Environment, for students to access using Information Retrieval technologies. Is this claim plausible? In your answer, explain what is meant by Colemans model of Knowledge Management processes, citing examples relating to learning and teaching at Leeds University. Define and give relevant examples of the four type of knowledge; and state whether they could be captured and stored in our VLE, and searched for via an Information Retrieval system. [20 marks]

even an essay has a marking scheme Key points: - Coleman process of knowledge gathering/acquisition: big problem would be data capture and preparation - Coleman process of knowledge storage/organisation: KM/IR could be of great benefit - Coleman process of knowledge refining/adding value: lectures aim at more than rote learning - Coleman process of knowledge transfer/dissemination: students prefer human factors of lectures? - Explicit Knowledge has been articulated - example: e.g. lecture notes, course handbooks - already captured, and already accessible via IR search - Implicit Knowledge hasnt been articulated (but could be) - example, e.g. extra material known to lecturer but not on the handouts - could potentially be captured, accessible if text form eg transcripts - Tacit Knowledge cant be articulated but is done without thinking - example, e.g. how to design and implement elegant programs - tacit knowledge cannot be captured, hence cannot be searched for via IR - Cultural Knowledge is shared norms/beliefs to enable concerted action - example, e.g. students cooperate in groupwork - written guidelines can be captured and retrieved, but not group spirit

Q3: Data Mining with WEKA Association rules link arbitrary features; e.g. (center = 0) => (color = 0) (100% - perfect predictor); Classification rules predict final feature (class) english=UK/US; e.g. (color (english = UK) (100% - perfect predictor)

Simple decision tree (colorpercent < = 40) / \ Yes No / \ UK US

How to choose the root? aim to balance the decision tree: best attribute is one which naturally splits instances into homogeneous subtrees with least errors. E.g. (colorpercent <= 40) splits into perfectly-predictive subsets with the training set.

Confusion matrix depends on decision-point given in (b); eg: for (colorpercent <= 40) we get 2 wrong classifications: === Confusion Matrix === a b <-- classified as 1 2 | a = UK 0 0 | b = US

Supervised v unsupervised ML Supervised learning involves learning from example instances with desired "answer" or classification, eg building decision tree to predict the last attribute, English=UK/US, given the arff instances; Unsupervised learning involves learning from example instances but not being shown desired "answer" for each, eg clustering instances into groups of similar documents on the basis of discriminative feature-values, not including English as the target class; this may yield another division of documents.

Reminder: birds eye overview of KM Knowledge in Knowledge Management Knowledge and Information Retrieval / Extraction Knowledge Discovery January mock exam: Knowledge Management