Knowledge Discovery and Dissemination (KDD) Program IARPA-BAA-09-10 Question Period: 22 Dec 09 – 2 Feb 10 Proposal Due Date: 16 Feb 10.

Slides:



Advertisements
Similar presentations
E-Science Data Information and Knowledge Transformation Thoughts on Education and Training for E-Science Based on edikt project experience Dr. Denise Ecklund.
Advertisements

o Nearly all 50 states have adopted the Common Core State Standards and Essential Standards. o State-led and developed Common Core Standards for K-12.
Project Proposal.
1.Data categorization 2.Information 3.Knowledge 4.Wisdom 5.Social understanding Which of the following requires a firm to expend resources to organize.
Yvan Rooseleer – BiASC – MAY 2013
SERVING CORPORATES AND INDIVIDUALS ©2012 BUSINESS REPORTING MANAGEMENT SERVICES, INC WELCOME.
PERFORMANCE FOR ALL The Project & the System. A HE project co-ordinated by University of Bristol, open to HE internationally. Developing the requirements.
The Experience Factory May 2004 Leonardo Vaccaro.
Sensemaking and Ground Truth Ontology Development Chinua Umoja William M. Pottenger Jason Perry Christopher Janneck.
Language Arts Technology Resources How can it help you?
A Technique for Advanced Dynamic Integration of Multiple Classifiers Alexey Tsymbal*, Seppo Puuronen**, Vagan Terziyan* *Department of Artificial Intelligence.
Information retrieval Finding relevant data using irrelevant keys Example: database of photographic images sorted by number, date. DBMS: Well structured.
1 IS112 – Chapter 1 Notes Computer Organization and Programming Professor Catherine Dwyer Fall 2005.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Huimin Ye.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Drew DeHaas.
Integrating Problem-Solving and Educational Software
Microsoft Business Intelligence Gustavo Santade Business Intelligence Project Manager Improving Business Insight Building a cube using Analysis Services.
GL12 Conf. Dec. 6-7, 2010NTL, Prague, Czech Republic Extending the “Facets” concept by applying NLP tools to catalog records of scientific literature *E.
1.Database plan 2.Information systems plan 3.Technology plan 4.Business strategy plan 5.Enterprise analysis Which of the following serves as a road map.
Bettina Matysiak PIDP 3104November Introduction  What are “application cards”?  an informal assessment technique which allows instructors to.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
1.Knowledge management 2.Online analytical processing 3. 4.Supply chain management 5.Data mining Which of the following is not a major application.
IARPA-BAA Question Period: 22 Dec 09 – 2 Feb 10 Proposal Due Date: 16 Feb 10.
1 Data Mining Books: 1.Data Mining, 1996 Pieter Adriaans and Dolf Zantinge Addison-Wesley 2.Discovering Data Mining, 1997 From Concept to Implementation.
Claudia Marzi Institute for Computational Linguistics, “Antonio Zampolli” – Italian National Research Council University of Pavia – Dept. of Theoretical.
Outline Terminology –Typical Expert System –Typical Decision Support System –Techniques Taken From Management Science and Artificial Intelligence Overall.
Communication & Web Presence David Eichmann, Heather Davis, Brian Finley & Jennifer Laskowski Background: Due to its inherently complex and interdisciplinary.
Problem Identification
ACE TESOL Diploma Program – London Language Institute OBJECTIVES You will understand: 1. How reading for Academic Purposes differs from other reading lessons.
2 Systems Architecture, Fifth Edition Chapter Goals Describe the activities of information systems professionals Describe the technical knowledge of computer.
Annual reports and feedback from UMLS licensees Kin Wah Fung MD, MSc, MA The UMLS Team National Library of Medicine Workshop on the Future of the UMLS.
Marko Grobelnik Jozef Stefan Institute ( Ljubljana, Slovenia.
SCSC 311 Information Systems: hardware and software.
U.S. Department of the Interior U.S. Geological Survey CDI Webinar Sept. 5, 2012 Kevin T. Gallagher and Linda C. Gundersen September 5, 2012 CDI Science.
BUSINESS INFORMATICS descriptors presentation Vladimir Radevski, PhD Associated Professor Faculty of Contemporary Sciences and Technologies (CST) Linkoping.
Learning outcomes for BUSINESS INFORMATCIS Vladimir Radevski, PhD Associated Professor Faculty of Contemporary Sciences and Technologies (CST)
Intelligence Advanced Research Projects Activity (IARPA) Dr. Edward Baranoski September 2009.
Minor Thesis A scalable schema matching framework for relational databases Student: Ahmed Saimon Adam ID: Award: MSc (Computer & Information.
Kingdom of Saudi Arabia Ministry of Higher Education Al-Imam Muhammad ibn Saud Islamic University College of Computer and Information Sciences Types of.
Lawrence R. Fine, M.Ed. Special Education Program Director Bi-County Collaborative.
Future Learning Landscapes Yvan Peter – Université Lille 1 Serge Garlatti – Telecom Bretagne.
Introduction – Addressing Business Challenges Microsoft® Business Intelligence Solutions.
Knowledge Representation of Statistic Domain For CBR Application Supervisor : Dr. Aslina Saad Dr. Mashitoh Hashim PM Dr. Nor Hasbiah Ubaidullah.
What is SBIR?  SBIR is a federal program where small businesses compete for up to $670,000 to research, develop and commercialize a new technology. 
Student Edition: Gale Info Trac Database Lesson Grades 9-12 High School Student Edition: Gale Info Trac Database Lesson Grades 9-12 High School Anita Cellucci.
ICT-enabled Agricultural Science for Development Scenarios, Opportunities, Issues by ICTs transforming agricultural science, research & technology generation.
Datalayer Notebook Allows Data Scientists to Play with Big Data, Build Innovative Models, and Share Results Easily on Microsoft Azure MICROSOFT AZURE ISV.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Introduction to STEM Integrating Science, Technology, Engineering, and Math.
OntoSoar: Soar Finds Facts in Text Peter Lindes, Deryle Lonsdale, David Embley Brigham Young University 33 rd Soar Workshop, June 2013 pl 6/6/201333rd.
Fire Emissions Network Sept. 4, 2002 A white paper for the development of a NSF Digital Government Program proposal Stefan Falke Washington University.
OMIS 694, Big Data Analytics
MATT DIXONARCHITECT CANDACE REMALYPROJECT MANAGER SPENCER SMITHBUSINESS ANALYST BRYAN LINTHICUMDEVELOPER ADAM STERNFELDTESTER Not Even Funny [Property.
 Explore fundamental issues in computing and develop theories and models to address those issues  Help scientists and engineers solve complex computing.
Data gathering (Chapter 7 Interaction Design Text)
Axis AI Solves Challenges of Complex Data Extraction and Document Classification through Advanced Natural Language Processing and Machine Learning MICROSOFT.
High Risk 1. Ensure productive use of GRID computing through participation of biologists to shape the development of the GRID. 2. Develop user-friendly.
1 Management Information Systems M Agung Ali Fikri, SE. MM.
FNA/Spring CENG 562 – Machine Learning. FNA/Spring Contact information Instructor: Dr. Ferda N. Alpaslan
LIMS (Location Information Management System) is the Smart Claim Solution for Motor Insurers, Built on the Powerful Microsoft Azure Platform MICROSOFT.
GCSE Computer Science Content Overview
Chapter 1 Computer Technology: Your Need to Know
HACT Program Hanscom Academic Cloud Team
The Five Secrets of Project Scheduling A PMO Approach
CHAPTER 1 Introduction BIC 3337 EXPERT SYSTEM.
Management Support Systems: An Overview by Dr. S. Sridhar,Ph. D
Technology Woes?.
European Network of e-Lexicography
Ontology-Based Information Integration Using INDUS System
OU BATTLECARD: Oracle WebCenter Training
Presentation transcript:

Knowledge Discovery and Dissemination (KDD) Program IARPA-BAA Question Period: 22 Dec 09 – 2 Feb 10 Proposal Due Date: 16 Feb 10

SYNOPSIS I ntelligence analysts must gather and analyze information from a wide variety of data sets that include: general references, news, technical journals and reports, geospatial data, entity databases, internal reports and more. The different terminologies, formats, data models, and contexts make it difficult to perform advanced analytic tasks across different data sets. If there are only a small, fixed number of data sets involved in an intelligence problem, then it may be practical to map all of the data sets to a common data model and to develop specialized analytic tools tailored to the problem. However, if the problem changes over time, the data sets are large or numerous, or there are new data sets that need to be integrated with those already in use, then a new approach is required. The focus of the KDD program is to develop novel approaches that will enable the intelligence analyst to effectively derive actionable intelligence from multiple, large, disparate sources of information, to include newly available data sets previously unknown to the analyst. The ability to quickly produce actionable intelligence from unanticipated, multiple, varied data sets require research advances in two key areas: (1) alignment of data models; and (2) advanced analytic algorithms. Making advances in these two research areas, and fully characterizing the performance of the research results using real Intelligence problems, is the focus of the IARPA Knowledge Discovery and Dissemination (KDD) Program. Performers shall perform research in both areas and develop prototype systems that implement their techniques and research results. The KDD Program will provide data sets to support research and development in addition to extensive test and evaluation. KDD test and evaluation will take place on an annual cycle, with each performer applying their prototype systems to challenge problems defined by the KDD Program. KDD evaluation of prototype systems will take place at government facilities and will use realistic Intelligence problems and real Intelligence data. The research supported by KDD will generally be unclassified, but the annual KDD evaluations will involve data sets classified no higher than SECRET//NOFORN. The KDD Program requires a combination of innovative research and the capability to develop robust prototypes. Research goals should be set, and research plans should be made, to take full advantage of the length of the KDD Program. The KDD Program expects a staged approach to prototype development; each successive prototype will leverage research progress made since the previous prototype. IARPA encourages teaming between academic and commercial entities to leverage the strengths of both types of organization. KDD requires the prime contractor for each performer team to have personnel and a facility cleared at the SECRET//NOFORN level at the time their proposal is submitted. KDD is planned as a 51-month program and anticipates making multiple awards. All correspondence must be submitted to the BAA address at All proposals should be submitted in accordance with the instructions in Section 4.C. of the Problems to solve System requirements Team requirements

Our Answers to Problems to Solve Wide variety of data sets – General references – the Web? CIA World Factbook?... (OntoES, FOCIH, TISP) – Free-running text – news, technical journals (WePS, [Embley09], Ancestry.com) – Geospatial data ([Embley89b]) – Entity databases (RelDB[Embley97], XML[Al-Kamha07,Al-Kamha08], IMS=heirarchical[Mok06,Mok10], Network=graph=OSM, OWL/RDF[Ding-converter]) – Reports (Filled-in forms and semi-structured data [Tao09,Liddle99],TANGO) – And more (Attensity?) Large and numerous data sets (extension to large and additional types; performance) Alignment of data models (TANGO) – Schema mapping ([Xu03,Xu06], …) – Data integration ([Biskup03]) – Semantic enrichment (MOGO) Advanced analytic algorithms (Giraud-Carrier: knowledge-based semantic distance, record linkage, and hybrid social networks; best-effort, quick answers [Zitzelberger thesis]) Performance ([Al-Muhammed07b], IS/Liddle, Attensity?)

Our Answers to System Requirements Prototype development (IS/Liddle/EnticeLabs liaison with Attensity where actual prototype would be developed) Data set storage and usage (Dithers currently 190Gbyte, 2- 3Tbyte easily obtained, 1Pbyte for $120,000; Attensity?) Ability to develop innovative research and a robust prototype (BYU primarily innovative research; Attensity primarily robust prototype; but synergistic R&D) Staged prototype development (Heart of research plan) – Leverage previous prototype – Show improvement at each stage

Our Answers to Team Requirements Academic: BYU – Computer Science (Embley) – Information Systems Management (Liddle) – Computational Linguistics (Lonsdale) – Data Mining (Giraud-Carrier) Commercial: Attensity Attensity's technology is the culmination of decades of research in the areas of search, natural language processing (NLP,) machine learning, artificial intelligence, and semantics. This research led to breakthrough software that allows computers to understand, process and analyze free-form text, offering government and commercial organizations the opportunity to leverage the vast amounts of information contained in non-structured formats.

SECRET//NOFORN Clearance BYU? BYU Personnel? – Embley, Liddle (US) – Lonsdale, Giraud-Carrier (NOFORN Clearance?) Attensity? Attensity Personnel?