An Ontological Approach to Financial Analysis and Monitoring.

Slides:



Advertisements
Similar presentations
Three-Step Database Design
Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Meta Data Larry, Stirling md on data access – data types, domain meta-data discovery Scott, Ohio State – caBIG md driven architecture semantic md Alexander.
UCLA : GSE&IS : Department of Information StudiesJF : 276lec1.ppt : 5/2/2015 : 1 I N F S I N F O R M A T I O N R E T R I E V A L S Y S T E M S Week.
An Ontological Approach to the Document Access Problem of Insider Threat ISI 2005, (May 20) Boanerges Aleman-Meza 1 Phillip Burns 2 Matthew Eavenson 1.
SRDC Ltd. 1. Problem  Solutions  Various standardization efforts ◦ Document models addressing a broad range of requirements vs Industry Specific Document.
OntoBlog: Informal Knowledge Management by Semantic Blogging Aman Shakya 1, Vilas Wuwongse 2, Hideaki Takeda 1, Ikki Ohmukai 1 1 National Institute of.
Information Retrieval in Practice
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Visual Web Information Extraction With Lixto Robert Baumgartner Sergio Flesca Georg Gottlob.
A Flexible Workbench for Document Analysis and Text Mining NLDB’2004, Salford, June Gulla, Brasethvik and Kaada A Flexible Workbench for Document.
Shared Ontology for Knowledge Management Atanas Kiryakov, Borislav Popov, Ilian Kitchukov, and Krasimir Angelov Meher Shaikh.
Marakas: Decision Support Systems, 2nd Edition © 2003, Prentice-Hall Chapter Chapter 1: Introduction to Decision Support Systems Decision Support.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University
Information retrieval Finding relevant data using irrelevant keys Example: database of photographic images sorted by number, date. DBMS: Well structured.
The RDF meta model: a closer look Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations.
Overview of Search Engines
Synthetic Information Architecture Semantic Web Technology: Leading the Migration Path from Static / Library To Dynamic / Network Architecture.
Artificial Intelligence Research Centre Program Systems Institute Russian Academy of Science Pereslavl-Zalessky Russia.
Amarnath Gupta Univ. of California San Diego. An Abstract Question There is no concrete answer …but …
OMAP: An Implemented Framework for Automatically Aligning OWL Ontologies SWAP, December, 2005 Raphaël Troncy, Umberto Straccia ISTI-CNR
Semantic Interoperability Jérôme Euzenat INRIA & LIG France Natasha Noy Stanford University USA.
An Ontological Approach to Assessing IC Need to Know Phillip BurnsCTA Inc. Prof. Amit ShethLSDIS Lab, University of Georgia Presented to ARDA PI Meeting,
SC32 WG2 Metadata Standards Tutorial Metadata Registries and Big Data WG2 N1945 June 9, 2014 Beijing, China.
1 The BT Digital Library A case study in intelligent content management Paul Warren
Knowledge based Learning Experience Management on the Semantic Web Feng (Barry) TAO, Hugh Davis Learning Society Lab University of Southampton.
Provenance Metadata for Shared Product Model Databases Etiel Petrinja, Vlado Stankovski & Žiga Turk University of Ljubljana Faculty of Civil and Geodetic.
Artificial intelligence project
SWETO: Large-Scale Semantic Web Test-bed Ontology In Action Workshop (Banff Alberta, Canada June 21 st 2004) Boanerges Aleman-MezaBoanerges Aleman-Meza,
Chapter 2 Architecture of a Search Engine. Search Engine Architecture n A software architecture consists of software components, the interfaces provided.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
1 Technologies for (semi-) automatic metadata creation Diana Maynard.
CROSSMARC Web Pages Collection: Crawling and Spidering Components Vangelis Karkaletsis Institute of Informatics & Telecommunications NCSR “Demokritos”
Of 33 lecture 10: ontology – evolution. of 33 ece 720, winter ‘122 ontology evolution introduction - ontologies enable knowledge to be made explicit and.
Dimitrios Skoutas Alkis Simitsis
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
BAA - Big Mechanism using SIRA Technology Chuck Rehberg CTO at Trigent Software and Chief Scientist at Semantic Insights™
©Ferenc Vajda 1 Semantic Grid Ferenc Vajda Computer and Automation Research Institute Hungarian Academy of Sciences.
© Geodise Project, University of Southampton, Knowledge Management in Geodise Geodise Knowledge Management Team Barry Tao, Colin Puleston, Liming.
Some questions -What is metadata? -Data about data.
1 Context-Aware Internet Sharma Chakravarthy UT Arlington December 19, 2008.
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Data Preprocessing Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Aim Ability to automate the detection of financial inconsistency and irregularity Problem Need to create a unified and logically rigorous terminology.
Semantic web Bootstrapping & Annotation Hassan Sayyadi Semantic web research laboratory Computer department Sharif university of.
DeepDive Model Dongfang Xu Ph.D student, School of Information, University of Arizona Dec 13, 2015.
1 Open Ontology Repository initiative - Planning Meeting - Thu Co-conveners: PeterYim, LeoObrst & MikeDean ref.:
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Selected Semantic Web UMBC CoBrA – Context Broker Architecture  Using OWL to define ontologies for context modeling and reasoning  Taking.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Integrated Departmental Information Service IDIS provides integration in three aspects Integrate relational querying and text retrieval Integrate search.
Of 24 lecture 11: ontology – mediation, merging & aligning.
Semantic Web Technologies Readings discussion Research presentations Projects & Papers discussions.
Information Retrieval in Practice
Search Engine Architecture
Presented by: Hassan Sayyadi
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Introduction to Database Systems
Chapter 2 Database Environment Pearson Education © 2009.
ece 627 intelligent web: ontology and beyond
Database Systems Instructor Name: Lecture-3.
Chapter 2 Database Environment Pearson Education © 2009.
Context-Aware Internet
Chapter 2 Database Environment Pearson Education © 2009.
AI Discovery Template IBM Cloud Architecture Center
Presentation transcript:

An Ontological Approach to Financial Analysis and Monitoring

Application Architecture

Research Areas ● Data Extraction ● Data Disambiguation ● Semantic Association and Ranking ● MathML-S (MathML with Semantics)

● Ontology schema designed to provide common terminology for a domain. ● Ontology instances represent actual data from a domain. ● Data is extracted from multiple resources and translated into instances of a single ontology. ● Original data sources include: ● Databases ● XML files ● HTML pages ● Text documents ● Etc. Data Extraction

Data Disambiguation Ongoing research to develop relationship and attribute based disambiguation techniques so that the ontology can be meaningfully populated. Simple Example: Is “Athens” the city in Georgia or the city in Greece?

Data Disambiguation Challenges ● Merging two or more databases/ontologies/xml files with multiple references of the same logical entity ● Adding new entities to an ontology when a similar entity already exists ● Variations in database/ontology/xml schemas ● Variations in information representation ● Incomplete information ● Use of abbreviations, mis-spellings, various naming conventions, format changes, etc.

Data Disambiguation Schema Person -- SSN -- TelNumber -- FirstName -- MiddleName -- LastName -- Generation -- Marital Status -- Applicant -- dependent of -- spouse of -- works for -- affiliated with -- foreign influence event -- address Tim Robins Tim Robins Single People Soft event place23 Conflicting instances Timothy Wallace Robinson Timothy -- Wallace -- Robinson Married person Oracle event place23 Reconciling Oracle and PeopleSoft indicates the two person entities work for the same organization Recognized as a time sensitive attribute String similarity metrics Nature of attribute indicates its relative importance – SSN given a high weight in disambiguating person entities

Semantic Associations and Ranking Semantic Associations ● Semantic associations are relationships or paths between concepts in an ontology Ranking ● Ranking based on multiple factors ● Number of links, types of links, location in ontology, etc. ● Ranking indicates degree of semantic “closeness”

Semantic Associations and Ranking Characterizing document content in terms of ontology “semantic annotation” ● Correlate words/phrases from document with entities/relationships in ontology ● Entity Identification ● Meta-data added to document (from associated ontological knowledge) ● Active area of research but practically useful technology now available ● Constrained to content of ontology

Semantic Associations and Ranking Semantic Relationships between Documents and Ontology ● Semantic associations: relationships between document concepts and ontology concepts are discovered and ranked ● Ranking based on multiple factors ● no. of links, types of links, location in ontology, … ● Ranking indicates degree of semantic “closeness”

Semantic Associations and Ranking ● Highly relevant ● Closely related ● Ambiguous ● Not relevant ● Undeterminable Documents Ranking

Semantic Associations and Ranking Research Content ● Discovery & Ranking of semantic semantic associations ● Characterizing “need to know” in terms of ontological concepts & relationships ● Meta-data annotation of data and (semi-structured & unstructured) documents ● correlation of document content & concepts in ontology

Semantic Associations and Ranking Research Challenges In this project we are addressing: ● Discovery of Semantic Associations per entity per document ● Input/Visualization/Management of Context of Investigation ● Scalability on number of documents & ontology size ● Performs well with thousand documents ● Ranking of documents

Semantic Associations and Ranking Ranking of Documents Relevance “Closely related entities are more relevant than distant entities” E = {e | e  Document } Ek = {f | distance(f, e  E) = k }

Semantic Associations and Ranking Relevance Measures for Documents ● Relevance engine input ● the set of semantically annotated documents ● the context of investigation for the assignment ● the ontology schema represented in RDFS, and the ontology instances represented in RDF ● Relevance measure function used to verify whether the entity annotations in the annotated document can be fit into the entity classes, entity instances, and/or keywords specified in the context of investigation.

Semantic Associations and Ranking Ranking of Documents Relevance Four groups of document-ranking: ● Not Related Documents ● unable to determine relation to context ● Ambiguously Related Documents ● some relationship exists to the context ● Somehow Related Documents ● Entities are closely related to the context ● Highly Related Documents ● Entities are a direct match to the context

Semantic Associations and Ranking Ranking of Documents Relevance continued Cut-off values determine grouping of documents w.r.t. relevance ● These are customizable cut-off values (more control and more meaningful parameters compared to say automatic classification or statistical approaches) “Inspection” of a document is possible via (a) original document or (b) original document with highlighted entities

Semantic Associations and Ranking Ontology-driven Thematic Association Lifecycle Building a scalable and high performance capability with support for: Task domain ontology creation and maintenance Ontology “Knowledge” based on trusted sources supporting Document Classification Ontology-driven Semantic Metadata Extraction/Annotation Utilizing semantic metadata and ontology to associate document theme(s) with analytical task Weighting process used to measure degree of relevance Task Domain Schema Creation Ontology Population Metadata Extraction And Annotation Enhancement Thematic Association/ Relationships Discovery Semantic Relationship Rank Analysis Ontology API MB KB

MathML-S ● An interface has been developed that allows the user to specify the set of things that need to be verified for any given individual. ● This kind of “ultimate flexibility” is possible due to the ontological approach used. ● An application of research in modeling rules for identifying financial irregularities using MathML (MathML-S) ● Traversals, formulas, rules and profiles represent the data, calculations and checks that need to be performed ● These are created using the graphical interface developed, and stored in the Component Library

MathML-S ● A traversal is a path through the ontology ending with a data-type value. ● A formula is a computation of a value using concepts found in the ontology. It may be constrained using data found in the ontology instance data. ● A rule is a verification (with boolean value) performed on: ● A computed value that come from a formula The existence of certain types of relationships Other types of rules may be added based on feedback. ● A profile is a collection of rules, where rules may be given different weights. A profile value can be computed for a person represented in the ontology.

MathML-S Rule Example Solvency Ratio Check Traversals ● Asset_T = value of Asset(n) ● Liability_T = value of Liability(n) Formulas ● Total_Assets = Asset_T1 + Asset_T2_ + … + Asset_Tn ● Total_Liabilities = Liability_T1 + Liability_T2 + … + Liability_Tn ● Solvency_Ratio= Total_Assets / Total_Liabilities Rule ● Solvency_Ratio_Check = Solvency_Ratio > 1.1

MathML-S ● Data integrated in the ontology is queried to verify compliance with respect to a set of customizable rules ● These rules include math calculations, verification of conditions, calculation and verification of ratios, etc. ● The ontological approach allows definition of such rules by using concepts and relationships-types of the ontology. (Thus applicable to sub- concepts)

MathML-S ● Semantic matching and the graph-like nature of ontology provide flexibility on defining the rules yet few research issues are to be addressed ● Operands for a formulas are data items in the ontology. Some data values are retrieved by traversing specific sequences of relationships ● Formulas are self-contained so that they can be re-uses in various rules (thus providing flexibility on maintenance) ● A profile is a set of rules where its verification implies querying the ontology and at the same time computing formulas and rule values

MathML-S