Heterogeneous Data Analysis: Tools, Methods, Applications Andrei Mogoutov | AGUIDEL www.aguidel.com.


Similar presentations
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.

Open repositories: value added services The Socionet example Sergey Parinov, CEMI RAS and euroCRIS.
C Introduction to the Geostat project Session on User needs (Geostat workshop in Bled 1-3 october 2008) Lars H. Backer
Basic Searching Engineering Village. Agenda What is Engineering Village? Setting up a personal account Searching Engineering Village How to.
Reference Management Software Tools Mendeley. Table of Contents: Part A Background/Location Signup/Login Import References Organize (Manage) References.
We have displayed the Browse publisher drop down menu. This You have full access to: list for an institution where all the material is included in the.
ANALYSING RESEARCH – A GLOBAL PERSPECTIVE Krzysztof Szymanski – Country Manager Thomson Reuters October 2009.
Engineering Village ™ Basic Searching.
0 General information Rate of acceptance 37% Papers from 15 Countries and 5 Geographical Areas –North America 5 –South America 2 –Europe 20 –Asia 2 –Australia.
Information Retrieval in Practice
Community of Science The Leading Internet Site for Researchers Worldwide
Biotech.Obs® Analytical Solutions AGUIDEL Consulting 68 Bld Port Royal, Paris France
Biomedical innovation at the laboratory, clinical and commercial interface. Mapping research grants, publications and patents in the field of microarrays.
CRIS/CERIF based model of research results granularity, circulation and usage in Socionet Sergey Parinov, Prof., Leading Researcher of Central Economics.
Mapping Science and Technology with Heterogeneous Data Analysis Tools for strategic management of innovation – Biotech.Obs Andrei Mogoutov | AGUIDEL Paris.
Data Sources & Using VIVO Data Visualizing Scholarship VIVO provides network analysis and visualization tools to maximize the benefits afforded by the.
Analytical Solution for Management of Biotech & Pharma Innovation Data collection, data warehousing, analytical design, data analysis and communication.
Managing references : Mendeley
Overview of Search Engines
New SpringerLink… ICSTI Conference, Moscow November 2010 Elwin Gardeur.
Regional Innovation Ecosystem Platform Panagiotis Tsarchopoulos URENIO Research.
Regardless of what you are marketing today the internet has opened up a whole new world of opportunity. But the way people now access the internet has.
Data Management Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition.
HOW CAN EMERGING DATABASES STRENGTHEN THE MEASUREMENT OF HUMAN DEVELOPMENT MARCH 2014 Mr Rachid Benmokhtar, President of the National Observatory of Human.
OFC304 Excel 2003 Overview: XML Support Joseph Chirilov Program Manager.
EUSTAR Website Project A preliminary analysis of our main objectives and practical solutions C. Henegar (France), F. Van den Hoogen (Holland), D. Farge.
MarketLine HQ ADVANTAGE – your subscription service Explore today at
Databases and Library Catalogs Global Index Medicus/Global Health Library PubMed Source Bibliographic Database: International Health and Disability.
GESIS Dr. Maximilian Stempfhuber Head of Research and Development Social Science Information Centre, Bonn, Germany How to deal with heterogeneity when.
Operation FaceBuffs of the CU-Boulder Alumni Association Harris Connect demo February 9, 2010 By Kim Egan.
OFC 303 Advanced Word XML: Customer-Defined Schemas Brian Jones Program Manager Authoring Services.
Thomson Reuters Solutions for Scientific Research David Horky Country Manager – Central and Eastern Europe
1 nlresearch.com The First ReSearch Engine: Northern Light® Susan M. Stearns Director of Enterprise Marketing March, 1999.
We have displayed the Browse publisher drop down menu. This You have full access to: list for an institution where all the material is included in the.
OARE Module 5A: Scopus (Elsevier). Table of Contents About Scopus (Elsevier) Using Scopus Search Page Results/Refine Search Pages Download, PDF, Export,
ORBIS & PORTALS E-Journal Workshop Michael Markwith, TDNet Inc. Reed College Library May 9, 2002.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
EvenTech System AGUIDEL Conference Package Conference / workshop / summer school / symposium /association / events Web site Management Conference News.
Information Technologies Integrating information technologies into all facets of campus life 10/03/2003.
Mining real world data Web data. World Wide Web Hypertext documents –Text –Links Web –billions of documents –authored by millions of diverse people –edited.
Organizing Data and Information
Deutsche Forschungsgemeinschaft DFG The Use of Research Funding Databases for Research Assessment Information Systems Presented at the 8th international.
Discovering Earth Science Data and Services Using NASA’s Global Change Master Directory: The Value for Earth Science Teachers Tyler Stevens NASA’s Global.
RESEARCH – DOING AND ANALYSING Gavin Coney Thomson Reuters May 2009.
Adding ArcGIS Online to Your GIS Curriculum
Advanced Semantics and Search Beyond Tag Clouds and Taxonomies Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services.
Data mining, interactive semantic structuring, and collaboration: A diversity-aware method for sense-making in search Mathias Verbeke, Bettina Berendt,
Text Analytics A Tool for Taxonomy Development Tom Reamy Chief Knowledge Architect KAPS Group Program Chair – Text Analytics World Knowledge Architecture.
PubMed/How to Search, Display, Download & (module 4.1)
7. Grid Computing Systems and Resource Management
Distributed Data Analysis & Dissemination System (D-DADS ) Special Interest Group on Data Integration June 2000.
Social Information Processing March 26-28, 2008 AAAI Spring Symposium Stanford University
RESEARCH PROJECT : Exploring the dynamics of biosafety research using relational data analysis Christophe Bonneuil (Centre Koyré d’Histoire des Sciences,
Large Scale Semantic Data Integration and Analytics through Cloud: A Case Study in Bioinformatics Tat Thang Parallel and Distributed Computing Centre,
What is a database? (a supplement, not a substitute for Chapter 1…) some slides copied/modified from text Collection of Data? Data vs. information Example:
Informatics (India) Ltd,
Presenter Organisation(s)
SQL Server Data Tools Gert Drapers
Presenter Organisation(s)
Data Warehouse and OLAP
Metadata Construction in Collaborative Research Networks
Reference Management Software Tools Mendeley (Part A)
Data Warehouse and OLAP
Citation databases and social networks for researchers: measuring research impact and disseminating results - exercise Elisavet Koutzamani
Presentation transcript:

Heterogeneous Data Analysis: Tools, Methods, Applications Andrei Mogoutov | AGUIDEL

Scientific Controversy « Spaces » News Streams Blog Sphere Web Sites E-Communication (Mailing Lists, Forums) Scientific Information Data Bases (Publications, Patents) Offline “Literature” Surveys/ Interviews Traditional Media (TV, Radio) Specific Data Bases Etc ?

Heterogeneous Data Sets ? Analytical methods and software tools for the treatment of heterogeneous data within a unified framework. Heterogeneity by source –Heterogeneous data means diverse types of data from different sources. For example, databases, surveys, questionnaires with open questions and codified variables, interviews and text collections. Heterogeneity by constitution –Heterogeneous data is not only from various sources, it is also varied internally. Thus different variables are represented within data sets. For example, geographic location, personal profiles, institutional affiliation, or semantic and lexical units. Software help users and analysists to understand relations of a multivariate nature between entities. It is the ‘hidden’ relations and dependencies within your data that the analysis makes evident. Heterogeneity by structure –In today’s world, complexity and diversity of data is unparalleled. Software works with this dynamism, from highly codified and detailed databases to survey data with numerical variables onward to ‘raw’ data. Heterogeneity by scale –Analytical solutions help you to negotiate and manage heterogeneity by source, constitution, structure, and of course, scale. Thus from global level of analysis and interconnection to the institutional, specific, and individual level, software makes data visible.

Traditions/Methods/Solutions Statistics / Data Mining Textual Analysis Tools / Text Mining Web Cartography Scientometrix Tools for Qualitative Data Analysis Social / Socio-Technical Networks GIS Etc ?

Heterogeneous Data Sets: Back Office “Offline” Questionnaires Bibliographical databases Existing Databases Templates Online Data Collection Tools Actor Location Actor Identity Contents Classification Schema Parsing & Matching System NETWORK ORIENTED DATABASE Web Crawler

Design of Analytical Solution Back Office - data tables - web crawler - matching tools - tools for textual analysis - tools for data update and control Front Office Middle Office - a layer of analytical queries - pre-defined queries for multilevel - data aggregation and synthetic analysis and indicators - graphical/analytical interfaces (GIS, Relational Mappings, Statistical Charts) - statistical tables, indicators and textual synthesis - integrated querying tools ON-LINEOFF-LINE DATA UPDATEFEED-BACK

“Online” Data Collection Tool

Front Office “Desktop”

Front Office “Online”

Scientometrix PubMed (Medline) ISI Derwent



Scientometrix / Numbers Exploring the dynamics of biosafety research using relational data analysis Christophe Bonneuil (Centre Koyré d’Histoire des Sciences, Cnrs, Paris) Andrei Mogoutov (Aguidel Consulting) Etienne Klein (INRA, Avignon) Fabien Moll-François (Centre Koyré)

Scientometrix / Ranking-Listing

Scientometrix: Early Warning: Strategic Diagrams of Research Community Evolution: biosafety research

Scientometrix / Mapping

Heterogeneous Networks Companies & Technologies Scientometrix/ Adds Heterogeneous Networks Companies & Technologies

Actor/Networks Scientometrix / Adds: Actor/Networks Pharma Group I Central, Star-like hierarchical networks Pharma Group II Less central, Less hierarchical Platform Tech. Companies Clique-like, complex networks

Space Biotech Clusters Scientometrix /GIS Space Biotech Clusters Boston Region

Mapping of Collaborative Networks Web Tool Box for Heterogeneous Data Analysis Andrei Mogoutov | AGUIDEL Paris Sources: Bibliographical Databases: PubMed (Medline), ISI etc Heterogeneous Network Output Analysis & Mapping: Co-Authorship Networks Content Analysis – Keyword Mapping Heterogeneous Networks Authors vs Keywords Statistical tables and row data downloadable for desktop tools SVG Mapping Output

Scientometrix/ Data & Web Mining / Practice with ReseauLu Software

Relational Data Analysis

Textual Analysis

Scientometrix/ Data & Web Mining / Practice

PubMed Mapping I

PubMed Data II

Scientometrix Online Demo: Aguidel Web Toolbox

Web Cartography I WebMap, visual directory that maps 2 million plus web sites

Web Cartography II Conversation Map (Warren Sack). virtual conversations analysis

Web Cartography III IssueCrawler Project – GovCom.Org, Amsterdam (R Rogers et al)

Web Cartography Online Demo: Aguidel Web Toolbox

Network in Time

News The North Korean English News Space, Sept. 15 – Nov. 15, Findings: Whitehouse.gov (press release) couches North Korea in terms of regime change and human rights. The only other outlet that does so is Frontpagemag.org, a site which at the time of this map, extolled surfers to sign and e-petition and help “Stop the Left’s Anti-American Agenda… Help expose terrorists in our midst>” Connecting regime change to war is done by Fox News, Newsweek, and Asia Times online. Thus it is clear that these media outlets are framing regime change in terms of military conflict. Regime Change and Reunification are, basically, disconnected. Thus there is little talk of a German model achievement of Regime Change. The Financial Times subscription service, a strong example of the corporate angle on the issue, presented North Korea only in terms of regime change, notably isolating the issue from conflict, reunification, famine and other issues. Only CNN is able to connect famine and reunification, one of the more practical and meaningful associations between the issues of import in the peninsula. This finding defies conventional wisdom which would have CNN less informed by the stance of regionally located English media outlets.

News /offline demo (RéseauLu)

Text Mining & Web Mining Tools Web Tool Box for Heterogeneous Data Analysis Andrei Mogoutov | AGUIDEL Paris Sources: Textual data, Web Based Data, Bibliographical Databases, Abstracts, Articles, Titles Data collection tools Lexical tables Visualization of Heterogeneous Networks Actor/Lexical/Semantic Networks