Heterogeneous Data Analysis: Tools, Methods, Applications Andrei Mogoutov | AGUIDEL
Scientific Controversy « Spaces » News Streams Blog Sphere Web Sites E-Communication (Mailing Lists, Forums) Scientific Information Data Bases (Publications, Patents) Offline “Literature” Surveys/ Interviews Traditional Media (TV, Radio) Specific Data Bases Etc ?
Heterogeneous Data Sets ? Analytical methods and software tools for the treatment of heterogeneous data within a unified framework. Heterogeneity by source –Heterogeneous data means diverse types of data from different sources. For example, databases, surveys, questionnaires with open questions and codified variables, interviews and text collections. Heterogeneity by constitution –Heterogeneous data is not only from various sources, it is also varied internally. Thus different variables are represented within data sets. For example, geographic location, personal profiles, institutional affiliation, or semantic and lexical units. Software help users and analysists to understand relations of a multivariate nature between entities. It is the ‘hidden’ relations and dependencies within your data that the analysis makes evident. Heterogeneity by structure –In today’s world, complexity and diversity of data is unparalleled. Software works with this dynamism, from highly codified and detailed databases to survey data with numerical variables onward to ‘raw’ data. Heterogeneity by scale –Analytical solutions help you to negotiate and manage heterogeneity by source, constitution, structure, and of course, scale. Thus from global level of analysis and interconnection to the institutional, specific, and individual level, software makes data visible.
Traditions/Methods/Solutions Statistics / Data Mining Textual Analysis Tools / Text Mining Web Cartography Scientometrix Tools for Qualitative Data Analysis Social / Socio-Technical Networks GIS Etc ?
Heterogeneous Data Sets: Back Office “Offline” Questionnaires Bibliographical databases Existing Databases Templates Online Data Collection Tools Actor Location Actor Identity Contents Classification Schema Parsing & Matching System NETWORK ORIENTED DATABASE Web Crawler
Design of Analytical Solution Back Office - data tables - web crawler - matching tools - tools for textual analysis - tools for data update and control Front Office Middle Office - a layer of analytical queries - pre-defined queries for multilevel - data aggregation and synthetic analysis and indicators - graphical/analytical interfaces (GIS, Relational Mappings, Statistical Charts) - statistical tables, indicators and textual synthesis - integrated querying tools ON-LINEOFF-LINE DATA UPDATEFEED-BACK
“Online” Data Collection Tool
Front Office “Desktop”
Front Office “Online”
Scientometrix PubMed (Medline) ISI Derwent
Scientometrix
Scientometrix
Scientometrix / Numbers Exploring the dynamics of biosafety research using relational data analysis Christophe Bonneuil (Centre Koyré d’Histoire des Sciences, Cnrs, Paris) Andrei Mogoutov (Aguidel Consulting) Etienne Klein (INRA, Avignon) Fabien Moll-François (Centre Koyré)
Scientometrix / Ranking-Listing
Scientometrix: Early Warning: Strategic Diagrams of Research Community Evolution: biosafety research
Scientometrix / Mapping
Heterogeneous Networks Companies & Technologies Scientometrix/ Adds Heterogeneous Networks Companies & Technologies
Actor/Networks Scientometrix / Adds: Actor/Networks Pharma Group I Central, Star-like hierarchical networks Pharma Group II Less central, Less hierarchical Platform Tech. Companies Clique-like, complex networks
Space Biotech Clusters Scientometrix /GIS Space Biotech Clusters Boston Region
Mapping of Collaborative Networks Web Tool Box for Heterogeneous Data Analysis Andrei Mogoutov | AGUIDEL Paris Sources: Bibliographical Databases: PubMed (Medline), ISI etc Heterogeneous Network Output Analysis & Mapping: Co-Authorship Networks Content Analysis – Keyword Mapping Heterogeneous Networks Authors vs Keywords Statistical tables and row data downloadable for desktop tools SVG Mapping Output
Scientometrix/ Data & Web Mining / Practice with ReseauLu Software
Relational Data Analysis
Textual Analysis
Scientometrix/ Data & Web Mining / Practice
PubMed Mapping I
PubMed Data II
Scientometrix Online Demo: Aguidel Web Toolbox
Web Cartography I WebMap, visual directory that maps 2 million plus web sites
Web Cartography II Conversation Map (Warren Sack). virtual conversations analysis
Web Cartography III IssueCrawler Project – GovCom.Org, Amsterdam (R Rogers et al)
Web Cartography Online Demo: Aguidel Web Toolbox
Network in Time
News The North Korean English News Space, Sept. 15 – Nov. 15, Findings: Whitehouse.gov (press release) couches North Korea in terms of regime change and human rights. The only other outlet that does so is Frontpagemag.org, a site which at the time of this map, extolled surfers to sign and e-petition and help “Stop the Left’s Anti-American Agenda… Help expose terrorists in our midst>” Connecting regime change to war is done by Fox News, Newsweek, and Asia Times online. Thus it is clear that these media outlets are framing regime change in terms of military conflict. Regime Change and Reunification are, basically, disconnected. Thus there is little talk of a German model achievement of Regime Change. The Financial Times subscription service, a strong example of the corporate angle on the issue, presented North Korea only in terms of regime change, notably isolating the issue from conflict, reunification, famine and other issues. Only CNN is able to connect famine and reunification, one of the more practical and meaningful associations between the issues of import in the peninsula. This finding defies conventional wisdom which would have CNN less informed by the stance of regionally located English media outlets.
News /offline demo (RéseauLu)
Text Mining & Web Mining Tools Web Tool Box for Heterogeneous Data Analysis Andrei Mogoutov | AGUIDEL Paris Sources: Textual data, Web Based Data, Bibliographical Databases, Abstracts, Articles, Titles Data collection tools Lexical tables Visualization of Heterogeneous Networks Actor/Lexical/Semantic Networks