Download presentation
Presentation is loading. Please wait.
Published byJacob Mosley Modified over 9 years ago
1
1 Kalev Leetaru, Eric Shook, and Shaowen Wang CyberInfrastructure and Geospatial Information Laboratory (CIGI) Department of Geography and Geographic Information Science School of Earth, Society, and Environment National Center for Supercomputing Applications (NCSA) University of Illinois at Urbana-Champaign CyberGIS ‘ 12, Urbana IL, August 8, 2012 A CyberGIS Approach to Digital Humanities and Social Sciences: The World of Textual Geography and a Case Study of Wikipedia’s History of the World
10
10
11
11
14
14 http://www.sgi.com/go/wikipedia
15
15
16
16
17
17
18
18
19
19
20
Workflow CyberGIS Sentiment Mining Fulltext Geocoding
21
Inside the CyberGIS “black box” Security Domain Decomposition XSEDE GISolve Middleware CI Data & Viz Resource Selection Task Scheduling Clouds Workflow Management Services Open Service API OSG Emotional Heatmap
22
Data Input for a Topic A set of locations with 3 attributes Latitude, longitude point location 1. Number of articles mentioning this location 2. Number of articles mentioning both this location and topic 3. Average tone of articles mentioning both this location and topic Latitude, longitude point location 1. Number of articles mentioning this location 2. Number of articles mentioning both this location and topic 3. Average tone of articles mentioning both this location and topic
23
Data Input for a Topic A set of locations with 3 attributes Latitude, longitude point location 1. Number of articles mentioning this location 2. Number of articles mentioning both this location and topic 3. Average tone of articles mentioning both this location and topic Latitude, longitude point location 1. Number of articles mentioning this location 2. Number of articles mentioning both this location and topic 3. Average tone of articles mentioning both this location and topic ?
24
Spatializing Emotion 3 important elements 1. Importance of location 2. Prevalence of topic 3. Emotion toward topic Goal: Capture 3 elements on a single map
25
1) Importance of Location Every mention of a location increases its importance Every mention of a location increases its importance Generate a density map of the number of times a location is mentioned in text using Kernel Density Estimation (KDE) based on k nearest neighbor search Generate a density map of the number of times a location is mentioned in text using Kernel Density Estimation (KDE) based on k nearest neighbor search
26
1) Importance of Location
27
2) Prevalence of Topic We term topic intensity to capture the prevalence of a topic relative to other topics, and adopt a method commonly used in epidemiological studies to estimate it We term topic intensity to capture the prevalence of a topic relative to other topics, and adopt a method commonly used in epidemiological studies to estimate it Relative risk is a ratio of the KDE of disease infection locations and case control locations Relative risk is a ratio of the KDE of disease infection locations and case control locations
28
Topic Intensity KDE(articles that mention a topic)___ KDE(articles that do not mention the topic) KDE(articles that mention a topic)___ KDE(articles that do not mention the topic) Relative Risk KDE(points with disease)__ KDE(points without disease) KDE(points with disease)__ KDE(points without disease)
29
Topic Intensity
30
3) Emotion Toward a Topic Challenging question: Is the emotional measure tone, discrete or continuous? Challenging question: Is the emotional measure tone, discrete or continuous? –Is tone "countable" like trees or does it exist as a continuum like air temperature? Tone is a continuum: Tone is a continuum: –Cannot have "number of tones"
31
3) Emotion Toward a Topic A different method is used, because tone is continuous and not discrete A different method is used, because tone is continuous and not discrete Inverse distance weighted (IDW) interpolation is used to estimate tone across space creating a tone map Inverse distance weighted (IDW) interpolation is used to estimate tone across space creating a tone map Tone map captures positive and negative tone toward a particular topic across space Tone map captures positive and negative tone toward a particular topic across space
32
3) Emotion Toward a Topic
33
Overview – 3 layers 1) Article density - Proxy: Importance of location 2) Topic intensity - Proxy: Prevalence of topic relative to other topics 3) Tone - Proxy: Emotion toward a topic
34
Overview – 3 layers 1) Article density - Proxy: Importance of location 2) Topic intensity - Proxy: Prevalence of topic relative to other topics 3) Tone - Proxy: Emotion toward a topic First two layers represent scaling factors for tone Value range: 0 - 1 Value range: 0 - 100 Value range: -100 - 100
35
Emotional Heatmap Article Density Topic Intensity Emotional Heatmap Tone * = *
36
Emotional Heatmap of Armed Conflict in 2003 (Wikipedia)
37
Summary First steps, but started the dialogue First steps, but started the dialogue Balance Balance –Managing the complexity of cyberinfrastructure access –Simplifying the workflow of chaining of spatial analytics –Making sense of what’s involved Scientific rigor Scientific rigor
38
Ongoing Work Translate spatial knowledge to domain knowledge by answering a basic question: why is this here and not there? Translate spatial knowledge to domain knowledge by answering a basic question: why is this here and not there? Tackle spatial aggregation issues Tackle spatial aggregation issues –Represent locations as areas not points –Areal interpolation
39
39 Acknowledgments Guofeng Cao, Anand Padmanabhan Guofeng Cao, Anand Padmanabhan National Science Foundation National Science Foundation –BCS-0846655 –OCI- –OCI-1047916 –Open Science Grid –XSEDE SES070004N
40
40 Thanks!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.