Download presentation
Presentation is loading. Please wait.
Published byΠελαγία Παπαϊωάννου Modified over 6 years ago
1
DEBBIE CHENG * LISA HANKIN * JOHN MARK JOSLING
BLOGVIS IS 247: INFOVIS DEBBIE CHENG * LISA HANKIN * JOHN MARK JOSLING
2
PRESENTATION OUTLINE STATE OF BLOGOSPHERE OBJECTIVE DATA TOOLS
CHALLENGES DESIGN DISCUSSION/NEXT STEPS COMMENTS & SUGGESTIONS
3
STATE OF THE BLOGOSPHERE
According to Technorati as of Oct 2005: The total number of weblogs tracked (18.9M) continues to double about every 5 months. The blogosphere is now over 30 times as big as it was 3 years ago, with no signs of slowdown
4
STATE OF THE BLOGOSPHERE
About 70,000 new weblogs are created every day About a new weblog is created each second Between 700,000 and 1.3 million posts are made each day About 33,000 posts are created per hour, or 9.2 posts per second
5
OBJECTIVE Purpose of BLOG VIS is to…
Explore how information spreads through the blogosphere over time Identify how ideas grow from isolated topics into full-blown epidemics that “infect” large populations See how topic ideas (memes) are propagated in real-time Track major influencers in blogspace Blogs that originate ideas are not the most popular blogs Follow the emergence of ideas and the speed at which ideas travel The goal of our project is to create a time-based visualization that maps the spread of ‘news’ or concepts through this ever-expanding blogosphere. Starting with a topic idea we will track how the seed sprouts other links to create a conversation network. Represent the relationship between it and other blogs that post on the same topic. Animation will be used to show how new nodes, created over time, form topic neighborhoods in the blog universe It is difficult to conceptualize how expansive the blogspace has become and how quickly topics spread over time. We hope that our visualization gives users a way to quickly absorb this information and evaluate how different blogs connect to each other based on common memes
6
DATA Blogpulse Conversation Tracker
Conversation Tracker assembles snapshots of weblog “conversations” Threaded view of the conversation by performing a depth-first traversal of the conversation graph, starting from the seed post and visiting each node once Full content of weblog posts are indexed List of the postings, url and dates that the postings were made
7
DATA
8
DATA Longest String of Token Matches Between Blog Postings
9
TOOLS Prefuse Toolkit Modify Prefuse data storage structure to include time-based attributes for each edge in the tree Custom edge and node renderers to provide more control over the ‘network growth’ animation Swing library to augment visualization with imbedded focus + context information
10
CHALLENGES The originator of an idea is hard to find
70% of blogs do not provide links back to another blog that had mentioned the idea previously Infer where information comes from, based on text similarity Difficulty in figuring out how to compute similarities Problem of scraping out just posting info and not noise that surrounds the blog site Link reference tree for a given topic is shallow Rarely is there more than 3 levels of depth in the referencing tree “bloggers’ plagiarism scientifically proven”
11
DESIGN Flash Animation
Animation over time shows the topic/meme "bloom”. Blog postings in concentric circles that expand outwards over time. The larger the node the more popular the blog. Center topic node not a circle As time increases hue of nodes gets lighter and brightness gets dimmer Center point is the topic, the next ring out are blogs that post the next day on the topic, the next ring out are blogs that post 2 days later, the third: 3, etc. Most of the blogs/nodes would not be connected, but some would build a tree showing linking patterns.
12
DISCUSSION/NEXT STEPS
Comment volume on postings Geography- where posts are made How to determine blog popularity? Topic regions- group blogs around the circle by "type" (politics, personal, etc) How can we determine blog type?
13
COMMENTS/SUGGESTIONS
Please
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.