Data Exploration Of Wikipedia By: Tyrone McElrath, Andrew Sutton
Our Objective Check it out! We Wanted to make a Visual representation of the web structure of a portion of wikipedia's site Check it out!
Visualization makes Data Better Data is Boring Visualization makes it Interactive Data is not consumer friendly Visualization allows for greater ease of use Data is sometime hard to correlate Visualization can help show relations
Tools? D3.Charts All in one power graph builder for visualization Pyri.Wikipedia Python Library for Script Python Web Crawling Script for dataSet GitHub Repository for Demo
How we got our data We used a python based api to collect and utilize the information on various wikipedia pages We set Keywords up to start our web crawling script Exported this information to json We Used an Algorithm: Pull web page information Search for links that correspond best to the group of keywords Select a link and span forward
What did we looked for? Clusters Nodes tightly linked together Paths The links chose and ventured though Variance How much does the first node vary from other nodes? Ex . telephone
We built a graph We built a graph It pulls the json from the python script Its pretty Aesthetically pleasing We have nodes and links and clusters We wanted to show how one node is connected to another Display Site
Why? Wanted to explore new technologies in visualizations See if things that are normally not associated with each other have a correlation Learn We were kinda confused on what we were gonna do Conclusion: Because we didn't have a better Idea and we were running out of time
Improvements More time learning the visualization library Bigger search span More research
Thank you! Questions?