Data Exploration Of Wikipedia

Slides:



Advertisements
Similar presentations
Surrey Libraries Computer Learning Centres January 2012 Internet Searching Teaching Script Totally New to Computers Internet Searching.
Advertisements

Using d3.js For Visualization of Corporate Board Membership Alexei Bulazel 1 ( ), Bharath Santosh 1 ( ), James Hendler 1.
O(N 1.5 ) divide-and-conquer technique for Minimum Spanning Tree problem Step 1: Divide the graph into  N sub-graph by clustering. Step 2: Solve each.
Curious Facts about PowerPoint 97. Did you know that… F PowerPoint 97 now includes Visual Basic for Applications as a macro language?
Introduction to WEKA Aaron 2/13/2009. Contents Introduction to weka Download and install weka Basic use of weka Weka API Survey.
Final Presentation Undergraduate Researchers: Graduate Student Mentor: Faculty Mentor: Jordan Cowart, Katie Allmeroth Krist Culmer Dr. Wenjun Zeng Investigating.
Algorithms for Data Mining and Querying with Graphs Investigators: Padhraic Smyth, Sharad Mehrotra University of California, Irvine Students: Joshua O’
Midterm Presentation Undergraduate Researchers: Graduate Student Mentor: Faculty Mentor: Jordan Cowart, Katie Allmeroth Krist Culmer Dr. Wenjun (Kevin)
Web Information Retrieval Projects Ida Mele. Rules Students can work in teams (max 3 people) The project must be delivered by the deadline that will be.
How Search Engines Work. Any ideas? Building an index Dan taylor Flickr Creative Commons.
Projects ( ) Ida Mele. Rules Students have to work in teams (max 2 people). The project has to be delivered by the deadline that will be published.
By LaBRI – INRIA Information Visualization Team. Tulip 2010 – version Tulip is an information visualization framework dedicated to the analysis.
Surrey Public Library Electronic Classrooms Internet Survival Skills.
WEB API: WHY THEY MATTER ECOL 453/ Nirav Merchant
Data Analysis in YouTube. Introduction Social network + a video sharing media – Potential environment to propagate an influence. Friendship network and.
Connecting to a Cloud-Hosted DSpace Instance (Version 1.8.2) and Testing new features in Dpace 3.0 Demo. Presented by: Andrew Mwesigwa 2012 Participant.
Python and REST Kevin Hibma. What is REST? Why REST? REST stands for Representational State Transfer. (It is sometimes spelled "ReST".) It relies on a.
By Chris Zachor.  Introduction  Background  Changes  Methodology  Data Collection  Network Topologies  Measures  Tools  Conclusion  Questions.
WHAT IS A SEARCH ENGINE. Widescreen Presentation Proteus, Keeper of Knowledge. Proteus is synonymous with change and success.
UWG 2013 Meeting PO.DAAC Web Services Demo. What are PO.DAAC Web Services?
Leveraging ArcGIS Online Elevation and Hydrology Services
The HDF Group ESIP Summer Meeting HDF Studio John Readey The HDF Group 1 July 8 – 11, 2014.
Vizster: Visualizing Online Social Networks Authors: Jeffrey Heer and Danah Boyd Presented by: Jeanne Kramer-Smyth Machon Gregory.
JJE: INEX XML Competition Bryan Clevenger James Reed Jon McElroy.
Sketches and prototypes for the Orlando Six Degrees of Separation Project.
Visualization of Washing Powder Formulation ———seeking the best ingredients of washing powder.
OIPA Openaid IATI Parser and API May What is OIPA? The openaid IATI parser and API; Ingests & shows IATI compliant datasets in a regular User Interface.
WebQuery: Searching and Visualizing the Web through Connectivity Jeromy Carriere, Nortel Rick Kazman, Software Engineering Institute 元智資工所 系統實驗室 楊錫謦 2000/1/5.
Graphs Definition: a graph is an abstract representation of a set of objects where some pairs of the objects are connected by links. The interconnected.
SP5 - Neuroinformatics 3DSomaMS Tutorial Computational Intelligence Group Technical University of Madrid.
Twitter Community Discovery & Analysis Using Topologies Andrew McClain Karen Aguar.
Esri UC 2014 | Technical Workshop | Administering ArcGIS for Server with Python Jon Bodamer.
Portlet Development Konrad Rokicki (SAIC) Manav Kher (SemanticBits) Joshua Phillips (SemanticBits) Arch/VCDE F2F November 28, 2008.
WEB STRUCTURE MINING SUBMITTED BY: BLESSY JOHN R7A ROLL NO:18.
Sentiment Analysis of Twitter Data(using HadoopMapreduce)
Academic Visualization an insight into academia
Information Searching Using Visualizations
Pixy Python API Charlotte Weaver.
DataNet Collaboration
Search Engines.
Genome Biology & Applied Bioinformatics Mehmet Tevfik DORAK, MD PhD
Comparison of Social Networks by Likhitha Ravi
Accessing Spatial Information from MaineDOT
Best SEO Tips to Make Your Website Stand Out. SEARCH ENGINE OPTIMIZATION It is essential that you implement Search Engine Optimization strategies to make.
Trail Study Kevin Cianfarini, Shane Davies, Marshall Hansen, Andrew Eason … CS4624: Multimedia, Hypertext, and Information Access Instructor: Dr. Edward.
IST256 : Applications Programming for Information Systems
The Anatomy of a Large-Scale Hypertextual Web Search Engine
Network Visualization
November 8th, 2017 Matthew Davis and John Fink
Data Processing DNA: Question Your Answer Describe what a chart is.
GIFT / Fiscal Data Package Iteration 3
Methodology & Current Results
CS & CS Capstone Project & Software Development Project
Web 2.0 Creating Content.
A Restaurant Recommendation System Based on Range and Skyline Queries
Digital content Outcomes:
Lesson 11: Web Services and API's
Bidirectional Query Planning Algorithm
Planning and Storyboarding a Web Site
Python Crash Course CSC 576: Data Science.
Python and REST Kevin Hibma.
Minimum spanning trees
Bivariate Data credits.
BioGRID: Biological General Repository for Interaction Datasets
Photo Classification Evaluation Tool
Using Veera with R and Shiny to Build Complex Visualizations
Neal Kurande, WinaGodwin Anyanwu Jr., Adam Chau
Information Visualization - Week 01
Open data in teaching and education
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

Data Exploration Of Wikipedia By: Tyrone McElrath, Andrew Sutton

Our Objective Check it out! https://wikinavigation.github.io We Wanted to make a Visual representation of the web structure of a portion of wikipedia's site Check it out! https://wikinavigation.github.io

Visualization makes Data Better Data is Boring Visualization makes it Interactive Data is not consumer friendly Visualization allows for greater ease of use Data is sometime hard to correlate Visualization can help show relations

Tools? D3.Charts All in one power graph builder for visualization Pyri.Wikipedia Python Library for Script Python Web Crawling Script for dataSet GitHub Repository for Demo

How we got our data We used a python based api to collect and utilize the information on various wikipedia pages We set Keywords up to start our web crawling script Exported this information to json We Used an Algorithm: Pull web page information Search for links that correspond best to the group of keywords Select a link and span forward

What did we looked for? Clusters Nodes tightly linked together Paths The links chose and ventured though Variance How much does the first node vary from other nodes? Ex . telephone

We built a graph We built a graph It pulls the json from the python script Its pretty Aesthetically pleasing We have nodes and links and clusters We wanted to show how one node is connected to another Display Site https://wikinavigation.github.io

Why? Wanted to explore new technologies in visualizations See if things that are normally not associated with each other have a correlation Learn We were kinda confused on what we were gonna do Conclusion: Because we didn't have a better Idea and we were running out of time

Improvements More time learning the visualization library Bigger search span More research

Thank you! Questions?