Categorical Relationships in History of InfoVis Publications CS533C Project Update by Alex Gukov
Goals Provide visual overview of InfoVis publication history Provide visual overview of InfoVis publication history Author collaboration networkAuthor collaboration network Paper co-citation networkPaper co-citation network Identify key influences Identify key influences Major research categoriesMajor research categories Influential authors and papers within a categoriesInfluential authors and papers within a categories Related categoriesRelated categories
Dataset Filtered 2004 InfoVis Contest data Filtered 2004 InfoVis Contest data InfoVis publication history from 1995 to 2002InfoVis publication history from 1995 to 2002 Original data cleaned up by Indiana University contestantsOriginal data cleaned up by Indiana University contestants Medium size network Medium size network 614 InfoVis articles with detailed metadata614 InfoVis articles with detailed metadata 8502 references with limited metadata8502 references with limited metadata 1036 authors1036 authors Paper metadata Paper metadata Title, year, abstract, keywords,Title, year, abstract, keywords,
Previous Work Node link diagram of highly cited papers, published authorsNode link diagram of highly cited papers, published authors Clearly identifies important papers, authors through node sizeClearly identifies important papers, authors through node size Does not relate authors, papers to research categoriesDoes not relate authors, papers to research categories Indiana University contest entry Indiana University contest entry
Previous Work IN-SPIRE by Pacific Northwest National Laboratory IN-SPIRE by Pacific Northwest National Laboratory Scatter plot of publicationsScatter plot of publications Plot positioning based on themes extracted from metadataPlot positioning based on themes extracted from metadata Clearly identifies dominant themesClearly identifies dominant themes Does not make use of citation dataDoes not make use of citation data
Criticism Want to relate publication network data with corresponding category information
Proposed Visualization Reduce the data set by using highly cited papers, published authors Reduce the data set by using highly cited papers, published authors Visualize collaboration and co-citation networks with node-link graphs Visualize collaboration and co-citation networks with node-link graphs Augment the plots with category information using background color Augment the plots with category information using background color
Graphing publication networks Node-link diagrams with papers, authors as graph nodes Node-link diagrams with papers, authors as graph nodes Node size proportional to the number of received citations Node size proportional to the number of received citations Node color Node color Paper publication datePaper publication date Number of papers written by an authorNumber of papers written by an author Force-directed layout using topology and category information as cues Force-directed layout using topology and category information as cues
Visualizing categories Identify a small number of categories from paper metadata Identify a small number of categories from paper metadata Process titles, abstracts, keywordsProcess titles, abstracts, keywords Reduce dimensionality, clusterReduce dimensionality, cluster Partition space around the graph nodes after layout is complete Partition space around the graph nodes after layout is complete Color the background of each node with corresponding category ( use light colors ) Color the background of each node with corresponding category ( use light colors ) Author is assigned the mean category of his/her publicationsAuthor is assigned the mean category of his/her publications
Mockup
Implementation Category identification Category identification Use PCA to reduce noiseUse PCA to reduce noise Use k-means on the resulting dataUse k-means on the resulting data Graph visualization Graph visualization Prefuse Visualization toolkit(Java)Prefuse Visualization toolkit(Java) Built-in force-directed layout engineBuilt-in force-directed layout engine Background space partitioning Background space partitioning Partition using a Voronoi diagramPartition using a Voronoi diagram Use CGAL geometry toolkit (C++)Use CGAL geometry toolkit (C++)
Current Progress Data graphing Data graphing Setup Prefuse and experimented with a sample social network graphing applicationSetup Prefuse and experimented with a sample social network graphing application Category identification Category identification Access database converted to xmlAccess database converted to xml Preprocessing data for use in MatlabPreprocessing data for use in Matlab