Download presentation
Presentation is loading. Please wait.
1
Academic Visualization an insight into academia
Hello everyone, I’m Changfeng Liu. It’s my honor to give you a presentation today. The topic is academic visualization – an insight into academia. Changfeng liu Acemap group May 31, 2016
2
outline Background Work Review SchoolDrag Future work Q & A
This is the outline of my presentation. May 31, 2016 Academic Visualization
3
outline Background Work Review SchoolDrag Future work Q & A
What is visualization? Why academic visualization? Work Review SchoolDrag Future work Q & A Firstly, I will introduce the background of my work. May 31, 2016 Academic Visualization
4
What is visualization Wikipedia:
Information visualization is the study of (interactive) visual representations of abstract data to reinforce human cognition. An easy way to understand the world! Visualization The very first question is what is visualization? According to Wikipedia, Information visualization is the study of interactive visual representations of abstract data to reinforce human cognition. In my opinion, visualization is an easy way to understand the world. May 31, 2016 Academic Visualization
5
Why academic visualization
Prof. Wang: We have got massive academic data, which is like raw food materials. What we should do is to cook delicious dishes! How to be a good chef? The following question is why we need academic visualization. Professor Wang said: “We have got massive academic data, which is like raw food materials. What we should do is to cook delicious dishes!” How to cook delicious food and be a good chef? Of course we won’t go to 新东方,AceMap is the right answer! May 31, 2016 Academic Visualization
6
Why academic visualization
Crawling, data cleaning Data warehousing, DBMS In AceMap, we have the chance to see how a delicious dish is made. Firstly, crawling and data cleaning are needed to obtain available data. Then database is used to store the data. After that, some data mining techniques are used to cook the data. Finally, information visualization will present a dish to the consumer. A picture is worth thousands of words, a good visualization will help the user to understand the complex structures between papers and reveal the hidden trends in academia. Information visualization! Data mining, knowledge discovery May 31, 2016 Academic Visualization
7
Why academic visualization
Why we need to build our own academic visualization? Because traditional academic search engines provide few visualizations… Google Scholar Microsoft academic dblp Since visualization is important, can we use visualization from other websites? The answer is frustrating: because traditional academic search engines provide few visualizations… As you can see, when I type in data mining, these three academic search engines can only return a list of papers without any visual information. This can be a disaster for this baby researcher. May 31, 2016 Academic Visualization
8
outline Background Work Review SchoolDrag Future work Q & A
Now, let’s have a quick review about our previous work. May 31, 2016 Academic Visualization
9
Work review One paper accepted by WWW BigScholar workshop
Two patents submitted On winter vacation, our paper was accepted by WWW BigScholar workshop and we submitted two patents related to academic visualization. May 31, 2016 Academic Visualization
10
Work review Local network analysis and hierarchical clustering
We mainly focus on local network analysis and hierarchical clustering among different papers in that work. This is some visualization pictures from the paper. By clicking the nodes, networks of the references are expanded and collapsed. Find multiple paths to the center paper and highlight them in the graph. The cluster graph shows fields in computer science. Zhaowei Tan, Changfeng Liu, et al., "AceMap: A Novel Approach towards Displaying Relationship among Academic Literatures", WWW BigScholar Workshop, 2016 May 31, 2016 Academic Visualization
11
Beyond Paper + paper We can see more!
AceMap abcd Paper + University? Paper + Timeline? Paper + Location? As I have mentioned, the previous visualization focused on demonstrating the pictures of papers. However, beyond paper plus paper, we can see more through visualization. How about paper plus university, paper plus timeline and paper plus location. The combination of different types of entities will make some really informative and interesting visualization. May 31, 2016 Academic Visualization
12
outline Background Work Review SchoolDrag Future work Q & A Motivation
Visualization tool selection Handle the data Real-time querying Future work Q & A Therefore, we want to conduct new visualization. And I will introduce SchoolDrag to you. May 31, 2016 Academic Visualization
13
schooldrag Our team implemented three visualizations this semester
Organization Geographical location Hierarchical structure I will introduce the one about organizations -- SchoolDrag This semester, our team implemented three visualizations. They are about the academic organizations, geographical locations and hierarchical structure. I will introduce the first one, SchoolDrag to you. May 31, 2016 Academic Visualization
14
motivation Select good graduate schools in a certain field
Find out the developing trends of different universities Which university is the groundbreaker? Which university is the follower? No academic search engines provide such a function. The motivation of SchoolDrag is following: How to select good graduate schools in a certain field as most of us will face this problem several months later; how to find out the developing trends of different universities; which university is the groundbreaker and which is the follower? Unfortunately, no academic search engines can answer these questions. May 31, 2016 Academic Visualization
15
Visualization tool selection
There are many tools available D3 and vega Tableau Gephi Eigenfactor How to choose? Open-source Easy to code Community support Web-based So we have to do this work by ourselves. The first thing we care about before conducting visualization is which tool to choose. There are many tools available as I list on this slide. Since we want to design new style and make it available on the Internet, we consider four aspects. Open-source, easy to code, community support and web-based. As a result, we choose vega and D3 as our tool. May 31, 2016 Academic Visualization
16
Handle the data -- goal To know the numbers of published papers of every organization in every year. To get the values of corresponding authors of every organization in every year. To demonstrate the developing trends and academic potential of different organizations in an interactive way. What we want to get from the database and show on the Internet are three things: 1. To know the numbers of published papers of every organization in every year. 2. To get the values of corresponding authors of every organization in every year. 3. To demonstrate the developing trends and academic potential of different organizations in an interactive way. May 31, 2016 Academic Visualization
17
Handle the data -- other tools
MySQL to manage and retrieve data C++ & Python to do the computation JSON to store the data Batch files to do the repeating tasks MySQL Batch files C++ Python Of course, vega and D3 are not enough for our visualization. We use MySQL to manage and retrieve data, C++ and Python to do the computation, JSON to store the data and batch files to do the repeating tasks. JSON May 31, 2016 Academic Visualization
18
Handle the data -- results
Two fields according to CCF recommendation list Network Artificial intelligence Ten L1 or L2 fields according to MAG datasets Computer security Computer vision Embedded system …… More fields are coming Here are the up-to-date results: we have got two fields information according to CCF recommendation list, namely Network and Artificial intelligence. Recently, we have calculated ten L1 or L2 fields according to MAG dataset, for example, computer security, computer vision, embedded system and so on. More fields are under computation and will come out soon. May 31, 2016 Academic Visualization
19
Demo time Due to the limited time, I only give a brief introduction of this demo. If you are interested in it, you can access it on AceMap. 1. Each circle stands for an academic organization. 2. The more published papers an organization has in the specified year, the bigger the circle is. 3. The curve shows the developing trends of the total number of published papers and authors of an organization. May 31, 2016 Academic Visualization
20
Offline Demo schoolDrag in AI field
Each circle stands for an academic organization. The more published papers an organization has in the specified year, the bigger the circle is. The curve shows the developing trends of the total number of published papers and authors of an organization. schoolDrag in AI field May 31, 2016 Academic Visualization
21
REAL-TIME QUERYING Real-time Querying V.S. Retrieving JSON files
More storage efficient – 50k fields in MAG dataset More flexible – what if a new field joins Less time efficient – minute-level waiting is unbearable How to deal with this trade-off? Rapid sampling[1] The next question is real-time querying. Why we need real-time querying? Since there are 50k fields in MAG dataset, real time querying can save much storage of the server. Moreover, real time querying is more flexible when facing a new field. However, because it has to computing for each query, the waiting time might be too long for the user. How to deal with this trade-off? The answer is Rapid sampling. [1] Kim, Albert, et al. "Rapid sampling for visualizations with ordering guarantees." Proceedings of the VLDB Endowment 8.5 (2015): May 31, 2016 Academic Visualization
22
REAL-TIME QUERYING -- rapid sampling
Main ideas Part stands for the whole population Only focusing on what we care about Confidence intervals count I will only give the main ideas of rapid sampling here. The first one is part stands for the whole population. That means we only sample part of the data to describe the whole dataset properly. What’s more, we should only focus on what we care about and ignore useless information. For example, if we want to rank universities, we don’t need to know the exact numbers of published papers. The ordering is the key information. Then, the ordering can be guaranteed by the non-overlapping confidence intervals. Kim, Albert, et al. "Rapid sampling for visualizations with ordering guarantees." Proceedings of the VLDB Endowment 8.5 (2015): May 31, 2016 Academic Visualization
23
REAL-TIME QUERYING -- rapid sampling
For our scenario Focus on ordering rather than exact paper numbers Compute in back-end and returning JSON files Combine static JSON with rapid sampling As for our task, we will focus on ordering rather than exact paper numbers. Let the server do the computing and return JSON files to users. What’s more, it’s a good idea to combine static JSON with rapid sampling to trade off. Kim, Albert, et al. "Rapid sampling for visualizations with ordering guarantees." Proceedings of the VLDB Endowment 8.5 (2015): May 31, 2016 Academic Visualization
24
outline Background Work Review SchoolDrag Future work Q & A
And this is my future work. May 31, 2016 Academic Visualization
25
Future work Building an interactive visualization for Junxian’s paper
Focusing on clustering algorithms at theory group (how to cook food) I am Building an interactive visualization for Junxian’s paper and focusing on clustering algorithms AT theory group (how to cook food) May 31, 2016 Academic Visualization
26
Q & A May 31, 2016 Academic Visualization
27
Thank you! May 31, 2016 Academic Visualization
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.