Analysis of scientific research Mario Sangiorgio Giordano Tamburrelli.

Slides:



Advertisements
Similar presentations
Query Classification Using Asymmetrical Learning Zheng Zhu Birkbeck College, University of London.
Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Active Learning with Feedback on Both Features and Instances H. Raghavan, O. Madani and R. Jones Journal of Machine Learning Research 7 (2006) Presented.
Combining Classification and Model Trees for Handling Ordinal Problems D. Anyfantis, M. Karagiannopoulos S. B. Kotsiantis, P. E. Pintelas Educational Software.
New Technologies Supporting Technical Intelligence Anthony Trippe, 221 st ACS National Meeting.
Automated Fitness Guided Fault Localization Josh Wilkerson, Ph.D. candidate Natural Computation Laboratory.
Explorations in Tag Suggestion and Query Expansion Jian Wang and Brian D. Davison Lehigh University, USA SSM 2008 (Workshop on Search in Social Media)
Two e-Learning elective seminars in Novi Sad Putnik Z., Komlenov Ž., Budimac Z. DMI, Faculty of Science University of Novi Sad.
Search Engines and Information Retrieval
Distributed Search over the Hidden Web Hierarchical Database Sampling and Selection Panagiotis G. Ipeirotis Luis Gravano Computer Science Department Columbia.
April 22, Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Doerre, Peter Gerstl, Roland Seiffert IBM Germany, August 1999 Presenter:
Experimental Evaluation in Computer Science: A Quantitative Study Paul Lukowicz, Ernst A. Heinz, Lutz Prechelt and Walter F. Tichy Journal of Systems and.
Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees Radford M. Neal and Jianguo Zhang the winners.
Who am I and what am I doing here? Allan Tucker A brief introduction to my research
Presented by Zeehasham Rasheed
Experimental Evaluation in Computer Science: A Quantitative Study Paul Lukowicz, Ernst A. Heinz, Lutz Prechelt and Walter F. Tichy Journal of Systems and.
Semi-Automated Design Guidance Enhancer (SADGE) A Framework for Architectural Guidance Development Mohsen Anvaari Norwegian University of Science and Technology.
Evaluating Performance for Data Mining Techniques
Word Sense Disambiguation for Automatic Taxonomy Construction from Text-Based Web Corpora 12th International Conference on Web Information System Engineering.
Advisor: Hsin-Hsi Chen Reporter: Chi-Hsin Yu Date:
28 August 2015T Kari Laitinen1 T Seminar on Wireless Future 3 ECTS cr Dr. Kari Laitinen Principal Lecturer Oulu University of Applied Sciences.
The 2nd International Conference of e-Learning and Distance Education, 21 to 23 February 2011, Riyadh, Saudi Arabia Prof. Dr. Torky Sultan Faculty of Computers.
Data Management Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition.
Search Engines and Information Retrieval Chapter 1.
Page 1 WEB MINING by NINI P SURESH PROJECT CO-ORDINATOR Kavitha Murugeshan.
Chapter 13 Genetic Algorithms. 2 Data Mining Techniques So Far… Chapter 5 – Statistics Chapter 6 – Decision Trees Chapter 7 – Neural Networks Chapter.
Event Metadata Records as a Testbed for Scalable Data Mining David Malon, Peter van Gemmeren (Argonne National Laboratory) At a data rate of 200 hertz,
A Taxonomy of Evaluation Approaches in Software Engineering A. Chatzigeorgiou, T. Chaikalis, G. Paschalidou, N. Vesyropoulos, C. K. Georgiadis, E. Stiakakis.
©2008 Srikanth Kallurkar, Quantum Leap Innovations, Inc. All rights reserved. Apollo – Automated Content Management System Srikanth Kallurkar Quantum Leap.
1 Formal Models for Expert Finding on DBLP Bibliography Data Presented by: Hongbo Deng Co-worked with: Irwin King and Michael R. Lyu Department of Computer.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Presented by: Apeksha Khabia Guided by: Dr. M. B. Chandak
Information Filtering LBSC 796/INFM 718R Douglas W. Oard Session 10, April 13, 2011.
 Fundamentally, data mining is about processing data and identifying patterns and trends in that information so that you can decide or judge.  Data.
Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation Jianping Fan, Yuli Gao, Hangzai Luo, Guangyou Xu.
Mining Topic-Specific Concepts and Definitions on the Web Bing Liu, etc KDD03 CS591CXZ CS591CXZ Web mining: Lexical relationship mining.
BEHAVIORAL TARGETING IN ON-LINE ADVERTISING: AN EMPIRICAL STUDY AUTHORS: JOANNA JAWORSKA MARCIN SYDOW IN DEFENSE: XILING SUN & ARINDAM PAUL.
Understanding User’s Query Intent with Wikipedia G 여 승 후.
21/11/20151Gianluca Demartini Ranking Clusters for Web Search Gianluca Demartini Paul–Alexandru Chirita Ingo Brunkhorst Wolfgang Nejdl L3S Info Lunch Hannover,
How Do We Find Information?. Key Questions  What are we looking for?  How do we find it?  Why is it difficult? “A prudent question is one-half of wisdom”
Data Mining and Decision Trees 1.Data Mining and Biological Information 2.Data Mining and Machine Learning Techniques 3.Decision trees and C5 4.Applications.
CSC 9010 Spring, Paula Matuszek. 1 CS 9010: Semantic Web Applications and Ontology Engineering Paula Matuszek Spring, 2006.
Multi-Abstraction Concern Localization Tien-Duy B. Le, Shaowei Wang, and David Lo School of Information Systems Singapore Management University 1.
Automatically detecting and describing high level actions within methods Presented by: Gayani Samaraweera.
Threshold Setting and Performance Monitoring for Novel Text Mining Wenyin Tang and Flora S. Tsai School of Electrical and Electronic Engineering Nanyang.
EXPLORING PROCESS OF DOING DATA SCIENCE VIA AN ETHNOGRAPHIC STUDY OF A MEDIA ADVERTISING COMPANY J.SALTZ, I.SHAMSHURIN 2015 IEEE INTERNATIONAL CONFERENCE.
UWMS Data Mining Workshop Content Analysis: Automated Summarizing Prof. Marti Hearst SIMS 202, Lecture 16.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Text Information Management ChengXiang Zhai, Tao Tao, Xuehua Shen, Hui Fang, Azadeh Shakery, Jing Jiang.
Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.
WHIM- Spring ‘10 By:-Enza Desai. What is HCIR? Study of IR techniques that brings human intelligence into search process. Coined by Gary Marchionini.
Project GuideBenazir N( ) Mr. Nandhi Kesavan RBhuvaneshwari R( ) Batch no: 32 Department of Computer Science Engineering.
D ESIGNING AND E VALUATING S ERVICE - ORIENTED COLLABORATIVE DEVELOPMENT ENVIRONMENT Supervisor > M. Ali Babar Co-Supervisor > Jakob E. Bardram Paolo Tell.
Unsupervised Learning Part 2. Topics How to determine the K in K-means? Hierarchical clustering Soft clustering with Gaussian mixture models Expectation-Maximization.
LOG6306 : Études empiriques sur les patrons logiciels
Data Mining – Intro.
Unit 6 Research Project in HSC Unit 6 Research Project in Health and Social Care Aim This unit aims to develop learners’ skills of independent enquiry.
RESEARCH APPROACH.
The Steps into creation of research
Publication Strategies
Location Prediction and Spatial Data Mining (S. Shekhar)
Applying Key Phrase Extraction to aid Invalidity Search
Fabiano Ferrari Software Engineering Federal University of São Carlos
Automated Fitness Guided Fault Localization
IEEE Transactions on Computational Intelligence and AI in Games
EE362G Smart Grids: End of semester presentations
Panagiotis G. Ipeirotis Luis Gravano
A Suite to Compile and Analyze an LSP Corpus
Hierarchical Relational Models for Document Networks
New Techniques and Technologies for Statistics (NTTS) 2019
Presentation transcript:

Analysis of scientific research Mario Sangiorgio Giordano Tamburrelli

The origin of this work Carlo Ghezzi ’s keynote: Reflections on 40+ years of software engineering research and beyond: an insider’s view Analysis based on papers Lack of tools to perform the analysis WHAT research topics WHO contributors HOW/WHEN trends

The origin of this work Time consuming Boring Requires an expert Lack of tools to perform the analysis

Automatic analysis Faster ScalableGeneral method One-click (After training) Feasible with data mining techniques BUT still not perfect (it is not semantic-based)

Steps of the analysis Identification of subtopics Interpretation of paper content Trend analysis (So far) CLUSTERING CLASSIFICATION CLUSTERING STATISTICS

Clustering

Hierarchical Expectation Maximization Algorithm The tool used is Crossbow Thanks to Gianluca Staffiero and Gabriele Valentini Abstracts of papers from both general and specific conferences and journals

The clustering process

Classification

Bayesian classifier Ad hoc tool using Mallet Analysis based on the abstract of the papers

Result evaluation Clustering was iterated until the results were good Classification performs well: high precision and recall values human expert agrees with the classifier

Outcomes Research analysis trends on main conferences and journals Tools to support research automatic bidding

Some trends found Data from IEEE Transactions on Software Engineering

Some trends found Data from IEEE Transactions on Software Engineering

Automatic bidding Build upon analysis methodologies and results

Bidding process Grouping the submissions by topic Creation of a profile for the reviewers Matching papers’ topic with reviewers’ interests CLASSIFICATION SELECTION

Grouping the submissions

Creation of the reviewer profile

Matching profiles and submissions

Result evaluation ICSM 2010

Reviewers’ profiles Carlo Ghezzi Profile: web-services formal methods middleware for distributed systems models software components education CONFIRMED Harald Gall Profile: software mining middleware for distributed systems models empirical studies Do you agree?

Comparison with actual bids Results apparently not so good: recall it is about 53% BUT The actual bid is not an oracle We are suggesting papers for the most relevant topics

Live Testing: ICSE 2011 Propose our bids to the reviewers Get a feedback on our suggestions, based on reviewer impressions

Future works Improvement of the system Ranking of the suggested papers Deeper statistical analysis Paper assignment based on Genetic Algorithms assignment