MATH 6380J Mini-Project 1: Realization of Recent Trends in Machine Learning Community in Recent Years by Pattern Mining of NIPS Words Chan Lok Chun lcchanac@ust.hk.

Slides:

Advertisements

Similar presentations

Multiple Regression.

Advertisements

N EWSFLASH ! I S THE NEWS ACCURATE ? Laura Jennings and Timina Liu Year 11 Burgmann Anglican School.

Inference for Regression

Gene Shaving – Applying PCA Identify groups of genes a set of genes using PCA which serve as the informative genes to classify samples. The “gene shaving”

Statistics for the Social Sciences

Data Sources The most sophisticated forecasting model will fail if it is applied to unreliable data Data should be reliable and accurate Data should be.

Software Quality Control Methods. Introduction Quality control methods have received a world wide surge of interest within the past couple of decades.

Business Statistics - QBM117 Statistical inference for regression.

1 Simple Linear Regression 1. review of least squares procedure 2. inference for least squares lines.

Objectives of Multiple Regression

Statistical Methods For Engineers ChE 477 (UO Lab) Larry Baxter & Stan Harding Brigham Young University.

Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.

Chapter 11 Simple Regression

1 Least squares procedure Inference for least squares lines Simple Linear Regression.

BPS - 3rd Ed. Chapter 211 Inference for Regression.

The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.

© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 15: Correlation and Regression Part 2: Hypothesis Testing and Aspects of a Relationship.

Descriptive Statistics vs. Factor Analysis Descriptive statistics will inform on the prevalence of a phenomenon, among a given population, captured by.

The Statistical Imagination Chapter 15. Correlation and Regression Part 2: Hypothesis Testing and Aspects of a Relationship.

1 Regression Analysis The contents in this chapter are from Chapters of the textbook. The cntry15.sav data will be used. The data collected 15 countries’

Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.

MARE 250 Dr. Jason Turner Introduction to Statistics.

Lesson 14 - R Chapter 14 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.

Example x y We wish to check for a non zero correlation.

CORRELATION ANALYSIS.

Principal Component Analysis

ASSESSING SEARCH TERM STRENGTH IN SPOKEN TERM DETECTION Amir Harati and Joseph Picone Institute for Signal and Information Processing, Temple University.

Principal Components Analysis ( PCA)

BPS - 5th Ed. Chapter 231 Inference for Regression.

Loco: Distributing Ridge Regression with Random Projections Yang Song Department of Statistics.

HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Chapter 13.

Stats Methods at IC Lecture 3: Regression.

23. Inference for regression

Statistics for Political Science Levin and Fox Chapter 11:

GS/PPAL Section N Research Methods and Information Systems

Regression and Correlation

Chapter 4 Basic Estimation Techniques

MATH-138 Elementary Statistics

Inference for Least Squares Lines

STATISTICS FOR SCIENCE RESEARCH

Model validation and prediction

Linear Regression and Correlation Analysis

Outlier Processing via L1-Principal Subspaces

Principal Component Analysis (PCA)

The Research Experience for Teachers Program

Principal Component Analysis

The Practice of Statistics in the Life Sciences Fourth Edition

Chapter 8 Part 2 Linear Regression

Inference about the Slope and Intercept

Data Analysis Learning from Data

Inference about the Slope and Intercept

Descriptive Statistics vs. Factor Analysis

AN ANALYSIS OF TWO COMMON REFERENCE POINTS FOR EEGS

ALL the following plots are subject to the filtering :

Principal Component Analysis

Somi Jacob and Christian Bach

Statistical Thinking and Applications

Making Inferences about Slopes

Name three scientific habits of mind and explain them. (pg-38&39)

Linear Regression and Correlation

MGS 3100 Business Analysis Regression Feb 18, 2016

Presentation transcript:

MATH 6380J Mini-Project 1: Realization of Recent Trends in Machine Learning Community in Recent Years by Pattern Mining of NIPS Words Chan Lok Chun lcchanac@ust.hk Department of Life Science, HKUST 1. Introduction 4. Analysis In this study, analysis has been done on a collection of words appear in journal articles published on NIPS from 1987 to 2015. Sparse PCA and linear regression on the frequency distribution of the words being used, are employed to identify the recent trends in machine learning-related research fields. Fig. 2 Projected data of the top 30 PCs against time To identify the recent rising trends, the PCs which could explain largely the variations in recent years are wanted. In order to do this, the data are first projected to each PC respectively, plotted against time (Fig.2) then perform linear regression to extract the slopes. The aforementioned desired PCs were identified by looking for those which give non-zero slope (tested at 99.9% significance level). After the screening, 1035 words out of 11463 are recovered from the extracted sparse PCs. Rounds of linear regression were then performed on the relative frequencies of those 1035 words with respect to the 29 time points (1987-2015). Top 50 words with greatest slopes were selected out and seen as the top 50 increasingly frequent words used in NIPS articles in the recent years. The list of words were then analyzed manually to check which subfield of machine learning community do they belong to. 2. Hypothesis Judging on the fact that deep learning has been thriving in recent decade, it is expected that the frequency of appearance of words related to this field should have a notable increasing trend. 3. NIPS word dataset The NIPS word dataset contains the word count of 11463 words in 5812 journal articles collected from 1987-2015. Preprocessing: In the analysis, the trend of words’ appearance frequencies through time is interested. Therefore, average is taken over the dataset by year to obtain the relative frequency count of each word in each year. The resultant dataset becomes 11463 (words) X 29 (years). Clustering: Instead of analyzing 11463 words at the same time, it is believed that many words are correlated (e.g. jargons from the same research area), doing clustering could reveal some hidden structure of the dataset. In this case, sparse PCA was chosen for its robustness in p>>n scenario. 6. Conclusion The “top 50 rising words” were analyzed, to our surprise, there is no key words strongly related to “deep learning” appears in the list. While there are up to 36% of them belongs to “Bayesian statistics” and 12% belongs to ”optimization”. However, when plotting the time series data of certain deep learning-related words, they do show obvious increment in frequency with time. While it is further observed that all top-30 sparse PCs do not include these words. It is conjectured that the dataset may have certain complicated underlying structure which cannot be clustered effectively by linear projection method like sparse PCA, such that the signals of certain important features are masked in some low PCs. By calculating the adjusted total variance [1] explained by the PCs, it is identified that the top 30 sparse PCs explains majority of the variations in the dataset. The top 30 PCs were then selected for further analysis 7. References [1] Zou, H., Hastie, T., & Tibshirani, R. (2006). Sparse Principal Component Analysis. Journal of Computational and Graphical Statistics, 15(2), 265-286. doi:10.1198/106186006x113430 Fig.1 Top 80 PCs for explained total variance