Presentation is loading. Please wait.

Presentation is loading. Please wait.

Copyright © 2015 by Educational Testing Service. 1 Feature Selection for Automated Speech Scoring Anastassia Loukina, Klaus Zechner, Lei Chen, Michael.

Similar presentations


Presentation on theme: "Copyright © 2015 by Educational Testing Service. 1 Feature Selection for Automated Speech Scoring Anastassia Loukina, Klaus Zechner, Lei Chen, Michael."— Presentation transcript:

1 Copyright © 2015 by Educational Testing Service. 1 Feature Selection for Automated Speech Scoring Anastassia Loukina, Klaus Zechner, Lei Chen, Michael Heilman* Educational Testing Service *CIVIS Analytics

2 Copyright © 2015 by Educational Testing Service. Overview Motivation Data Scoring models Results Conclusion

3 Copyright © 2015 by Educational Testing Service. Context and motivation Scoring of constructed responses -- speech Computation of features using NLP + speech technology, using speech recognition and signal processing outputs Predict scores using supervised machine learning Educational measurement: managing trade-off: - Maximize empirical performance - Maximize model interpretability

4 Copyright © 2015 by Educational Testing Service. Ideal Properties of Scoring Models High empirical performance Contains features that evaluate all relevant aspects of the test construct Relative Contribution by each feature should be obvious Inter-correlations between features not too high Polarity of feature weights correspond to their meaning Smaller and simpler is better (interpretability)

5 Copyright © 2015 by Educational Testing Service. Linear Regression Scoring Models Built by Human Experts Straightforward and well-known in all disciplines Allow to address most requirements of ideal scoring models Disadvantage: cumbersome development due to manual selection of features and checking for all constraints

6 Copyright © 2015 by Educational Testing Service. Proposed Model Explore alternative regression models, e.g., shrinkage methods Can do feature selection automatically while still addressing ideal model constraints 6

7 Copyright © 2015 by Educational Testing Service. Data Spoken English proficiency test Spontaneous speech, ~1 minute per response Score scale: 1 – 4 Data SetSpeakersResponsesH-H Correlation Train9,3129,9560.63 Eval8,10147,6420.62

8 Copyright © 2015 by Educational Testing Service. Features 75 features extracted for each response via SpeechRater Construct dimensions: – fluency – pronunciation accuracy – prosody – grammar – vocabulary Dimensions not covered: content, discourse

9 Copyright © 2015 by Educational Testing Service. Scoring Models 1.Baseline: human expert (12 features) 2.All features using OLS regression 3.Hybrid stepwise regression 4.Non-negative least-square regression 5.Non-negative LASSO regression (LASSO*; lambda optimized to obtain a feature set size of about 25)

10 Copyright © 2015 by Educational Testing Service. LASSO Shrinkage model – dimensionality reduction Penalty for larger coefficients Sets subset of coefficients to zero Lambda-parameter: if zero: yields OLS model; if infinity: yields model with no features Determined optimal lambda empirically (Target number of features where performance flattens out)

11 Copyright © 2015 by Educational Testing Service. Crossvalidation Results ModelFeaturesNegative CoeffsCorrelation Expert baseline12No0.606 All OLS75Yes0.667 Hybrid stepwise~40Yes0.667 Non-neg Ls~35No0.655 LASSO*~25No0.649

12 Copyright © 2015 by Educational Testing Service. Results on Evaluation Set ModelFeaturesItem CorrSpeaker Corr Expert baseline120.610.78 All OLS750.670.86 LASSO*250.650.84

13 Copyright © 2015 by Educational Testing Service. Construct Coverage Comparison Adding relative standardized beta-weights ConstructExpertLasso* Fluency0.5800.527 Pronunciation accuracy0.0980.151 Prosody0.0800.035 Total for Delivery 0.7590.712 Grammar 0.1550.103 Vocabulary0.0860.183 Total for Language Use0.2410.286

14 Copyright © 2015 by Educational Testing Service. Summary Building scoring models for constructed responses in line with best practices in educational measurement is a complex task of constraint satisfaction Therefore, this Task has been typically performed by human experts Our study demonstrates the viability of using automated methods of feature selection that can satisfy multiple requirements of ideal scoring models LASSO* model is more accurate, has very similar construct coverage compared to expert baseline and is highly interpretable


Download ppt "Copyright © 2015 by Educational Testing Service. 1 Feature Selection for Automated Speech Scoring Anastassia Loukina, Klaus Zechner, Lei Chen, Michael."

Similar presentations


Ads by Google