Download presentation
Presentation is loading. Please wait.
Published byArchibald Sherman Modified over 9 years ago
1
1 Sparsity Control for Robustness and Social Data Analysis Gonzalo Mateos ECE Department, University of Minnesota Acknowledgments: Profs. Georgios B. Giannakis, M. Kaveh G. Sapiro, N. Sidiropoulos, and N. Waller MURI (AFOSR FA9550-10-1-0567) grant Minneapolis, MN December 9, 2011
2
2 2 Learning from “ Big Data ” `Data are widely available, what is scarce is the ability to extract wisdom from them ’ Hal Varian, Google ’ s chief economist BIG Fast Productive Revealing Ubiquitous Smart K. Cukier, ``Harnessing the data deluge,'' Nov. 2011. Messy
3
3 3 Social-Computational Systems The means: leverage dual role of sparsity Complexity control through variable selection Robustness to outliers Complex systems of people and computers The vision: preference measurement (PM), analysis, management Understand and engineer SoCS
4
4 4 Conjoint analysis Marketing, healthcare, psychology [Green-Srinivasan ‘ 78] Success story [Wind et al ’ 89] Attributes: room size, TV options, restaurant, transportation Goal: learn consumer ’ s utility function from preference data Linear utilities: `How much is each part worth? ’ Optimal design and positioning of new products Strategy: describe products by a set of attributes, `parts ’
5
5 5 Modeling preliminaries Respondents (e.g., consumers) Rate profiles Each comprises attributes Linear utility: estimate vector of partworths Conjoint data collection formats (M1) Metric ratings: (M2) Choice-based conjoint data: Online SoCS-based preference data exponentially increases Inconsistent/corrupted/irrelevant data Outliers
6
6 residuals discarded 6 Robustifying PM Least-trimmed squares [Rousseeuw ’ 87] (LTS) Q: How should we go about minimizing nonconvex (LTS)? A: Try all subsets of size, solve, and pick the best is the -th order statistic among G. Mateos, V. Kekatos, and G. B. Giannakis, ``Exploiting sparsity in model residuals for robust conjoint analysis,'' Marketing Sci., Dec. 2011 (submitted). Simple but intractable beyond small problems Near optimal solvers [Rousseeuw ’ 06], RANSAC [Fischler-Bolles ’ 81]
7
7 7 Modeling outliers Outlier variables s.t. outlier otherwise Both and unknown, typically sparse! Natural (but intractable) nonconvex estimator Nominal ratings obey (M1); outliers something else -contamination [Fuchs ’ 99], Bayesian model [Jin-Rao ’ 10]
8
8 8 LTS as sparse regression Lagrangian form Tuning parameter controls sparsity in number of outliers (P0) Formally justifies the preference model and its estimator (P0) Ties sparse regression with robust estimation Proposition 1: If solves (P0) with chosen s.t., then in (LTS).
9
9 9 Just relax! (P1) (P1) convex, and thus efficiently solved Role of sparsity-controlling is central Q: Does (P1) yield robust estimates ? A: Yap! Huber estimator is a special case where (P0) is NP-hard relax e.g., [Tropp ’ 06]
10
10 Lassoing outliers Suffices to solve Lasso [Tibshirani ’ 94] Data-driven methods to select Lasso solvers return entire robustification path (RP) Proposition 2:, Minimizers of (P1) are Coeffs. Decreasing
11
11 Nonconvex regularization Nonconvex penalty terms approximate better in (P0) Options: SCAD [Fan-Li ’ 01], or sum-of-logs [Candes et al ’ 08] Iterative linearization-minimization of around Initialize with, use Bias reduction (cf. adaptive Lasso [Zou ’ 06])
12
12 Comparison with RANSAC, i.i.d. Nominal: Outliers:
13
13 Nonparametric regression If one trusts data more than any parametric model Go nonparametric regression: lives in a space of “ smooth ’’ functions Ill-posed problem Workaround: regularization [Tikhonov ’ 77], [Wahba ’ 90] RKHS with kernel and norm Interactions among attributes? Not captured by Driven by complex mechanisms hard to model
14
14 Function approximation True function Nonrobust predictions Robust predictionsRefined predictions Effectiveness in rejecting outliers is apparent G. Mateos and G. B. Giannakis, ``Robust nonparametric regression via sparsity control with application to load curve data cleansing,'' IEEE Trans. Signal Process., 2012
15
15 Load curve data cleansing Load curve: electric power consumption recorded periodically Reliable data: key to realize smart grid vision [Hauser ’ 09] Uruguay ’ s power consumption (MW) Faulty meters, communication errors Unscheduled maintenance, strikes, sport events B-splines for load curve prediction and denoising [Chen et al ’ 10]
16
16 NorthWrite data Data: courtesy of NorthWrite Energy Group, provided by Prof. V. Cherkassky Outliers: “ Building operational transition shoulder periods ” No manual labeling of outliers [Chen et al ’ 10] Energy consumption of a government building ( ’ 05- ’ 10) Robust smoothing spline estimator, hours
17
17 Principal Component Analysis Our goal: robustify PCA by controlling outlier sparsity Motivation: (statistical) learning from high-dimensional data Principal component analysis (PCA) [Pearson ’ 1901] Extraction of low-dimensional data structure Data compression and reconstruction PCA is non-robust to outliers [Jolliffe ’ 86] DNA microarray Traffic surveillance
18
18 Our work in context Robust PCA Robust covariance matrix estimators [Campbell ’ 80], [Huber ’ 81] Computer vision [Xu-Yuille ’ 95], [De la Torre-Black ’ 03] Low-rank matrix recovery from sparse errors, e.g., [Wright et al ’ 09] Contemporary applications tied to SoCS Anomaly detection in IP networks [Huang et al ’ 07], [Kim et al ’ 09] Video surveillance, e.g., [Oliver et al ’ 99] Matrix completion for collaborative filtering, e.g., [Candes et al ’ 09]
19
19 PCA formulations Training data Minimum reconstruction error Compression operator Reconstruction operator Maximum variance Component analysis model Solution:
20
20 Robustifying PCA Outlier-aware model G. Mateos and G. B. Giannakis, ``Robust PCA as bilinear decomposition with outlier sparsity regularization,'' IEEE Trans. Signal Process., Nov. 2011 (submitted). Interpret: blind preference model with latent profiles (P2) -norm counterpart tied to (LTS PCA) (P2) subsumes optimal (vector) Huber -norm regularization for entry-wise outliers
21
21 Alternating minimization (P2) update: SVD of outlier-compensated data update: row-wise vector soft-thresholding Proposition 3: Alg. 1 ’ s iterates converge to a stationary point of (P2). 1
22
22 Video surveillance Data: http://www.cs.cmu.edu/~ftorre/ OriginalPCARobust PCA `Outliers ’
23
23 Big Five personality factors Five dimensions of personality traits [Goldberg ’ 93][Costa-McRae ’ 92] Measure the Big Five Short-questionnaire (44 items) Rate 1-5, e.g., `I see myself as someone who … … is talkative ’ … is full of energy ’ Big Five Inventory (BFI) Handbook of personality: Theory and research, O. P. John, R. W. Robins, and L. A. Pervin, Eds. New York, NY: Guilford Press, 2008. Discovered through factor analysis WEIRD subjects
24
24 BFI data 24 Robust PCA identifies 8 outlying subjects Validated via `inconsistency ’ scores, e.g., VRIN [Tellegen ’ 88] Eugene-Springfield community sample [Goldberg ’ 08] subjects, item responses, factors Data: courtesy of Prof. L. Goldberg, provided by Prof. N. Waller
25
25 Online robust PCA Motivation: Real-time data and memory limitations Exponentially-weighted robust PCA At time, do not re-estimate
26
26 Online PCA in action Outliers: Nominal:
27
27 Robust kernel PCA Kernel (K)PCA [ Scholkopf ‘ 97 ] Challenge: -dimensional Kernel trick: Input space Feature space Related to spectral clustering
28
28 Unveiling communities Data: http://www-personal.umich.edu/~mejn/netdata/ Network: NCAA football teams (nodes), F ’ 00 games (edges) teams, kernel Identified exactly: Big 10, Big 12, ACC, SEC, Big East Outliers: Independent teams ARI=0.8967
29
29 Spectrum cartography Goal: find s.t. is the spectrum at position Approach: Basis expansion model for, nonparametric basis pursuit Idea: collaborate to form a spatial map of the spectrum SPECTRUM MAPSPECTRUM MAP OriginalEstimated J. A. Bazerque, G. Mateos, and G. B. Giannakis, ``Group-Lasso on splines for spectrum cartography,'' IEEE Trans. Signal Process., Oct. 2011.
30
30 Technical Approaches: Consensus-based in-network operation in ad hoc WSNs Distributed optimization using alternating-direction methods Online learning of statistics using stochastic approximation Performance analysis via stochastic averaging Distributed adaptive algorithms Issues and Significance: Fast varying (non-)stationary processes Unavailability of statistical information Online incorporation of sensor data Noisy communication links Improved learning through cooperation G. Mateos, I. D. Schizas, and G. B. Giannakis, ``Distributed recursive least-squares for consensus-based in-network adaptive estimation,' ‘ IEEE Trans. Signal Process., Nov. 2009. Wireless sensor
31
31 Unveiling network anomalies Anomalies across flows and timeEnhanced detection capabilities Approach: Flag anomalies across flows and time via sparsity and low rank Payoff: Ensure high performance, QoS, and security in IP networks M. Mardani, G. Mateos, and G. B. Giannakis, ``Unveiling network anomalies across flows and time via sparsity and low rank,'' IEEE Trans. Inf. Theory, Dec 2011 (submitted).
32
32 OUTLIER-RESILIENT ESTIMATION SIGNAL PROCESSING LASSO 32 Concluding summary Research issues addressed Sparsity control for robust metric and choice-based PM Kernel-based nonparametric utility estimation Robust (kernel) principal component analysis Scalable distributed real-time implementations Control sparsity in model residuals for robust learning Application domains Preference measurement and conjoint analysis Psychometrics, personality assessment Video surveillance Social and power networks Experimental validation with GPIPP personality ratings ( ~ 6M) Gosling-Potter Internet Personality Project (GPIPP) - http://www.outofservice.com
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.