Presentation is loading. Please wait.

Presentation is loading. Please wait.

Principle Components & Neural Networks How I finished second in Mapping Dark Matter Challenge Sergey Yurgenson, Harvard University Pasadena, 2011.

Similar presentations


Presentation on theme: "Principle Components & Neural Networks How I finished second in Mapping Dark Matter Challenge Sergey Yurgenson, Harvard University Pasadena, 2011."— Presentation transcript:

1 Principle Components & Neural Networks How I finished second in Mapping Dark Matter Challenge Sergey Yurgenson, Harvard University Pasadena, 2011

2 To measure the ellipticity of 60,000 simulated galaxies Kitching, 2011 Scientific view.

3 e1=-0.13889 e2=0.090147 Training set 40,000 training examples Test set 60,000 examples e1= ? e2= ? g: P -> e Regression function g does not need to be justified in any scientific way! Supervised learning is used to find g Data mining view P P e e

4 Neural Network => e1=-0.13889 e2=0.090147 RMSE=0.01779 Too many inputs parameters. Many parameters are nothing more than noise. Slow training Result is not very good Reduce number of parameters Make parameters “more meaningful” Matlab

5 Principle components to reduce number of input parameters Neural Network with PC as inputs : RMSE~0.0155

6 Implicit use of additional information about data set: 2D matrixes are images of objects Objects have meaningful center. Calculate center of mass with threshold. Center pictures using spline interpolation. Recalculate principle components Fine dune center position using amplitude of antisymmetrical components Original Centered

7 Principle Components after center recalculation

8 Principle components - stars

9 Components # 2 and # 3 Linear regression using only components 2,3 => RMSE~0.02 Color – 2theta Color – (a-b)/(a+b) e1=[(a-b)/(a+b)]cos(2theta) e2=[(a-b)/(a+b)]sin(2theta)

10 Neural Network: 38 (galaxies PC) + 8 (stars PC) inputs 2 Hidden Layers -12 neurons (linear transfer function) and 8 neurons(sigmoid transfer function) 2 outputs – e1 and e2 as targets 80% random training subset, 20% validation subset Multiple trainings with numerous networks achieving training RMSE<0.015 Typical test RMSE =0.01517 – 0.0152 Small score improvement by combining prediction of many networks (simple mean): Combination of multiple networks, training RMSE ~0.0149 public RMSE ~0.01505-0.01509 private RMSE ~0.01512-0.01516 Benefit of network combination is ~0.00007-0.0001 Best submission – mean of 35 NN predictions

11 Training set Test set std=0.01499std=0.01518

12 Training RMSETest RMSE Original0.014990.01518 7 bit resolution0.015030.01522 6 bit resolution0.015130.01532 5 bit resolution0.015510.01574 4 bit resolution0.016960.01718 Pix size 20.015460.01571 + 0.5 noise0.016840.01706 +1.0 noise0.021200.02152 +1.5 noise0.028730.02916

13 Questions: Method is strongly data depended. How method will perform for more diverse data set and real data ? Is there a place for this kind of methods in cosmology?


Download ppt "Principle Components & Neural Networks How I finished second in Mapping Dark Matter Challenge Sergey Yurgenson, Harvard University Pasadena, 2011."

Similar presentations


Ads by Google