Antoine Guitton, Geophysics Department, CSM

Statistical identification of faults in 3D seismic volumes using a machine learning approach
Antoine Guitton, Geophysics Department, CSM Hua Wang, Computer Science Department, CSM Whitney Trainor-Guitton, Geophysics Department, CSM

3D Field data example: flat cube view
Time slice Xline Inline

3D Field data example: flat cube view
Fault locations with Hale’s approach Fault locations with MLA ≈P(faults) 1

Why MLA for faults? MLAs can work well for interpretation tasks where Experience/Training matters MLAs can work in any dimension Codes yield reliable, reproducible, unbiased estimates MLAs can be constantly improved (learn from mistakes, more data or features) Fault imaging IS a classification problem Tons of libraries

Machine learning in a nutshell
Outcome vector Classifier Matrix of feature vectors

Features - Predictors Histograms of Oriented Gradients (HOG)
2D only Scale Invariant Feature Transform (SIFT) Seismic amplitudes Software available at:

Features - Predictors Histogram of Oriented Gradients (HOG):
Build local histograms of gradient orientation

Features - Predictors Scale Invariant Feature Transform (SIFT):
Identify features independent of scale, noise, illumination, orientation

Machine learning in a nutshell
Outcome vector Classifier Matrix of feature vectors

Training data - Outcome vector
Labels built with Hale’s java toolkit ≈P(faults) 1

Building outcome vector with patching
1 (c) y1=1 y7=0 y3=0 y9=0 (b) Scatter (a)

1 y1=1 y7=0 ≈ 4 3 1 2 y3=0 y9=0 (a) (c) Merge

Building data with patching
HOG/SIFT 1 X1 . . X= X2 . . X3 . X9 (a) (b) (c) Scatter

Fault Imaging Workflow
Build fault labels Patch of class labels and seismic data Compute HOG and SIFT features for each patch Train SVM classifier on training set Predict on test set Reassemble labeled data from test set to form fault image Analyze result (confusion matrix)

Fault Imaging Workflow
Build labels Patch of class labels and seismic data Compute HOG and SIFT features for each patch Train SVM classifier on training set Predict on test set Reassemble labeled data from test set to form fault image Analyze result (confusion matrix)

Example: Synthetic test
Training with 40,000 feature vectors (10,000 used for crossvalidation for SVM parameters) Testing with 200,000 vectors

Test data Seismic volume Estimate fault locations with Hale’s library
≈P(faults) 1

Test data “true” fault location after patching
Estimate fault locations with Hale’s library ≈P(faults) 1

1 y1=1 y7=0 ≈ Merge 4 3 1 2 y3=0 y9=0

Synthetic example 4: Train the MLA (SIFT/HOG) with 768 predictors: 91.5% accurate 30137 75.3% 2551 6.4% True - False - 856 2.1% 6456 16.1% 88.3% 11.7% False + True + Precision 91.5% 8.5% Overall fit/error

Synthetic example 5&6: Apply on test data (SIFT+HOG) and reassemble
“true” fault locations Predicted fault locations ≈P(faults) 1

Synthetic example 7: Analyze results (SIFT+HOG): 91.5% accurate 152720
76.4% 10372 5.2% True - False - 6652 3.3% 30246 15.1% 81.9% 18.1% False + True + Precision 91.5% 8.5% Overall fit/error

Synthetic example 5&6: Apply on test data (SIFT+HOG) and reassemble

Synthetic example 5&6: Apply on test data (HOG only) and reassemble

Field data example 128,000 training samples 147,000 test samples

Field data example “true” fault locations Test data ≈P(faults) 1

Field data example Mislabeling is present in both training and test data Faults are labeled as non-faults Non-faults are labeled as faults Training is done with mislabeled data Comparison with labels of test data is hard

Field data example 4: Train the MLA (HOG+SIFT): 86.5 % accurate 100842

Predicted fault locations with smoothing
Field data example 5&6: Apply on test data (HOG+SIFT) and reassemble “true” fault locations Predicted fault locations with smoothing ≈P(faults) 1

Field data example 5: Analyze results (HOG+SIFT): 77.4% accurate
108401 73.7% 28857 19.6% True - False - 4298 2.9% 5444 3.7% 55.9% 44.1% False + True + Precision 77.4% 22.6% Overall fit/error

Field data example 4: Apply on test data (HOG+SIFT) “true” fault locations Predicted fault locations with smoothing ≈P(faults) 1

Field data example 4: Apply on test data (HOG only) “true” fault locations Predicted fault locations with smoothing ≈P(faults) 1

Field data example 4: Apply on test data (HOG+SIFT) “true” fault locations Predicted fault locations with smoothing ≈P(faults) 1

Field data example 4: Apply on test data (HOG) “true” fault locations Predicted fault locations with smoothing ≈P(faults) 1

Conclusions 1st successful application of MLA for fault imaging of field data Supervised-learning worflow includes: Fault labeling Patching Building feature sets (HOG,SIFT) Training of SVM classifier with Gaussian kernels Possible improvements Train on more data Include more/better features (3D, smoothing, LASSO, etc…) Use other classifiers (trees, neural nets/deep learning, etc…) Use semi-supervised learning

Fault locations with MLA
Thank you! Questions? Fault locations with Hale’s approach Fault locations with MLA

Thank you ! “true” fault locations Predicted fault locations with smoothing ≈P(faults) 1

Synthetic example 4: Train the MLA (SIFT) with 128 predictors: 84.7% accurate 29279 73.2% 4420 11.1% True - False - 1714 4.3% 4587 11.5% 72.8% 27.2% False + True + Precision 84.7% 15.3% Overall fit/error Recall

Synthetic example 4: Apply on test data (SIFT only)

Synthetic example 5: Analyze results (SIFT): 84.3% accurate 146334

Features Histograms of Oriented Gradients (HOGs) are used for object recognition. Computed as follows: Compute gradients in x and z directions at all pixel locations Compute unassigned/signed angles at all pixel locations Compute histograms of angles within cells (1 cell=8x8 pixels, for instance) Normalization across blocks of cells (1 block=2x2 cells)

Synthetic example 4: Train the MLA (HOG) with 740 predictors: 99.4% accurate 30991 77.5% 245 0.6% True - False - 2 0.0% 8762 21.9% 100.0% 0.0% False + True + Precision 99.4% 0.6% Overall fit/error

Synthetic example 5&6: Apply on test data (HOG only) and reassemble
“true” fault locations Predicted fault locations

Synthetic example 7: Analyze results (HOG) – 92.1% accurate 150583

Features Scale Invariant Feature Transform (SIFT) process:
Identify keepoints (involves scaling/blurring/DoG/extrema identification) Around each keypoints, build a histogram of gradients (8 bins) for 16 blocks (1 block is 4x4 cells).

Features Scale Invariant Feature Transform (SIFT) process:
Each keypoint has 128 features (8bins x 16blocks) Problem: # of keypoints is variable for each image Solution: Clustering of SIFT features and build histogram of them for each image (some might be all zeroes).

Field data example 4: Train the MLA (HOG): 100 % accurate ! overfit!!!! 101426 78.8% 7 0.0% True - False - 0.0% 27318 21.2% 100.0% 0.0% False + True + Precision 100.0% 0.0% Overall fit/error

Field data example 5&6: Apply on test data (HOG only) and reassemble “true” fault locations Predicted fault locations with smoothing ≈P(faults) 1

Field data example 7: Analyze results (HOG): 77% accurate 105801 26866

Field data example 5: Analyze results (HOG): 77% accurate 105801 26866

Features - Predictors

Antoine Guitton, Geophysics Department, CSM

Similar presentations

Presentation on theme: "Antoine Guitton, Geophysics Department, CSM"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Antoine Guitton, Geophysics Department, CSM

Similar presentations

Presentation on theme: "Antoine Guitton, Geophysics Department, CSM"— Presentation transcript:

Similar presentations

About project

Feedback