110 likes | 126 Views
This project aims to identify salt deposits in seismic images using a 2-stage Res-Unet model with data augmentation, cross-validation, and voting. Achieved an Intersection over Union (IoU) of 0.839 (0.858), ranking at 10%. Lessons learned and potential improvements discussed.
E N D
TGS Salt Identification Challenge ——Zheming Wang CourseWorkofCS539inFall2018
Introduction Salt deposits below the surface is a big obstacle for oil and gas drillers. To detect them, expert human interpretation is still a must nowadays. The goal of this project is to tell the salt region with given seismic image as follow:
Data and method The data comes from the TGS challenge on Kaggle. The picture size is 101*101, while the train and test sets have sizes of 4,000 and 18,000. ThetoolkitIuseisKeraswithTensorFlowasbackendinpython. The code I used mainly referred to some kernels on Kaggle. I keep the metric and data transforming partunchanged, revised the model structure, and wrote some code to make the training resumable and automatic on Euler.
Basic Summary It is a typical image segmentation problem and there is a specific metric called Intersection over Union (IoU). In this project, I start from a baseline U-net model with IoU of 0.675(0.700) , ranking at 81%. After trying different methods and do lots of fine tuning, I get a 2-stage Res-Unetmodel with data augmentation, cross validation and voting, which obtains IoU of 0.839(0.858) , ranking at 10%.
Result 1: model choosing Generalsetting Batchsize:32 Epochs:200 EarlyStop:10 Lrschedule: 1e-2/8/0.5/1e-5 Monitor:val_iou * Maydifferalittlepermodel.
Lessons Learned in Project 1. Keras has some problems when resume a saving model. After reading source code, I wrote a patch for it. https://github.com/wzmJimmy/MyKerasPatch 2. Parallelism did not accelerate single model training. I found this fact when trying Keras mult-gpu class. Then I changed to train multiple cross-validation like models at the same time. 3. In Data Science, the model choosing is a core problem, while the data preprocessing and hyper-parameter tuning can also make a great difference but need lots of experience and trails.
Further improvement • Other techniques: • Transfer learning: using other network as anencoder. • Optimizer choosing: Adam,momentum,normalSGD. • Featureengineering:tellzero-or-somesaltfirst. • ……