1 / 51

Chaiwoot Boonyasiriwat Feb. 6, 2009

Multiscale Waveform Inversion and High-Performance Computing using Graphics Processing Units (GPU). Chaiwoot Boonyasiriwat Feb. 6, 2009. Part I Multiscale Waveform Inversion: A Blind Test on A Synthetic Dataset. Outline. Previous Results on Marine and Land Data Goals

sumana
Download Presentation

Chaiwoot Boonyasiriwat Feb. 6, 2009

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multiscale Waveform Inversion andHigh-Performance Computing using Graphics Processing Units (GPU) Chaiwoot Boonyasiriwat Feb. 6, 2009

  2. Part IMultiscale Waveform Inversion:A Blind Test on A Synthetic Dataset

  3. Outline • Previous Results on Marine and Land Data • Goals • Methods and Data Processing • Numerical Results • Summary 1

  4. Gulf of Mexico Data 480 Hydrophones 515 Shots dt = 2 ms Tmax = 10 s 12.5 m 2

  5. Kirchhoff Migration Images 3

  6. Kirchhoff Migration Images 3

  7. Comparing CIGs 4

  8. Comparing CIGs CIG from Waveform Tomogram CIG from Traveltime Tomogram 4

  9. Comparing CIGs 4

  10. Comparing CIGs CIG from Waveform Tomogram CIG from Traveltime Tomogram 4

  11. Comparing CIGs 4

  12. Comparing CIGs CIG from Waveform Tomogram CIG from Traveltime Tomogram 4

  13. 1.6 km 100 m Y-Coord. (km) 0 km 0 0 50 X-Coord. (km) Time (s) 2 Offset (km) -3.6 3.6 Saudi Arabia Land Survey 1. 1279 CSGs, 240 traces/gather 2. 30 m station interval, max. offset = 3.6km 3. Line Length = 46 km 4. Pick 246,000 traveltimes 5. Traveltime tomography -> V(x,y,z) 5

  14. Brute Stack Section 0 Time (s) 2.0 3920 CDP 5070 6

  15. Traveltime Tomostatics + Stacking 0 Time (s) 2.0 3920 CDP 5070 7

  16. Waveform Tomostatics + Stacking 0 Time (s) 2.0 3920 CDP 5070 8

  17. Outline • Previous Results on Marine and Land Data • Goals • Methods and Data Processing • Numerical Results • Summary 9

  18. Goals • Blind Test • Sensitivity Test • unknown source wavelet • unknown forward modeling 10

  19. Outline • Previous Results on Marine and Land Data • Goals • Methods and Data Processing • Numerical Results • Summary 11

  20. Methods and Data Processing • Low-pass filtering 2 Hz 5 Hz • Source estimation • Waveform inversion • Traveltime tomography Time Picking: Shengdong 12

  21. Outline • Previous Results on Marine and Land Data • Goals • Methods and Data Processing • Numerical Results • Summary 13

  22. Original CSG 0 Time (s) 5 0 Offset (km) 5 14

  23. Numerical Results Kirchhoff Migration Image overlaid with Traveltime Tomogram 0 Depth (km) 1 10 Location (km) 0 15

  24. Numerical Results Kirchhoff Migration Image overlaid with Waveform Tomogram 0 Depth (km) 1 10 Location (km) 0 16

  25. Results Common Image Gathers obtained using Waveform Tomogram 0 Depth (km) 1 Offset (km) 10 Location (km) 0 0 0.5 17

  26. Waveform Tomogram vs. True Velocity Waveform Tomogram True Velocity 0 Depth (km) 1 0 Location (km) 10 0 Location (km) 10 18

  27. Investigation I m/s True Model 0 3000 Depth (km) 1000 0.5 Waveform Tomogram using My Data 3000 0 Depth (km) 0.5 1000 0 10 19 Location (km)

  28. Investigation II True Velocity Migration Image using Original Data 0 Depth (km) 1 0 Location (km) 10 0 Location (km) 10 20

  29. Investigation III True Velocity Migration Image using My Data 0 Depth (km) 1 0 Location (km) 10 0 Location (km) 10 21

  30. Outline • Previous Results on Marine and Land Data • Goals • Methods and Data Processing • Numerical Results • Summary 22

  31. Summary • Blind test on a synthetic dataset. • Waveform inversion failed. • Need to investigate why waveform inversion failed. • Factors: source wavelet, forward modeling, velocity structure, incorrect information. 23

  32. Future Work • Redo the inversion with correct information. • Speed up waveform inversion. 24

  33. Part IIHigh-Performance Computingusing GPUs

  34. Outline • Motivation • Introduction to Computing on GPUs • Preliminary Results • Summary 1

  35. Motivation: Peak Performance 1000 750 Peak GFLOP/s 500 250 0 2 Courtesy of NVIDIA

  36. Motivation: Memory Bandwidth 120 100 80 Bandwidth GB/s 60 40 20 0 3 Courtesy of NVIDIA

  37. Outline • Motivation • Introduction to Computing on GPUs • Preliminary Results • Summary 4

  38. CPU vs. GPU GPU CPU GPU devotes more transistors to data processing. 5 Courtesy of NVIDIA

  39. Large memories are slow, fast memories are small Thread synchronization does not work across different thread blocks. CPU vs. GPU (Device) Grid Block (0, 0) Block (1, 0) Shared Memory Shared Memory Host + GPU Storage Hierarchy Conventional Storage Hierarchy Proc Registers Registers Registers Registers Cache L2 Cache Thread (0, 0) Thread (1, 0) Thread (0, 0) Thread (1, 0) Local Memory Local Memory Local Memory Local Memory L3 Cache Host Global Memory Constant Memory Memory Texture Memory 6 Source: Mary Hall (U of Utah), NVIDIA

  40. GPUs were originally designed for graphics. High Speed: Useful of a variety of applications. Potential for very high performance at low cost Architecture well suited for certain kinds of parallel applications (data parallel) Demonstrations of 20-100X speedup over CPU General-Purpose Computation on GPUs (GPGPU) 7 Source: Mary Hall (U of Utah), GPGPU.org

  41. Minimal extensions to C++. Allow kernel functions to be executed N times in parallel by N different CUDA threads. Each thread performs roughly the same computation to different partitions of data. Data-parallel interface to GPUs. Programming Model: CUDA(Compute Unified Device Architecture) 8 Source: Mary Hall, CUDA Programming Guide CS6963

  42. Outline • Motivation • Introduction to Computing on GPU • Preliminary Results • Summary 9

  43. Preliminary Results • Modeling Test: speedup factor of 20x using 1536 threads. • Migration Test: N/A (Thread synchronization problem) • Inversion Test: N/A 10

  44. Forward Modeling Test 0 m/s 4000 Depth (km) 3.5 1500 0 Horizontal Location (km) 15 NX = 1536 = Nthreads NZ = 373 11

  45. Forward Modeling Test CSG from CPU CSG from GPU 0 0 Time (s) Time (s) 6 6 0 Offset (km) 15 0 Offset (km) 15 12

  46. Conventional C Code for (iz=2; iz<nz-2; iz++) { for (ix=2; ix<nx-2; ix++) { indx = ix+iz*nx; P2[indx] = (2.0+2.0*C1*alpha)*P1[indx] - P0[indx] + alpha*(C2*(P1[indx-1] +P1[indx+1]+P1[indx-nx]+P1[indx+nx]) +C3*(P1[indx-2]+P1[indx+2] +P1[indx-2*nx]+P1[indx+2*nx])); } } 13

  47. CUDA Code ix = threadIdx.x; for (iz=2; iz<nz-2; iz++) { indx = ix+iz*nx; P2[indx] = (2.0+2.0*C1*alpha)*P1[indx] - P0[indx] + alpha*(C2*(P1[indx-1] +P1[indx+1]+P1[indx-nx]+P1[indx+nx]) +C3*(P1[indx-2]+P1[indx+2] +P1[indx-2*nx]+P1[indx+2*nx])); } 14

  48. Outline • Motivation • Introduction to Computing on GPU • Preliminary Results • Summary 15

  49. Summary • GPU is a cheap, high-performance processor. • CUDA makes it possible to learn how to program on GPU with a steep learning curve. • Current timing result is very promising. • Better understanding of GPU/CUDA will improve the performance in the future. 16

  50. Future • Develop CUDA-based codes for • FD forward modeling • RTM • Waveform inversion • Release codes some time in the Fall of 2009 17

More Related