1 / 14

Current Target Transcurrent Plate Boundary

Current Target Transcurrent Plate Boundary. Preliminary Study e.g. San Andreas Faults, CA. San Andreas Faults, CA. Transcurrent Plate Boundaries > 1,000 km. US Geological Survey. Problem Configuration. Double Fault Patches for Initial Condition. 150~1,200km(Length)×45km(Depth) .

kaiya
Download Presentation

Current Target Transcurrent Plate Boundary

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 3rd ACES WG mtg. 2003/06/06 Brisbane. Current TargetTranscurrent Plate Boundary • Preliminary Study • e.g. San Andreas Faults, CA.

  2. 3rd ACES WG mtg. 2003/06/06 Brisbane. San Andreas Faults, CA.Transcurrent Plate Boundaries> 1,000 km US Geological Survey

  3. 3rd ACES WG mtg. 2003/06/06 Brisbane. Problem Configuration • Double Fault Patches for Initial Condition. • 150~1,200km(Length)×45km(Depth) . • 705~6,000 parameters. • Plate Motion : 50mm/yr. • Earth Simulator (1-64 PEs).

  4. 3rd ACES WG mtg. 2003/06/06 Brisbane. Overview • Hashimoto Code • Tectonic stress accumulation simulation at transcurrent plate boundaries • Boundary Integral Method • Fault Length: 150km – 1200km

  5. 3rd ACES WG mtg. 2003/06/06 Brisbane. Optimized if (iflagM.eq.1) then do ip= 1, PETOT is= (ip-1)*gN k= 1 k1= gMTBL(k) !CDIR NODEP do j= 1, gN wt= dydamatP(is+j)*sig2imatP(ip) gA2(j,k)= gA2(j,k) + wt*dydamatP(is+k1) gB2(j) = gB2(j) + wt*dymatP (ip) enddo do k= 2, gM k1= gMTBL(k) !CDIR NODEP do j= 1, gN wt= dydamatP(is+j)*sig2imatP(ip) gA2(j,k)= gA2(j,k) + wt * dydamatP(is+k1) enddo enddo chisq= chisq + dymatP(ip)*dymatP(ip)*sig2imatP(ip) enddo else !CDIR NODEP do ip= 1, PETOT is = (ip-1)*gN chisq= chisq + dymatP(ip)*dymatP(ip)*sig2imatP(ip) enddo endif • gM=gN/PETOT • x gM additional computation for wt Parallel Matrix Assembling for Linear EQNs: MRQCOF Original do ip= 1, PETOT is= (ip-1)*gN if (iflagM.eq.1) then do j= 1, gN wt= dydamatP(is+j)*sig2imatP(ip) !CDIR NODEP do k= 1, gM k1= gMTBL(k) gA2(j,k)= gA2(j,k) + wt*dydamatP(is+k1) enddo gB2(j)= gB2(j) + dymatP(ip)*wt enddo endif chisq= chisq + dymatP(ip)*dymatP(ip)*sig2imatP(ip) enddo

  6. 3rd ACES WG mtg. 2003/06/06 Brisbane. Matrix Component: FUNCScalled NDATA times at every time step Original Optimized gs_d= 0.d0 do is= 1, stepj-1 if (dp_d.ne.0) then do p= 1, ma do it= 0, itcnt-1 if ((t(it) .le.tau(stepj)-tau(is)) .and. & (t(it+1).gt.tau(stepj)-tau(is))) then gst= gss(p,it) goto 111 endif enddo gst= gss(p,itcnt) 111 continue gs_d= gs_d + aaj(p,is)*gst enddo endif enddo if (itflag.eq.0) then do is= 1, stepj-1 do it= 0, itcnt-1 if ((t(it) .le.tau(stepj)-tau(is)) .and. & t(it+1).gt.tau(stepj)-tau(is))) then itCUR(is)= it goto 111 endif enddo itCUR(is)= itcnt 111 continue enddo endif ... gs_d= 0.0d0 if (dp_d.ne.0) then do is= 1, stepj-1 !CDIR NODEP do p= 1, ma gs_d= gs_d + aaj(p,is)*gss(p,ITcur(is)) enddo enddo endif • Additional array “ITCUR(is)” is defined and this is calculated just once at every time step. • Operations for computations of “gs_d” is very simple and easy to be optimized. • “Subroutine FUNCS” is called “NDATA” times. • “stepj” is current step number, therefore computational amount for this part is increasing as the simulation proceeds. • “gst” only depends on time and location of parameter point.

  7. 3rd ACES WG mtg. 2003/06/06 Brisbane. Results on Earth SimulatorSingle PE, 15 steps for 150km length region PROG.UNIT FREQUENCY EXCLUSIVE AVER.TIME MOPS MFLOPS V.OP AVER. I-CACHE O-CACHE BANK TIME[sec]( % ) [msec] RATIO V.LEN MISS MISS CONF mrqcof 162 385.226( 64.3) 2377.937 5669.4 1240.2 98.49 234.1 4.8410 2.8251 0.1084 funcs 1125252 169.319( 28.3) 0.150 839.0 261.9 77.30 54.8 1.7660 4.5700 6.4117 srcinput 1 22.228( 3.7) 22227.847 171.0 0.9 1.22 178.8 3.4421 0.1314 0.0000 pgauss 108 12.887( 2.2) 119.327 4260.5 1966.9 98.64 198.0 0.0922 0.2183 0.0098 quasi_static 1 4.495( 0.8) 4494.601 178.6 4.6 5.15 250.7 0.3769 0.0692 0.0000 consti_parameter 1125252 4.445( 0.7) 0.004 203.1 47.8 0.00 0.0 0.6829 0.7551 0.0000 mrqmin 108 0.117( 0.0) 1.085 3751.8 2.6 98.33 234.7 0.0041 0.0025 0.0000 ------------------------------------------------------------------------------------------------- total 2250884 598.717(100.0) 0.266 3986.7 914.8 97.01 202.0 11.2052 8.5715 6.5299 Original PROG.UNIT FREQUENCY EXCLUSIVE AVER.TIME MOPS MFLOPS V.OP AVER. I-CACHE O-CACHE BANK TIME[sec]( % ) [msec] RATIO V.LEN MISS MISS CONF funcs 1125252 168.392( 46.2) 0.150 843.6 263.3 77.30 54.8 0.9748 4.4880 6.4117 mrqcof 162 153.722( 42.1) 948.903 7102.9 3494.7 98.99 233.3 2.4568 3.9489 0.1084 srcinput 1 20.673( 5.7) 20672.844 183.8 1.0 1.22 178.8 2.1630 0.1307 0.0000 pgauss 108 12.907( 3.5) 119.510 4254.0 1963.9 98.64 198.0 0.1104 0.2287 0.0104 consti_parameter 1125252 4.538( 1.2) 0.004 198.9 46.8 0.00 0.0 0.9358 0.6159 0.0000 quasi_static 1 4.464( 1.2) 4464.177 179.8 4.6 5.15 250.7 0.3026 0.1485 0.0000 mrqmin 108 0.117( 0.0) 1.084 3752.9 2.6 98.33 234.7 0.0043 0.0025 0.0000 ------------------------------------------------------------------------------------------------- total 2250884 364.813(100.0) 0.162 3549.2 1664.3 96.18 180.2 6.9477 9.5631 6.5306 Optimized • Computational time reduced dramatically. • “MRQCOF” speed up in spite of larger amount of computation. • Bank conflict in FUNCS.

  8. 3rd ACES WG mtg. 2003/06/06 Brisbane. • Innermost loops “it” -> “p” for “gss(p,it)”. • “gss(ipp,idX1,idX2,it)” -> “gss(ipp,it,idx1,idX2)”. Array Access Pattern: FUNCSin order to avoid bank conflict Original Optimized idX1= idint(zz_d) it= 0 !CDIR NODEP do p= 1, ma ipp= idnint(dabs(kk(p)-xx_d)) if (ipp.gt.xmax0) then uu (p) = 0.d0 gss(p,it)= 0.d0 else idX2= idint(ll(p)/3.d0) uu (p) = u (ipp, idX2, idX1) gss(p,it)= gs(ipp, it, idX2, idX1) endif enddo do it= 1, itcnt !CDIR NODEP do p= 1, ma ipp= idnint(dabs(kk(p)-xx_d)) if (ipp.gt.xmax0) then gss(p,it)= 0.d0 else idX2= idint(ll(p)/3.d0) gss(p,it)= gs(ipp, it, idX2, idX1) endif enddo enddo idX1= idint(zz_d) do p= 1, ma ipp= idnint(dabs(kk(p)-xx_d)) if (ipp.gt.xmax0) then uu(p)= 0.d0 !CDIR NODEP do it= 0, itcnt gss(p,it)= 0.d0 enddo else idX2= idint(ll(p)/3.d0) uu(p)= u(ipp, idX1, idX2) !CDIR NODEP do it= 0, itcnt gss(p,it)= gs(ipp, idX1, idX2, it) enddo endif enddo

  9. 3rd ACES WG mtg. 2003/06/06 Brisbane. Results on Earth SimulatorSingle PE, 15 steps for 150km length region PROG.UNIT FREQUENCY EXCLUSIVE AVER.TIME MOPS MFLOPS V.OP AVER. I-CACHE O-CACHE BANK TIME[sec]( % ) [msec] RATIO V.LEN MISS MISS CONF mrqcof 162 385.226( 64.3) 2377.937 5669.4 1240.2 98.49 234.1 4.8410 2.8251 0.1084 funcs 1125252 169.319( 28.3) 0.150 839.0 261.9 77.30 54.8 1.7660 4.5700 6.4117 srcinput 1 22.228( 3.7) 22227.847 171.0 0.9 1.22 178.8 3.4421 0.1314 0.0000 pgauss 108 12.887( 2.2) 119.327 4260.5 1966.9 98.64 198.0 0.0922 0.2183 0.0098 quasi_static 1 4.495( 0.8) 4494.601 178.6 4.6 5.15 250.7 0.3769 0.0692 0.0000 consti_parameter 1125252 4.445( 0.7) 0.004 203.1 47.8 0.00 0.0 0.6829 0.7551 0.0000 mrqmin 108 0.117( 0.0) 1.085 3751.8 2.6 98.33 234.7 0.0041 0.0025 0.0000 ------------------------------------------------------------------------------------------------- total 2250884 598.717(100.0) 0.266 3986.7 914.8 97.01 202.0 11.2052 8.5715 6.5299 Original PROG.UNIT FREQUENCY EXCLUSIVE AVER.TIME MOPS MFLOPS V.OP AVER. I-CACHE O-CACHE BANK TIME[sec]( % ) [msec] RATIO V.LEN MISS MISS CONF funcs 1125252 168.392( 46.2) 0.150 843.6 263.3 77.30 54.8 0.9748 4.4880 6.4117 mrqcof 162 153.722( 42.1) 948.903 7102.9 3494.7 98.99 233.3 2.4568 3.9489 0.1084 srcinput 1 20.673( 5.7) 20672.844 183.8 1.0 1.22 178.8 2.1630 0.1307 0.0000 pgauss 108 12.907( 3.5) 119.510 4254.0 1963.9 98.64 198.0 0.1104 0.2287 0.0104 consti_parameter 1125252 4.538( 1.2) 0.004 198.9 46.8 0.00 0.0 0.9358 0.6159 0.0000 quasi_static 1 4.464( 1.2) 4464.177 179.8 4.6 5.15 250.7 0.3026 0.1485 0.0000 mrqmin 108 0.117( 0.0) 1.084 3752.9 2.6 98.33 234.7 0.0043 0.0025 0.0000 ------------------------------------------------------------------------------------------------- total 2250884 364.813(100.0) 0.162 3549.2 1664.3 96.18 180.2 6.9477 9.5631 6.5306 Optimized PROG.UNIT FREQUENCY EXCLUSIVE AVER.TIME MOPS MFLOPS V.OP AVER. I-CACHE O-CACHE BANK TIME[sec]( % ) [msec] RATIO V.LEN MISS MISS CONF mrqcof 162 151.462( 60.2) 934.950 7208.9 3546.8 98.99 233.3 4.2596 0.6754 0.1108 funcs 1125252 57.366( 22.8) 0.051 5323.4 1284.5 98.56 230.9 2.3050 1.0004 0.1203 srcinput 1 21.015( 8.4) 21014.591 180.8 1.0 1.22 178.8 2.4005 0.1383 0.0000 pgauss 108 12.832( 5.1) 118.812 4279.0 1975.5 98.64 198.0 0.0939 0.1742 0.0104 quasi_static 1 4.634( 1.8) 4633.743 173.2 4.4 5.15 250.7 0.5050 0.1040 0.0000 consti_parameter 1125252 4.221( 1.7) 0.004 213.9 50.3 0.00 0.0 1.0785 0.0555 0.0000 mrqmin 108 0.117( 0.0) 1.083 3758.9 2.6 98.33 234.7 0.0041 0.0025 0.0000 ------------------------------------------------------------------------------------------------- total 2250884 251.646(100.0) 0.112 5794.3 2529.4 98.52 231.2 10.6466 2.1503 0.2415 Final

  10. 3rd ACES WG mtg. 2003/06/06 Brisbane. Results on Earth SimulatorSingle PE, 50 steps for 150km length region Original Optimized Final Real Time (sec) : 1655.183718 User Time (sec) : 1654.318557 System Time (sec) : 0.345625 Vector Time (sec) : 1492.593044 Instruction Count : 225736939601 Vector Instruction Count : 33259024852 Vector Element Count : 6766311630183 FLOP Count : 1631853320220 MOPS : 4206.438666 MFLOPS : 986.420247 Average Vector Length : 203.442875 Vector Operation Ratio (%) : 97.234032 Memory size used (MB) : 129.925232 MIPS : 136.453127 Instruction Cache miss (sec): 19.861414 Operand Cache miss (sec): 28.637158 Bank Conflict Time (sec): 18.185194 Real Time (sec) : User Time (sec) : System Time (sec) : Vector Time (sec) : Instruction Count : Vector Instruction Count : Vector Element Count : FLOP Count : MOPS : MFLOPS : Average Vector Length : Vector Operation Ratio (%) : Memory size used (MB) : MIPS : Instruction Cache miss (sec): Operand Cache miss (sec): Bank Conflict Time (sec): Real Time (sec) : 987.056632 User Time (sec) : 986.534791 System Time (sec) : 0.309119 Vector Time (sec) : 852.099695 Instruction Count : 150316159955 Vector Instruction Count : 20258217490 Vector Element Count : 3711121900113 FLOP Count : 1801639677230 MOPS : 3893.608088 MFLOPS : 1826.230248 Average Vector Length : 183.190940 Vector Operation Ratio (%) : 96.614115 Memory size used (MB) : 129.927307 MIPS : 152.367824 Instruction Cache miss (sec): 11.255024 Operand Cache miss (sec): 24.688724 Bank Conflict Time (sec): 18.245958 Real Time (sec) : 673.869615 User Time (sec) : 673.459139 System Time (sec) : 0.303781 Vector Time (sec) : 572.748073 Instruction Count : 70980664682 Vector Instruction Count : 18317157790 Vector Element Count : 4242193958613 FLOP Count : 1883167310684 MOPS : 6377.309649 MFLOPS : 2796.260681 Average Vector Length : 231.596736 Vector Operation Ratio (%) : 98.773801 Memory size used (MB) : 129.923050 MIPS : 105.397136 Instruction Cache miss (sec): 12.189514 Operand Cache miss (sec): 13.090039 Bank Conflict Time (sec): 0.784992 SR2201 64 PEs 2205.sec. 1752.sec. 1921.sec.

  11. 3rd ACES WG mtg. 2003/06/06 Brisbane. Results on Earth SimulatorSingle PE, 5 steps for 300km length region PROG.UNIT FREQUENCY EXCLUSIVE AVER.TIME MOPS MFLOPS V.OP AVER. I-CACHE O-CACHE BANK TIME[sec]( % ) [msec] RATIO V.LEN MISS MISS CONF mrqcof 127 661.180( 52.1) 5206.140 7668.4 3796.1 99.38 241.9 3.9525 7.5565 0.1524 funcs 1758442 504.338( 39.7) 0.287 635.4 163.7 69.20 37.9 1.7746 11.4471 25.6899 pgauss 94 72.447( 5.7) 770.714 5731.7 2669.5 99.22 224.9 0.2088 0.4403 0.0177 srcinput 1 20.692( 1.6) 20692.259 183.7 1.0 1.23 179.0 2.1808 0.1350 0.0000 consti_parameter 1758442 6.977( 0.5) 0.004 176.8 36.3 0.00 0.0 1.6187 1.2345 0.0000 quasi_static 1 2.934( 0.2) 2934.378 187.3 4.7 7.71 253.6 0.2100 0.0393 0.0000 mrqmin 94 0.397( 0.0) 4.227 4067.4 1.4 98.81 242.4 0.0036 0.0021 0.0014 ------------------------------------------------------------------------------------------------- total 3517201 1268.966(100.0) 0.361 4580.9 2195.6 97.61 199.0 9.9492 20.8548 25.8614 Optimized PROG.UNIT FREQUENCY EXCLUSIVE AVER.TIME MOPS MFLOPS V.OP AVER. I-CACHE O-CACHE BANK TIME[sec]( % ) [msec] RATIO V.LEN MISS MISS CONF mrqcof 127 656.062( 72.8) 5165.844 7728.2 3825.8 99.38 241.9 6.7885 2.2390 0.1573 funcs 1758442 141.695( 15.7) 0.081 5976.3 1250.8 99.00 239.2 3.9588 1.5603 0.1976 pgauss 94 72.393( 8.0) 770.142 5735.9 2671.5 99.22 224.9 0.1846 0.3724 0.0176 srcinput 1 21.138( 2.3) 21137.964 179.8 1.0 1.23 179.0 2.5011 0.1773 0.0000 consti_parameter 1758442 6.112( 0.7) 0.003 201.8 41.5 0.00 0.0 1.6043 0.3505 0.0000 quasi_static 1 3.106( 0.3) 3105.903 176.9 4.4 7.71 253.6 0.3563 0.0652 0.0000 mrqmin 94 0.402( 0.0) 4.276 4020.3 1.4 98.81 242.4 0.0034 0.0022 0.0014 ------------------------------------------------------------------------------------------------- total 3517201 900.908(100.0) 0.256 7036.7 3197.7 99.23 240.3 15.3971 4.7669 0.3738 Final

  12. 3rd ACES WG mtg. 2003/06/06 Brisbane. Results on Earth Simulator16PEs, 20 steps for 1200km length region Optimized Global Data of 16 processes : Min [U,R] Max [U,R] Average =========================== Real Time (sec) : 8274.057 [0,4] 8274.079 [0,8] 8274.066 User Time (sec) : 8238.496 [0,14] 8269.493 [0,1] 8264.090 System Time (sec) : 0.083 [0,9] 19.035 [0,14] 2.704 Vector Time (sec) : 7798.273 [0,14] 7892.069 [0,8] 7844.972 Instruction Count : 624303959865 [0,8] 643825666463 [0,6] 635511514785 Vector Instruction Count : 231626728504 [0,14] 232565807360 [0,8] 232112589492 Vector Element Count : 55074702660332 [0,7] 55125971671196 [0,8] 55106179962299 FLOP Count : 27148136671701 [0,3] 27152984585351 [0,0] 27150576822615 MOPS : 6709.196 [0,7] 6735.701 [0,14] 6716.971 MFLOPS : 3282.949 [0,3] 3295.707 [0,14] 3285.373 Average Vector Length : 237.034 [0,8] 237.809 [0,14] 237.412 Vector Operation Ratio (%) : 99.258 [0,6] 99.294 [0,8] 99.273 Memory size used (MB) : 618.973 [0,1] 623.520 [0,0] 619.344 MIPS : 75.509 [0,8] 77.919 [0,6] 76.901 Instruction Cache miss (sec): 18.623 [0,7] 39.624 [0,0] 29.463 Operand Cache miss (sec): 54.630 [0,12] 84.330 [0,0] 58.905 Bank Conflict Time (sec): 90.877 [0,15] 93.064 [0,0] 91.629

  13. 3rd ACES WG mtg. 2003/06/06 Brisbane. Results on Earth Simulator16PEs, 20 steps for 1200km length region Final Global Data of 16 processes : Min [U,R] Max [U,R] Average =========================== Real Time (sec) : 7916.622 [0,4] 7916.638 [0,8] 7916.628 User Time (sec) : 7884.213 [0,2] 7911.765 [0,1] 7906.683 System Time (sec) : 0.080 [0,5] 18.391 [0,2] 2.624 Vector Time (sec) : 7517.856 [0,6] 7583.443 [0,8] 7549.507 Instruction Count : 505404224489 [0,8] 528906103722 [0,6] 518064920585 Vector Instruction Count : 227641971726 [0,7] 228417705195 [0,8] 228070602086 Vector Element Count : 56157955391428 [0,7] 56213438597266 [0,12] 56197485658957 FLOP Count : 27324180912828 [0,3] 27329028805661 [0,0] 27326621214808 MOPS : 7135.700 [0,7] 7164.557 [0,2] 7144.278 MFLOPS : 3453.654 [0,4] 3466.290 [0,2] 3456.146 Average Vector Length : 246.086 [0,8] 246.694 [0,7] 246.404 Vector Operation Ratio (%) : 99.467 [0,6] 99.510 [0,8] 99.487 Memory size used (MB) : 618.950 [0,3] 623.497 [0,0] 619.366 MIPS : 63.985 [0,8] 66.851 [0,6] 65.522 Instruction Cache miss (sec): 6.898 [0,4] 25.711 [0,0] 11.426 Operand Cache miss (sec): 34.316 [0,3] 57.017 [0,0] 38.945 Bank Conflict Time (sec): 2.566 [0,15] 4.896 [0,0] 3.372

  14. 3rd ACES WG mtg. 2003/06/06 Brisbane. Results on Earth SimulatorParallel Efficiency for 1st Linear Step Final Original Length of the Innermost Loop= m/PE ●:L=150km, ○:L=300km, ■:L=450km, □:L=600km, ▲:L=1200km

More Related