1 / 37

COMPUTATIONAL ELEMENT S FOR VERY LARGE-SCALE, HIGH-FIDELITY AERODYNAMIC ANALYSIS AND DESIGN

COMPUTATIONAL ELEMENT S FOR VERY LARGE-SCALE, HIGH-FIDELITY AERODYNAMIC ANALYSIS AND DESIGN. 2006 년 11 월 20 일. 김종암 Aerodynamic Simulation & Design Lab. 서울대학교 기계항공공학부. Contents. Introduction Aerodynamic Solvers for High Performance Computing

nishan
Download Presentation

COMPUTATIONAL ELEMENT S FOR VERY LARGE-SCALE, HIGH-FIDELITY AERODYNAMIC ANALYSIS AND DESIGN

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. COMPUTATIONAL ELEMENTS FOR VERY LARGE-SCALE, HIGH-FIDELITY AERODYNAMIC ANALYSIS AND DESIGN 2006년 11월 20일 김종암 Aerodynamic Simulation & Design Lab. 서울대학교 기계항공공학부

  2. Contents • Introduction • Aerodynamic Solvers for High Performance Computing • Characteristics of International Standard Codes • Essential Elements for Teraflops CFD • High-Fidelity Numerical Methods for Flow Analysis and Design • Parallel Efficiency Enhancement • Geometric Representation for Complex Geometry • Some Examples • Conclusion

  3. Introduction - Bio & Astrophysics [ Simulation of supernovae ] ORNL (Oak Ridge National Laboratory) Researchers using an ORNL supercomputer have found that the organized flow beneath the shock wave in a previous two-dimensional model of a stellar explosion persists in three dimensions, as shown here. [ Molecules in motion ] - 10.4 teraflops SDSC (San Diego Supercomputer Center ) Understanding how molecules naturally behave inside cells. Predicting how the molecules might react to the presence of prospective drugs. [ Computationally predicting protein structures ] ORNL (Oak Ridge National Laboratory) A protein structure, predicted at ORNL (left) and the actual structure, determined experimentally (right). [ Blood-flow patterns at an instant during the systolic cycle ] CITI (Computer and Information Technology Institute) [ 2-D Rayleigh-Taylor Instability ] FLASH center / Pittsburgh Supercomputing Center.

  4. Introduction - Weather forecasting [ Global atmospheric circulation ] DKRZ (Deutsches Klimarechenzentrum GmbH) The German High Performance Computing Centre for Climate and Earth System Research Animation of 1 month "simulated weather" with a global atmosphere model [ Typhoon ETAU in 2003 ] Earth Simulator Center Result of non-hydrostatic ultrahigh-resolution coupled atmosphere-ocean model - 26.58 Tflops was obtained by a global atmospheric circulation code. [ Global ocean circulation ] DKRZ 3-D Particles/Streamlines coloured by temperature are used to visualize important features of the annual mean ocean circulation [Twin typhoons over the Philippine Sea] Earth Simulator Center

  5. Introduction - Aerospace & Other related fields [ the numerical simulation of the hydro-aerodynamic effects around the Shosholoza boat with the aim to gain an optimal design ] the Scientific Supercomputing Center at Karlsruhe University [ Full SSLV configuration ] NASA Columbia Supercomputer [ Aerodynamics simulation around a SAUBER PETRONAS C23 ] SAUBER PETRONAS, Switzerland [ Bio-Agent Blast Dispersion Simulations ] DTRA (Defense Threat Reduction Agency )

  6. Introduction - System architecture • Primary Factors on Computing Speed • CPU clock speed of computer • Number of instruction per one clock • CPU clock speed is represented ‘Hz’ : frequency per one second • 1 Tflops = A trillion floating-point operation per second • Example • Pentium Xeon 2.4 Ghz : 2.4Ghz * 2 (Hyper-Threading) = 4.8 GFlops • Ia64(Itanium) 1.4 Ghz : 1.4 * 2 (Hyper-Threading) * 2(Instruction) = 5.6 Gflops

  7. Computing Power Nowadays • Top500 List (June 2006) • Fastest machine : BlueGene/L by IBM (at DOE/NNSA/LLNL) • Ranked at #500 : 2.026 Tflops • Era of teraflops computing has already come! • BlueGene/L • 100,000+ processors • Performance : 280.6 teraflops

  8. Introduction • Application Characteristics Aerospace Engineering Usage of memory is higher than hard drive Requirement of high speed CPU and high speed I/O Network speed is sensitive Mechanical Engineering • Explicit Problem • Performance of CPU and network speed are important • Implicit Problem • Requirement of high speed I/O and mass memory storage Physical science • Monte Carlo : High dependence on network performance Chemical science • Molecular Dynamics • Performance of CPU and network speed are important • Low dependency of memory size, I/O capacity and speed • Quantum Dynamics • Performance of CPU, network speed and mass memory storage are important Life science • Protein folding • High speed CPU and memory size are a little important Astronomy Computing performance is sensitive to high speed CPU, network speed ( Enormous influence at pre-process and post-process)

  9. Introduction • Application Characteristics

  10. Specialized High Performance Baseline Codes • Standard Flow Solvers in NASA (USA) • Full Potential • CAPTSD • Block Structured • CFL3D, TLNS3D-MB, PAB3D, GASP , LAURA, VULCAN • Overset Structured • OVERFLOW • Unstructured • FUN3D, USM3D, 3D3U • Other Flow Solvers • MIRANDA • High-order hydrodynamics code for computing instabilities and turbulent mix • Coded by LLNL (Lawrence Livermore National Laboratory) • AVBP • A compressible flow solver running on unstructured and hybrid grids • Coded by CERFACS, France

  11. Aerodynamic Solvers for High Performance Computing (USA) • General Features of Overflow • Right-hand side options : • central differencing with Jameson 4/2 dissipation • Roe upwinding. • Left-hand side options : • Pulliam-Chaussee diagonalized scheme • LU-SGS scheme • Low-Mach number preconditioning • First-order implicit time advance • Convergence acceleration options: • Time-accurate mode or local timestep scaling • Grid sequencing, multigrid • Performance Test • Block structured overset gridwith 126 million grid points in total, 2000 time steps • Weak scaling : About 123,000 meshpoints in each processor • Efficiency : About 70% with 1024 processors(Compared to 64 processors)

  12. Aerodynamic Solvers for High Performance Computing (USA) • General Features of the CFL3D • 2-D or 3-D grid topologies • Inviscid, laminar and/or turbulent flows • Steady or unsteady (including moving-grid) flows • Spatial discretization • van Leer’s FVS, Roe’s FDS • Time integration • Implicit approximate-factorization, dual-time stepping • High order interpolation & limiting • TVD MUSCL • Multiple block options: • 1-1 blocking, patching, overlapping, embedding • Convergence acceleration options: • Multigrid, mesh sequencing • Turbulence model options: • Baldwin-Lomax • Baldwin-Lomax with Degani-Schiff Modification • Baldwin-Barth • Spalart-Allmaras (including DES option) • Wilcox k-omega • Menter's k-omega SST • Abid k-epsilon • Explicit Algebraic Stress Model (EASM) • K-enstrophy

  13. Aerodynamic Solvers for High Performance Computing (USA) • PETSc-FUN3D (NASA) • Code Features • FUN3D code attached to PETSc framework • A tetrahedral vertex-centered unstructured code • Spatial discretization with Roe scheme • A Galerkin discretization for the viscous terms • Pseudo-transient Newton-Krylov-Schwarzblock-incomplete factorization on each subdomainof the Schwarz preconditioner for time integration • Used for design optimization of airplanes,automobiles and submarines with irregular meshes • Performance Test • Unstructured mesh with 2.7 million vertices,18 million edges • Weak scaling • Performance : Nearly scalable withO(1000) processors

  14. Aerodynamic Solvers for High Performance Computing (USA) • MIRANDA (LLNL) • Code Features • High-order hydrodynamics code for computing instabilities and turbulent mix • Conducting direct numerical simulationand large-eddy simulation • FFTs and band-diagonal matrix solvers for spectrally-accurate derivatives • Studying Rayleigh-Taylor (R-T) and Richtmyer-Meshkov (R-M) instabilities • Performance Test • Weak scaling parallel efficiency nearly 100%with 128K processors • Strong scaling shows good efficiency with64K processors (Compared to performance with 8K processors) • All-to-all communication gives good performance Turbulent Flow Mixing of Two Fluids(LES of R-T Instability) Efficiency with Strong Scaling

  15. Aerodynamic Solvers for High Performance Computing (Europe) • AVBP (CERFACS) • Code Features • A parallel CFD code for laminar and turbulent compressible Navier-Stokes equations on unstructured and hybrid grids • Unsteady reacting flow analysis based on the LES approach • Built upon a modular software library including integrated parallel domain partition and data reordering tools, message passing handler, supporting routines for dynamic memory allocation, routines for parallel I/O and iterative methods • Performance • Nearly 100% of parallel efficiency with4K processors (on BlueGene/L) • Strong scaling case • Code may run in the range of O(1000)s of processors

  16. Aerodynamic Solvers for High Performance Computing • Efficiency of Various Applications Including CFD • From BlueGene/L reports • Both weak scaling and strong scaling parallelism ※ Weak scaling : Same domain size in each processor ※ Strong scaling : Same domain size in total

  17. Essential Elements for Teraflops CFD - High-Fidelity Numerical Method - Numerical Flux Scheme : Accurate Shock Capturing Higher-Order Interpolation : Complex Flow Structure & Vortex Resolving Enhanced Accuracy of Aerodynamic Coefficients. Flow Analysis over Helicopter Full Body Configuration : A Very-Large Scale Problem Convergence Acceleration & Adaptive Grid Technique : Reduced Computational Cost N-S Simulation Around a Helicopter Fuselage with Actuator Disks U.C. Davis Center for CFD

  18. RoeM Roe with E-Fix Damping & Feeding rate control using Mach number-based function RoeM • Shock Stability ( No Carbuncle ) • Total Enthalpy Conservation • Stability in Expansion Region • Exact Capturing of Contact Discontinuity • Accuracy comparable to Roe’s FDS Essential Elements for Teraflops CFD - High-Fidelity Numerical Method – • RoeM Scheme Roe’s FDS • Sharp capturing of shock discontinuity • Unstable in expansion region (defect) • Carbuncle phenomena (defect)

  19. AUSMPW+ Scheme Pressure wiggles cured by introducing weighting functions based on pressure AUSMPW+ • Eliminating expansion shock • Eliminating oscillations and overshoots • Reduced grid dependency • Improved convergence behavior Essential Elements for Teraflops CFD - High-Fidelity Numerical Method – AUSM+ • Splitting the convective flux term and the pressure flux term • The hybrid form of FDS and FVS • Oscillation near a wall or across a strong shock (defect)

  20. M-AUSMPW+ Scheme Much effective in the computations of multi-dimensional flows Achieve the complete monotonic characteristics Improved convergence characteristics Essential Elements for Teraflops CFD - High-Fidelity Numerical Method – M-AUSMPW+ • Propose the criterion for accurate calculation of cell-interface fluxes • Pressure splitting function is modified

  21. Essential Elements for Teraflops CFD - High-Fidelity Numerical Method – • Higher Order Interpolation & Oscillation Control Scheme : MLP • TVD and ENO approach : based on 1-D flow physics. • Higher order interpolation with effective oscillations control in multiple dimension : Multi-dimensional Limiting Process. MLP5 + M-AUSMPW+ ( 350 * 175 * 175 ) MLP5 + M-AUSMPW+ ( 350 * 175 * 175 ) Feature 2: Profile of swirls near the corner Feature 3: Interacted profile of separated vortex & swirls plane of x = 0.842 plane of y = 0.078 : the center of primary separated vortex plane of x = 0.8725 Feature 1: Profile of separated vortex

  22. Essential Elements for Teraflops CFD - High-Fidelity Numerical Method – • Multigrid : Issues regarded in hypersonic flows • Non-linearity in shock regions • cause robustness problem in prolongation • Chemical reaction • time step restricted due to stiffness • Solutions to the problems • Modified Implicit residual smoothing • Damped prolongation & Implicit treatment of source term • Test Problem : Nonequilibrium viscous flow • M∞=10 , 60km altitude

  23. Essential Elements for Teraflops CFD - Parallel Efficiency Enhancement - • Requirements for Systems • CPU - Fewer & powerful processors • Better for efficiency, management of resources, fault-prevention • More power consumption and heat emission • Memory – Faster access & efficient management • Most important factor for CFD applications • Network – Multiple interconnection networks • Separated communication channel between inter-processor communication and global communication • Ex) IBM BlueGene/L : 5 different communication types • I/O – Unpredicted broken data • Overload to storage server during data writing • Sometimes broken ASCII data are observed

  24. Essential Elements for Teraflops CFD - Parallel Efficiency Enhancement - • Requirements for Software/Programming • Memory size – Different array range among processors • Computing domains can be different in range with same mesh points • Conventionally maximized memory size was allocated • Remedy : Variables stored in global memory (Shared memory system) Dynamic memory allocation in Fortran 90 (Distributed memory system) • I/O – Writing conducted in each processor • Conventional programs gathered all data set into one processor : Large-size array allocation required • Etc : Optimized compiler options, highly functional debugger, minimization of serial processing 80×40Domain 40×80Domain Dimension X(80,80), Y(80,80), …… 80×40Domain

  25. Essential Elements for Teraflops CFD - Parallel Efficiency Enhancement - • Requirements for Algorithms • Scalability enhancement • Reduced global communication • Global communication along with inter-processor communication leading to synchronization problem • Residual gathering, aerodynamic coefficient computation routines should be improved • Dynamic load balancing • Processor allocation for faster inter-processor communication • Dynamic load balancing for the change of processor’s performance during computation • Fault-tolerance

  26. Essential Elements for Teraflops CFD - Geometric Representation - Multiple Body Problems Complicated Geometry • Block topology is complicated for structured system. • Grid generation work is a time consuming work. • Manual preprocess is impossible. Multiblock Overset Unstructured • Preprocessor • for Partitioning & • Automatic Detection • of Block Topology • Preprocessor • for automatic • block connectivity • Postprocessor • Overset mesh • generator • Automatic • grid generator • & Grid Adaption • Method

  27. Essential Elements for Teraflops CFD - Geometric Representation - • Multi-Block System • Modulation of Preprocessing Code • Evaluation of Metric, Minimum Wall Distance and their Exchange • Automatic Detection Block Topology Flow Analysis of Combustion Chamber (NS, 600,000 pts., ASDL)

  28. Mesh A Mesh A Mesh A Mesh B Mesh C Mesh C Mesh B Mesh B Mesh A Mesh C Mesh A Mesh B Mesh C Mesh B Essential Elements for Teraflops CFD - Geometric Representation - • Overset Mesh System • Pre-processing for automatic finding process of hole, fringe and donor cells due to complicated block connectivity (Overlap Optimization for PEGASUS) • Post-processing for the evaluation of aerodynamic coefficients (Zipper Grid)

  29. Essential Elements for Teraflops CFD - Geometric Representation - • Unstructured System • Automatic grid generation code (Mavriplis et al., NASA Langley) • Grid adaptation method Subdivision Method Adjoint Based Adaptation Method

  30. Jet Off AOA 0 Jet On AOA 0 Jet Off AOA 10 Jet On AOA 10 Jet Off AOA 20 Jet On AOA 20 Streamlines and Iso-velocity Surfaces (Side Nozzle, N-S, M = 1.0) Some Examples • Multi-block System • Parametric study in various flight conditions for aerospace engineering Parametric Study of a Missile with Side Nozzle (N-S, M =1.75)

  31. Mach Contour Static Pressure Contour Some Examples • Multi-block System • Flow Analysis & Design of Turbulent Intake Flow using Multiblock System Total Pressure Contour in the Duct Section & Streamlines

  32. Some Examples • Design Optimization based on Large Scale Computation Turbulent Duct Design with Multi-block Mesh System Baseline Model : Designed Model :

  33. Some Examples • Overset Mesh System Manually Assigned Block Connectivity Overlap Optimized Block Connectivity

  34. Y/SPAN=33.1% Y/SPAN=18.5% Y/SPAN=40.9% Y/SPAN=23.8% Y/SPAN=51.2% Y/SPAN=63.6% Y/SPAN=84.4% Some Examples • Overset Mesh System

  35. Baseline Designed Some Examples • Design Optimization based on Large Scale Computation Redesign of DLR-F4 W/B Conf. with Overset Mesh System

  36. Some Examples • Launch Vehicle Analysis with Load Balancing • Parallel computation on the Grid • 32 processors in Seoul National University & KISTI • 3.5 million mesh points

  37. Conclusion • Current Status • Many disciplines are conducting teraflops computing • Teraflops computing in CFD field has not been activated yet • Issues and Requirements • High-fidelity numerical schemes for the description of complex flowfield • Domain decomposition method and parallel algorithms for enhancement of efficiency / fault-tolerancing • Automatic pre- & post-processing techniques in geometric representation to resolve complicated multiple body problems • Target CFD Application Areas • Unsteady Aerodynamics with Massive Flow Separation • MDO and Fluid-Structure Interaction • Multi-Body Aerodynamics with Relative Motion • Multi-Scale Flow Computation

More Related