1 / 21

NGC Integration and HPC Analysis

NGC Integration and HPC Analysis. September 22, 2009 Timothy M. Shead NGC Integration Lead.

roxy
Download Presentation

NGC Integration and HPC Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. NGC Integration and HPC Analysis September 22, 2009 Timothy M. Shead NGC Integration Lead Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,for the United States Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000. This document is SAND Number: 2009-6318 P

  2. What is Titan? An expansion of the open source Visualization ToolKit (VTK) to support the ingestion, processing, and display of informatics data. Post processing, visualization, and analysis of ASC CTH AMR simulation of material behavior during high velocity impacts. Looking at global patterns of email exchange within Enron Corp.

  3. What is Titan? A flexible parallel pipeline architecture, written in C++ Query Display Ingest Extract Model Partition Display • Provides a direct mapping between whiteboard and code. • Provides flexibility / exploration (for developers!) • Manages execution: • Manages the “flow” of data through components. • In what order are components executed? • What needs to be recomputed when parameters change?

  4. What is Titan? A flexible parallel pipeline architecture, written in C++ • Table • Tree • Graph • Sparse Matrix • Dense Matrix • Selection Link

  5. What is Titan?

  6. What is Titan? A proven architecture for interactive client-server analysis. Sept 2006: ‘Out of the box’ ParaView was used for post-processing of a magnetics simulation that had 274 million unstructured cells, 5000 files, and multiple terabtyes on disk. User processed and interacted with data from their Sandia desktop, server delivered images to the desktop client (3 ~ 5 Hz).

  7. Building Applications with Titan ParaView 3.0 ThreatView NGC Prototype 1 Language Bindings (TCL, Python, Java, .NET, COM) Parallel Client/Server Library (C++) TITAN Toolkit (C++) VTK (C++)

  8. Building Applications with Titan CargioSQL NGC Prototype 2 Language Bindings (TCL, Python, Java, .NET, COM) Parallel Client/Server Library (C++) TITAN Toolkit (C++) VTK (C++)

  9. Titan Language Bindings Tcl/Tk on Linux Python on OSX Java on Vista C++ on Windows XP COM and Excel, on Vista .NET on Vista

  10. Titan Programmable Filters XML Programmable Filter Java Programmable Filter R Programmable Filter R Inside Java™ Inside XQuery / XSLT Inside Titan Data Out (New / Updated Tables, Trees,Graphs, etc) Titan Data Out (New / Updated Tables, Trees,Graphs, etc) Titan Data Out (New / Updated Tables, Trees,Graphs, etc) Titan Data In (Tables, Trees, Graphs, etc) Titan Data In (Tables, Trees, Graphs, etc) Titan Data In (Tables, Trees, Graphs, etc) User-supplied R expression User-supplied Java Code User-supplied XQuery / XSLT expression Not shown: Python & Matlab™ programmable filters …

  11. R Programmable Filter Example vtkRcalculatorFilter* rcf = vtkRcalculatorFilter::New(); rcf->SetInput(source->GetOutput()); rcf->PutTable("x"); rcf->SetRscript("m = do.call(cbind,x)\n \ cl <- kmeans(m,3)\n \ m = cbind(m,cl$cluster)\n \ colnames(m)[4] = \"cluster\"\n"); rcf->GetTable("m"); sink->SetInput(rcf->GetOutput());

  12. Representative Titan Modeling + Analysis Pipelines Similarity Graph Analysis (P2, Red Storm Stunt) “Show me how every document relates to every other document.” Similarity Matrix Graph Layout Rendering Unsupervised Partitioning (P2) “Assign documents to hard clusters based on similarities.” Document Ingestion Feature Extraction Vector Space Model SVD Model Unsupervised Partitioning Graph Layout Rendering Text Query Analysis (Near Future) “Show me how every document relates to my query.” Query Rendering

  13. Representative Titan Modeling + Analysis Pipelines Similarity Graph Analysis (P2, Red Storm Stunt) “Show me how every document relates to every other document.” Similarity Matrix Graph Layout Rendering Unsupervised Partitioning (P2) “Assign documents to hard clusters based on similarities.” Document Ingestion Feature Extraction Vector Space Model SVD Model Unsupervised Partitioning Graph Layout Rendering Text Query Analysis (Near Future) “Show me how every document relates to my query.” Query Rendering Terabytes Gigabytes Megabytes “Data Reduction / Aggregation”

  14. Representative Titan Modeling + Analysis Pipelines Similarity Graph Analysis (P2, Red Storm Stunt) “Show me how every document relates to every other document.” Similarity Matrix Graph Layout Rendering Unsupervised Partitioning (P2) “Assign documents to hard clusters based on similarities.” Document Ingestion Feature Extraction Vector Space Model SVD Model Unsupervised Partitioning Graph Layout Rendering Text Query Analysis (Near Future) “Show me how every document relates to my query.” Query Rendering “Computational Complexity”

  15. Representative Titan Modeling + Analysis Pipelines Similarity Graph Analysis (P2, Red Storm Stunt) “Show me how every document relates to every other document.” Similarity Matrix Graph Layout Rendering Unsupervised Partitioning (P2) “Assign documents to hard clusters based on similarities.” Document Ingestion Feature Extraction Vector Space Model SVD Model Unsupervised Partitioning Graph Layout Rendering Text Query Analysis (Near Future) “Show me how every document relates to my query.” Query Rendering Hours Seconds Milliseconds “Latency”

  16. Representative Titan Modeling + Analysis Pipelines Similarity Graph Analysis (P2, Red Storm Stunt) “Show me how every document relates to every other document.” Similarity Matrix Graph Layout Rendering Unsupervised Partitioning (P2) “Assign documents to hard clusters based on similarities.” Document Ingestion Feature Extraction Vector Space Model SVD Model Unsupervised Partitioning Graph Layout Rendering Text Query Analysis (Near Future) “Show me how every document relates to my query.” Query Rendering “Interaction & Feedback”

  17. Where We Are Today Local Filesystem or Database NGC P2 Serial Option: put the entire pipeline in one process (P2) Zero administration, easy to use, online computation, small data.

  18. Modest Client / Server Capability forOrganizations without HPC Server Filesystem or Database Titan Web Service NGC P3 HTTPS (Based on ParaText) Client / Server Option: model generation and some analysis on server, remaining analysis in client (P3) Modest administration, slightly more complex, some computation offline, modest data sizes.

  19. Modest Client / Server Capability forOrganizations without HPC NGC P3 HTTPS Server Filesystem or Database Web Server NGC P3 (Based on ParaText) Client / Server Option: model generation and some analysis on server, remaining analysis in client (P3) Modest administration, slightly more complex, some computation offline, modest data sizes.

  20. How the NGC will deliver “HPC informaticscapabilities that are both usable and useful to analysts." NGC P3 Portals, shared file system, or shared database. HTTPS HPC Filesystem or Database HPC Service Node HPC Compute Nodes NGC P3 (Based on ParaText and the Red Storm Stunt) Client X HPC Option: model generation on HPC compute nodes, some analysis on a service node, remaining analysis in client (P3) More administration, more offline computation, largest data sizes.

  21. Questions?

More Related