grid computing at texas tech university using sas l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Grid Computing at Texas Tech University using SAS PowerPoint Presentation
Download Presentation
Grid Computing at Texas Tech University using SAS

Loading in 2 Seconds...

play fullscreen
1 / 27

Grid Computing at Texas Tech University using SAS - PowerPoint PPT Presentation


  • 269 Views
  • Uploaded on

Grid Computing at Texas Tech University using SAS. Ron Bremer Jerry Perez Phil Smith Peter Westfall* Director, Center for Advanced Analytics and Business Intelligence Texas Tech University. What is Grid Computing?.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Grid Computing at Texas Tech University using SAS' - lotus


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
grid computing at texas tech university using sas

Grid Computing at Texas Tech University using SAS

Ron Bremer

Jerry Perez

Phil Smith

Peter Westfall*

Director, Center for Advanced Analytics and Business Intelligence

Texas Tech University

what is grid computing
What is Grid Computing?
  • Grid computing means using multiple resources connected by the net to perform demanding calculations.
  • Example:
economies of high performance computing
Economies of High Performance Computing
  • Current fastest machine: ~40 Teraflops ($300M)
  • 10 Tflops Machines

(~$50M)

  • Fastest Cluster at TTU: 0.1 Tflops (~$0.1M)
  • Speed of a PC 0.003 Tflops

(~$.001M)

underused resources
Underused Resources
  • Computers are everywhere, mostly idle!
  • Grid computing leverages unused resources to create an effective “Supercomputer”
  • Teraflops = (N computers) x (TFLPs per)
  • For Free! (Almost)
grid initiatives at ttu and in texas
Grid Initiatives at TTU and in Texas
  • HipCAT – High Performance Computing Across Texas
  • TIGRE – Texas Internet Grid for Research and Education
  • SORCER – Service ORienter Computing EviRonment (TTU CS dept.)
  • SAS/Connect grid
hipcat
HipCAT
  • Consortium of Texas institutions working together to use
    • High performance computing
    • Clusters
    • Massive data storage
    • Scientific visualization
    • Grid computing.
  • Director: Phil Smith, Texas Tech University
  • Members:
    • Baylor College of Medicine
    • Rice University
    • Texas A&M University
    • Texas Tech University
    • University of Houston
    • University of Texas
    • University of Texas at Austin
    • University of Texas at Arlington
    • University of Texas at El Paso
    • University of Texas Southwestern Medical Center
tigre
TIGRE
  • Texas Internet Grid for Research & Education
  • Two year project involving: UT, TTU, UH, Rice, and TAMU
  • Funding announced by the Governor in September
  • TIGRE will develop a grid software stack and policies and procedures to facilitate Texas grid computing efforts.
grid software products used at ttu
Grid Software Products Used at TTU
  • AVAKI
  • Globus
  • Jini Networking Technology
  • SAS/Connect (MPConnect), %Distribute macro
benefits of sas
Benefits of SAS
  • Ease of Use (relative to other grid products)
  • Available and applicable for many scientists in their resp. fields
  • Flexibility
    • Data base (DATA step, PROC SQL)
    • Math/Optimization (SAS/IML, SAS/OR)
    • Stat (SAS/STAT, SAS/ETS)
problems amenable to sas grid
Problems Amenable to SAS Grid
  • Replicates of Fundamental task
  • Fundamental tasks are time consuming, lots of replicates
  • Examples
    • Simulation
    • Astrophysics
    • Bioinformatics
    • Ensembles of predictive models
success story
Success Story
  • Financial Event Studies
    • Developed simulation tool to detect events
    • Simulated its performance
    • 25 hours finished in 40 minutes
    • Published in J. Fin. Econometrics
  • Old system: “Sneaker grid”
another success story portfolio analysis
Another Success Story:Portfolio Analysis
  • 300 portfolios, 50 securities each by randomly sampling securities from CRSP daily database (7.23 Gigabytes)
  • 15 models created for each of 50 securities (PROC AUTOREG of SAS/ETS), under 169 treatment settings.
  • 126,750 models and associated data steps per portfolio.
  • 500 days of continuous computing time reduced to two weeks.
notoriety
Notoriety
  • Web articles appeared in SAS, Grid today, Next-Gen Data forum
  • Interviewed by DataBase Trends and Applications
sas grid structure
SAS Grid Structure
  • Client connects to host machines
  • Client sends replicates of fundamental task (“chunks”) to hosts
  • Hosts process chunks, send back to client
  • Client combines chunks and summarizes
sas farm
SAS Farm
  • 100 SAS machines in student lab
  • 2.66 GhZ per node
  • All have SAS software installed
  • SAS “Spawner” must be started on all
  • Avaki also installed - diagnoses problems
load balancing
Load Balancing
  • Automatically supports load balancing by farming out independent tasks to the next available resource.
  • Students never noticed that their machines were being used!
simulation based methods
Simulation-Based Methods

PROC MULTTEST of SAS/STAT(first hard-coded bootstrap?)

simulation based methods ii
Simulation-Based Methods, II
  • Adjust=simulate in GLM and MIXED
  • Posterior simulation in MIXED
toy example testing random number generators
Toy Example – Testing Random Number Generators
  • Random number generators often fail to provide independent numbers.
  • Test case: U1, U2 are Uniform on (0,1).
  • If independent, then E{6(U1-U2)2} = 1.00.
  • Check: Generate many pairs, report average (should be 1.000000)
startup windows
Startup (Windows)

1. Start Spawner:

C:\Program Files\SAS\SAS 9.1>spawner -i -comamid tcp

2. Activate Spawner:

3. Set batch log in permissions:

the distribute macro
The %Distribute Macro
  • Written by Cheryl Doninger and Randy Tobias
  • File: http://support.sas.com/rnd/scalability/papers/distribute.zip
  • Supporting document:

http://support.sas.com/rnd/scalability/papers/distConnect0401.pdf

problems we have experienced
Problems We Have Experienced
  • Random crashes (client as well as hosts)
  • Diagnosing errors
  • I/O problems
  • Windows Service Pack 2 Firewall
  • Social issues (grid involves people!)
future plans
Future Plans
  • Support from business and government:
    • grid-enabled bioinformatics
    • business intelligence/data mining
  • Support HPC at TTU and in Texas