Flights of the condor war stories challenges and solutions
Download
1 / 70

Flights of the Condor: War Stories, Challenges, and Solutions - PowerPoint PPT Presentation


  • 108 Views
  • Uploaded on

Flights of the Condor: War Stories, Challenges, and Solutions. Jason Stowe Condor Week 2009 April 22 nd , 2009. Coming to Condor Week since 2005. Started as a User. Users hunger for features.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Flights of the Condor: War Stories, Challenges, and Solutions' - ninon


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Flights of the condor war stories challenges and solutions l.jpg

Flights of the Condor:War Stories, Challenges, and Solutions

Jason Stowe

Condor Week 2009

April 22nd, 2009


Coming to condor week since 2005 started as a user l.jpg
Coming to Condor Week since 2005. Started as a User



Slide4 l.jpg

AccountingGroups (2004/2005)Configuration w/Pipes (2005/2006)GroupResourcesUsed (2006/2007)Condor in Cloud (2007/2008)Resource Weights (2008/2009)Based upon customer requests


Slide5 l.jpg
Focus on software development for managing Condor at any scale,and provide services that complement the technology



Slide7 l.jpg
Users like Condor because... Businesses, that use CondorIt’s open, it works, flexible, (corporations) no lock-in API/Operating System, and...


The community l.jpg
The Community Businesses, that use Condor


Today let s talk about a few challenges solutions l.jpg
Today, let’s talk about Businesses, that use Condora few challenges, solutions


War story 1 compute data l.jpg
War Story #1: Businesses, that use CondorCompute & Data


Whenever you find or solve a computation problem you discover a data problem l.jpg
Whenever you find or solve Businesses, that use Condora computation problem, youdiscover a data problem.


Dark or latent unused storage on any os device l.jpg
“Dark” or Latent, Unused Storage Businesses, that use Condoron any OS/Device


Empty space dispersed across machines in unusable sizes l.jpg
Empty space dispersed across Businesses, that use Condormachines in unusable sizes



So we looked at hadoop l.jpg
So we looked at our machines.”Hadoop


New type of storage aggregated or cloud storage l.jpg
New type of storage: our machines.”Aggregated or “Cloud” Storage


Block store architecture l.jpg
Block Store Architecture our machines.”


But how do we use it l.jpg
But how do we use it? our machines.”


1 5 years ago it works well to access it in java but what about mounting l.jpg
1.5 years ago: It works well our machines.”to access it in Java, but what about mounting?


So we tried webdav l.jpg
So we tried our machines.”WebDAV


Next up open source fuse driver l.jpg
Next up, our machines.”open source FUSE driver



Mountable drivers linux fuse windows ifs l.jpg
Mountable drivers Read/WriteLinux(FUSE) / Windows (IFS)


Cloudfs architecture l.jpg
CloudFS Read/Write Architecture



Customers asked for surprising features l.jpg
Customers Asked for Read/WriteSurprising Features

  • HTTP/REST Protocols similar to Amazon S3

    Reasons:

    Installing mountable driver across servers/workstations prohibitive

    Want similar interface to various cloud storage providers => Internal Cloud

  • FTP Interface – Because it is simple!


Status today l.jpg
Status Today Read/Write


Mountable multi platform drivers linux suse 10 rhel centos 4 5 windows 2k3 osx 10 3 l.jpg
Mountable Multi-platform Drivers. Linux: SUSE 10, RHEL/ Read/WriteCentOS 4&5, Windows 2k3 +, OSX 10.3+




Restful storage service ftp interface l.jpg
RESTful Read/Write Storage Service & FTP interface


Management interface for controlling storage features integrating with cycleserver l.jpg
Management interface for Read/Writecontrolling storage features(Integrating with CycleServer)


Looking forward to condor hadoop l.jpg
Looking forward to Read/Writecondor_hadoop!


War story 2 cloud calculations l.jpg
War Story #2: Read/WriteCloud Calculations


Condor users peak vs median usage problem l.jpg
Condor users Read/WritePeak vs. Median usageProblem


Need for compute power comes up suddenly l.jpg
Need for compute power Read/Writecomes up suddenly



Condor users balance we need more servers for big runs and our servers are 40 utilized l.jpg
Condor users balance Read/Write“We need more servers for big runs” and “Our servers are 40% utilized”


Many ways to solve this problem using ec2 l.jpg
Many ways to solve Read/Writethis problem using EC2


Use cases do exist for adding nodes to a local condor pool using amazon ec2 l.jpg
Use cases do exist for Read/Writeadding nodes to a local condor poolusing Amazon EC2


We favored entire pools in cloud l.jpg
We Read/Writefavored entire poolsin cloud


Data scheduling performance issues l.jpg
Data Scheduling, Read/WritePerformance issues



Can test cycleserver at a scale our users have and we don t l.jpg
can test CycleServer at a scale Read/Writeour users have and we don’t


Need 1000 node condor pool wait 15 minutes l.jpg
Need 1000 node Condor Pool Read/WriteWait 15 minutes


Dynamic resources pool can be sized to the jobs l.jpg
Dynamic Resources => Read/WritePool can be sized to the jobs


1 core x 1000 hrs 1000 core x 1 hr 200 l.jpg
1 core x 1000 hrs = Read/Write1000 core x 1 hr = ~$200


Sounds good but how do we do this for a workflow like blast l.jpg
Sounds good, but how Read/Writedo we do this for a Workflow like BLAST?


From e science 2008 for 64x the processors hadoop running blast 57x mpiblast 52 4x l.jpg
From e-science 2008: Read/WriteFor 64x the processorsHadoop Running Blast: 57xmpiBLAST: 52.4x


High cpu amazon ec2 nodes have best price performance l.jpg
High-CPU Amazon EC2 nodes Read/Writehave best price/performance


Scalability 2x cpus 1 9825x 64 cpus 60 7x speed up l.jpg
Scalability: 2x CPUs = 1.9825x Read/Write64 CPUS = 60.7x Speed-up


Why high throughput leads to efficient computing l.jpg
Why High Throughput leads to Read/WriteEfficient Computing


Another user worked with varian mass spectrometers other high tech lab equipment l.jpg
Another User: Read/WriteWorked with Varian - Mass SpectrometersOther High-Tech Lab Equipment


Problem coming up on a conference needed to run a large simulation l.jpg
Problem: Coming up on Read/Writea conference, needed to run a large simulation


Six weeks on an internal condor pool l.jpg
Six Weeks Read/WriteOn an internal Condor pool


Deployed a condor pool in cyclecloud l.jpg
Deployed a Condor pool Read/Writein CycleCloud


Same 6 week job l.jpg
Same 6-week Job Read/Write


Ran 1 day l.jpg
Ran < 1 Day Read/Write


War story 3 management l.jpg
War Story #3: Read/WriteManagement


Condor tutorial mentions why use a personal condor i e condor on few nodes l.jpg
Condor Tutorial mentions Read/Write“Why use a personal Condor?”i.e. Condor on few nodes...


Condor on 1 computer gets you policies fault tolerance etc l.jpg
Condor on 1 computer Read/WriteGets you policies, fault-tolerance, Etc.


Similarly management issues come up even on small pools l.jpg
Similarly, management issues Read/Writecome up even on small pools



Managing configuration files our config with pipes cw2006 l.jpg
Managing Configuration Files Read/Write(our Config with Pipes CW2006)


Exploring classads logfiles becomes problematic l.jpg
Exploring Read/WriteClassAds/LogFilesbecomes problematic



Man decades on development of tools to assist running condor l.jpg
Man-decades on development Read/Writeof tools to assist running Condor


Have demo against madison pool come see me we d love more use cases l.jpg
Have demo against Madison pool Read/WriteCome see me. We’d love more use cases


Questions thank you l.jpg
Questions? Thank you Read/Write

For more information go to:

http://www.cyclecomputing.com

We constantly see opportunities for talented Condor folks, so please feel free to contact us!

Jason Stowe

jstowe - cyclecomputing.com