1 / 41

Cloud Computing at Amazon’s EC2

Cloud Computing at Amazon’s EC2. Joe Steele jrsteele@unomaha.edu. Grid Computing. Shared resources – many computer clusters transferring data and running jobs. Geographically distributed. Cross-grid collaboration.

egan
Download Presentation

Cloud Computing at Amazon’s EC2

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cloud Computingat Amazon’s EC2 Joe Steele jrsteele@unomaha.edu

  2. Grid Computing Shared resources – many computer clusters transferring data and running jobs. Geographically distributed. Cross-grid collaboration. Idea is analogous to electric power network (grid), where power generators are distributed, but users access electric power without bothering about the source of energy and its location.

  3. LHC Computing Grid (LCG)

  4. Cloud Computing What if I don’t have my own cluster? Cloud computing refers to a cluster that invites users to send jobs. (SaaS –Software as a Service) Computation, software, data access, and storage services that do not require user knowledge of the location or configuration of the system. Term comes from the cloud drawing used in the past to represent the telephone network, later represents the internet.

  5. Cloud Computing

  6. Cloud Computing Private companies large data centers. When considering operational costs, 50k servers are cheaper per cpu then 1k servers (5 to 7 times cheaper). Amazon: • $0.085/cpu-hour • No minimum, maximum • No contract

  7. Amazon E2 aws.amazon.com Computing cluster – create an account and provide a credit card. Let Amazon take care of the hardware.

  8. Cloud BioLinux JCVI (J. Craig Venter Institute) created cloud version of NERC BioLinuxVM. An Ubuntu machine with over 100 NEBC software packages. Image stored at EC2, is available to be copied at no charge, by EC2 users.

  9. http://aws.amazon.com

  10. Create a new account

  11. Enter your information

  12. Sign up for an EC2 account

  13. Click on “Sign up for Amazon EC2”

  14. EC2 Account • Signing up for EC2 automatically signs you up for Amazon Simple Storage Service, and Amazon Virtual Private Cloud. • Requires credit card information. • No charges until you start using the services. • Amazon will email with Access Identifiers, and instructions for your first log in.

  15. Click on “AWS Management Console”

  16. Click the EC2 Tab

  17. Launch an Instance

  18. I recommend biolinux

  19. Click “Select”

  20. Pricing • Amazon has a variety of VM sizes available – pricing is at: http://aws.amazon.com/ec2/pricing/ • You are charged for CPU usage, for data storage, and for data transferred to or from Amazon. Charges continue until a VM is “Terminated”. • You can set up a small test VM for free – select “Micro” for the size.

  21. Kernel defaults are fine

  22. Create a Key Pair

  23. Create security group

  24. Launch

  25. Machine info

  26. “Terminate” to end charges

  27. ssh to the machine A window opens, telling you how to connect to your new VM, eg,: “ssh -i key_pair_name.pem root@ec2-76-202-01-919.compute-1.amazonaws.com” However, for biolinux, do: ssh –i key_pair_name.pem ubuntu@ec2-76-202-01-919.compute-1.amazonaws.com

  28. NX Use NX for the graphical display (built in to biolinux already). Open source, can be found at http://www.nomachine.com/ Must ssh into VM FIRST, using the key pair. >adduser <username> >groups >usermod -G <grp1>,<grp2>,ssh <username>

  29. Start NX

  30. “Configure”

  31. BioLinux over NX

  32. Data Stored at Amazon There are large datasets stored at Amazon, available for use – free of charge (mostly). You are charged for any data you copy. http://aws.amazon.com/datasets to search through them.

  33. http://aws.amazon.com/datasets

  34. Datasets Human DNA sequences: • 1000 Genomes Project (7,300 GB) • Ensembl Annotated Human Genome - FASTA (115 GB) • Ensembl Annotated Human Genome - MySQL (200 GB) • GenBank (200 GB) • Human Liver Cohort (Sage Bionetworks) (0.6 GB) • Illumina - Jay Flatley's Human Genome Data Set. (350 GB) • YRI Trio Data - complete genome sequence for three individuals (700 GB) Other (might include some human data): • Ensembl - FASTA DB (100 GB) • Influenza Virus (including Swine Flu) - from NCBI (1 GB) • UniGene - from NCBI (10 GB) • PubChem Library - from NCBI (230 GB)

  35. Public Snapshots

  36. Select “Volumes”

  37. Create a Volume

  38. Instance Information

  39. Attach it to your Instance

  40. Mount the Volume From your VM: >sudomkfs –t ext3 /dev/sdf >sudomkdir /mnt/datasets >sudo mount –t ext3 /dev/sdf /mnt/datasets 200GB of genbank data are now in /mnt/datasets

More Related