slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
GENOMA HUMANO, 26 Junho 2000 PowerPoint Presentation
Download Presentation
GENOMA HUMANO, 26 Junho 2000

Loading in 2 Seconds...

play fullscreen
1 / 27

GENOMA HUMANO, 26 Junho 2000 - PowerPoint PPT Presentation


  • 105 Views
  • Uploaded on

GENOMA HUMANO, 26 Junho 2000. Celera. 175 mil leituras por dia 4 grupos 1- Transforma bactérias, praqueia e pega colonias 2- Mini-Prep 3- Reação de sequenciamento e precipitação com etanol 4- Alimenta os Sequenciadores ABI Prism 3700 65 pessoas. Celera.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'GENOMA HUMANO, 26 Junho 2000' - halil


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
celera
Celera
  • 175 mil leituras por dia
  • 4 grupos

1- Transforma bactérias, praqueia e pega colonias

2- Mini-Prep

3- Reação de sequenciamento e precipitação com etanol

4- Alimenta os Sequenciadores ABI Prism 3700

65 pessoas

Celera

slide3

O corpo humano tem aproximadamente 100 trilhões células. Dentro de cada célula há o núcleo que contém o genoma - 46 cromossomos humanos - que gerenciam o desenvolvimento humano

slide4

Cada cromossomo é uma fita longa de DNA. Cromossomos são compreendido por milhões de cópias das quatro letras do código genético - A, C, G, T as bases do DNA em que estão arranjados genes e seções não codificadoras. Encontrar a ordem, ou seqüência, destas quatro letras é o objetivo do projeto genoma. O genoma humano inteiro é composto de aproximadamente 3,5 bilhão bases.

slide5

Para ler o DNA, os cromossomos são cortados em partes minúsculas, cada uma destas será lida individualmente quando todos os segmentos foram lido eles são montados correta na ordem.

slide6

Dois métodos foram usados:

  • DNA é fragmentado e montado na ordem correta (Celera)
  • Montagem dos cromossomos antes de descodificar a seqüência(Consórcio Público)

Métodos

slide7

BAC to BAC

Sequencing

Whole Genome Shotgun Sequencing

The BAC to BAC approach first creates a crude physical map of the whole genome before sequencing the DNA

Constructing a map requires cutting the chromosomes into large pieces and figuring out the order of these big chunks of DNA before taking a closer look and sequencing all the fragments

The shotgun sequencing method goes straight to the job of decoding, bypassing the need for a physical map

Therefore, it is much faster

slide8

BAC to BAC

Sequencing

Whole Genome Shotgun Sequencing

Several copies of the genome are randomly cut into pieces that are about 150,000 base pairs (bp) long

Multiple copies of the genome are randomly shredded into pieces that are 2,000 base pairs (bp) long by squeezing the DNA through a pressurized syringe. This is done a second time to generate pieces that are 10,000 bp long

slide9

BAC to BAC

Sequencing

Whole Genome Shotgun Sequencing

Each of these 150,000 bp fragments is inserted into a BAC- a bacterial artificial chromosome

The whole collection of BACs containing the entire human genome is called a BAC library

Each 2,000 and 10,000 bp fragment is inserted into a plasmid

The two collections of plasmids containing 2,000 and 10,000 bp chunks of human DNA are known as plasmid libraries

slide10

BAC to BAC

Sequencing

Whole Genome Shotgun Sequencing

These pieces are fingerprinted to give each piece a unique identification tag that determines the order of the fragments

Fingerprinting involves cutting each BAC fragment with a single enzyme and finding common sequence landmarks in overlapping fragments that determine the location of each BAC along the chromosome

Then overlapping BACs with markers every 100,000 bp form a map of each chromosome

This step not needed in shotgun sequencing

slide11

BAC to BAC

Sequencing

Whole Genome Shotgun Sequencing

Each BAC is then broken randomly into 1,500 bp pieces and placed in another artificial piece of DNA called M13

This collection is known as an M13 library

This step not needed in shotgun sequencing

slide12

BAC to BAC

Sequencing

Whole Genome Shotgun Sequencing

All the M13 libraries are sequenced

500 bp from one end of the fragment are sequenced generating millions of sequences

Both the 2,000 and the 10,000 bp plasmid libraries are sequenced

500 bp from each end of each fragment are decoded generating millions of sequences

Sequencing both ends of each insert is critical for the assembling the entire chromosome

slide13

BAC to BAC

Sequencing

Whole Genome Shotgun Sequencing

These sequences are fed into a computer program called PHRAP that looks for common sequences that join two fragments together

Computer algorithms assemble the millions of sequenced fragments into a continuous stretch resembling each chromosome (Assembler)

slide14

INFORMÁTICA

1- Checar a qualidade da seqüência

Precisão média de 99,5% (1 erro em 200) e meta de 99,99%

2- Retirada do vetor

3- Blast para tirar seqüências mitocondriais (2114) e sequências que não são humanas - vetor e genoma de E. coli (713)

Assembler

slide15

The Assembler compares the millions of fragments against each other, finding all common segments between two fragments that are at least 40 letters long. These overlaps could not have occurred by chance, and they become the foundation of assembly

Of these overlaps, some are "true" and some are "repeat-induced"

slide16

The assembler now searches for groups of overlapping fragments that (1) together spell a common sequence, and (2) do not overlap fragments with sequences that dispute, or contest, the common sequence

Such uncontested groups of fragments are assembled into what are called “unitigs”

Each unitig contains on average about 30 fragments

slide17

The assembler identifies incorrectly assembled unitigs that spell repeats by looking at the "depth" of the total number of fragments in the unitig

A statistic called the Discriminator is used to find stacks of fragments that are suspiciously high

Correctly assembled unitigs are called U-unitigs ("U" for unique), and all other unitigs are set aside

slide18

The Scaffolding stage begins

Critical to this stage is the fact that most of the fragments were grabbed from the genome in pairs during sequencing. Known as mate pairs, these fragments are always separated by the same number of letters, either about 1,000 or about 9,000

A contiguous sequence of ordered unitigs is a contig. During scaffolding, the assembler orients contigs using mates

Mate pairs stick together and remain the same distance apart. If mates from the same pair lie on different contigs, for instance, the contigs are likely to be neighbors about 99% of the time

slide19

As the assembler compares more and more mates, the contig geography becomes apparent. Sets of contigs that are ordered and oriented using enforcing pairs are called scaffolds. At this point, the scaffolding is continuous except for gaps

Some of these gaps are due to missing sequence; this is unavoidable. Other gaps contain repetitive sequence that can now be closed using the unitigs that were set aside earlier by the Discriminator

slide20

The assembler classifies repeat sequences by size and reliability, calling the largest and most reliable repeats "rocks”

Rocks are tossed into the gaps first, to be followed by the lesser "stones," and finally the smallest and least reliable pieces, "pebbles"

Rocks must be linked to the contigs on either side of a gap by two or more mates

slide21

Stones are linked to the contigs by only one mate. Their position in a gap is confirmed by overlaps

Pebbles are placed in a gap based on the quality of the overlaps between each other and the adjoining contigs

slide22

ROCKVILLE, MD, June 26, 2000

  • CELERA GENOMICS COMPLETESTHE FIRST ASSEMBLY OF THE HUMAN GENOME
        • Assembled Genome Has 3.12 Billion Base Pairs

Artigo

slide23

Celera's paired end-sequencing strategy has produced paired sequence reads that cover the human genome 35.6 times

The calculation to perform the assembly involved 500 million trillion base to base comparisons requiring over 20,000 CPU hours on Celera's supercomputer

The method used by Celera has determined the genetic code of five individuals: three females and two males who have identified themselves as Hispanic, Asian, Caucasian, or African American

celera x p blico
Celera

27,27 milhões de leituras

Media de 543 pares de bases em cada leitura

16 bibliotecas com 5 doadores

Assumindo o genoma de 2,9 Gpb, a cobertura foi de 5,1 vezes em termos de seqüência 38,7 vezes em termos de clones.

Público

4,44 Gpb de seqüências

sequencia de vetor nas pontas

2,6 Mpb fase 3

61 Mpb fase 1e 2 Lixo

16 Mpb fase 0

20 % acabada

Total 75 % Rascunho

4,36Gpb 5 % seqüências

únicas

Celera X Público
slide25

Fases do Genoma Público

Fase 0 Read (corta o Bac e recobre 1X)

Fase 1-2 Read a Read e Bac a Bac

(1) Contigs dos Bacs vão ao Gene Bank

(2) Bacs ordenados em arquivos maiores

Fase 3 Bacs ordenados e completos

slide27

Genoma da Celera

Bom para Dedel?