1 / 39

So you think you can model the genome…

So you think you can model the genome…. Jeltje van Baren – April 27 2005. Overview. Introduction: there’s more to the genome than genes The formation of repetitive elements: LINES Genome rearrangements and evolution: The Hox genes Genes without function: pseudogenes

ruby
Download Presentation

So you think you can model the genome…

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. So you think you can model the genome… Jeltje van Baren – April 27 2005

  2. Overview • Introduction: there’s more to the genome than genes • The formation of repetitive elements: LINES • Genome rearrangements and evolution: The Hox genes • Genes without function: pseudogenes • Gene prediction and pseudogenes • Processed pseudogenes and gene prediction • Ways to get rid of pseudogenes in gene predictions • intron alignment method • conservation method

  3. There’s more to the genome… • 1-3% of the human genome is coding • 50% of homology between mouse and human is coding • Rest…? – regulatory elements? Shh C7orf2

  4. LCR There’s more to the genome… Beta globin cluster: the locus control region Wijgerde et al. 1995. Nature 377:209-213 Insulators affect transcription factor ability

  5. Junkyard 1-3% of the human genome is coding – is the rest ‘junk’?

  6. Junkyard 1-3% of the human genome is coding – is the rest ‘junk’? A lot of it is repeat: 17% LINES 15% SINES 8% Retrovirus/retroposon 3% DNA transposon

  7. Repetitive elements - LINES LINE stands for long interspersed element pol II ORF1 ORF2 AAAAA Internal promoter ? Reverse transcriptase/Endonuclease Transcription Translation Migrates back into nucleus Ribonucleoprotein (RNP)

  8. Retrotransposition

  9. Rearrangement shapes the genome In the course of evolution, genomes are constantly ‘shuffled’: Duplication of regions by unequal crossing over during meiosis

  10. Example of duplication: the Hox cluster 1 2 3 4 5 6 7 (8) 9 10 11 (12) 13 Mouse HoxA HoxA2 homeobox The homeobox codes for a 60 amino acid protein domain that is a DNA binding motif DNA binding motifs are often present in transcription factors

  11. Example of duplication: the Hox cluster In evolution, first duplication of genes, then duplication of region...? In Fugu, HoxC1 and HoxC3 are pseudogenes

  12. Hox genes are important in development Mouse HoxA proximal distal 1 2 3 4 5 6 7 (8) 9 10 11 (12) 13 Mouse developing limb

  13. Hox genes are important in development Mouse HoxD anterior posterior 1 (2) 3 4 (5) (6) (7) 8 9 10 11 12 13 Mouse developing limb

  14. Pseudogenes in Hox cluster 1 (2) 3 4 5 6 (7) 8 9 10 11 12 13 Fugu HoxC Pseudogenes • How do we detect pseudogenes? • Mutations lead to putative frameshifts or stop codons • No transcripts of the gene can be detected • Some gene features (promoter, intron/exon boundaries, termination signal) may have been lost • Ka/Ks ratios

  15. Pseudogenes and gene prediction Segmental duplications may contain complete genes Mutations result in deterioration or generation of gene family members

  16. Prediction of pseudogenes Pseudogene will be predicted as real gene if no stop codons or intron/exon boundary mutations are present In longer genes (more exons), stop-containing exons may be skipped

  17. Repetitive elements - LINES LINE stands for long interspersed element pol II ORF1 ORF2 AAAAA Internal promoter ? Reverse transcriptase/Endonuclease Transcription Translation Migrates back into nucleus Ribonucleoprotein (RNP)

  18. SINES pol III AAAAAAAAAA No ORF! SINES (eg Alu) use the RT/EN function of LINES Migrates back into nucleus Ribonucleoprotein (RNP)

  19. Migrates back into nucleus Ribonucleoprotein (RNP) Processed pseudogenes Processed pseudogenes are generated from intronless RNA using the same mechanism:

  20. Pseudogenes and gene prediction Pseudogene treated as single exon gene Pseudogene treated as exons

  21. Finding nonprocessed pgenes How?

  22. Finding nonprocessed pgenes • How can’t we do it? • Use frameshifts or stop codons in exon predictions • Use polyA tails next to exon prediction • Things we can do: • Identify parent gene • Look for non-conservation

  23. Using known genes for pgene finding Predicted gene BLAST Known gene Align prediction to genomic region of known gene & match intron locations If the intron positions do not line up, the exon is a putative pseudogene

  24. Limitations – intron method • Only works if the parent gene is known • Will not detect small parent exons • Some ‘known genes’ are really undetected pseudogenes

  25. Finding processed pseudogenes Method 2: conserved synteny

  26. What is synteny? • Synteny the occurrence of two or more genes on the same chromosome within one species • -Conserved Synteny • The occurrence of synteny of orthologous genes in two different organisms. human chr7 mouse chr5 conserved synteny

  27. Conserved synteny human mouse

  28. Using conserved synteny in pseudogene finding • Take gene model. • BLAST to human genes. • Compare gene location. • If there is a better hit elsewhere in the human genome than in the mouse conserved syntenic region: possible pseudogene human mouse ?

  29. Example: pseudogene exons Exon not conserved in mouse

  30. Example: pseudogene exons Parent gene orthologous with different mouse chromosome: hit in mouse… Solution: remove all ‘second’ and ‘third’ orthology hits

  31. FBN3 Fibrillin3 has no mouse homolog but is a real gene in human

  32. So pseudogenes have no function?

  33. It all started with transgene insertion… Sex-lethal (Drosophila) S.Hirotsune et al., Nature 243:91-96 (2003)

  34. …that resulted in really unhappy mice ~80% of +/- mice die within 2 days of birth the rest has bone deformities, renal and liver problems and an incomplete epithelial eye cover at birth.

  35. The transgene insertion site

  36. Transcription is disrupted UTR

  37. Pseudogene protects parent

  38. So what happens? Mkrn1 Mkrn1-p Competition for a ‘destabilizing factor’?

More Related