Ch2 genome organization and evolution continue
Sponsored Links
This presentation is the property of its rightful owner.
1 / 50

Ch2. Genome Organization and Evolution (continue) PowerPoint PPT Presentation

  • Uploaded on
  • Presentation posted in: General

Ch2. Genome Organization and Evolution (continue). 阮雪芬 Jan02, 2003 NTUST. Pick out Genes in Genomes. Open reading frames (ORFs) Start codon ------------------  stop codon A potential protein-coding region Approaches to identify protein-coding regions

Download Presentation

Ch2. Genome Organization and Evolution (continue)

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Ch2. Genome Organization and Evolution (continue)


Jan02, 2003


Pick out Genes in Genomes

  • Open reading frames (ORFs)

    • Startcodon------------------stop codon

    • A potential protein-coding region

  • Approaches to identify protein-coding regions

    • Detection of regions similar to known coding regions from other organisms

    • Ab inition methods

      • It is more complete and accurate for bacteria than eukaryotes

Pick out Genes in Genomes

  • A framework for ab initio gene identification in eukaryotic genomes

Pick out Genes in Genomes

Genomes of Prokaryotes

  • Most prokaryotic cells contain

    • A large single circular piece of double-stranded DNA (< 5 Mb)

    • Plasmids

  • E. coli only ~11% of the DNA is non-coding.

The Genome of the Bacterium E. coli


  • Strain K-12 contains 4639221 bp in a single circular DNA molecules, with no plastids.

  • An inventory reveals

    • 4285 protein-coding genes

    • 122 structural RNA genes

    • Non-coding repeat sequences

    • Regulatory elements

    • Transcription/translation guides

    • Transposase

    • Prophage remnants

    • Insertion sequence elements

    • Patches of unusual composition

The Genome of the Bacterium E. coli

  • The average size of an ORF is 317 amino acids.

  • 630-700 operons, operons vary in size, although few contain more than five genes. Genes within operons vary to have related functions.

The Genome of the Bacterium E. coli

  • Several features of E. coli

    • It can synthesize all components of proteins and nucleic acids, and cofactors.

    • It has metabolic flexibility

    • A wide range of transporters

    • Even for specific metabolic reactions there are many cases of multiple enzymes.

    • Does not posses a complete range of enzymatic capacity.

The genome of the archaeon Methanococcus jannnaschii


  • Methanococcus jannnaschii was collected from a hydrothermal vent 2600m deep off the coast of Baja California, Mexico, in 1983.

  • Thermophilic organism

  • The genome was sequenced in 1996 by The Institute for Genomic Research (TIGR). It was the first archaeal genome sequenced.

The genome of the archaeon Methanococcus jannnaschii

  • It contains a large chromosome containing a circular double-stranded DNA molecule 1664976 bp long.

  • 1743 predicted coding regions.

  • Some RNA genes contain introns.

  • As in other prokaryotic genomes there is a little non-coding DNA.

  • In archaea, protein involved in transcription, translation, and regulation are more similar to those of eukaryotes.

  • Archaeal proteins involved in metabolism are more similar to those of bacteria.

The genome of one of the simplest organisms: Mycoplasma genitalium


  • An infectious bacterium.

  • Its genome was sequenced in 1995 by TIGR, The Johns Hopkins University and The University of North Carolina.

  • The gene repertoire includes some that encode proteins

    • DNA replication

    • Transcription

    • Translation

    • Adhesions

    • Other molecules for defence against the host’s immune system.

    • Transport proteins

Genomes of Eukaryotes

  • In eukaryotic cells, the majority of DNA is in the nucleus, separated into bundles of nucleoproteins, the chromosomes.

  • Each chromosome contains a singledouble-stranded DNA molecule.

  • Nuclear genomes of different species vary widely in size.

  • Eukaryotic species vary in the number of chromosomes and distribution of genes among them.

    • Human chromosome 2~~a fusion of chimpanzee chromosomes 12 and 13.

Genomes of Eukaryotes

  • Saccaromyces cerevisiae (Ibaker’s yeast)

    • Protein-protein interaction

      • Yeast two-hybrid system

Yeast Two-hybrid System

  • Useful in the study of various interactions

  • The technology was originally developed during the late 1980's in the laboratory Dr. Stanley Fields (see Fields and Song, 1989, Nature).

Yeast Two-hybrid System

GAL4 DNA-activation domain

GAL4 DNA-binding domain

Nature, 2000

Yeast Two-hybrid System

  • Library-based yeast two-hybrid screening method

Nature, 2000

Protein-protein Interactions on the Web

  • Yeast

  • C. Elegans

  • H. Pylori


  • Drosophila

Yeast Protein Linkage Map Data

  • New protein-protein interactions in yeast

List of interactions with links to YPD

Stanley Fields Lab

Yeast Protein Linkage Map Data

Genomes of Eukaryotes

  • Caenorhabditis elegans

    • The genome was completed in 1998

    • The first full DNA sequence of a multicellular organism

    • XX genotype: a self-fertilizing hermaphrodite.

    • XO genotype: a male.

Genomes of Eukaryotes

  • Drosophila melanogaster

    • Its genome sequence was announced in 1999 by a collaboration between Celera Genomics and the Berkeley Drosophila Genome Project.

    • Despite the fact that insects are not very closely related to mammals, the fly genome is useful in the study of human disease.

    • It contains homolgues of 289 human genes implicated in various disease:

      • Cancer

      • Cardiovascular disease….etc.

Genomes of Eukaryotes

  • Arabidopsis thaliana

    • A flowering plant

    • ~125 Mbp DNA

Genomes of Eukaryotes-Human

  • In Feb 2001, the International Human Genome Sequencing Consortium and Celera Genomics published, separately, drafts of the human genome.

  • 22 chromosome pairs +X, Y

  • Protein coding gene

    • ~32000 genes in all

Nucleic acid binding

Transcription factor binding

Cell cycle regulator



Actin binding

Defense/immunity protein


Enzyme activator

Enzyme inhibitor


Signal transduction

Storage protein

Cell adhesion

Structural protein


Ligand binding or carrier

Tumour suppressor


Genomes of Eukaryotes-Human

  • Human protein coding gene

Genomes of Eukaryotes-Human

  • Repeat sequences

    • 50% of the genome

    • Contain

      • Transposable elements

      • Retroposed pseudogenes

      • Simple “sutters”

      • Segmental duplications

      • Blocks of tandem repeats

Genomes of Eukaryotes-Human

  • RNA

    • 497 transfer RNA genes

    • Genes for 28S and 5.8S ribosomal RNAs

    • Small nucleolar RNAs

    • Spliceosomal snRNAs


  • Single-nucleotide polymorphisms (SNPs)

    • A genetic variation between individuals, limited to a single base pair which can be substituted, inserted or deleted.

    • Sickle-cell anaemia is an example of a disease caused by a specific SNP

      • AT mutation in the beta-globin gene changes a GluVal


  • Single-nucleotide polymorphisms (SNPs)

    • Nearly 1.8 million SNPs

    • Occurring on the average every 2000 base pairs.

    • Not all SNPs are linked to disease

    • The A, B, and O alleles of genes for blood groups illustrate these possibilities.

      • A and B alleles differ by four SNP substitutions.

ABO Blood Groups



The human ABO blood groups illustrate the effect of glycosyl-transferases.

Evolution of Genomes

  • Synonymous nucleotide substitution

  • Non- synonymous nucleotide substitution

    Ka = the number of non- synonymous nucleotide substitution

    Ks = the number of synonymous nucleotide substitution

    Ka/ Ks : high ratio

     possibly functional changes

Databases of Aligned Gene Families

Example- The Effect of RGD Mimetic Peptide in Breast Cancer Cell Line MCF7


  • RGD has been used as inhibitor of integrin-ligand interaction.

  • Loss of integrin-mediated signaling will induce apoptosis.


RGD(Arg-Gly-Asp) is the smallest motif that bind

with the integrin receptor on the cell surface and

Play important role in cell cycle.


Cell Death


Our Study

Human breast cancer cell MCF-7

Genomic Study

RGD mimetic peptides



Cell Apotosis












The Structures of RGD Mimetic Peptides









cDNA Microarray

C-RGD, 24hr

C-RGD, 6hr

C-RGD, 48hr

C-RGD, 72hr


  • Total 34 genes, but after filtering there are only 19 genes

  • Total 11genes have expression fold >2 (up or down changes)

Apoptosis Regulator

Apoptosis Regulator

Caspase Pathway in CRGD-treated MCF7 Cell

Caspase 10

Caspase 3

Caspase 9

Caspase 8 and FADD

Caspase 4

Caspase 7

Searching and Clustering of RGD-containing Protein in Swiss-Prot Database

  • In Swiss-Prot database, there are541 human RGD-containing protein containing 5 caspase proteins.

  • Caspase 8 was clustered with integrin beta4

  • Caspase 1, caspase 2, caspase 3 and caspase7 are clustered.

Please pass the genes: horizontal gene transfer

  • Horizontal gene transfer is the acquisition of genetic material by one organism from the other.

    • Direct uptake

    • Via a viral carrier

Genome Databases

  • PIR

Genome Databases

  • Entrez Genomes


  • Weblem 2.1

  • Weblem 2.9

  • Weblem 3.1

Deadline: Jan 16

  • Login