90 likes | 352 Views
Lecture 09. DNA Sequence Analysis & Gene Identification (I). Bioinformatics 90. (AAAAAA)n. 3’. 7-mG cap. Exon 1. Exon 2. Exon 3. Exon 4. The Organization of an Eukaryotic Gene. GENE. Exon 1. Intron. Exon 2. Intron. Exon 3. Intron. Exon 4. Promoter Enhancer.
E N D
Lecture 09 DNA Sequence Analysis & Gene Identification (I) Bioinformatics 90
(AAAAAA)n 3’ 7-mG cap Exon 1 Exon 2 Exon 3 Exon 4 The Organization of an Eukaryotic Gene GENE Exon 1 Intron Exon 2 Intron Exon 3 Intron Exon 4 Promoter Enhancer Transcription Poly(A) signal mRNA transcript 5’ 3’ 5’-untranslated region Exon 1 Intron Exon 2 Intron Exon 3 Intron Exon 4 3’-untranslated region Processing Mataure mRNA stop start 5’
Gene identification involves 4 main stages Find the putative coding region(s) in the sequence Open reading frame CpG islands Tandemly and dispersed repeats Promoter regions (TATA box, cap signal, CCAAT-box) Transcription factors, Poly-A sites Find non-coding features of interest in the sequence Branch point signal CT(G,A)A(C,T) Determine the exon-intron organization 5’ and 3’ splice sites: AG/GUAAGU--------------PyPyPyPyPyPyPyPy-CAG/G motif, signal and pattern Blast, FASTA Functional studies Identify the gene
GENE FINDERS Banbury Cross http://igs-server.cnrs-mrs.fr/igs/banbury FGENEH http://genomic.sanger.ac.uk/gf/gf.shtml GeneID http://www1.imim.es/geneid.html GeneMachine http://genome.nhgri.nih.gov/genemachine GeneParser http://beagle.colorado.edu/_eesnyder/GeneParser.htl GENSCAN http://genes.mit.edu/GENSCAN.html Genotator http://www.fruitfly.org/_nomi/genotator/ GRAIL http://compbio.ornl.gov/tools/index.shtml GRAIL-EXP http://compbio.ornl.gov/grailexp/ HMMgene http://www.cbs.dtu.dk/services/HMMgene/ MZEF http://www.cshl.org/genefinder PROCRUSTES http://www-hto.usc.edu/software/procrustes RepeatMasker http://ftp.genome.washington.edu/RM/RepeatMasker.html Sputnik http://rast.abajian.com/sputnik/
Function Command GCG SeqWEB + + + + + + + + + + + + + + + + + + + + - - + - Sequence manipulation ORF Searching Mapping (restriction sites) Mapping (transcription factors) Reverse Frames Map Translate Map (-minc) (-maxc) Mapsort (-exclude) (-digest) Mapplot Map tfsites
What to do next? The predictions by these programs is just that: a prediction. NEVER TRUST A COMPUTER!
Exercise89-10 Programs used in this exercise: (1) Sequence manipulation – reverse (3)ORF Searching – frames , map , translate (4)Mapping (restriction sites) – map (-minc, -maxc), mapsort(-exclude, -digest), mapplot, plasmidmap (5)Mapping (transcription factor) – map(tfsites). Sequences used in this exercise: gb:z18853 (C.elegans mRNA for capping protein alpha subunit.) cds:10-858 gb:x03795 (Human mRNA for platelet derived growth factor A-chain, PDGF-A) cds:388-1020.