1 / 21

Carlow IT Bioinformatics November 2006

CFTR – gene cloning and initial bioinformatic analysis Riordan et 12(*) et Tsui (1989) Science 245:1066. Carlow IT Bioinformatics November 2006. * Including Francis Collins, later leader of the Human Genome Sequencing Project. Cystic fibrosis. Horrible inherited disease

cleary
Download Presentation

Carlow IT Bioinformatics November 2006

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CFTR – gene cloningand initial bioinformatic analysisRiordan et 12(*) et Tsui (1989)Science 245:1066 Carlow IT BioinformaticsNovember 2006 * Including Francis Collins, later leader of the Human Genome Sequencing Project

  2. Cystic fibrosis • Horrible inherited disease • Affecting lung, pancreas, sweat-glands • Abnormally high trans-membrane electrical potential • Decreased Cl- ion membrane transport • Often associated with failure to respond to ATP dependent kinase • no phosphorylation: no function

  3. More symptoms etc. • Difficult breathing • Early death (1959 6mths, 2006 38yrs) • More prone to infections (thicker mucus) • Can do pre-natal diagnosis or sweat test • "Woe is the child who tastes salty from a kiss on the brow, for he is cursed, and soon must die“ German proverb 1700s • We modify AMPs defensins: can make one effective in high salt environment??

  4. Genetics & epidemiology • Located on chr 7q31.2 180Kb gene • 1 in 25 europeans carries a CFTR mutation so 1:2500 live birth have the disease • Males and female equally affected • Life expect higher in males – nobody knows why • Why so common? • Cholera toxin requires normal CFTR • Also possible connexion with typhus

  5. Mapping • Genetic association with markers pinpoints chromosome 7 • Chromosome walking to zero in • NO genome sequence in those days

  6. Clone and sequence • Why bother? • because we can! • ? can predict features/functions • ? Can compare CF v normal to identify mutation • Working with cDNA not genomic • Generate cDNA libraries from cells & cell-lines • Screen for cDNAs that hybridise with known CFTR fragment • Eventually (much hard work) got 19 overlapping cDNA clones

  7. Fig 1 19 normal clones 2 CF clones

  8. Fig3 - where expressed Patchy expression profile

  9. Gene sequence • Clones span 6.1kb of RNA • ORF protein of 1480 amino acids • So bigger than 300AA average • In 1989 << 1000 human genes sequenced • Bioinformatic analysis possible then: • Start codon, consensus seq for transl start + AUG • 2nd structure prediction • Hydropathy plot • Homology searches (pre BLAST) • Glycosylation, Ser, Thr kinase sites

  10. Start of ORF • 5’- AGACCAUGCA-3’ in CFTR • 5’-(CC)[A/G]CCAUGG(G) consensus • Convinced? • I’m not

  11. The sequence 1 Exon splice Trscr Start AA count RNA count 2 TM domains Pred kinase sites

  12. The sequence 2 First ATP Binding fold Is underlined Delta F 508 circled

  13. Protein analysis Whole protein is two similar halves each with 6 membrane Spanning domains (hydropathic peaks) and two NBFs (hydrophilic regions) and a charged R region

  14. DF508 Fig6 – homology/similarity Conserved, hydrophobic Aromatic position at 508 Comparing two conserved regions in CFTR and other proteins: some with Two, some with one similar region, multidrug resistance, transporters etc.

  15. Structure of the fold • Two halves similar structure but low AA conservation (best is only 27/66 identities) • Others in family have much tighter conservation • No signal peptide says that orientation of first TM domain is (i – o) • External loops very short • …except between TM7 and TM8 where there is N glycosylation site

  16. More… • R domain is one exon 69/241 residues are polar alternating +ve and –ve charge regions • Also most of the phosphorylation kinase sites • All family members secrete something: • Chloride (CFTR) • Pigment (drosophila white gene) • lytic peptide (E. coli hemolysin) • …so what about the “function unknown” mbpX gene in liverwort chloroplasts ?

  17. More… • Hypothesise that CFTR is the ion channel • 10/12 of TM domains have >1 +ve AA • ie. amphipathic helix • cf. brain Na+ channel & GABA-R Cl- channel • Contrast p-glycoprotein • Closely realted but no +ve TM AAs • Big protein – maybe also other functions

  18. Fig 7 a composite model Glycosylation

  19. In colour from wikipedia

  20. Conclude • From very little data and very small DB N=bases N=seqs • 1988 23,800,000 20,5791989 34,762,585 28,7911990 49,179,285 39,533 • 2000 11,101,066,288 10,106,023 • to compare with can make predictions about structure and function that have stood the test of time.

  21. Postscript • DF508 may be about delivery of protein to the membrane • Functions fine if you trick cells to deliver! • By 1995 300 different mutations identified in the gene • Last month 1531 different mutations at • http://www.genet.sickkids.on.ca/cftr/StatisticsPage.html • With human genome, SNPs, ESTs much easier to interpret sequence information

More Related