1 / 48

Genome Annotation

Genome Annotation. GTTGCAATCTGAGACACATATTTTTGATATTCCAGTTGTTGCAATC GAATGTAAAACATATTTAGATCTTTAAATGTATGGTAC ATTCAAGATCCAACCTTCATTCTAGTGTTTAAAGAGAAC GTTGCAA TTAGGTTTTGTGATTTGTTTGCAGGGGCAGGAGGCTTTGGTTTAGGTT

deacon
Download Presentation

Genome Annotation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GenomeAnnotation

  2. GTTGCAATCTGAGACACATATTTTTGATATTCCAGTTGTTGCAATCGAATGTAAAACATATTTAGATCTTTAAATGTATGGTACATTCAAGATCCAACCTTCATTCTAGTGTTTAAAGAGAACGTTGCAAGTTGCAATCTGAGACACATATTTTTGATATTCCAGTTGTTGCAATCGAATGTAAAACATATTTAGATCTTTAAATGTATGGTACATTCAAGATCCAACCTTCATTCTAGTGTTTAAAGAGAACGTTGCAA TTAGGTTTTGTGATTTGTTTGCAGGGGCAGGAGGCTTTGGTTTAGGTT AAATGGCAGGCTTCTCTGTACCTTTATCTGTTGAAATTGATACCTGGGCTTGTGATACACTACGCTACAACCGCCCTGATTCAACAGTTATTCAAAATGATATCGGTAACTTTAGTACAGAAAATGACGTTAAGAACACTACGCTACAACCGCCCATATCTGCAACTTTAAACCTGATATTATTATTGGCGGGCCTCCATGCCAGGGATTTAGAGATCCTAGAAATGGTATTGCTGGGCCAGCCCAAAAAGATCCTAAAGATCCTAGAAATG GTTTAAATCCTCATAAATTATCAAACAAATCATATGATCAGAATAATCGCCGTTTAAATCCTCA TTTATTCATCAACTTTGCACAATGGATAAAATTTCTTGAACCTAAAGCGTTTGTCATGGAAAACGAAGGTTTTAAAGTTATAGATAAAAGGATTGCTATCAAGGAAAAATGCAGAAGGTTTTAAAGTTATAG TTATGCAGAAAAATTTGACTTCTCACTAAATATAAAGATTTTTTAGATCAGCAGCATTATGCAGAAAA ACCCACCGTTTGGGCAAAAAGACGACGGTACTGGTTTAACCAGCCAAATGTTCTTTCTACTACCCACCGTTT TTATTAAGAAAACATTTGAAGAACTTGGTTATTTTGTCGAAGTATGGGTTTTAAATGCTGCGGAATATGGCATTCCGCAAATTAGAGAACGTATTTTTATTGTTGGCAATAAAAAAGGTAAAGTACTAGGTATGAGTATTATACCTGCACTAACTTTGTGGGACGCAATATCAGACTTACCAGAACTTAATGCGCGTGAAGGAAGTGAAGAGCAACCCTATCATTTAAAACCTCAAAATACTTATCAGACTTGGGCTAGAAATGGTAGTGCTACGCTTTACAATCATGTTGCAATGGAACATTCTGACCGTTTAGTAGAACGTTTCCGGCATATAAAATGGGGTGAATCCAGTTCGGATGTATCTAAAGAACATGGAGCTAGACGACGTAGTGGTAATGGTGAATTATCAAACAAATCATATGATCAGAATAATCGCCGTTTAAATCCTCATAAACCGTCTCACACTATTGCTGCGTCATTCTATGCTAATTTTGTCCATCCTTTTCAACATCGAAATTTAACAGCCCGTGAAGGAGCTAGAATCCAATCTTTTCCAGATAACTATAGATTTTTTGGAAAAAAAACTGTCGTATCTCATAAACTATTGCATCGAGAAGAAAGATTTGATGAAAAATTTCTTTGTCAATATAATCAAATCGGTAATGCTGTACCCCCTCTTCTCGCTAAAGTAATTGCACATCATCTTCTAGAGAAATTAGAGTTATGCCAACAACTGATAGAAATCCTCTAGTGCATGGATCAAATCTTGAACAAAAAGAGAATCATCGTACAAAATACAGAGATACTGAAAGCAGGACTTTCCTTAGAGAAATCAGAACTGAATATGACAAATGGCATAAAGCAAATATGAACCTGGTTGGACCAAAATCAGAAATTACTGACCAAGATGATTCAATTATTACTCAAAGAGTGGAACTTCTCACTAAATATAAAGATTTTTTAGATCAGCAGCATTATGCAGAAAAATTTGATTCAAGATCCAACCTTCATTCTAGTGTTTTAGAGACCATTTATAAAGTAAATCTTTAGACGACTAGACGACGTAGCATAATACGAGTCATAACGGCATATATGGCAGCCTCACTCATTTCTGGGAGACGCTCATAATCCTTACTGAGACGACGGTACTGGTTTAACCAGCCAAATGTTCTTTCTACTACCCACCGTTTGGGCAAAACCTGAAATTCTTGATTAGTACGCCGGATTACCTCAACATGAGCTTGAATCATCAGCCAAACAGAGAGCGCAAATTTATCACCGTCATAGCCGGAATCAACCCAGATGACTTCAACTTTTTCCAGTAATTCTGGACGCTCTTCTAACAGTTCCATCAAAGTATAGGCGGCAAGTAATCTTTCTCCAGCATTTGCTTCACTTACAACCACTTTTAACAAAAGTCCCAGACTATCAACCAAAGTTTGCCGCTTTCGTCCTTTTACCTTCTTGCCACCATCAAAACCGTACACATCCCCCTTTTTTCAGTCGTTTTTACCGACTGGCTGTCTGCCGCGATCGCCGTGGGTTGAGTTGACTTCCCCATTTTTTGACGAACTTGATCGCGCAAAGTATGATTCATTTCAGTTGAAACCTGTTTTCAGATGGTAGTAGATAGCGTTGCATACTTCTCGCATATCAGTTGTTCGGGGATGCCCACCGCATTTAGCGGGTGGAATCAAAGGAGCTAAAATTGCCCATTCTGAGTCATTAAGGTCTGTAGAATAAGACTTTCGTCTCATTGTTTCCTATGTAAATACACTCTACAAACAGTATCTTATCGCTGCCTTTTTATCTTAGCTCTCCTTTAGATTTACTTTATAAATAGCCTCTTAGAAGAATTTCTTTATTATTTATTTAAAGATTTAGTACAAGATTTCGGGCAGAACGCTCTTATTGGTAAGTCACACACGTTCAAAGATATTTTCTTCGTACCACCAAAATATTCTGAAATGCTCAAGCGACCTTATGCGCGAATTGAGAGAAAAGATCATGATAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAATTCGTAATTGGTGCAACTGTTCAAGCATCGCTTAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAGAAGCAGCATGCTTGAGGGATCTTCACGCGCAGCAGAGGATTTAAAAGCGAGAAATCCTAACAGTTTATACCTTGTGGTTATGGAATGGATAAAACTGACCAATGATGTAAATTTACGAAAATATAAAGTTGATCAAATTTATGTACTACGTCAGCAATGCTTGAGGGATCTTCACGCGCAGCAGAGGATTTAAAAGCGAGAAATCCTAACAGTTTATACCTTGTGGTTATGGAATGGATAAAACTGACCAATGATGTAAATTTACGAAAATATAAAGTTGATCAAATTTATGTACTACGTCAGCAAAAAAATACTGATAGAGAGTTTAGGTATGAGTCAACTTACATAAAAAAT TGCTTGAGGGATCTTCACGCGCAGCAGAGGATTTAAAAGCGAGAAATCCTAACAGTTTATACCTTGTGGTTATGGAATGGATAAAACTGACCAATGATGTAAATTTACGAAAATATAAAGTTGATCAAATTTATGTACTACGTCAGCAAAAAAATACTGATAGAGAGTTTAGGTATGAGTCAACTTACATAAAAAATTGCTTGAGGGATCTTCACGCGCAGCAGAGGATTTAAAAGCGAGAAATCCTAACAGTTTATACCTTGTGGTTATGGAATGGATAAAACTGACCAATGATGTAAATTTACGAAAATATAAAGTTGATCAAATTTATGTACTACGTCAGCAAAAAAATACTGATAGAGAGTTTAGGTATGAGTCAACTTACATAAAAAATTGCTTGAGGGATCTTCACGCGCAGCAGAGGATTTAAAAGCGAGAAATCCTAACAGTTTATACCTTGTGGTTATGGAATGGATAAAACTGACCAATGATGTAAATTTACGAAAATATAAAGTTGATCAAATTATGTACTACGTCAGCAAAAAAATACTGATAGAGAGTTTAGGTATGAGTCAACTTACATAAAAAATAAAAATACTGATAGAGAGTTTAGGTATGAGTCAACTTACATAAAAAAT GenomeAnnotation GenomeAnnotation

  3. GTTGCAATCTGAGACACATATTTTTGATATTCCAGTTGTTGCAATCGAATGTAAAACATATTTAGATCTTTAAATGTATGGTACATTCAAGATCCAACCTTCATTCTAGTGTTTAAAGAGAACGTTGCAAGTTGCAATCTGAGACACATATTTTTGATATTCCAGTTGTTGCAATCGAATGTAAAACATATTTAGATCTTTAAATGTATGGTACATTCAAGATCCAACCTTCATTCTAGTGTTTAAAGAGAACGTTGCAA TTAGGTTTTGTGATTTGTTTGCAGGGGCAGGAGGCTTTGGTTTAGGTT AAATGGCAGGCTTCTCTGTACCTTTATCTGTTGAAATTGATACCTGGGCTTGTGATACACTACGCTACAACCGCCCTGATTCAACAGTTATTCAAAATGATATCGGTAACTTTAGTACAGAAAATGACGTTAAGAACACTACGCTACAACCGCCCATATCTGCAACTTTAAACCTGATATTATTATTGGCGGGCCTCCATGCCAGGGATTTAGAGATCCTAGAAATGGTATTGCTGGGCCAGCCCAAAAAGATCCTAAAGATCCTAGAAATG GTTTAAATCCTCATAAATTATCAAACAAATCATATGATCAGAATAATCGCCGTTTAAATCCTCA TTTATTCATCAACTTTGCACAATGGATAAAATTTCTTGAACCTAAAGCGTTTGTCATGGAAAACGAAGGTTTTAAAGTTATAGATAAAAGGATTGCTATCAAGGAAAAATGCAGAAGGTTTTAAAGTTATAG TTATGCAGAAAAATTTGACTTCTCACTAAATATAAAGATTTTTTAGATCAGCAGCATTATGCAGAAAA ACCCACCGTTTGGGCAAAAAGACGACGGTACTGGTTTAACCAGCCAAATGTTCTTTCTACTACCCACCGTTT TTATTAAGAAAACATTTGAAGAACTTGGTTATTTTGTCGAAGTATGGGTTTTAAATGCTGCGGAATATGGCATTCCGCAAATTAGAGAACGTATTTTTATTGTTGGCAATAAAAAAGGTAAAGTACTAGGTATGAGTATTATACCTGCACTAACTTTGTGGGACGCAATATCAGACTTACCAGAACTTAATGCGCGTGAAGGAAGTGAAGAGCAACCCTATCATTTAAAACCTCAAAATACTTATCAGACTTGGGCTAGAAATGGTAGTGCTACGCTTTACAATCATGTTGCAATGGAACATTCTGACCGTTTAGTAGAACGTTTCCGGCATATAAAATGGGGTGAATCCAGTTCGGATGTATCTAAAGAACATGGAGCTAGACGACGTAGTGGTAATGGTGAATTATCAAACAAATCATATGATCAGAATAATCGCCGTTTAAATCCTCATAAACCGTCTCACACTATTGCTGCGTCATTCTATGCTAATTTTGTCCATCCTTTTCAACATCGAAATTTAACAGCCCGTGAAGGAGCTAGAATCCAATCTTTTCCAGATAACTATAGATTTTTTGGAAAAAAAACTGTCGTATCTCATAAACTATTGCATCGAGAAGAAAGATTTGATGAAAAATTTCTTTGTCAATATAATCAAATCGGTAATGCTGTACCCCCTCTTCTCGCTAAAGTAATTGCACATCATCTTCTAGAGAAATTAGAGTTATGCCAACAACTGATAGAAATCCTCTAGTGCATGGATCAAATCTTGAACAAAAAGAGAATCATCGTACAAAATACAGAGATACTGAAAGCAGGACTTTCCTTAGAGAAATCAGAACTGAATATGACAAATGGCATAAAGCAAATATGAACCTGGTTGGACCAAAATCAGAAATTACTGACCAAGATGATTCAATTATTACTCAAAGAGTGGAACTTCTCACTAAATATAAAGATTTTTTAGATCAGCAGCATTATGCAGAAAAATTTGATTCAAGATCCAACCTTCATTCTAGTGTTTTAGAGACCATTTATAAAGTAAATCTTTAGACGACTAGACGACGTAGCATAATACGAGTCATAACGGCATATATGGCAGCCTCACTCATTTCTGGGAGACGCTCATAATCCTTACTGAGACGACGGTACTGGTTTAACCAGCCAAATGTTCTTTCTACTACCCACCGTTTGGGCAAAACCTGAAATTCTTGATTAGTACGCCGGATTACCTCAACATGAGCTTGAATCATCAGCCAAACAGAGAGCGCAAATTTATCACCGTCATAGCCGGAATCAACCCAGATGACTTCAACTTTTTCCAGTAATTCTGGACGCTCTTCTAACAGTTCCATCAAAGTATAGGCGGCAAGTAATCTTTCTCCAGCATTTGCTTCACTTACAACCACTTTTAACAAAAGTCCCAGACTATCAACCAAAGTTTGCCGCTTTCGTCCTTTTACCTTCTTGCCACCATCAAAACCGTACACATCCCCCTTTTTTCAGTCGTTTTTACCGACTGGCTGTCTGCCGCGATCGCCGTGGGTTGAGTTGACTTCCCCATTTTTTGACGAACTTGATCGCGCAAAGTATGATTCATTTCAGTTGAAACCTGTTTTCAGATGGTAGTAGATAGCGTTGCATACTTCTCGCATATCAGTTGTTCGGGGATGCCCACCGCATTTAGCGGGTGGAATCAAAGGAGCTAAAATTGCCCATTCTGAGTCATTAAGGTCTGTAGAATAAGACTTTCGTCTCATTGTTTCCTATGTAAATACACTCTACAAACAGTATCTTATCGCTGCCTTTTTATCTTAGCTCTCCTTTAGATTTACTTTATAAATAGCCTCTTAGAAGAATTTCTTTATTATTTATTTAAAGATTTAGTACAAGATTTCGGGCAGAACGCTCTTATTGGTAAGTCACACACGTTCAAAGATATTTTCTTCGTACCACCAAAATATTCTGAAATGCTCAAGCGACCTTATGCGCGAATTGAGAGAAAAGATCATGATAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAATTCGTAATTGGTGCAACTGTTCAAGCATCGCTTAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAGAAGCAGCATGCTTGAGGGATCTTCACGCGCAGCAGAGGATTTAAAAGCGAGAAATCCTAACAGTTTATACCTTGTGGTTATGGAATGGATAAAACTGACCAATGATGTAAATTTACGAAAATATAAAGTTGATCAAATTTATGTACTACGTCAGCAATGCTTGAGGGATCTTCACGCGCAGCAGAGGATTTAAAAGCGAGAAATCCTAACAGTTTATACCTTGTGGTTATGGAATGGATAAAACTGACCAATGATGTAAATTTACGAAAATATAAAGTTGATCAAATTTATGTACTACGTCAGCAAAAAAATACTGATAGAGAGTTTAGGTATGAGTCAACTTACATAAAAAAT TGCTTGAGGGATCTTCACGCGCAGCAGAGGATTTAAAAGCGAGAAATCCTAACAGTTTATACCTTGTGGTTATGGAATGGATAAAACTGACCAATGATGTAAATTTACGAAAATATAAAGTTGATCAAATTTATGTACTACGTCAGCAAAAAAATACTGATAGAGAGTTTAGGTATGAGTCAACTTACATAAAAAATTGCTTGAGGGATCTTCACGCGCAGCAGAGGATTTAAAAGCGAGAAATCCTAACAGTTTATACCTTGTGGTTATGGAATGGATAAAACTGACCAATGATGTAAATTTACGAAAATATAAAGTTGATCAAATTTATGTACTACGTCAGCAAAAAAATACTGATAGAGAGTTTAGGTATGAGTCAACTTACATAAAAAATTGCTTGAGGGATCTTCACGCGCAGCAGAGGATTTAAAAGCGAGAAATCCTAACAGTTTATACCTTGTGGTTATGGAATGGATAAAACTGACCAATGATGTAAATTTACGAAAATATAAAGTTGATCAAATTATGTACTACGTCAGCAAAAAAATACTGATAGAGAGTTTAGGTATGAGTCAACTTACATAAAAAATAAAAATACTGATAGAGAGTTTAGGTATGAGTCAACTTACATAAAAAAT GenomeAnnotation

  4. A Walk in the Forest * Photo courtesy of www.webshots.com

  5. Observation * Photos courtesy of www.webshots.com and Peter Smallwood

  6. Observation * Photos courtesy of www.webshots.com and Peter Smallwood

  7. Observation * Photos courtesy of www.webshots.com and Peter Smallwood

  8. Observation * Photos courtesy of www.webshots.com and Peter Smallwood

  9. Experiment * Photos courtesy of www.webshots.com and Peter Smallwood

  10. English RedOrange YellowGreenBluePurple RedOrangeYellowGruePurple Mayan Filters: Information reducers

  11. Filters: Information reducersSquirrel filter

  12. Filters: Information reducersMolecular filter

  13. TCTACTTATA TTCAATCCAC AGGGCTACAC CTAGTTCTTG AAGAGTCTGT TGAATGAACA CATACATGGT TTATCTGTTT TTCTGTCTGC TCTGACCTCT GGCAGCTTTC CACTAGTTTC TGGATTTCGG AACTCTAGCC TGCCCCACTC TTAGATAAAC GAACCTTAGT GACTTCTGCT ATACCAAAGT CTCCACGCCC CTCCGTAAAC CTCTAACATG ATGTCAGCAA ATATTAAAAA TGAATAAACT TTGTTAAAGG TACAAATGAA AATTAGCAAA AAGAGTTTAA AGTTAAAAAC GAATTGCAGT CATTCTAGGG AAACCTGTAT GGTTACATGA ACTGCCTAAA AAACAAGCTA TTATATATTT TAAGAAATTA ATTGCAATTA ATTTCCTGGG CCCCAGCTGT CATTAAAAAG AGGCAAATAC AGCCAAGGAC GACAGCACTG ACCCTCAAGA AGGCACCGGC TGACAGACAG GCTGAAATTC CGCTGAGAGC AGAGTGGTAC ATTGAACCCT CCCTGCACCA GGTCTTTCCT GTGGGCACTG AGTGCAGACA ATGAATGACT GAACGAACGA TTGAATGAAA AGAAATGAGA TATGAGGCAA TCACAGCATC AGGTGACCTT AGTATCTATT CTCGGGAGCG CACGGCTCTA AAGAGGCCCA TATCCAGGCA CCTTTAGATG CAAGAAGGAG GAAACAGCTC GAAATCCCTG AGGCCGGAGG GTCAAGAACT CTCCACCGGC GGCAGCGGCC CCCCGGCCTA AGGCTGCCTG TGCTATAAAT ACGCGGCCCA TTCCCTGGGC TCGGCGGGAC AGATAACATG AATGTGCCCT CTCCGTAAAC CTCTAAC... Filters: Information reducersSequence filter

  14. Display of gene context by TIGR/CMR (or Kazusa) Nostoc sp. PCC 7120: Position Search and Segment Retrieval >Nostoc sp. PCC 7120 3455501-3456500 ttagtagatgggcttgatgtacagtactttgaaatcaattccctccgccgcaaaatggct gtagttagtcaagatacatttattttcaacacttctattagagacaatatcgcctacggt acatctggggcgagtgaagcggaaattagagaagtagcgcggctagcaaatgcgttgcaa tttatcgaagaaatgcccgaagggtttgatactaagttaggcgatcgcggtgtccgttta tctggaggacagagacaacggattgcgatcgctcgtgcattactccgagatcccgaaatc ctcattcttgacgaagccaccagcgccctagattcagtctccgagcgattaattcaggag tctatagaaaaactttccgtgggtagaacagtaattgcgatcgctcacagactctccaca attgccaaagcagataaggttgtggtgatggaacaagggcgaattgttgagcagggaaat tatcaagaacttctagaacaacgcggaaagctctggaaatatcaccagatgcaacacgaa tcaggacagactaattcgtaatatcaattcaaaattcaaaattcaaaattcaaaattagg gaagccgagcagaatcatggttttggggtatgtatctgtcccattcttttttcaaatcgg tataactccccaatccccaatccccaatctccagtccccaatccccaatccccaatcccc aatccccagtccccaatccccaatcccatgaaaatttccgtcatcatctcgaattacaac tatgctcgttatctttctagagcaatcaactctgttctcgctcaaactcactcagacatt gaaatcgttatcgtagatgatggttctacagataacagccgtgatgttattacccaactg caagaacaagcaccggataaaatcaagcccatctttcaagcaaatcaaggacagggaggc gctttcaatgcggggtttgcggcggcgactggcgaagtcg

  15. Display of gene context by Artemis

  16. Anabaena Chromosome (6413771 bp): 3455501-3456500 .........|.........|.........|.........|.........| alr2835hepA: ABC transport3454238 -> 3456061 alr2836 glycosyltransferase 3456248 -> 3457216 TTAGTAGATGGGCTTGATGTACAGTACTTTGAAATCAATTCCCTCCGCCG CAAAATGGCTGTAGTTAGTCAAGATACATTTATTTTCAACACTTCTATTA GAGACAATATCGCCTACGGTACATCTGGGGCGAGTGAAGCGGAAATTAGA GAAGTAGCGCGGCTAGCAAATGCGTTGCAATTTATCGAAGAAATGCCCGA AGGGTTTGATACTAAGTTAGGCGATCGCGGTGTCCGTTTATCTGGAGGAC AGAGACAACGGATTGCGATCGCTCGTGCATTACTCCGAGATCCCGAAATC CTCATTCTTGACGAAGCCACCAGCGCCCTAGATTCAGTCTCCGAGCGATT AATTCAGGAGTCTATAGAAAAACTTTCCGTGGGTAGAACAGTAATTGCGA TCGCTCACAGACTCTCCACAATTGCCAAAGCAGATAAGGTTGTGGTGATG GAACAAGGGCGAATTGTTGAGCAGGGAAATTATCAAGAACTTCTAGAACA ACGCGGAAAGCTCTGGAAATATCACCAGATGCAACACGAATCAGGACAGA CTAATTCGTAATATCAATTCAAAATTCAAAATTCAAAATTCAAAATTAGG GAAGCCGAGCAGAATCATGGTTTTGGGGTATGTATCTGTCCCATTCTTTT TTCAAATCGGTATAACTCCCCAATCCCCAATCCCCAATCTCCAGTCCCCA ATCCCCAATCCCCAATCCCCAATCCCCAGTCCCCAATCCCCAATCCCATG AAAATTTCCGTCATCATCTCGAATTACAACTATGCTCGTTATCTTTCTAG AGCAATCAACTCTGTTCTCGCTCAAACTCACTCAGACATTGAAATCGTTA TCGTAGATGATGGTTCTACAGATAACAGCCGTGATGTTATTACCCAACTG CAAGAACAAGCACCGGATAAAATCAAGCCCATCTTTCAAGCAAATCAAGG ACAGGGAGGCGCTTTCAATGCGGGGTTTGCGGCGGCGACTGGCGAAGTCG 3455501 3455551 3455601 3455651 3455701 3455751 3455801 3455851 3455901 3455951 3456001 3456051 3456101 3456151 3456201 3456251 3456301 3456351 3456401 3456451 Contig GoTo Block Find Display PgUp/PgDnHelp Quit

  17. Anabaena Chromosome (6413771 bp): 3455501-3456500 .........|.........|.........|.........|.........| alr2835hepA: ABC transport3454238 -> 3456061 alr2836 glycosyltransferase 3456248 -> 3457216 TTAGTAGATGGGCTTGATGTACAGTACTTTGAAATCAATTCCCTCCGCCG CAAAATGGCTGTAGTTAGTCAAGATACATTTATTTTCAACACTTCTATTA GAGACAATATCGCCTACGGTACATCTGGGGCGAGTGAAGCGGAAATTAGA GAAGTAGCGCGGCTAGCAAATGCGTTGCAATTTATCGAAGAAATGCCCGA AGGGTTTGATACTAAGTTAGGCGATCGCGGTGTCCGTTTATCTGGAGGAC AGAGACAACGGATTGCGATCGCTCGTGCATTACTCCGAGATCCCGAAATC CTCATTCTTGACGAAGCCACCAGCGCCCTAGATTCAGTCTCCGAGCGATT AATTCAGGAGTCTATAGAAAAACTTTCCGTGGGTAGAACAGTAATTGCGA TCGCTCACAGACTCTCCACAATTGCCAAAGCAGATAAGGTTGTGGTGATG GAACAAGGGCGAATTGTTGAGCAGGGAAATTATCAAGAACTTCTAGAACA ACGCGGAAAGCTCTGGAAATATCACCAGATGCAACACGAATCAGGACAGA CTAATTCG■□□■□■■□□■■■□□□□■■■□□□□■■■□□□□■■■□□□□■■□□□ □□□□■■□□□■□□□□■■□■□□■■■■□□□□■□■□■□■■■□■■■■□■■■■■■■ ■■■□□□■■□□■□■□□■■■■■■□□■■■■■□□■■■■■□□■■■■■□□■■■■■□ □■■■■■□□■■■■■□□■■■■■□□■■■■■□□■■■■■□□■■■■■□□■■■■ATG AAAATTTCCGTCATCATCTCGAATTACAACTATGCTCGTTATCTTTCTAG AGCAATCAACTCTGTTCTCGCTCAAACTCACTCAGACATTGAAATCGTTA TCGTAGATGATGGTTCTACAGATAACAGCCGTGATGTTATTACCCAACTG CAAGAACAAGCACCGGATAAAATCAAGCCCATCTTTCAAGCAAATCAAGG ACAGGGAGGCGCTTTCAATGCGGGGTTTGCGGCGGCGACTGGCGAAGTCG 3455501 3455551 3455601 3455651 3455701 3455751 3455801 3455851 3455901 3455951 3456001 3456051 3456101 3456151 3456201 3456251 3456301 3456351 3456401 3456451 Contig GoTo Block Find Display PgUp/PgDnHelp Quit

  18. New World 1 aagctttgaa agcactacag gatttacctt 61 aacaactaag tcgctctcaa gttacttctc 121 aatctgaatc tatcgcgggt gtggcaaaag 181 tctcaatgga agacttatta actcaaattc 241 gagtggcgag gattagcgtc aatctatagg 301 taaaatgctt atactgtcat ggcttgagtc 361 cgcctgaacc tttgctagag tatctttttc 421 ccaatggctg aaaagctacc ttagtttcag 481 acagtaagcc ttctagactc aggcagtttt We get the information Old World We ran the process

  19. The New World TGAGACACATATTTTTGATATTCCAGTTGTTGCAATC GAATGTAAAACATATTTAGATCTTTAAATGTATGGTAC ATTCAAGATCCAACCTTCATTCTAGTGTTTAAAGAGAAC TGATTTGTTTGCAGGGGCAGGAGGCTTTGGTTTAGGTTTTG AAATGGCAGGCTTCTCTGTACCTTTATCTGTTGAAATTGATACCTGGGCTTGTGATACACTACGCTACAACCGCCCTGATTCAACAGTTATTCAAAATGATATCGGTAACTTTAGTACAGAAAATGACGTTAAGAATATCTGCAACTTTAAACCTGATATTATTATTGGCGGGCCTCCATGCCAGGGATTTAGTATTGCTGGGCCAGCCCAAAAAGATCCTAAAGATCCTAGAAATGGTTTATTCATCAACTTTGCACAATGGATAAAATTTCTTGAACCTAAAGCGTTTGTCATGGAAAACGTAAAAGGATTGCTATCAAGGAAAAATGCAGAAGGTTTTAAAGTTATAGATATTATTAAGAAAACATTTGAAGAACTTGGTTATTTTGTCGAAGTATGGGTTTTAAATGCTGCGGAATATGGCATTCCGCAAATTAGAGAACGTATTTTTATTGTTGGCAATAAAAAAGGTAAAGTACTAGGTATTCCTAAAAAAACACATTCTCTGCAATTTTTAAATTTAAATAGGTCTCAATTATCGATCTTCGATGATATGAGTATTATACCTGCACTAACTTTGTGGGACGCAATATCAGACTTACCAGAACTTAATGCGCGTGAAGGAAGTGAAGAGCAACCCTATCATTTAAAACCTCAAAATACTTATCAGACTTGGGCTAGAAATGGTAGTGCTACGCTTTACAATCATGTTGCAATGGAACATTCTGACCGTTTAGTAGAACGTTTCCGGCATATAAAATGGGGTGAATCCAGTTCGGATGTATCTAAAGAACATGGAGCTAGACGACGTAGTGGTAATGGTGAATTATCAAACAAATCATATGATCAGAATAATCGCCGTTTAAATCCTCATAAACCGTCTCACACTATTGCTGCGTCATTCTATGCTAATTTTGTCCATCCTTTTCAACATCGAAATTTAACAGCCCGTGAAGGAGCTAGAATCCAATCTTTTCCAGATAACTATAGATTTTTTGGAAAAAAAACTGTCGTATCTCATAAACTATTGCATCGAGAAGAAAGATTTGATGAAAAATTTCTTTGTCAATATAATCAAATCGGTAATGCTGTACCCCCTCTTCTCGCTAAAGTAATTGCACATCATCTTCTAGAGAAATTAGAGTTATGCCAACAACTGATAGAAATCCTCTAGTGCATGGATCAAATCTTGAACAAAAAGAGAATCATCGTACAAAATACAGAGATACTGAAAGCAGGACTTTCCTTAGAGAAATCAGAACTGAATATGACAAATGGCATAAAGCAAATATGAACCTGGTTGGACCAAAATCAGAAATTACTGACCAAGATGATTCAATTATTACTCAAAGAGTGGAACTTCTCACTAAATATAAAGATTTTTTAGATCAGCAGCATTATGCAGAAAAATTTGATTCAAGATCCAACCTTCATTCTAGTGTTTTAGAGACCATTTATAAAGTAAATCTTTAGACGACTAGACGACGTAGCATAATACGAGTCATAACGGCATATATGGCAGCCTCACTCATTTCTGGGAGACGCTCATAATCCTTACTGAGACGACGGTACTGGTTTAACCAGCCAAATGTTCTTTCTACTACCCACCGTTTGGGCAAAACCTGAAATTCTTGATTAGTACGCCGGATTACCTCAACATGAGCTTGAATCATCAGCCAAACAGAGAGCGCAAATTTATCACCGTCATAGCCGGAATCAACCCAGATGACTTCAACTTTTTCCAGTAATTCTGGACGCTCTTCTAACAGTTCCATCAAAGTATAGGCGGCAAGTAATCTTTCTCCAGCATTTGCTTCACTTACAACCACTTTTAACAAAAGTCCCAGACTATCAACCAAAGTTTGCCGCTTTCGTCCTTTTACCTTCTTGCCACCATCAAAACCGTACACATCCCCCTTTTTTCAGTCGTTTTTACCGACTGGCTGTCTGCCGCGATCGCCGTGGGTTGAGTTGACTTCCCCATTTTTTGACGAACTTGATCGCGCAAAGTATGATTCATTTCAGTTGAACTAGGAGGAAAATCCCCTGGAAGCATATCCCACTGACAACCTGTTTTCAGATGGTAGTAGATAGCGTTGCATACTTCTCGCATATCAGTTGTTCGGGGATGCCCACCGCATTTAGCGGGTGGAATCAAAGGAGCTAAAATTGCCCATTCTGAGTCATTAAGGTCTGTAGAATAAGACTTTCGTCTCATTGTTTCCTATGTAAATACACTCTACAAACAGTATCTTATCGCTGCCTTTTTATCTTAGCTCTCCTTTAGATTTACTTTATAAATAGCCTCTTAGAAGAATTTCTTTATTATTTATTTAAAGATTTAGTACAAGATTTCGGGCAGAACGCTCTTATTGGTAAGTCACACACGTTCAAAGATATTTTCTTCGTACCACCAAAATATTCTGAAATGCTCAAGCGACCTTATGCGCGAATTGAGAGAAAAGATCATGATTTCGTAATTGGTGCAACTGTTCAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAACCATGCTTGAGGGATCTTCACGCGCAGCAGAGGATTTAAAAGCGAGAAATCCTAACAGTTTATACCTTGTGGTTATGGAATGGATAAAACTGACCAATGATGTAAATTTACGAAAATATAAAGTTGATCAAATTTATGTACTACGTCAGCAAAAAAATACTGATAGAGAGTTTAGGTATGAGTCAACTTACATAAAAAAT

  20. AATAAAGCTTTACAAACCAAACTCTGGCTTCAATTGTGTAACCCAAGCTTTGATTCTTTCCTCTGTTAAATCGGATTGATTATCTTCATCAAGGGCAAGACCTACAAATTTACCATCACGAACAGCTTTAGACTCACTGAATTCATAACCTTCTGTAGGCCAATAGCCAACTGTTTCACCACCATTTTCTGAAATTTTTTCCTCTAGAATACCGAGGGCATCTTGAAATGTATCAGGATAACCAACCTGGTCTCCAGGAGCAAAATAAGCAACTTTTTTGCCGATGAAGTCAATGTTATCTAACTCATCATAAAAATTTTCCCAATCACTTTGCAATTCTCCAACATTCCAGGTAGGACAACCAACAACGATATAATCGTAGTTATTGAAATCACTTGGTTCAGCTTGTGAAATATCATATAAAGTTACAACACTATCACCACCAAACTCCTTCTGAATTATTTCTGATTCAGTTTGGGTATTGCCTGTTTGAGTACCAAAAAATAAACCAATATTAGACATTTTTACTCCTTTTATGTATTTGCAAAATTATTTCAATTAAAATATTTAGTAATAATTAATTGTTAGCTAGCTAATAATTAAATTTTTATTACAATCATTGTAAAAGGCATTGAAAAAGTAAATAAAAATTTTTATTCTACGTTATTTCAAAAATATTTACTTACATATACTTAACCTTTATAGTGATGTAATATACTCTAATTCCTATTTTACTTATAAATACCATCTCAGCTTAATGTAACGAATTTTTCTGTTTATCTTTAAATACAAAAAATTCAACAAAACTACAGAAAATTAATCTTAATAACACAAAACAAGTATCAATCTGTAATACAACTAAGCTTAAATAAATTAATAGAAAGCTTCATCTATCTAATAGGTTGAGAATAGTTTATGTCTAATGACATAAATTCATTCGTGTTGATTTCATTTGGGTATATTCATCTGATTTAGGATTTACTCCATTAAGTTTGTACTCATCAATGCCCGCCTGTTGGTATCCACAATTCTCATACAGTGCGCGAGCAAAGTAATCAATCGTTCGTCGCCATATCTAACTTTGAGTCAAACAAACCAGTTGGATTACCAACCCTCAACTAATCGCTTCTTTAAGGCGAGCGATCGCACATTTAACTGTTGGTTGTCACAAGAGAACTAATACTACAGCAGTATATTTAACAACTAAGGGTGGTTCAACTTTCGCTGCGACTCCTCCAACGCGCTGAAATACACAGGACTGATGCGATCGCAAACTCTTTGACTAAATTCCATACATTATCATGACCATCTCCCAAACAAACAAGTGGGTTAACCAGATGCTGACTATTAACATCCCCTGAGTTCGGAGTTGTAGGTCTATTTGACTGGTTCAAAGCGATGATGGAACGGCTTTGTTGCATGAATTAAAAAAAGACACACCATCACCTACTTCTAGGATAGACACATCAAACGTCCCACCGCCTAAGTCAAATACCAAGATAATTTCGTTAGTTTTCTTGTCAAGTCCGTAAGCGAGGGCCGCCGCCGTGGGCTAGTTGATAATTCGCAGAACTTTAATCCCGGCAATTCTACTGGCATCTTTGGTAGCCTGCCGTTGAGAGTCATTGAAATAGGCAGGGGTGGTAATTACCGCTTGCCTCACTGGTTCCCCCAGATATGTGCTGGCATCATCTATCAGCTTGCGGACTACCTCATACCATTTCACGAAAAACCTGATACACATGTAAACTCTGAAACCCTTGCTGTATCAAAGTTTTGTAATTACGAATTACGAATTACGAATTGATATCAGCCGAGATTTCTTCGGGTGAAAATTCCTTGTTCAGAGCGGGACAGTGTAGCTTGACATTGCCATTACTGTCACGTACCACTTTGTAAGTAACTTGTTTTGCCTCTTGCGTAACTTCATCATACCTGCGCCCGATGAACCGCTTCACAGAATAAAAAGTGTTTTCTGGGTTCATTACACCCTGGCGCTTAATAAAGCTTTACAAACCAAACTCTGGCTTCAATTGTGTAACCCAAGCTTTGATTCTTTCCTCTGTTAAATCGGATTGATTATCTTCATCAAGGGCAAGACCTACAAATTTACCATCACGAACAGCTTTAGACTCACTGAATTCATAACCTTCTGTAGGCCAATAGCCAACTGTTTCACCACCATTTTCTGAAATTTTTTCCTCTAGAATACCGAGGGCATCTTGAAATGTATCAGGATAACCAACCTGGTCTCCAGGAGCAAAATAAGCAACTTTTTTGCCGATGAAGTCAATGTTATCTAACTCATCATAAAAATTTTCCCAATCACTTTGCAATTCTCCAACATTCCAGGTAGGACAACCAACAACGATATAATCGTAGTTATTGAAATCACTTGGTTCAGCTTGTGAAATATCATATAAAGTTACAACACTATCACCACCAAACTCCTTCTGAATTATTTCTGATTCAGTTTGGGTATTGCCTGTTTGAGTACCAAAAAATAAACCAATATTAGACATTTTTACTCCTTTTATGTATTTGCAAAATTATTTCAATTAAAATATTTAGTAATAATTAATTGTTAGCTAGCTAATAATTAAATTTTTATTACAATCATTGTAAAAGGCATTGAAAAAGTAAATAAAAATTTTTATTCTACGTTATTTCAAAAATATTTACTTACATATACTTAACCTTTATAGTGATGTAATATACTCTAATTCCTATTTTACTTATAAATACCATCTCAGCTTAATGTAACGAATTTTTCTGTTTATCTTTAAATACAAAAAATTCAACAAAACTACAGAAAATTAATCTTAATAACACAAAACAAGTATCAATCTGTAATACAACTAAGCTTAAATAAATTAATAGAAAGCTTCATCTATCTAATAGGTTGAGAATAGTTTATGTCTAATGACATAAATTCATTCGTGTTGATTTCATTTGGGTATATTCATCTGATTTAGGATTTACTCCATTAAGTTTGTACTCATCAATGCCCGCCTGTTGGTATCCACAATTCTCATACAGTGCGCGAGCAAAGTAATCAATCGTTCGTCGCCATATCTAACTTTGAGTCAAACAAACCAGTTGGATTACCAACCCTCAACTAATCGCTTCTTTAAGGCGAGCGATCGCACATTTAACTGTTGGTTGTCACAAGAGAACTAATACTACAGCAGTATATTTAACAACTAAGGGTGGTTCAACTTTCGCTGCGACTCCTCCAACGCGCTGAAATACACAGGACTGATGCGATCGCAAACTCTTTGACTAAATTCCATACATTATCATGACCATCTCCCAAACAAACAAGTGGGTTAACCAGATGCTGACTATTAACATCCCCTGAGTTCGGAGTTGTAGGTCTATTTGACTGGTTCAAAGCGATGATGGAACGGCTTTGTTGCATGAATTAAAAAAAGACACACCATCACCTACTTCTAGGATAGACACATCAAACGTCCCACCGCCTAAGTCAAATACCAAGATAATTTCGTTAGTTTTCTTGTCAAGTCCGTAAGCGAGGGCCGCCGCCGTGGGCTAGTTGATAATTCGCAGAACTTTAATCCCGGCAATTCTACTGGCATCTTTGGTAGCCTGCCGTTGAGAGTCATTGAAATAGGCAGGGGTGGTAATTACCGCTTGCCTCACTGGTTCCCCCAGATATGTGCTGGCATCATCTATCAGCTTGCGGACTACCTCATACCATTTCACGAAAAACCTGATACACATGTAAACTCTGAAACCCTTGCTGTATCAAAGTTTTGTAATTACGAATTACGAATTACGAATTGATATCAGCCGAGATTTCTTCGGGTGAAAATTCCTTGTTCAGAGCGGGACAGTGTAGCTTGACATTGCCATTACTGTCACGTACCACTTTGTAAGTAACTTGTTTTGCCTCTTGCGTAACTTCATCATACCTGCGCCCGATGAACCGCTTCACAGAATAAAAAGTGTTTTCTGGGTTCATTACACCCTGGCGCTT The New World What have we lost?

  21. The New World What have we lost? What have we lost?

  22. ? Met-Thr-Tyr-Asp-Gln-Arg-Thr-Gly-Leu... Genetic code TCTACTTATA TTCAATCCAC AGGGCTACAC CTAGTTCTTG AAGAGTCTGT TGAATGAACA CATACATGGT TTATCTGTTT TTCTGTCTGC TCTGACCTCT GGCAGCTTTC CACTAGTTTC TGGATTTCGG AACTCTAGCC TGCCCCACTC TTAGATAAAC GAACCTTAGT GACTTCTGCT ATACCAAAGT CTCCACGCCC CTCCGTAAAC CTCTAACATG ATGTCAGCAA ATATTAAAAA TGAATAAACT TTGTTAAAGG TACAAATGAA AATTAGCAAA AAGAGTTTAA AGTTAAAAAC GAATTGCAGT CATTCTAGGG AAACCTGTAT GGTTACATGA ACTGCCTAAA AAACAAGCTA TTATATATTT TAAGAAATTA ATTGCAATTA ATTTCCTGGG CCCCAGCTGT CATTAAAAAG AGGCAAATAC AGCCAAGGAC GACAGCACTG ACCCTCAAGA AGGCACCGGC TGACAGACAG GCTGAAATTC CGCTGAGAGC AGAGTGGTAC ATTGAACCCT CCCTGCACCA GGTCTTTCCT GTGGGCACTG AGTGCAGACA ATGAATGACT GAACGAACGA TTGAATGAAA AGAAATGAGA 3% ATGACTTATGATCAACGCACAGGGCTA From Sequence to OrganismHow does Nature do it? ATGACTTATGATCAACGCACAGGGCTA • Begin transcription • End transcription • Splice transcript • Begin translation Rules of transcriptional and post-transcriptional control

  23. TCTACTTATA TTCAATCCAC AGGGCTACAC AAGAGTCTGT TGAATGAACA CATACATGGT TTCTGTCTGC TCTGACCTCT GGCAGCTTTC TGGATTTCGG AACTCTAGCC TGCCCCACTC GAACCTTAGT GACTTCTGCT ATACCAAAGT CTCCGTAAAC CTCTAACATG ATGTCAGCAA TGAATAAACT TTGTTAAAGG TACAAATGAA AAGAGTTTAA AGTTAAAAAC GAATTGCAGT AAACCTGTAT GGTTACATGA ACTGCCTAAA TTATATATTT TAAGAAATTA ATTGCAATTA CCCCAGCTGT CATTAAAAAG AGGCAAATAC GACAGCACTG ACCCTCAAGA AGGCACCGGC GCTGAAATTC CGCTGAGAGC AGAGTGGTAC CCCTGCACCA GGTCTTTCCT GTGGGCACTG ATGAATGACT GAACGAACGA TTGAATGAAA What genes are in my organism? Interpolated Markov model Candidate genes Predicted genes How do Biologists use Bioinformation? Gene finder

  24. TCTACTTATA TTCAATCCAC AGGGCTACAC AAGAGTCTGT TGAATGAACA CATACATGGT TTCTGTCTGC TCTGACCTCT GGCAGCTTTC TGGATTTCGG AACTCTAGCC TGCCCCACTC GAACCTTAGT GACTTCTGCT ATACCAAAGT CTCCGTAAAC CTCTAACATG ATGTCAGCAA TGAATAAACT TTGTTAAAGG TACAAATGAA AAGAGTTTAA AGTTAAAAAC GAATTGCAGT AAACCTGTAT GGTTACATGA ACTGCCTAAA TTATATATTT TAAGAAATTA ATTGCAATTA CCCCAGCTGT CATTAAAAAG AGGCAAATAC GACAGCACTG ACCCTCAAGA AGGCACCGGC GCTGAAATTC CGCTGAGAGC AGAGTGGTAC CCCTGCACCA GGTCTTTCCT GTGGGCACTG ATGAATGACT GAACGAACGA TTGAATGAAA What genes are in my organism? How do Biologists use Bioinformation? Gene finder Interpolated Markov model Conform to standard model Challenge accepted beliefs Candidate genes Predictedgenes Predicted genes

  25. globin • Highly filtered output • Easy to grasp • High-level insights Filters are powerful

  26. globin • Highly filtered output • Easy to grasp • High-level insights • Unfiltered output • Confusing • Basic insights Filters Constrain New Discovery

  27. Filters are tempting

  28. Filters are tempting Globin

  29. 1. Need high-level filters Current State of Affairs

  30. Current State of Affairs 1. Need high-level filters 2. Need access to raw phenomena AATAAAGCTTTACAAACCAAACTCTGGCTTCAATTGTGTAACCCAAGCTTTGATTCTTTCCTCTGTTAAATCGGATTGATTATCTTCATCAAGGGCAAGACCTACAAATTTACCATCACGAACAGCTTTAGACTCACTGAATTCATAACCTTCTGTAGGCCAATAGCCAACTGTTTCACCACCATTTTCTGAAATTTTTTCCTCTAGAATACCGCAACACTATCACCACCAAACTCCTTCTGAATTATTTCTGATTCAGTTTGGGTATTGCCTGTTTGAGTACCAAAAAATAAACCAATATTAGAC

  31. ASSIGNK12-setFROMGene-finder (K12-DNA) ASSIGNO157-setFROMGene-finder (O157-DNA) CONSIDER EACHprotein IN O157-set WHENConstituent-of (K12-set, protein) = FALSE COLLECTprotein Current State of Affairs 1. Need high-level filters 2. Need access to raw phenomena 3. Need ability to build new tools

  32. We need… Biologists . . . . . . and Programmers

  33. The Death of Creative Science

  34. The Death of Creative Science

  35. Current State of Affairs 1. Need high-level filters 2. Need access to raw phenomena 3. Need ability to build new tools Need biologist programmers

  36. Why hasn’t this happened? Part of bioinformatic program written in C if (pcInFile == NULL) pfInFile = stdin; else pfInFile = fopen(pcInFile, "r"); pfOutFile = fopen( pcOutFile, "w" ); if (pfInFile == NULL) { fprintf( stderr, "ERROR opening %s\n", pcInFile ); exit(1); } if (pfOutFile == NULL) { fprintf( stderr, "ERROR opening %s\n", pcOutFile ); exit(1); } fputc( fgetc(pfInFile), pfOutFile ); /* deal with first '>' in file */ for ( ; ; ) { if (processIdentifier( pfInFile, pfOutFile )) { } else { break; } if (processSequence( pfInFile, pfOutFile )) { } else { break; } } fclose( pfInFile ); fclose( pfOutFile );

  37. Why hasn’t this happened? Part of bioinformatic program written in Perl sub match_positions { my $pattern; local $_; ($pattern, $_) = @_; my @results; local $matchStart; my $instrumentedPattern = qr/(?{ $matchStart = pos() })$pattern/; while (/$instrumentedPattern/g) { my $nextStart = pos(); push @results, "[$matchStart..$nextStart)"; pos() = $matchStart+1; } return @results;

  38. Why hasn’t this happened? Biologists will not come to programming Programming must come to biologists

  39. BioLingua • Provides knowledge in accessible form • Provides tools accessed in common way • Provides results that can be manipulated • Provides a programming language that speaks to biologists

  40. Jeff Elhai Center for the Study of Biological Complexity Virginia Commonwealth University Phone: 828-0794 E-mail: ElhaiJ@VCU.Edu BioLingua http://ramsites.net/~biolingua/help

More Related