1 / 23

RNA Assembly Using extending method.

RNA Assembly Using extending method. Wei Xueliang 2010-04-07. Overview. Why abandon deBruijn . Why abandon Extended deBruijn . Introduction to current method. Handle the old problem. The new problem. Tod o. Why abandon deBruijn . De Bruijn Graph’s ( dis )advantage: Very Fast.

magnar
Download Presentation

RNA Assembly Using extending method.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. RNA Assembly Using extending method. Wei Xueliang 2010-04-07

  2. Overview • Why abandon deBruijn. • Why abandon Extended deBruijn. • Introduction to current method. • Handle the old problem. • The new problem. • Todo

  3. Why abandon deBruijn. • De Bruijn Graph’s (dis)advantage: • Very Fast. • Coverage distribution and K-Value affect a lot • Key : the coverage is not uniform distributed in the RNA assembly. • No best K value.

  4. Why abandon deBruijn. • The length of the red part is 27.

  5. deBruijn Graph of K = 28

  6. deBruijn Graph of K = 29

  7. deBruijn Graph of K = 30

  8. Why abandon deBruijn. • Key : The coverage is not uniform distributed in the RNA assembly. • No best K value. • Can we using different K to run the program many times? • This is not De Novo Assembly’s job. • Time. • Provide high accurate contigs with-in limited time. • Scaffolding programs.

  9. Why abandon Extended deBruijn. • My Extended de Bruijnmethod: • Using two or more K value at the same time.

  10. Why abandon Extended deBruijn. • The change rate of coverage is above my expectation. Need many K. • The convert between different K are difficult. • Memory problem for big K. When K > 32, each K-index need > 50G (with Data-Sets: 10G) • Throw the K away.

  11. Introduction to the new method • From Pramila’s genome assembly method. • Start from any Tag and do a correction. • If successfully corrected, continue.

  12. Introduction to the new method • Find all the tag which have at least 24 bps overlaps. (Magic number) • Using these overlapping tags to extend Base and continue add more tags.

  13. Introduction to the new method • How to find the overlapping tags fast and with mis-match? • Index and Union: {Tag3}, {Tag2, Tag3}, {Tag3, Tag4} Union =>{Tag1, Tag2, Tag3, Tag4}

  14. Introduction to the new method • How to find the next overlapping tags fast and with mis-match? • V1 <= U3 • V2 <= (U1 << 1) + 0 • V3 <= (U2 << 1) + 0

  15. Handle the old problem. • When the length of overlapping part < 24?

  16. Handle the old problem. • Check the tags one by one by descending order of the length of overlap.

  17. Handle the old problem.

  18. Handle the old problem.

  19. Handle the old problem. • Degree of approximation.

  20. Handle the old problem. • Less tips. • Do not have bubbles. • Because we doing overlap with mis-match. • Use whole tags

  21. The new problem. • Speed. • The tail of the tag often have more errors. • Reverse ExtendingProblem.

  22. Todo • Handle Reverse ExtendingProblem. • Speed • Finish the comparision between deBruijn method(velvet) and my method. • Paired End Tag.

  23. Thank you very much for attention.

More Related