1 / 26

Jumbled Matching with SIMD

Jumbled Matching with SIMD. Sukhpal Singh Ghuman and Jorma Tarhio. Outline. Definition Motivation Previous algorithms SIMD Jumbled matching with SIMD Experimental results Concluding remarks. Jumbled matching.

lcummings
Download Presentation

Jumbled Matching with SIMD

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Jumbled Matching with SIMD Sukhpal Singh Ghuman and Jorma Tarhio

  2. Outline • Definition • Motivation • Previous algorithms • SIMD • Jumbled matching with SIMD • Experimental results • Concluding remarks

  3. Jumbled matching • A substring u of T is a jumbled equivalent to P if the count of each character in P is equal to its count in u and |P| = |u| holds. • To find substrings of T which are permutations of P. • For example: P = edcba in T = aabecdcddee.

  4. Jumbled matching • Jumbled patterns can be described as Parikh vector. • Vector of multiplicities of the characters. • p(S) is (1,2,1) for S = abcb.

  5. Motivation • Alignment of strings • SNP discovery • Discovery of repeated patterns • Interpretation of mass spectrometry data

  6. Previous Algorithms: Count • Key Idea - scan the text forward while maintaining counts of characters. • Work in linear time. • These algorithms were developed as filtration methods for online approximate string matching.

  7. Previous Algorithms: BAM • Cantone and Faro (Proc. PSC 2014) presented the BAM algorithm (Bit-parallel Abelian Matcher). • Associate a counter (bin) to each distinct character in P. • A single 1-bit counter for the remaining characters of the alphabet.

  8. Bit Parallel simulation P = abbccc cbaother characters

  9. Previous Algorithms • Chhabra et al. (Proc. PSC 2015) presented: • BAM2 - A variation of BAM that handles a 2 - gram at a time. • EBL (Exact Backward for Large alphabets) - Based on the SBNDM2 algorithm.

  10. SIMD • The SIMD architecture allows the execution of multiple data on single instruction. • Sixteen 128-bit registers known as XMM0 to XMM15. • Weusespecializedstringmatching SIMD instructions in addition to standard SIMD instructions

  11. Jumbled Matching + SIMD • SIMD comprise of several aggregation operations: • Equal each • Equal any • Ranges

  12. Equal Any Approach • Equal any: • Handles 16 bytes at one time. • Two operands as input. • Set of characters in the pattern. • Text window. • Example: • Operand1: aeiou • Operand2: You drive me mad • Output: 0110001010010010

  13. Equal Any Approach • The equal any SIMD command returns a bitvector of 16 bits showing the positions in the test window which hold any character of the pattern. • A match candidate is found if the m bits of the vector are ones.

  14. SIMD Instructions • simd-equal-any(x, y) • _mm_extract_epi16( _mm_cmpistrm(x, y, SIDD_CMP_EQUAL_ANY), 0). • simd-cmpeq(x, y) • mm_movemask_epi8( _m128i_mm_cmpeq_epi8(_m128i x, _m128i y)).

  15. Least Frequent Character Approach • Based on the least frequent character of the text in the pattern. • Frequency of the character is based on the text or on the language. • We use SIMD instructions to analyze whether a test window of 16 bytes holds the least frequent character of the pattern.

  16. Least Frequent Character Approach • R is an array containing 16 bytes, each of which holding the least frequent character of the pattern. • The SIMD register x holds R and the SIMD register y holds a test window of 16 bytes of the text. • The registers x and y are compared by the simd-cmpeq operation. • Our previous algorithms used as checking ( or local search) routine.

  17. Experimental Results • The performance of SIMD instructions depends on the architecture of the processor. • The performance of a single instruction is measured by latency and throughput.

  18. Latency and throughput of SIMD instructions for Nehalem and Haswell

  19. Experimental Results: Execution times of algorithms for English data on Nehalem.

  20. Experimental Results: Execution times of algorithms for Protein data on Nehalem.

  21. Experimental Results: Execution times of algorithms for English data on Haswell.

  22. Execution times of algorithms for English data on Nehalem.

  23. Execution times of algorithms for Protein data on Nehalem.

  24. Execution times of algorithms for English data on Haswell.

  25. Concluding remarks • We introduced improved solutions for exact jumbled pattern matching based on the SIMD architecture. • If the latency of the used SIMD instructions would improve in future processors, the running times of the algorithms will respectively change.

  26. THANK YOU!!

More Related