An Empirical Comparison of Monitoring Algorithms for Access Anomaly Detection

An Empirical Comparison of Monitoring Algorithms for Access Anomaly Detection Guy Martin GNU-OSLab Spring-10

Agenda • Background • Access Anomaly • Partial Order Execution Graph (POEG) • Access History • English-Hebrew Labeling • Task Recycling Algorithm • Implementation • Empirical Results • Recapitulation [DiSc90] - In Proceedings of the second ACM SIGPLAN symposium on Principles & Practice of Parallel Programming (PPoPP’90)

Background - Access Anomaly • An access anomaly occurs when two concurrent threads access a shared memory location without explicit coordination and at least one of these accesses is a write operation. • Task Recycling vs. English Hebrew Algorithms Y = 1 Doall i = 1 to 2 X = Y + i Endall Z = X + Y Y=1 i=2 i=1 X=Y+1 X=Y+2 Z=X+Y [DiSc90] - In Proceedings of the second ACM SIGPLAN symposium on Principles & Practice of Parallel Programming (PPoPP’90)

Background - POEG • A block is an instruction sequence, executed by a single thread, that does not include concurrency primitives or coordination operations • A POEG captures the Lamport’s Happens-Before relation and imposes a partial order on the set of blocks that make up an execution instance • A block a is an ancestor of a block b if there is a path from a to b (b is a descendant of a). Otherwise, a and b are concurrent. b0 b1 b2 b8 b3 b4 b5 b9 b10 b11 b6 b7 b12 b13 A POEG Partial Order Execution Graph

Background – Access History Reading order of X • The access history is examined at each access to variable X. • The reader set of an access history contains two blocks only if they are concurrent b0 b2 b6 b3 b2 b1 =X Variable is written X b9 b3 b6 b7 b8 b4 b5 Access history Read: Write: =X =X b2 b6 b3 X= b9 b10 b9 b11 A POEG with Access Anomalies The write of X in b9 conflicts with reads of X in b2 and b6

English-Hebrew Labeling • Each block is associated with a tag consisting of a pair of labels • English label E : produced by performing a left-to-right pre-order numbering • Hebrew label H: created symmetrically from right to left • Labels are string of number lexicographically ordered • English Label creation • Doall: E(tagci) E(tagp)|i • Coordination: E(tagc) E(tagp)|1 • Endall: E(tagc) max(E(tagpi)) • Concurrency condition • E(tagi)<E(tagj) and H(tagi)>H(tagj) • E(tagi)>E(tagj) and H(tagi)<H(tagj) [DiSc90] - In Proceedings of the second ACM SIGPLAN symposium on Principles & Practice of Parallel Programming (PPoPP’90)

English-Hebrew Labeling • 111,123 and 12,11 are concurrent • 1131,1231 and 12,11 are not concurrent 1,1 11,12 12,11 111,123 Coordination List Consist of the tags of the ancestors of each block such that all tags are unordered 121,113 112,122 122,112 113,121 123,111 1211,1131 113,123 113,123 1131,1231 123,113 121,113 113,123 123,113 123,123 [DiSc90] - In Proceedings of the second ACM SIGPLAN symposium on Principles & Practice of Parallel Programming (PPoPP’90)

Task Recycling Algorithm • Each block has a unique task identifier consisting of a task and a version number • Tasks can be recycled • More than one block can be assigned to the same task at different moment times in the runtime. • The version number is used to distinguish among different blocks assigned to the same task • Version number is increased each time a block is assigned to a task • Concurrency information is maintained in a parent vector associated with each executing block [DiSc90] - In Proceedings of the second ACM SIGPLAN symposium on Principles & Practice of Parallel Programming (PPoPP’90)

Task Recycling Algorithm • Parent vector initialization for a new block c with parent blocks p1, …, pm with task identifier t1v1, …, tmvm • Concurrency test • A block b is concurrent with another block with task identifier tv iff parentb[t]<v for i=1 to T do iftj{t1…tm}: i=tjthen parentc[i] vj elseparentc[i]  max(parentp1[i],…, parentpm[i]) endfor [DiSc90] - In Proceedings of the second ACM SIGPLAN symposium on Principles & Practice of Parallel Programming (PPoPP’90)

Task Recycling Algorithm 11 12 21 22 13 31 41 51 61 23 14 15 24 One can easily see that block 13 is concurrent with block 21, but 15 is not concurrent with 21 16 [DiSc90] - In Proceedings of the second ACM SIGPLAN symposium on Principles & Practice of Parallel Programming (PPoPP’90)

Implementation • Algorithms are implemented for the Ultracomputer parallel Fortran compiler • Several optimizations have been done on the two algorithms during implementation • Some libraries were written is C and in Assembler for more efficiency. • Tests have been realized on four scientific parallel programs [DiSc90] - In Proceedings of the second ACM SIGPLAN symposium on Principles & Practice of Parallel Programming (PPoPP’90)

Empirical Results • Programs with Little coordination and frequent fine-grained doall • English-Hebrew performs better than Task Recycling • Because of the relatively high cost of maintaining parent vectors and task assignment • Cost of per variable access • Task Recycling is most efficient than English-Hebrew • Program with frequent coordination • The cost of maintaining coordination lists approaches the cost of maintaining parent vectors [DiSc90] - In Proceedings of the second ACM SIGPLAN symposium on Principles & Practice of Parallel Programming (PPoPP’90)

Empirical Results • Space requirements are bounded by O(TxV) • T is the maximum number of threads that may potentially execute in parallel, and • V is the number of monitored variables. • Space requirements for typical programs are on average O(V) • Task Recycling is more efficient in terms of space requirements and often in performance. [DiSc90] - In Proceedings of the second ACM SIGPLAN symposium on Principles & Practice of Parallel Programming (PPoPP’90)

Recapitulation • English-Hebrew and Task Recycling: • programs with nested parallelism and coordination • English-Hebrew • Maintains shared memory access history • Uses tags for concurrency relationship among blocks • Task Recycling • Maintains parent vectors • Assigns task identifiers to POEG for concurrency relationship [DiSc90] - In Proceedings of the second ACM SIGPLAN symposium on Principles & Practice of Parallel Programming (PPoPP’90)

An Empirical Comparison of Monitoring Algorithms for Access Anomaly Detection

An Empirical Comparison of Monitoring Algorithms for Access Anomaly Detection

Presentation Transcript

Anomaly Detection

Population-Wide Anomaly Detection

Cluster Analysis for Anomaly Detection

Anomaly Detection

Anomaly Detection Systems

A Tamper-resistant Monitoring Framework for Anomaly Detection in Computer Systems

A Comparison of Burst Gravitational Wave Detection Algorithms for LIGO

Empirical Comparison of Algorithms for Network Community Detection

Sensitivity of PCA for Traffic Anomaly Detection

Traffic Anomaly Detection

An Algorithm for Anomaly-based Botnet Detection

Anomaly Detection Systems

Volume Anomaly Detection

An Overview of Pitch Detection Algorithms

Causal Modeling for Anomaly Detection

BGP Anomaly Detection in an ISP

Causal Modeling for Anomaly Detection

Visualizing Audio for Anomaly Detection

An Empirical Comparison of Monitoring Algorithms for Access Anomaly Detection

Example of Anomaly Detection

Anomaly Detection Industry