60 likes | 181 Views
This document outlines the rigorous requirements for breakpoint analysis, emphasizing the increased stringency in aligning b-points and contigs used in both AGE and Crossmatch methodologies. It highlights the impact of stringent calls on the accuracy and agreement between AGE and CROSSMATCH results. The inconsistent estimation of false discovery rates (FDR) by SAV, along with various biases, is examined. A proposal for integrating GenomeStrip with robust assembly support is suggested, aiming to improve accuracy in genomic calls while addressing the inherent biases in SAV validation.
E N D
Stringent breakpoints Alexej & Ken, August 17, 2011
Stringent requirements for b-points • More stringent requirements for AGE alignment (remove ~8% of b-points) • Increase stringency on contigs used by both AGE and Crossmatch(strong effect on the number of resulting breakpoints)
Phase1 deletion breakpoint assembly First line – initial b-points Second line – stringent b-points Much better agreement between AGE and CROSSMATCH
First line – initial b-points Second line – stringent b-points All calls from 5 merged calls Inconsistent FDR estimation: (2108 – 519)/2108 = 75% < 86%
Calls with SAV p-val <> 0.5 Require 50% reciprocal overlap Second line – p-val > 0.5 • Calls with p-val < 0.5 and p-val > 0.5 are not dramatically different • SAV validation seems to systematically overestimate FDR. • Possible reasons: • Input call properties (sample under-assignment) • Bias again smaller regions • Bias in repetitive regions • Reference sample bias when using aCGH probes
Proposal • Use GenomeStrip + assembly support (all or only with both AGE and CROSSMATCH support) + SAV validated (possibly) • SAV validation has inherent biases • Assembly validation is orthogonal to SAV validation • There is evidence that SAV overestimates FDR • GenomeStrip + assembly support by both AGE and CROSSMATCH would have 29,281 calls with likely overestimated SAV FDR of 11% • Use only consistent b-points by AGE and CROSSMATCH for genotyping