maximum test coverage minimum test cases n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Maximum Test Coverage Minimum Test Cases PowerPoint Presentation
Download Presentation
Maximum Test Coverage Minimum Test Cases

Loading in 2 Seconds...

play fullscreen
1 / 33

Maximum Test Coverage Minimum Test Cases - PowerPoint PPT Presentation


  • 153 Views
  • Uploaded on

Maximum Test Coverage Minimum Test Cases. Using Equivalence Classes to your advantage. Introduction. John Arrowwood – john@irie-inc.com Been doing QA in the Portland area since 1996 Certified ScrumMaster , working on Agile teams since 2007 Specialize in test automation

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Maximum Test Coverage Minimum Test Cases' - masako


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
maximum test coverage minimum test cases

Maximum Test Coverage Minimum Test Cases

Using Equivalence Classes to your advantage

introduction
Introduction
  • John Arrowwood – john@irie-inc.com
  • Been doing QA in the Portland area since 1996
  • Certified ScrumMaster, working on Agile teams since 2007
  • Specialize in test automation
  • Have worked as a consultant for a plethora of companies: Intel, Regence, TransUnion, Dynamics Research Corp (Amdocs), RuleSpace, Merant (Serena), CenturyTel, Iovation, Windstream Communications, and Kronos. Current contract is with the IVR team at Comcast
the problem
The Problem
  • More and more, applications are database-driven
  • The size of those databases keeps getting bigger
  • The behavior of the application depends on the characteristics of the data
  • How much testing is “enough?”
how much is enough
How Much is Enough?
  • Test as much as you can before ship date
  • Beat on it until the incoming defect rate falls below some arbitrary threshold
  • Gut Feel – can we ship it?
  • Test Plan 100% executed
  • The open question for all of these is:
    • Have we tested enough to find all significant defects?
    • What haven’t we tested?
example
Example
  • Service Function
    • Regence – Out of Pocket Calculator
  • Output depends on Input
  • Countless permutations of inputs are possible
    • Customer – location, plan details, deductable met?
    • Provider – in network, out of network, agreements
    • Service – specific procedure being provided
  • Which customers do I need to test?
  • Which Providers should I test with?
a simpler example
A Simpler Example
  • Database of people
    • TransUnion – Consumer Lookup – Credit Report
    • Regence – Provider Search
  • Testing functionality: lookup by last name
  • Requirement: All records are findable
  • Example searches: smith, zeta-jones, de la joya, o’conner
  • What else do I need to test?
  • Are there other names I should try?
option 1 test everything
Option 1: Test Everything
  • If time allowed, this would be the ideal solution, as no defect would be left un-discovered
  • But for any realistic data set size, this is unfeasible
    • 375 million records at TransUnion
    • At one per second, average, would take almost 12 years to test them all
    • And that is without testing any variations and permutations of the search criteria
  • And because of the redundancy in the data, many of those tests would be a waste of effort, anyway
data redundancy
Data Redundancy
  • There is a lot of redundancy in real-world data sets
    • In the 1990 US Census, over 1% of the population has a last name of Smith.
    • The ten most popular last names make up over 5.5% of the population
    • Over 2.5% of women are named Mary
    • The top 10 names make up almost 11% of women
    • 3.3% of men are named James, another 3.2% are John, and the top ten names make up 23% of all men!
    • Just how many guys are there named “John Smith” ???
  • Time spent testing redundant test cases is wasted
option 2 random sample set
Option 2: Random Sample Set
  • Common for very large data sets, like at TransUnion
  • Gives a (supposedly) reasonable approximation correctness and predictability
  • With a 2 week test cycle, at best, test 1.2 million records at one per second
    • That is 0.32%
    • Business wanted any characteristic that appears in at least 5% of data to be tested
    • A lot of characteristics above that threshold would most likely never get tested!
  • Data redundancy
    • If 1% of whole set is Smith, then 1% of random subset probably is, too
    • You spend around 3-1/2 hours of your limited 2 weeks testing Smith records, when you really only needed to spend 1 second testing one of them
option 3 hand selection
Option 3: Hand-Selection
  • A domain or subject-matter expert (SME) - maybe you, maybe not - can select your test cases
  • You are unlikely to select redundant test cases, so you can be more efficient
  • Are you certain you have tested everything that needs to be tested?
    • 375 million records condensed to 64 million distinct surnames
    • Never would have guessed that the data set included strange things like “surname(nickname)” or “a.surname”
    • Manual review of the 64 million names never revealed them
option 4 equivalence classes
Option 4: Equivalence Classes
  • Analyze the production data, and iteratively and programmatically assign records to buckets, where every record in the bucket is interchangeable for any other
  • When you are done, you test one record per bucket
  • If properly defined, you can test exactly enough to prove correctness, no more, no less
  • Think of it as a SME on steroids – it has the patience and tenacity to analyze billions of records
  • But it is only as correct as the logic put into it
equivalence classes
Equivalence Classes
  • Mathematically, an equivalence class is all values of x which, when input into the function f(x), evaluate to the same output value y
    • See http://en.wikipedia.org/wiki/Equivalence_class
  • In Quality Assurance, an Equivalence Class is the set of all inputs that can be seen as interchangeable with one another without impacting the results of the test, where only one of them needs to be tested in order to have confidence in the test results
    • See http://en.wikipedia.org/wiki/Equivalence_Partitioning
example partitions
Example Partitions
  • Valid vs. Invalid input values
  • Upper-bound, lower-bound, in-bound, out-of-bounds
  • Empty-set vs. one, two, or three or more elements
  • Empty input vs. normal, vs. max-size, vs. oversize
  • Can you name others?
pre requisites
Pre-Requisites
  • Equivalence Partitioning doesn’t require coding ability
  • Figuring out how best to partition millions of records does
  • It’s worth the investment to obtain those skills, or hire someone who has them, or conscript a developer into doing this work
  • The production data must be available for programmatic analysis – preferably in a readily accessible format, e.g. .csv or .xml
    • Barring that, maybe extract data from production logs
  • You must have the computational resources to do the analysis
example surnames
Example: Surnames
  • Database with 375 million distinct people in it
    • 64 million distinct last names
    • Top 90 percentile are equivalent
  • Built a histogram of surnames, most common first
    • Manually reviewed this list
    • Quickly determined that it was still too many to go through
data patterns
Data Patterns
  • Converted every name to a pattern:
    • Converted every upper-case letter to ‘A’
    • Every lower-case letter to ‘a’
    • And every digit to ‘9’
  • Generated a new histogram – count, pattern, example
  • List was smaller, but still too big
  • And there were still a lot of clearly equivalent test cases
  • But, now I started to see some of the oddball things showing up in the list
visual
Visual

Raw Histogram

Pattern Histogram

55016653Aaaaaa Miller

44526736Aaaaa Smith

44043929Aaaaaaa Johnson

28721375Aaaaaaaa Williams

20727790Aaaa Hall

14237549Aaaaaaaaa Rodriguez

6060133Aaaaaaaaaa Richardson

3531179Aaa Lee

2216343Aaaaaaaaaaa Christensen

686906Aaaaaaaaaaaa

Christiansen

519600Aa Le

448141AaAaaaaa McDonald

360436AaAaaaa McGuire

273971AaAaaa McLean

226265Aaaaaaaaaaaaa

Hollingsworth

212366AaAaaaaaa McCormick

179730AaAaaaa De Jesus

137740AaAaaaaa Mc Donald

2069309 Smith

1629150 Johnson

1357432 Williams

1216138 Brown

1212358 Jones

955441 Miller

948214 Davis

685084 Wilson

670050 Anderson

631897 Taylor

625351 Thomas

615815 Garcia

613769 Moore

594073 Jackson

577366 Rodriguez

incremental improvement
Incremental Improvement

Previous

Longer than 4 = equivalent

195839605 Aaaaa Smith

20727790Aaaa Hall 90%

3531179 Aaa Lee

1132856 AaAaaaa McDonald

972866AaaaaAaaaa Rivera Rivera

519600Aa Le

436146AaAaaaa De Jesus

339588Aaaaa-Aaaaa Pierre-Louis

286533A'Aaaaa O'Brien

273971AaAaaa McLean

240705Aaaaa-aaaa Pierre-louis99%

215607A'aaaaO'brien

186674AaaAaaaa Del Valle

133223AaAaaa De Leon

116553AaaaaAaaa Santa Cruz

116168AaAaa McCoy

113526AaaaAaaaa Diaz Rivera

103296 A M

55016653 Aaaaaa Miller

44526736 Aaaaa Smith

44043929 Aaaaaaa Johnson

28721375 Aaaaaaaa Williams

20727790 Aaaa Hall

14237549 Aaaaaaaaa Rodriguez

6060133 Aaaaaaaaaa Richardson

3531179 Aaa Lee

2216343 Aaaaaaaaaaa Christensen

686906 Aaaaaaaaaaaa Christiansen

519600 Aa Le

448141 AaAaaaaa McDonald

360436 AaAaaaa McGuire

273971 AaAaaa McLean

226265 Aaaaaaaaaaaaa Hollingsworth

212366 AaAaaaaaa McCormick

179730 AaAaaaaDe Jesus

137740 AaAaaaaaMc Donald

iterative process
Iterative Process
  • Made the pattern function more and more complicated
    • Downloaded lists of names from census data, USPS, etc.
    • Tokenized the name, and for each token, if it matched one of the names in one of the lists, I replaced it with {list-name}, e.g. {surname}, {male}, {female}, {ambig}, {city}, {state}, etc.
    • If a token was not recognized, then the old pattern was the fall-back
    • Punctuation was never transformed, so it could be taken into account
  • After several iterative improvements, got the histogram down to about 38k patterns
    • Still too much to test manually, but perfectly reasonable for use in an automated test suite, if I were so inclined – testing could complete in under 11 hours
    • Equates to 0.01% of full data set, or 0.05% of full list of surnames
    • The top 95% were encompassed in the first 23 patterns in the list!
    • But upon inspection, there were clearly common elements farther down in the list, but I was able to justify where to stop the testing
generalized process
Generalized Process
  • Get access to an extract of the raw data
  • Create f(x)
    • Always start simple, for example: s/[A-Z]/A/g; s/[a-z]/a/g; s/[0-9]/9/g;
  • Build a histogram showing the count, the pattern, and the most common input data that produced that pattern
  • Review this histogram (perhaps with the developers), decide which elements in the top few hundred are equivalent and which are not
  • Make intelligent changes to your transformation function to classify more equivalent values into the same bucket
  • Repeat until you are satisfied with the results
e xample phonetic encoding
Example: Phonetic Encoding
  • SoundEx is one example of a phonetic algorithm that aims to put surnames that sound the same in the same bucket, ignoring variants of spelling
    • Smith, Smyth, Smythe would all fall in the same bucket
    • Bare, Bear, Bahr, and maybe even Beer, too
    • http://en.wikipedia.org/wiki/SoundEx
  • Other examples which aim to increase accuracy are
    • http://en.wikipedia.org/wiki/Metaphone
    • http://en.wikipedia.org/wiki/Match_Rating_Approach
    • http://en.wikipedia.org/wiki/New_York_State_Identification_and_Intelligence_System
    • TransUnion had their own, unpublished algorithm
  • All of these are excellent examples of the concept: a data transformation function which maps input values that are conceptually the same into a common output value
example address
Example: Address
  • Given addresses: line1, line2, city, state, zip, plus4
  • Due to the nature of the application, you should test at least one record per zip code
  • Because you can search by city+statewithout zip, you really need one per city+state+zip
example address1
Example: Address
  • If the application has custom parsing for the line1, you need an equivalence class function for it
    • Different cities have different address formats
      • 123 SW Main St
      • 123 Main St SW
      • 123 S Main St W
    • Some addresses will have apartment/suite information in line1, others in line2
  • If the parsing of those fields is handled by a third party library, and you have the data in separate fields in your database, then you can probably ignore that field
example line1 pattern
Example: line1 Pattern
  • 123 SW Main St 19754 North East Broadway Boulevard
  • My Equivalence Class function
    • Tokenize the string
    • Using a defined priority, map recognized words into token class identifiers
    • With some, merge repeated token classes into one
  • Output for both:
    • {digits} {dir} {token} {tfare}
the whole record
The Whole Record
  • For every field, you need to be able to have a custom function f(x) that handles that type of data
    • Some of those fields may use the same function, e.g. ec_enum(x) which returns the input value converted to lower case
  • You can choose to ignore some fields (e.g. plus4)
    • Always ignore volatile fields, like auto-increment primary keys, full phone numbers, etc.
  • Other fields are only relevant in combination with others (e.g. city, state, zip)
  • You want
    • an example of each distinct pattern for line1+line2
    • an example of each city, state, zip combination
    • You do NOT want every pattern of line1 for each city, state, zip, as that would be hugely wasteful
    • But you would like your line1 patterns to be spread out over every city, state, zip combination
getting what you want
Getting What You Want
  • The process is the same as with the Surnames – the only difference is you don’t generate a single pattern
    • You generate one pattern for the line1 and line2 fields
    • You generate another for the city, state, zip
  • Associate the record with both patterns
    • Rather than having a single pattern that represents the data, you have two
    • The pattern string includes the context:$pattern_addr = ‘addr:’ . ec_address_pattern( line1, line2 );$pattern_csz = ‘csz:’. ec_city_state_zip( city, state, zip );
    • Save both patterns in your histogram hash
    • Save a reference to the record for both pattern strings in your “example” mapping
the magic filter
The Magic Filter
  • In general, you process records as a filter
    • Keep a master hash table
      • key = pattern, value = reference to record
    • If frequency matters to you, keep another hash table
      • key = pattern, value = count of occurrences
    • Process records one at a time
    • Pass the record to your f(x) transformation function
    • For every pattern returned by your trans. function, update your hash tables (or update them directly in the function)
    • When you get to the end, you output all of the unique records still being pointed to by the master hash table
the magic filter1
The Magic Filter
  • The part that does the actual filtering can be abstracted into reusable code
  • How it works:
    • In a modern language, objects/variables no longer being referenced are garbage collected
    • Your code stores a reference in the hash
    • As long as that reference remains, the record remains
    • When the last reference to that record is replaced by a reference to some other record that matches the same pattern, the redundant record is forgotten, and the memory it used is reclaimed for future records
memory constrained
Memory Constrained
  • This algorithm assumes that you have enough memory to store all of your selected test records
  • If that is not the case
    • Just keep a hash of all patterns that have been seen
    • If the record returned any “new” patterns, output it
    • Then add the “new” pattern to the hash for next time
  • This will be less than optimal, but it will allow you to filter to significantly fewer records without having to use as much memory to do it
  • NOTE: It is better to just get more memory, and use the memory you have wisely!
normalized data
Normalized Data
  • What if the data is normalized?
    • Zero or more phone numbers?
    • Zero or more addresses?
  • Need to find a way to export it so that your filter can read it, e.g. xml
  • Add additional patterns, one or two per address, one per phone number, plus one to indicate how many addresses and another for how many phone numbers the record had
statistical skew
Statistical Skew
  • When using a randomly sampled subset of the data of sufficient size, you can usually infer or generalize something about the behavior of the data set as a whole
  • EC selection de-emphasizes common data elements, and over-emphasizes uncommon ones, completely destroying any ability to generalize based on your testing results
  • This technique is great for finding defects, not for characterizing the behavior of a system as a whole
  • Be aware and prepared to prevent anyone from trying to draw fallacious conclusions about the whole data set based on your testing of the subset
determinism
Determinism
  • Because of how the algorithm works, it will tend to select the last records seen, and only output older records that are not covered by more recent ones
  • This can create a statistical skew to the sample set (above and beyond the skew inherent to the algorithm)
  • It is best to process your records in random order
    • If you re-randomize, you then get different records
    • Otherwise, you will get the same set of records every time
  • This is an added insurance against your EC functions being inadequate
questions
Questions?
  • FAQ:
    • Yes! You can email me if you have a question that we don’t get to here, and if time allows, I will answer it, no charge
    • Yes! Your organization can hire me for short-term engagements to help you implement these principles
    • Yes! It will not be cheap 
  • Anything else?