Stratification case study to illustrate alternative methods to stratify a sampling frame l.jpg
This presentation is the property of its rightful owner.
Sponsored Links
1 / 27

Stratification Case study to illustrate alternative methods to stratify a sampling frame PowerPoint PPT Presentation


  • 129 Views
  • Uploaded on
  • Presentation posted in: General

Stratification Case study to illustrate alternative methods to stratify a sampling frame. Dr. Will Yancey, CPA. This material is the property of the presenter and cannot be reproduced or used without the expressed Written consent of the presenter. Outline. Why stratify?

Download Presentation

Stratification Case study to illustrate alternative methods to stratify a sampling frame

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Stratification case study to illustrate alternative methods to stratify a sampling frame l.jpg

StratificationCase study to illustrate alternative methods to stratify a sampling frame

Dr. Will Yancey, CPA

This material is the property of the presenter and cannot be reproduced or used without the expressed

Written consent of the presenter.

VII-1


Outline l.jpg

Outline

  • Why stratify?

  • Coefficient of Variation (CV)

  • High and Low Thresholds

  • Number of strata

  • Strata Boundary Determination

    Case study data for this presentation:

    185,083 rows of purchase invoice line items.

VII-2


A why stratify l.jpg

A. Why stratify?

Parable of the Footballs and the Fish

  • You are asked to determine the weight of 1,000 footballs. You know they are identical in weight. You can weigh only one football at a time. How many must you weigh?

  • You are asked to determine the weight of 1,000 different fish taken from a lake. They are highly variable in weight. You can weigh only one fish at a time. How many must you weigh?

VII-3


Parable continued l.jpg

Parable continued

  • How could we organize the fish so we could get a reasonable estimate without weighing them all?

  • What feature would we use to organize the fish?

  • What features would probably not be useful for estimating total weight?

  • How many piles should we have?

VII-4


Effective stratification l.jpg

Effective Stratification

Effective stratification: If possible, what we are measuring is similar within each stratum and different between strata.

Stratifying (grouping, categorization, segmenting, etc.)

  • Grouping by account, type, division, or other attribute.

  • Stratifying by dollar amount within group.

    A sales and use tax audit goal is to estimate total tax dollar error.

    Correlation of invoice line amounts with taxability or errors:

  • If an error occurs, it is proportional to invoice line amount

  • The relative frequency of error occurrence might or might not be correlated with invoice amount.

VII-5


Accounts payable case study data l.jpg

Accounts Payable Case Study Data

185,083 rows of invoice line items

Range $0.01 to $26,763,476

$493 million total population base

4% of items with amount ≥ $10,000

VII-6


A p case study distribution of l.jpg

A/P Case Study: Distribution of $

4% of items with amount ≥ $10,000 contain

$376 of the $493 million in population base = 76%

> $10K

VII-7


B coefficient of variation cv l.jpg

B. Coefficient of Variation (CV )

CV is a relative measure of the dispersion around the mean.

Dollar stratification results in lower CV within each stratum than in the combined unstratified sampling frame.

Caution: When the mean is close to zero, CV is very sensitive to small changes.

VII-8


Cv stratification and precision l.jpg

CV, stratification, and precision

Reducing CV usually improves precision.

(Remember Parable of Footballs and Fish.)

For each stratum compute the CV of the items’ invoice line amounts.

For a specific total sample size and stratified random sampling, the best precision usually occurs when the CV are relatively constant across the strata.

  • Consider adjusting strata boundaries or adding more strata to adjust CV across the strata.

VII-9


Case study coefficient of variation l.jpg

Case Study: Coefficient of Variation

VII-10


C high and low thresholds l.jpg

C. High and Low Thresholds

All items with dollar amount greater than High Threshold (H) will be detailed (actual basis exam) rather than sampled.

“This removal of the extremes from the main body of the population reduces the skewness and improves the normal approximation.” Cochran, Sampling Techniques, 3rd Edition, p. 44.

VII-11


Setting high threshold h also known as ceiling detail threshold l.jpg

Setting High Threshold (H)also known as ceiling, detail threshold

  • Approximately top 0.1% to 0.2% of items (or some other %).

  • Greater than 3 standard deviations from the unstratified population mean.

    As H decreases, the number of items in the detail stratum increases.

    Items above H are from relatively few major vendors or major projects.

VII-12


Case study high threshold l.jpg

Case Study: High Threshold

Population Size = 185,083. Population Base = $492,953,742.

Exhibits in this presentation: H = $100,000.

VII-13


Low threshold l also known as floor or basement l.jpg

Low Threshold (L)also known as Floor or Basement

Accounting transaction data files have many small dollar items – particularly for purchases with invoice line items.

  • Delivery charges, processing fees, etc.

    Some sampling plans set a Low Threshold (L) such that every item below L is:

  • Excluded (no change), or

  • Minimum sample size, or

  • Project results from other sampled strata onto the stratum below L.

VII-14


Low threshold l criteria l.jpg

Low Threshold (L) - criteria

Policy for setting L depends on what will be done with items below L.

Possible criteria for setting a value for L

  • Less than 1% or 2% of population dollars are below L (or some other %).

  • Greater than 3 standard deviations below the unstratified population mean.

  • Divide H by 1,000.

VII-15


Case study low threshold l.jpg

Case Study: Low Threshold

Exhibits in this presentation: L = $100.

VII-16


D number of sampled strata l.jpg

D. Number of Sampled Strata

  • Adding more strata

    • Reduces CV within stratum.

    • Minimum sample size per stratum may result in total sample that exceeds budget.

    • More than 6 strata probably does not improve precision [Neter and Loebbecke, Behavior of Major Statistical Estimators in Sampling Accounting Populations, (AICPA, 1975)].

  • Pragmatic approach: Start with 3 strata and then add or delete strata as needed to achieve desired precision, CV, or other criteria.

VII-17


E strata boundary determination l.jpg

E. Strata Boundary Determination

  • Precision is a function of strata boundaries combined with other attributes in population and the sampling plan.

  • Unless otherwise stated, the following case study shows:

    Five strata = 3 sampled strata + Low + High

    Low Threshold (L) = 100

    High Threshold (H) = 100,000

VII-18


Equal population size nearly equal population size in sampled strata 2 3 and 4 l.jpg

Equal Population SizeNearly equal population size in sampled strata 2, 3, and 4

Observe: CV varies greatly across strata 2, 3, and 4.

VII-19


Equal population base nearly equal population base in sampled strata 2 3 and 4 l.jpg

Equal Population Base $Nearly equal population base $ in sampled strata 2, 3, and 4

Observe: CV varies greatly across strata 2, 3, and 4.

VII-20


Cumulative square root csr method l.jpg

Cumulative Square Root (CSR) Method

  • Developed by Tore Dalenius, a Swedish statistician, in the 1950’s with the warning that it will not do well with all distributions.

  • See numerical example in New York State CAA Manual, Publication 132, www.tax.state.ny.us/pdf/publications/sales/pub132_1001.pdf , pages 17-19.

  • Cumulative square root method can be distorted when begin from zero and there are lots of small $ items (such as under $10).

    • Mitigate by setting L threshold greater than zero.

VII-21


Cumulative square root with zero low threshold l zero 4 sampled strata 1 detail stratum l.jpg

Cumulative Square Root with Zero Low ThresholdL = zero. 4 sampled strata. 1 detail stratum.

Observe: CV varies greatly across strata 1, 2, 3, and 4.

VII-22


Cumulative square root with 100 low threshold l 100 between l and h has 3 sampled strata l.jpg

Cumulative Square Root with $100 Low ThresholdL = 100. Between L and H has 3 sampled strata

Observe: CV is closer across strata 2, 3, and 4.

Setting an appropriate L has improved the stratification.

VII-23


Geometric ratio method l.jpg

Geometric Ratio Method

  • Developed by Will Yancey with co-authors Jane Horgan and Patricia Gunning at Dublin City University in Ireland in 2003.

  • Assumes population distribution declines at a relatively constant rate.

  • Requires setting thresholds L and H.

    R = H / L = 100,000 / 100 = 1,000

    For J=3 strata: r = R ^ (1/J) = 1,000 ^ (1/3) = 10.0

    For J=4 strata: r = R ^ (1/J) = 1,000 ^ (1/4) = 5.623

VII-24


Geometric ratio with 3 sampled strata ratio upper to lower boundary is r 10 in strata 2 3 and 4 l.jpg

Geometric Ratio with 3 sampled strataRatio upper to lower boundary is r=10 in strata 2, 3, and 4.

Observe: CV is relatively similar across strata 2, 3, and 4.

VII-25


Slide26 l.jpg

Geometric Ratio with 4 sampled strataRatio upper to lower boundary is r=5.623 in strata 2, 3, 4, and 5.

Observe: Adding more strata lowers the CV.

VII-26


Summary of stratification procedures l.jpg

Summary of Stratification Procedures

  • Set a High Threshold (H).

  • Set a Low Threshold (L).

  • Choose number of strata.

  • Set boundaries with a method.

  • Compute CV in each stratum.

  • Adjust by changing L, H, boundaries, adding or deleting strata.

VII-27


  • Login