Ministry of Finance and Ministry of SamurdhiWelfare Benefits Board The Targeting Formula: Analysis Using Pilot Data Welfare Workshop Colombo, 16-17 November 2003
Developing criteria for identifying welfare beneficiaries • The idea of proxy means test formula (PMTF): to find a weighted combination of “proxy” variables/indicators that together identify or predict whether a household is poor or not • The PMTF assigns a “score” to every household, based on information collected from the household for all variables that are included in the formula • Choice of proxy variables must balance between 3 criteria: able to identify the poor with some accuracy; easily observable and measurable; cannot be manipulated easily by household • PMTF developed by regression analysis using household surveys • Analysis is used to identify appropriate set of variables and weights • Alternative models from Sri Lanka Integrated Survey (SLIS, 1999-2001), and Consumer Finance Socio-Economic Survey (1996-97)
The Proxy Means Test Formula using SLIS Notes:1) All scores are derived from regressions of (log of) per capita consumption expenditure on a set of variables 2) The score for each variable is its coefficient in the regression, multiplied by 100, and rounded to the nearest integer 3) Lower the score, poorer the household is considered to be 3) The aggregate score for each household is calculated as constant +/- the weight on each variable. For each 0-1 variable, multiply the score by 1 if true for household, by 0 if not true. For each continuous variable (denoted by *), multiply the score by the value of the variable for the household)
Validating the formula through a pilot • Pilot targeting exercise conducted over the last few months • Information collected on variables in the 2 formulas through application forms in selected GN divisions • Analysis of data collected to test how the predictions from the formula measure up against real data from an application process
Key questions analyzed using pilot data • What was the rate of response to the pilot application, and how well has the application form worked? • How does the coverage (% of population identified as beneficiaries) of the formula for different cutoff scores measure up to what was simulated from survey data? • What are the implications of coverage figures for the choice of cutoff points for eligibility ? • What can we say about coverage of groups that are likely to be especially vulnerable ? • How do the results compare across different regions of the country? • What implications can be drawn for field work or data collection efforts ?
Response to pilot targeting exercise and implications • 58% (in terms of population) response rate for all pilot areas: 55% in non-NE areas, 69% in the 2 NE districts • What we expect: people not interested in program would not fill up the form • Concern: did everyone who would have been interested fill up the form ? • Special concern: those who have been “missed” by the program so far, did they fill the form? • What about those who filled, but information is incomplete? • Because of non-response, all population coverage estimates will have to be calculated assuming that all who did not apply would not have qualified for the program * denotes estimates using average household size from census information (for non-Northeast districts), since population figures for Northeast districts are not available
How serious is the problem of incomplete information for the purpose of analysis? • Age of household head is the most frequently missing variable • No evidence that missing information differentially affects certain TYPES household, which would have caused a bias in results due to missing variables • There are some variables that were imperfectly measured: cultivable land, type of toilet, homeless
SLIS Actual Pilot sample SLIS predicted Cumulative distribution of pilot scores from SLIS model:Comparison with results from SLIS sample The pilot sample is poorer than the SLIS national sample Expected since pilot sample contains only those who are interested in the program SLIS_actual is log per capita expenditure times 100, SLIS refers to predicted score from SLIS data, and Pilot score is calculated using the SLIS model on all pilot sample cases for which the model score could be calculated
Applying SLIS model to pilot: program coverage • % covered in pilot sample for every cutoff point is higher than that predicted from SLIS –since pilot applicants are poorer • Important: % covered among total population in pilot districts for every cutoff point is close to that predicted from SLIS • The difference between predicted and pilot population coverage becomes higher as cutoff point is raised • This difference also suggests that all the poor households probably did not apply for the pilot program
Cutoff points and coverage • Coverage is higher for the two NE districts: a higher proportion of applicants in the total population brings the coverage closer to what is predicted • Choice of a cutoff point: recognize that the proportion of applicants in the population is expected to rise in the actual program, which will raise the % of population covered for a given cutoff • E.g. with cutoff at a score of 709, pilot population coverage of 24.6% is lower than predicted coverage of 26.3% • But coverage will increase if strong efforts are made to have all the poor apply to the program • Thus 24.6% is lower bound of expected coverage; actual coverage should be expected to be 2-3 percentage points higher • Last but not the least: recall the inherent tradeoff in choice of cutoff due to the formula being unable to identify the poor “perfectly” • Higher the cutoff, higher the rate of “leakage” to non-poor • Lower the cutoff, lower the rate of coverage among the poor
Coverage Cutoff scores Coverage among specially “vulnerable” groups on applying the formula: high in pilot sample
Coverage Cutoff scores Coverage among other vulnerable groups also high in pilot sample….. As is coverage among large households & children…..
Coverage of the poor in specially vulnerable groups • Results from household survey (SLIS) show little reason for concern • Undercoverage (proportion of poor “missed” by formula) is less for all vulnerable groups than among the poor in general, for all cutoff points
Coverage of vulnerable households who are in the poorest 30% of the population When cutoff = 709 (30th percentile of actual consumption) (Results using SLIS data)
Comparison between North & East districts and the rest • Significantly higher response rate to pilot application in N & E districts • Leads to higher % of population covered (for any cutoff) in NE districts than in non-NE districts • Most household characteristics in N&E pilot districts – defined by the variables in the formula – are not significantly different from those in the other pilot districts • 3.1% of households in NE districts have a disabled head, compared to 4.8% of households in non-NE; 14.0% of households in NE have a single female head, compared to 12.6% of those in non-NE
Comparison with groups of existing program beneficiaries • Distribution of Samurdhi beneficiaries in pilot somewhat correlated with SLIS formula scores in N&E; little correlation between the two in non-N&E districts • If cutoff was set at the 30th percentile, to benefit about 25% of pilot population • 63% of beneficiaries are current Samurdhi recipients, 6% benefit from other welfare programs (but not Samurdhi), and 31% currently receive no benefits • Eligible households include: 47% of current Samurdhi recipients, 45% of those who receive other welfare benefits only, and 35% of those with no welfare benefits
The homeless households • All homeless households should be in the program automatically • The formula is not applicable for them; thus all results so far have been derived after taking the homeless out of the sample • Proportion of homeless in the pilot sample • Does not change between NE and non-NE districts • Significantly higher in rural areas • Q: Have all the homeless been able to apply? • Average household size much lower for homeless households • 24% of homeless households comprise of a single member, compared to 1% of those living in homes • 5% of homeless households comprise of a 6 or more members, compared to 37% of those living in homes
Lessons for field work: 1) formula is meant to replace subjective judgment • A single variable in most cases is not a good predictor for poverty • With cutoff set at the 30th percentile (score=709), 26% of selected households has TV/VCR, 41% has electric light, 50% has brick/cement wall • Thus critical to collect all information, and not form pre-judgments about a household’s poverty based on inspection – let the formula do its job !
Lessons for field work: 2) all variables are necessary, but some are especially critical to measure accurately • Ten most “sensitive” variables for identifying the poor using SLIS formula • Sensitivity depends on the the variable’s weight and its distribution • Coverage is most affected by household size, rooms per member, and whether household head is female and single
Implications for a broader safety net system • There remain certain categories of poor and vulnerable households that are less likely to be picked up by the formula • Poor households who are small in size • ……who have to support disabled members (not necessarily the head of the household) • ……who have suffered a recent reversal of fortune (e.g. death/disability of a primary earning member, loss of crops) • ……affected by civil strife/violence/dislocation, unless the effect is directly in terms of loss of assets or disability to household head • ……affected by illness of head/other household members • Since the above are relatively rarely occurring events or shocks, the formula cannot incorporate these as variables that explain poverty for the general population • Such cases underscore the need for a broader safety net system, incorporating programs that address specific needs and vulnerabilities