Data Weighting Issues

Data Weighting Issues Adult Equivalent Scales, Stratified Sampling and general Population Weighting Issues

Introduction • Sometimes what we observe is not what our theories, estimators and tests have been developed to address. • Welfare addresses the aggregate wellbeing of individuals however their actions are often observed in the context of a collective of agents for example as households or countries. • Sometimes the agents of interest are observed in terms of sampling schemes that are not purely random i.e. they are frequently observed in the context of stratified or cluster sampling processes. • Here we are going to address how to deal with some of these issues.

The Issue of Adult Equivalence. • Typically economic data report expenditures by households whereas welfare measurement is individualistic. • Some methodology is required to “equivalize” the incomes (not so difficult) or utilities (really difficult) of households with different compositions. • The “equivalized” measures can then be apportioned to the individuals in the households and welfare measurement can proceed on the basis of the collection of individual measures. • Browning (1992) provides an excellent review of the issues.

Utility Based Approach. • Consider a household Indirect Utility Function defined by: V(p,x,z)=maxq(U(q,z) s.t. Σipiqi = x) where p is a vector of prices corresponding to a vector of commodities q constrained by expenditure x and z is a vector of household characteristics (e.g. numbers and ages of children etc.) • The welfare of two families a and b is equal if and only if V(p,xa,za) = V(p,xb,zb) thus identifying V() permits interpersonal comparisons of utility in a simple fashion (again this will raise objections amongst those who do not approve of such comparisons). • Estimate an integrable demand system and integrate back to the indirect utility function. The number of adult equivalents d(p,x,z) relative to a reference household zR is defined implicitly by V(p,x,zR)=V(p,x/d(p,x,z),z).

Utility Based Approach Continued • Generally equivalence scales appear to be functions of incomes and prices. • Estimates have varied widely and appear to be very imprecise. • Reason for this: Any utility function V(p,x,z) can be renormalized to F(V(p,x,z)z) where F() is strictly increasing in V so that F and V represent the same ordinal preferences over goods, unfortunately they will give different values for the adult equivalent scale. • Essentially we have an identification problem, see Blundell and Lewbel (1991). Need more information, for example restricting scales to be independent of income restricting utility functions to the form V(p,x/a(p,z)). Have been generalized (Donaldson and Pendukar).

Utility Based Approach continued • Equivalence scales still end up depending upon prices (which are frequently not observed) and tend to be commodity specific (the equivalence scale associated with alcohol and tobacco consumption likely to be quite different from that for clothing etc.) • Income independent scales also appear to be at odds with the evidence (economies of scale in sharing out the wealth within the family differ with different goods). • Equivalizing household incomes then depends upon patterns of consumption at different income levels.

What has been done in practice? • Utility theory based approaches have been largely ignored. • Household income deflated by some scaling factor α(z)nβ(z), where n is the household size, and then repeated in the data n times. (Most popular appears to be α=1, β=0.5) • Akin to the data reweighting problem. • In essence household income has an elasticity of -β with respect to household size (n) and nβ corresponds to the equivalent number of single persons. • In the literature values of β in the interval [0,1] have been employed where 0 may be interpreted as infinite returns to family size (Y is the welfare enjoyed by each of the individual family members no matter how many of them there are) and 1 is interpreted as constant returns to family size (with Y being shared equally amongst the family members).

Practice and Debates in the Literature. (Continued) • Generally something in between is favoured (see for example Karoly and Burtless (1995) and Lanjouw and Ravallion (1995)), the official United States Bureau of the Census poverty scale has an implicit family size elasticity of .56 (similar to the McClements scale used in the U.K.) and though it has been the subject of criticism (Citro and Michael (1995)) this is the scale most frequently used by researchers. • Problem is different values of parameters appear to make a big difference(see inter alia Buhmann et. al. (1988), Coulter et. al. (1992), Banks and Johnson (1994) and Jenkins and Cowell (1994) and references cited therein with regard to the McClements Scale in the U.K.). • See Anderson (2003) Poverty in America.

Population Weighting • Using Per Capita GNP in international comparisons, Ireland gets same weight as China • Re-weight data to take account of the “representative agent” issue. • Really a stratified sampling problem. • Let pk be population of country k and p* be average population across all countries, then wk (population weight for country k) = pk/p*. • Makes huge difference whether or not weighting is done in some cases (see Anderson (2004a)).

Data Weighting Issues