Download Presentation
## De la Garza Phenomenon

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**De la Garza Phenomenon**Bikas K Sinha ISI, Kolkata RU Workshop : APRIL 18, 2012 Collaborators : N K Mandal & M Pal Calcutta University**Nomenclature..……**Liski-Mandal-Shah-Sinha (2002) : Topics in Optimal Design : Springer-Verlag Monograph Pukelshiem (2006) : Optimal Design of Experiments Refers as …..Property of Admissibility • Khuri-Mukherjee-Sinha-Ghosh (2006) : Statistical Science …..de la Garza Phenomenon • Min Yang (2010) : Annals of Statistics …title of the paper ‘On the de la Garza Phenomenon’**Motivating Example : First Course in Regression**• X : -3.2 -2.7 -1.8 0.2 4.7 6.3 8.2 • Y : … … … …. … … …… • Fit a linear regression equation of Y on X under the usual model assumptions….etc etc • X-transformed to U…… • U : -1.00, -0.91, -0.75, -0.40, 0.39, 0.67, 1.00 • Motivating Question : If we believe in the linear regression model, what good are so many u-values ? Why can’t we work with exactly two u-values &, that too, possibly with +/- 1 ?**Linear Regression Model**Mean Model Yx = α + βx with Homoscedastic Errors • Given DN = [(x 1, n 1); (x 2, n 2); …(x k, n k)] ; N=∑ni • χ = Space of the Regressor ‘X’ = [a, b], a < b WOLG : a ≤ x 1 < x 2 <….< xk ≤ b; x’s all distinct For each i, ni ≥ 1 such that ∑ni= N [given] Estimability of α and β ensured iff k ≥ 2. Fitting of Linear Regression Model : β^ = b yx= SPyx/ SSxx; α^ = ybar – b xbar Inference rests on normality of errors etc etc**Motivating Theory :Undergraduate Level**X : a ≤ x 1 < x 2 <….< xk ≤ b [k > 1, all x’s distinct] Y : y1 , y2 , y3 , …. yk ……responses on Y Assume Linear Regression of Y on X : E[Yx] = α + βx Usual conditions on the errors…. Find BLUE of the regression coefficient ‘β’. Smart Student’s thought…..pairwise unbiased estimators… β^_(i,j) =b_(i,j) = (yi – yj) / (xi – xj), 1<= i < j <= k So….BLUE can be based on the {b_(i,j)’s}…..k_c_2 pairs All Distinct ? / Correlated / Uncorrelated ? Basis : b_(1,2), b_(1,3), …, b_(1,k) …each unbiased but Jointly correlated estimates…..y_1 is involved everywhere**Formation of BLUE…..**• Work out means, variances/ covariances of the estimators and start from there to arrive at the BLUE. • Define ‘η’ as the (k-1)x1 col. vector of the ‘difference estimators’ i.e., η =(b_(1,2), b_(1,3),…,b_(1,k)) so that • E[η] = β1 & Disp.(η) = σ2W, W being a pd matrix • Then blue of β = η’ W-1 1 / 1’ W-1 1 • Show that indeed the above simplifies to β^=b=∑ (yi- ybar)(xi – xbar)/ ∑(xi -xbar)^2.**Smarter move…..**• V1 = [y1 – y2]/√2 / [x1 – x2]/ √2 • V2 = [y1 + y2 – 2y3]/ √6 / [x1 + x2 - 2x3] √6 • ……. • Vn-1= [y1 + y2 +…- (n-1)yn]/ √{n(n-1)} / • [x1 + x2 +…- (n-1)xn] / √{n(n-1)} • Then these V’s are uncorrelated. • Hence W(V) is a diagonal matrix etc etc…. • Derivation of β^ is much easier…… • Claim: Same result….novel derivation …use of Helmert’s Orthogonal Transformation.**Motivating Theory : Master Level**Regression Design on X : (x1, n1); ( x2, n2); …………..(xk , nk) [k > 1]; all x’s distinct Y : {(y1j); (y2j); ….(ykj)}…altogether n = sum niobservations Assume Linear Regression of Y on X : E[Yx] = α + βx Usual conditions on the errors…. Find BLUE of the regression coefficient ‘β’. Smart Student’s thought…..pairwise unbiased estimators… β^_(i,j) =b_(i,j) = (ybari– ybarj) / (xi – xj), 1<= i < j <= k So….BLUE can be based on the {b_(i,j)’s}. How many ? Correlated /Uncorrelated ? Basis : b_(1,2), b_(1,3), …, b_(1,k) …each unbiased but Jointly correlated estimates…..y_1 is involved everywhere**Motivating Theory : Master Level & Beyond…..**• Work out means, variances/ covariances of the estimators and start from there to arrive at the BLUE. • Define ‘η’ as the vector of these ‘difference estimators’ so that • E[η] = β1 & Disp.(η) = σ2W…..Complicated ? • Then blue of β = η’ W-1 1 / 1’ W-1 1 Show that indeed the above simplifies to β^=b=∑ni (ybari - ybarbar)(xi – xbar) / ∑ni (xi -xbar)2.**Smarter move….**• V1 = [√n1 ybar1 - √n2 ybar2]/[….] • V2 = [√n1 ybar1 + √n2 ybar2 - 2√n3 ybar3]/[...] • Etc etc • This time W-matrix becomes a diagonal matrix… • Tremendous simplification in the formation of β^**Turn back to the basic question…**X : -3.2 -2.7 -1.8 0.2 4.7 6.3 8.2 • U : -1.00, -0.91, -0.75, -0.40, 0.39, 0.67, 1.00 • Motivating Question : If you believe in the linear regression model • E[Y_x] = α + βx = δ + γu = E[Y_u] what good are so many u-values ? Why can’t you work with exactly two u-values &, that too, possibly with +/- 1 ?**Fisher Information Matrix**• I(θ; DN) = X’ X = 2 x 2 matrix with elements • [(N T1); (T1 T2)] where T1 = ∑ ni xi & T2 = ∑ ni x2i X Nx2 = [1 Nx1 , col. vector of xi’s with ni repeats] Averaged Information Matrix per Observation IBAR = (I/N) I(θ) = [(1 μ’1) (μ’1μ’2)] where μ’1 = ∑ ni xi / N μ’2 = ∑ ni x2i / N I(θ) : pd matrix iff k ≥ 2 distinct x’s are considered**de la Garza Phenomenon [de la Garza, A. (1954) : AMS]**• Research Paper [Annals of Statistics] : 2010 • Research Paper [Annals of Statistics] : 2009 • Springer-Verlag Monograph on Optimal Designs : 2002 • Wiley Book on Optimal Designs : 2006 • Continuous Flow of Papers involving Linear & Non-Linear Models – both qualitative and quantitative responses – enormous impact of de la Garza Phenomenon in optimality studies**Continuous Design Theory**• Context : Linear Regression Model • Space of Regressor : χ = [a, b], a < b • k ≥ 2 distinct x-values in CHI with positive weights • w1, w2, …, wk such that ∑wi= 1 • In applications, we consider in terms of ‘N’ observations, with Nwi = Ni observations taken at • x = xi , i = 1, 2, …, k. • [Choice of ‘N’ ensures integral values of Ni’s] Version of IBAR = [(1 μ’1) (μ’1μ’2)] where μ’1 = ∑ wi xi AND μ’2 = ∑ wi x2i Known as Information Matrix arising out of a Continuous Design, in terms of {(xi ,wi); i = 1, 2, …, k}**De la Garza Phenomenon : Continuous Design Theory**• Context : Linear Regression Model with Homoscedastic Errors • Claim 1: Given any continuous regression design ‘D_(k, x, w)’ with ‘k’ support points in χ= [a, b] : • a ≤ x 1 < x 2 <….< xk ≤ b; x’s all distinct and with positive weights w1, w2, …, wk [such that ∑wi= 1], whenever k > 2, we can find exactly 2 points ‘x*’ and ‘x**’ with suitable weights ‘p*’ and ‘p**’ such that (i) x 1 ≤ x* < x** ≤ x k; (ii) p* + p** = 1 and (iii) IBAR based on ‘D*_[(x*, p*); (x**, p**)]’ is identical to IBAR based on D_(k, x, w). [Info. Equivalence]**Proof of Claim 1**• Recall μ’1 = ∑ wi xi [1st moment] • AND μ’2 = ∑ wi x2i [2nd moment] • Start with • IBAR = [(1 μ’1) (μ’1μ’2 )] • Set IBAR = I*BAR and derive defining equ. • p*x* + p** x** = μ’1 …………………..(1) • p*x*2 + p** x**2 = μ’2…………..(2) • Claim : There is an acceptable solution for • [(x*, p*); (x**, p**) satisfying (1) and (2).**Proof….contd.**• WOLG : x1 = -1 AND xk = +1 • Solution set : Define μ2 = μ’2 – μ’12 > 0 • x* = μ’1 +/- [p** μ2/p*] • X** = μ’1 -/+ [p* μ2/p**] • Further, for x* < x**, we readily verify • -1 < x* = μ’1 – [p** μ2/p*] AND • x* < x** = μ’1+ [p* μ2/p**] < 1 • whenever μ2 / [μ2 + (1 +μ’1)2] < p* < • (1-μ’1)2 / [μ2 + (1- μ’1)2] • NOTE : Verified LHS < RHS**Statement of Information Equivalence : Polynomial Regression**Therefore : Guaranteed existence of [(x*, p*); (x**, p**)]; -1 < x* < x** < 1; 0 < p* < 1 such that IBAR = IBAR*. de la Garza Phenomenon applies to pth degree polynomial regression model in terms of Information Equivalence of any k [>p+1]–point supported continuous design with that of a suitably chosen exactly (p+1)-point supported continuous design !**Caratheodory’s Theorem**• If ‘p+1’ is the number of parameters in a model, one can restrict attention to at most (p+1)(p+2)/2 parameters. • Strength…..model specification …most general • Weakness….pth degree polynomial regression model…de la Garza provides much better result • [ p+1 < <(p+1)(p+2)/2, in general terms]**Higher Degree Polynomial Regression**• Yes….de a Garza Phenomenon holds for higher degree polynomial regressions as well…..proof is a marvel exercise in matrix theory !!! • Equate given pd matrix I(D) to I(D*) where • I(D*) = X*W*X*, with X* being a square matrix and W* being a diagonal matrix. The claim is that such X* and W* matrices exist with minimum number of support points …..this is the spirit of de la Garza Phenomenon in terms of Information Equivalence. Information Dominance came much later.**Back to de la Garza Phenomenon: Exact Design Theory [EDT]**• This aspect …somehow…has been bypassed in the literature……difficult to provide a general theory as to the exact sample size for Info. Equi. to work ! • Motivating Example : Linear Regression with 3 points to start with : [-1, 0, 1] so that k = 3 > 2. Accordingly to de la Garza Phenomenon, under continuous design theory, there are weights • 0 < w -1, w0 , w +1 < 1, sum = 1 • assigned to these points. AND then we can find**De la Garza Phenomenon : EDT**one 2-point design, say [(a, p); (b, q)] such that -1 ≤ a < b ≤ 1, 0 < p < 1 and there is Information Equivalence between the two designs ! What if we are in an exact design scenario with a given total number of observations ‘N’ and its decomposition into n(-), n(o) and n(+) – being assigned to -1, 0 and 1 respectively ? Can we now find a solution to [(a, na); (b, nb)] satisfying**EDT…**• (i) -1 ≤ a < b ≤ 1; • (ii) na + nb = N – both being integers • (iii) Information Equivalence ? • Do we need a condition on ‘N’ at all ? • Crucial Observation : NOT ALL VALUES OF ‘N’ ARE AMENABLE TO SUPPORTING THE EQUIVALENCE THEOREM OF THE INFORMATION MATRIX .….NEEDED A MINIMUM VALUE……ONLY THEN IT WORKS !**EDT : Choice of ‘N’**• Examples : N Remark • (i) -1(1), 0(1), +1 (1) : 3 NOT Possible • (ii) -1(2), 0(2), +1(2) : 6 Possible • (iii) -1(1), 0(2), +1(1) : 4 Possible • (iv) -1(2), 0(1), +1(1) : 4 Not Possible • (v) -1(4), 0(2), +1(2) : 8 Possible • (vi) -1(1), 0(3), +1(1) : 5 Possible • (vii) -1(1), 0(2), +1 (4) : 7 Possible • (viii) -1(1), a(1), +1(1) : 3 Not Possible • (vi) -1(2), a(2), +1(2) : 6 Possible iff 3 – 2(3) < a < 2 (3) – 3**EDT : General Theory for 3 pointswith point symmetry**• Consider a general allocation design : • -1 (n-), 0(no) and 1(n+) where each of n-, no and n+ is a positive integer and (n-) + (no) + (n+) = N ≥ 3. • Once more, we want to replace the above 3-point point-symmetric design by a two point design of the form : (x, nx) and (y, ny) so that nx + ny = N and, moreover, Information Equivalence holds. That suggests**EDT**• x nx + y ny = (n+) – (n-) ..…….(3) • x2 nx + y2ny = (n+) + (n-) ……….(4) • Set • a = nx, b = ny, T1 = (n+) – (n-) and T2 = (n+ ) + (n-) ……………(5) • From (3) and (4), in terms of (5), we obtain • x = [T1 / (a+b)] ± [{b[(a+b)T2 – T12]}/a(a+b)2] • y = [T1 / (a+b)] ±[{a[(a+b)T2 – T12]}/b(a+b)2] • It can be readily verified that (a+b) T2 > T12.**EDT**• Let us choose • x = [T1 / (a+b)] +[{b[(a+b)T2 – T12]}/a(a+b)2] • and • y = [T1 / (a+b)] -[{a[(a+b)T2 – T12]}/b(a+b)2] • so that y < x. • Note that T1 and T2 are both known. We will now sort out values of nxand nysubject to nx+ ny= N so as to satisfy the requirement that • -1 ≤ y < x ≤ 1.**EDT**• First, note that • (i) a + b = N • (ii) expressions for x and y depend on a and b only through a/b or b/a. • Set n(-)/N = P- n( 0) / N = Po n(+)/N = P+ • Conditions : -1 ≤ y AND x ≤ 1 • Equivalent to : • 1 + T1/(a+b) ≥ [{a[(a+b)T2 – T12]}/b(a+b) 2] • AND • 1 – T1/(a+b) ≥[{b[(a+b) T2 – T12]}/a(a+b) 2]**EDT**• Equivalent to : [Po(1-Po)+ 4(P+)(P-)]/[2(P-) + Po]2 ≤ nx/ny nx/ny <= [2(P+) + Po]2 /[Po(1-Po)+4(P+)(P-)] • Equivalent to : L =[Po(1-Po)+ 4(P+)(P-)]/[Po(1-Po)+ 4(P+)(P-)+[2(P-) + Po]2] ≤ nx / N <= [2(P+) + Po]2 / [Po(1- Po)+ 4(P+)(P-) + [2(P+) + Po]2] = U • Written alternatively as : N.L ≤ nx ≤N.U.**EDT**• Implication : Choice of ‘N’ must be such that the interval [N.L, N.U] includes at least one integer which can serve as the value of nx. A sufficient condition for this to happen is, of course, that the length of the interval viz. N(U - L) ≥ 1. Even otherwise, a choice of nx could be ensured. Note : So far….this [length less than unity] has been eluding us !!!**EDT**(i) Po = P+ = P- = 1/3 [point and mass symmetric design] • Here we find L = 2/5 and U = 3/5. • So, for N = 3, N.L = 6/5 and N.U =9/5, which do not include any integer. So 3-point design with point and mass symmetry cannot be replaced by a 2-point design whenever N = 3. • Again, for N = 6, we have N.L = 12/5, N.U = 18/5 and these include the integer ‘3’. So there is a solution and we have : ± (2/3), each with 3 observations…as was mentioned before.**EDT**• For N = 9, we have N.L = 18/5 and N.U = 27/5. These include 2 integers : 4 and 5. So we have two solutions : • [-5/(30), 4]; [4/(30), 5] • AND • [-4/(30), 5]; [5/(30), 4].**EDT**• (ii) Po = 2/7, P+ = 4/7 and P- = 1/7 i.e., the initial design is has a size which is a multiple of 7, say N = 7k. This design is pt-sym but mass-asymmetric. • And explicitly it is : [(-1, k); (0, 2k), (1, 4k)] where k is an integer. • Note that L and U are independent of k. Computations yield : L = 13/21 [= 39/63] and U =50/63. • (a) k =1 : N = 7; N.L=13/3 < N.U=50/9 : one sol. • nx = 5, x = 3/7 + (1040)/70; • ny = 2, y = 3/7 – 5 (1040)/140**EDT**• (b) k = 2 : N = 14….three solutions • nx = 9, x = 3/7 + (520)/42; • ny = 5, y = 3/7 – 3 (2080)/140 • nx =10, x = 3/7 + (1040)/70; • ny = 4, y = 3/7 –(260)/14 nx =11, x = 3/7 + (3432)/154; ny = 3, y = 3/7 –(3432)/ 42**EDT**• (iii) Po = 3/5, P+ = P- = 1/5 i.e., the initial design has size multiple of 5, say N = 5k and explicitly it is : • [(-1, k); (0, 3k); (1, k)] where k is an integer. . • This is point and mass-symmetric • Note that L and U are independent of k. Computations yield : L = 2/7 and U = 5/7. • k = 1 : N = 5, 10/7 ≤ nx ≤ 25/7 : • (nx, ny) = (2, 3) OR (3, 2). • Solutions : x = 2/(15) and y = -3/(15) • with nx = 3 and ny = 2; • x = 3/(15) and y = -2/(15) • with nx = 2 and ny = 3.**EDT**• k = 2 : N = 10, 20/7 ≤ nx ≤ 50/7 : nx = 3, 4, 5, 6, 7. • Solutions: x = 6/(210) and y = -14/(210) for (nx, ny) = (7, 3) x = 14/(210) and y = -6/(210) for (nx, ny) = (3, 7) x = 4/(60) and y = -6/(60) for (nx, ny) = (6, 4) x = 6/(60) and y = -4/(60) for (nx, ny) = (4, 6) x = 2/(10) and y = -2/(10) for (nx, ny) = (5, 5).**EDT**• EXAMPLE of 3 -point asymmetric design : N = 3 • Consider an asymmetric design [(-1, 1), (a, 1), (1, 1)] with a # 0. WOLG, we take a > 0. • Consider Information Equivalence with [(x, 2), (y, 1)]. • Then • a = 2x + y……………………..…(6) • 2 + a2= 2x2+ y2…………………..(7) • This yields : x = a/3 ± 2/3 times (a2 + 3) • and for 0 < a < 1, it turns out that • a/3 – 2/3 times (a3+ 3) < -1 • and 1 < a/3 + 2/3 times (a2 + 3). • Hence, N = 3 does not work !**EDT**• For N = 6, naturally, equal allocation of 2 at each of the 3 points will yield the same negative result when we opt for [(x, 4), (y, 2)]. It follows that [(x, 5), (y, 1)] also fails to yield any affirmative result. • For [(x, 3), (y, 3)], we require • 2a = 3(x+y) • 4 + 2a2= 3(x2+ y2). • We obtain : • x, y = a/3 ±1/3 times (6 + 2a2)**EDT**• Note : For a = 0, this leads to : x, y = ± (2/3). This was discussed earlier. • Condition : -1 < x < 1 leads to : • 0 < a < 2(3) – 3, if a > 0. • This was stated earlier.**EDT**• More examples….. • [(-1, 1); (0, 2); (1, 1)] is equivalent to • [(-1/(2), 2); ((1/(2), 2)] • [(-1, 2); (0, 1); (1, 1)] : Impossible • [(-1, 4); (0, 2); (1, 2)] is equivalent to • [(-1/4 - (165)/20; 5); (-1/4 + (165)/12, 3]**Turning back to the example…**U : -1.00, -0.91, -0.75, -0.40, 0.39, 0.67, 1.00 Under Linear Regression : Does there exist a 2- point Information Equivalent Design ? Computations yield : n = 7 μ’1= -1/7= -0.142857; μ’2 = 4.1516/7 Alt. Choice : -1 < a(4) < 0 < b(3) < 1 for 7 obs. 4a + 3b = -1 and 4a^2 + 3b^2 = 4.1516 a = -0.7982 AND b = 0.7309….reqd. solution**Quadratic Regression : Info Equi.**• Context : Quadratic Regression Model with Homoscedastic Errors • [ Mean Model Yx = α + βx + γx2 ] • Claim : Given any continuous regression design ‘D_(k, x, w)’ with ‘k’ support points in χ =[a, b] : • a ≤ x 1 < x 2 <….< xk ≤ b; x’s all distinct and with positive weights w1, w2, …, wk [such that ∑wi= 1], whenever k > 3, we can find exactly 3 points ‘x*’, ‘x**’ and ‘x***’ with suitable weights ‘p*’, ‘p**’ and ‘p***’ such that (i) x 1 ≤ x* < x** < x*** ≤ x k; (ii) p* + p** + p***= 1 and (iii) IBAR based on ‘D*_[(x*, p*); (x**, p**); (x***, p***)]’ is identical to IBAR based on D_(k, x, w). [Info. Equivalence]**Quadratic Regression : EDT**• Problem # 1 • Given D_4 : [(-1, 1); (-a, 1); (a, 1); (1, 1)] • Can we find [(x, 2); (y, 1); (z, 1)] for Information Equivalence with -1 ≤ x # y # z ≤ 1? • Answer : Impossible ! • Problem # 2 • Given D_6 : [(-1, 1); (-0.5, 2); (0.5, 2); (1, 1)] • Can we find [(-x, f); (0, 6-2f); (x, f)] for Information Equivalence with 0 < x < 1 ? • Yes : Unique sol. x = (3)/2 and f = 2.**More on Quadratic Regression : EDT**Problem # 3. What about D_(2k+2) : [(-1, 1); (-0.5, k); (0.5, k); (1, 1)] ? Sol. [(-x, f); (0, 2k+2-2f); (x, f)] for some x & f ? ‘No’ for k = 3 to 7 For k = 8 : f = 6 and x = 1/(2) ! More Affirmative Cases : • D_36 :[-1, 2);(-0.5, 16);(0.5, 16);(1, 2)] = D_36 : [(-1/ (2), 12); (0, 12); (1/ (2), 12)] (ii) D_68 :[-1, 2);(-0.5, 32);(0.5, 32);(1, 2)] = D_68 : [(-(2/5), 25); (0, 18); ((2/5), 25)]**Information Domination…**• De la Garza Phenomenon : Info Equivalence • More to it in terms of Information Domination • WOLG ………..χ = [-1, 1] • Claim 2: Given D*=[(x*, p*); (x**, p**)] with (x*, x**) NOT both equal to (-1, 1), there exists • 0 < c < 1 so that Dc = [(-1, c); (+1, 1-c)] produces an Information Matrix I(Dc) which ‘dominates’ I(D*) in the sense of ‘matrix domination’. That is, I(Dc) – I(D*) is nnd. In a way, I(Dc) dominates I(D*) in every sense ! • This is the best result one can think of ………...in terms of ‘improving’ over I(D*) !!**Information Domination….**• Proof of Claim 2 : • Set 1 – 2c = μ’1 and solve for c =[1- μ’1]/2. • Note that (x*, x**) # (-1, 1) so that -1 < μ’1 < 1 and so 0 < c < 1. • Next note that μ’2 < 1. • Therefore, I(Dc) – I(D*) = [(0, 0) (0, 1- μ’2)] which is nnd. • Message : Push the points to the boundaries !**Quadratic Regression : Information Dominance**• Context : Quadratic Regression Model with Homoscedastic Errors [ Mean Model Yx = α + βx + γx2 ] • Claim : Set χ = [-1, 1] WOLG. • Given any continuous regression design • ‘D*_[(x*, p*); (x**, p**); (x***, p***)]’ with -1 < x* < x** < X*** < 1, there exist proportions ‘p’, ‘q’ and ‘r’ and a constant c, -1 < c < 1 such that the design D_[(-1, p); (c, r); (+1, q)] provides Information Dominance over the design D*.**Sketch of the Proof….**• I= (1 μ’1 μ’2) • (μ’1μ’2 μ’3) • (μ’2 μ’3 μ’4) • I* = etc etc • Equate μ’1, μ’2and μ’3to those of I* and solve for p, q, r and c. Then show that • μ’4< μ*’4 • For details…..Pukelsheim’s Book • Also…….Liski et al Monograph [2002] : • Topics in Optimal Design**Binary Response Models**• Impressive Literature on Optimality Issues • de la Garza Phenomenon & Information Dominance…recent advances…. • Optimal designs for binary data under logistic regression. • Mathew-Sinha (2001) • Jour. Stat Plan. & Inf., 93, 295-307**Binary Response Model….**• P[Yx = 1] = 1/[1+exp{-(α + βx)}] • {(xi, ni)}; i=1, 2, …, k ….given data • Binomial model…..log likelihood….differentiation etc etc…Information Matrix….. Approximate Theory : {(xi, pi)} etc……∑ pi = 1 Set ai = α + βxi for each i I(α,β)=[(∑ pi exp(-ai)/[1+exp(-ai)]2; (∑ pi xi exp(-ai)/[1+exp(-ai)]2; do; (∑ pi xi2 exp(-ai)/[1+exp(-ai)]2