87 Views

Download Presentation
##### Kakutani’s interval splitting scheme

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Kakutani’s interval splitting scheme**Willem R. van Zwet University of Leiden Bahadur lectures, Chicago 2005**Kakutani's interval splitting scheme**Random variables X1, X2 , … : X1 has a uniform distribution on (0,1); Given X1, X2 , …, Xk-1 , the conditional distribution of Xk is uniform on the longest of the k subintervals created by 0, 1, X1, X2 , …,Xk-1.**0———x1——————————1**0———x1———————x2——1 0———x1——x3————x2——1 0———x1`——x3——x4—x2——1 Kakutani (1975): As n →∞, do these points become evenly (i.e. uniformly) distributed in (0,1)?**Empirical distribution function of X1,X2,…,Xn :**Fn(x) = n-1 Σ1in1(0, x] ( Xi). Uniform d.f. on (0,1) : F(x) = P(X1 x) = x , x(0,1) . Formal statement of Kakutani's question: (*) limn sup 0<x<1 Fn(x) - x 0 with probability 1 ?**We know : if X1,X2,…,Xn are independent and each is**uniformly distributed on (0,1), then (*) is true (Glivenko-Cantelli). So (*) is "obviously" true in this case too!! However, distribution of first five points already utterly hopeless!!**Stopping rule:**0<t<1 : Nt = first n for which all subintervals have length t ; t1 : Nt = 0 . Stopped sequence X1, X2,…, XN(t) has the property that any subinterval will receive another random point before we stop iff it is longer than t.**May change order and blow up:Given that X1=x**0———x—————————1 L( NtX1=x) = L( Nt/x + N*t/(1-x) + 1), 0<t<1 \ / independent copies L(Z) indicates the distribution (law) of Z .**L( NtX1=x) = L( Nt/x + Nt/(1-x) + 1), 0<t<1\ /**independent copies (t) = E Nt = (0,1) {(t/x) + (t/(1-x)) + 1} dx(t) = E Nt = (2/t) - 1 , 0 <t< 1. Similarly2(t) = E (Nt - (t))2 = c/t , 0 <t< 1/2 .**(t) = E Nt = (2/t) - 1 , 0 <t< 1.**2(t) = E (Nt - (t))2 = c/t , 0 <t 1/2 . E{Nt / (2/t)} = 1 - t/2 → 1 as t → 0, 2 (Nt / (2/t)) = ct/4 → 0 as t → 0. lim t→0Nt / (2/t) = 1 w.p. 1 We have built a clock! As t→0, the longest interval (length t) tells the time n ~ (2/t) .**Nt ~ (2/t) as t → 0 w.p. 1.**Define Nt Nt (x) = Σ 1(0, x] ( Xi) , x(0,1) . i=1 Nt (x) ~ Nt/x ~ (2x/t) as t → 0 w.p. 1. Nt (x): 0—.—.—x—.——.—.——.—.——1 Nt/x : 0—.—-—x—.——.—.——.—.——1 FNt (x) = Nt (x) / Nt → x as t→0 w.p. 1**FNt (x) = Nt (x) / Nt → x as t→0 w.p. 1.and as Nt →**∞ when t→0,Fn (x) → x as n → ∞ w.p. 1 sup Fn(x) - x 0 w.p. 1 . 0<x<1Kakutani was right (vZ Ann. Prob. 1978)**We want to show that Fn(x) x faster than in the i.i.d.**uniform case. E.g. by considering the stochastic processes Bn (x) = n1/2 (Fn(x) - x) , 0x1 . If X1,X2,…,Xn independent and uniformly distributed on (0,1), then Bn D B0 as n , where D refers to convergence in distribution of bounded continuous functions and B0denotes the Brownian bridge.**Refresher course 1**W : Wiener processon [0,1], i.e. W ={ W(t): 0t1} with W(0) = 0 ; W(t) has a normal (Gaussian) distribution with mean zero and variance E W2 (t) = t W has independent increments. B0 : Brownian bridge on [0,1], i.e. B0 ={B0(t): 0t1} is distributed as W conditioned on W(1)=0. Fact: {W(t) – t W(1): 0t1 } is distributed as B0**So: If X1,X2,…,Xn are independent and uniformly**distributed on (0,1), then Bn D B0 as n . If X1,X2,…,Xn are generated by Kakutani's scheme, then (Pyke & vZ, Ann. Prob. 2004) Bn D a.B0 as n , with a = ½ σ( N½) = (4 log 2 - 5/2)½ = 0.5221…. . Half a Brownian bridge! Converges twice as fast!**Refresher course 2**Y : random variable with finite k-th moment μk = E Yk = ∫ Yk dP < ∞ and characteristic function ψ(t) = E e itY = 1 + Σ1jkμj (it) j/ j! + o(tk) . Then logψ(t) = Σ1jkj (it) j/ j! + o(tk) . j : j-th cumulant 1 = μ1 = E Y ; 2 = 2 = E(Y- μ1)2 3 = E(Y- μ1)3 ; 4 = E(Y- μ1)4 - 34 ; etc. j =0 for j3 iff Y is normal.**If Y1 and Y2 are independent, then characteristic functions**multiply and hence cumulants add up: • j(Y1+Y2) = j(Y1) + j(Y2) . • Let Y1, Y2, … be i.i.d. with mean μ=EY1=0 and all moments finite. Define • Sn = n-½ (Y1 + Y2 + … + Yn) . • Then for j≥3, • κj (Sn) = n-j/2 κj(Σ Yi) = n1-j/2 κj (Y1) → 0. • Sn asymptotically normal by a standard moment convergence argument. Poor man’s CLT, but sometimes very powerful.**We know**E Nt = (2/t) - 1 , 0 <t< 1, 2 (Nt) = c/t , 0<t≤½ . Similarly κj (Nt) = cj /t , 0<t≤ 1/j , j=3,4,… . Define Is = N1/s + 1, i.e. the number of intervals at the first time when all intervals are ≤ 1/s .Then κj (Is) = cj s , s>j, j=1,2,…, with c1 = 2, c2 = c.**κj (Is) = cj s , s>j, j=1,2,…,**For growing s, Is behaves more and more like an independent increments process!! Define Wt (x) = (t/c)½(Nt (x) - 2x/t), 0≤x≤ 1. Then for s=1/t, and s (i.e. t0), Wt (x) D (t/c)½(Nt/x - 2x/t) D (cs)- ½ (Ixs - 2xs) D W(x) because of the cumulant argument. (the proof of tightness is very unpleasant !)**Now**Wt (x) - x Wt (1) = (t/c)½{Nt (x) - x Nt (1)} 2(ct)- ½ {Nt (x) - x Nt (1)} / Nt = 2(ct)- ½ (FNt(x) - x) . Hence fort=2/n and M=N2/n n, n½ (FM(x) - x) D (c/2)½ B0(x) = a B0(x), but the randomness of M is a major obstacle !**Now M = N2/n n , but it is really nasty to show that**n½ sup |FM (x) - Fn (x)| →P 0 . But then we have n½ (Fn (x) - x) D a . B0(x), with a = (c/2)½ .