250 likes | 279 Views
This research delves into the concept of strong completeness and its relevance in random thinning of random processes. It explores identifiable sampling schemes and Markov sampling methods, with a focus on the gamma family. The application of strong completeness in spatial sampling and risk analysis is examined, particularly in negative binomial sampling of stochastic processes. Various observational schemes are discussed, highlighting the importance of identifiable sampling for inference validity. The paper also covers identifiable sampling schemes in continuous and discrete settings, elucidating the criteria for uniquely determining the stochastic structure of a process.
E N D
Strong Completeness and its Application to Random Thinning of Random Processes ShailajaDeshmukh University of Pune, Pune – India Visiting professor, University of Michigan, Ann Arbor • Outline • Random Thinning of stochastic process • Identifiable sampling schemes • Markov sampling of a continuous parameter stochastic process • Strong completeness of gamma family • Applications in spatial sampling • Negative binomial sampling of a discrete parameter stochastic • process and its applications in risk analysis
{X(t) , t T} , T : Discrete or continuous • Various observational schemes • Complete observation for a fixed time interval Keiding (1974, 1975), Athreya (1975, 1978) • Observing a process till a fixed no. events occur (inverse sampling) Moran (1951), Keiding (1974, 1975) • Observing the process at specified deterministic epochs t1, t2, …tn Prakasa Rao (1988), Su & Cambanis (1993)
Kingman (1963, Ann. Math. Statist ) • Fixed epoch sampling suffers from non- identifiability Observed data may come from different processes • Kingman (1963) advocated selecting epochs t1, t2, …tn randomly • Criterion – Process derived from the original process randomly, • should determine the stochastic structure of the original process • uniquely • The process used for sampling should be identifiable • {X(t), t T}: Original process under study • Zn = X(Tn), Tn : Random variables • {Zn, n 1} : Derived process or randomly thinned process
Identifiability of a sampling scheme d {X(1) (t)} {X(2) (t)} = d {Z(1) (n)} {Z(2) (n)} = • Derived process determines the original process uniquely • Identifiability is essential for justfication of inference based • on randomly derived process • Basawa (1974) , Baba (1982) • Identifiable sampling schemes
continuous discrete • Tn : n th success in • independent Bernoulli trials. • Bernoulli sampling • Deshmukh (1991), Austr. J. of Statistics • Tn : n th event • in Poisson process. • Poisson sampling • Kingman (1963) • Ann. Math.Statist • Tn : n th visit to state 1 • in two state Markov chain • Markov sampling • Deshmukh (2000), Austr.&New • Zealand J. of Statistics • Tn : n th visit to state 1 • in two state Markov process. • Markov process sampling • Strong completeness of gamma • family. Deshmukh (2005) • Stochastic modelling & Applications Tn: n-th epoch of k-th success in Bernoulli trials Strong completeness of negative binomial family Extension of PASTA
Markov sampling of a continuous parameter stochastic process • {X(t), t ≥ 0}: continuous parameter stochastic process • {Y(t), t ≥ 0}: Markov process with state space {0,1} and Y(0) = 1 • {Y(t)} is independent of {X(t)} • Observe {X(t)} at the epochs of visits to state 1 of {Y(t)} • Tk = S1 + … + Sk : epoch of k-th visit to state 1 of {Y(t)} • {X(t)} is observed at Tk, k ≥ 1, • {Z(k) = X(Tk), k ≥ 1} is derived from the original process by MS
Aim: Whether {Z(k)} determines the stochastic structure of the original • process uniquely • Waiting time Tk for the k-th visit to the state 1 of the Markov process 1 W1 0 W0 1 • Waiting time for the first visit to the state1: S1 = W0 + W1 • W0 and W1 are independent random variables having exponential • distribution with mean λ0-1 and λ1-1 respectively. • Tk = S1 + … + Sk = V0 + V1, where Vi ~ ~ G (λi, k ), • λi : scale parameter and k is the shape parameter, i = 0,1.
Sampling scheme is identifiable d {X(1) (t)} {X(2) (t)} = Markov sampling d {Z(1) (k)} {Z(2) (k)} = Family of finite dimensional distribution functions of {X(i) (t)} Fi ≡ Fi(t1, t2,…, tn) = Fi(t1, t2,…, tn; x1,.., xn) = P[Xi(tj) ≤ xj; j = 1,…, n], x1,x2,…, xn - real numbers, t1,t2,…, tn - positive real numbers, t1 < t2 < … < tn. Family of finite dimensional distribution functions of {Z(i) (k)} Gi = P[Zi(K(j)) ≤ xj, j = 1, 2, …,n ], K(j) = k1 + k2 + … + kj 8
k1 k2 kn 0 K(1) K(2) K(n-1) K(n) TK(1) TK(2) TK(n-1) TK(n) U1 U2 Un-1 Un L1 L2 Ln Let TK(j) =Uj and Lj = Uj – Uj-1 , Lj = U1 + U2 + … + Uj Gi ≡ P[Zi(K(j)) ≤ xj, j = 1,…, n] = P[Xi(TK(j)) ≤ xj, j = 1,…, n] = P[Xi(Uj) ≤ xj, j = 1,…, n] = E{ P[Xi(Uj) ≤ xj, j = 1,…, n | U1,U2,…,Un] } = E{P[Xi(Uj) ≤ xj, j = 1,…, n ]} = E{Fi(L1, L1+L2, …, L1+L2+…+Ln)}
G1 = G2 implies E{F1(L1, L1+L2, …, L1+L2+…+Ln)} = E{F2(L1, L1+L2, …, L1+L2+…+Ln)} E{F1(L1, L1+L2, …, L1+…+Ln)- F2(L1, L1+L2, …, L1+…+Ln)} = 0 Expectation is with respect to the joint distribution of (L1, L2,…,Ln). L1, L2,…,Ln are independent and Lj ~ Vokj + V1kj ,where Vokj ~G (λ0, kj ) and V1kj ~ G (λ1, kj ), λ0 and λ1 are known, kj j= 1, …,n are the only unknown parameters . Expectation is with respect to the joint distribution of (Vokj ,V1kj j= 1, …,n). If the joint distribution of (Vokj ,V1kj j= 1, …,n) is complete G1 = G2 implies F1 = F2 Strong completeness of family of Vokj /V1kj for any j implies completeness of the joint distributions
X ~ G ( α, k), α : scale parameter, k : shape parameter f (x) = αk e-αx xk-1/Г(k), x > 0 α : known, k Є I+ Not a one parameter exponential family, parameter space is not an open set Complete family
Ek( h(x)) = 0 for all k Є I+ ⇔ ∫ h(x) αk e-αx xk-1/Г(k)dx = 0 , for all k Є I+ ⇔ g(k) = 0 , for all k Є I+ ⇔ Σzk g(k) = 0, 0 < z < 1 ⇔ ∫ h(x) e-αx (Σ (αzx)k-1/(k-1)!) dx = 0 ⇔ ∫ h(x) e-θx = 0, θ = α (1 – z), 0 < θ < α ⇔ ∫ h(x) e-θx = 0, for all θ > 0, by analytic continuation ⇔ h(x) = 0, a.s. Pk for all k Є I+ {G(α , k), k Є I+} is a complete family Strongly complete
Definition: A family of distributions {Fθ, θЄΘ} is called strongly • complete if there exists a measure μ on (Θ , fi) such that for every • subset Θ* of Θ for which μ(Θ – Θ*) = 0, ∫ h(x) Fθ(dx) = 0 for all • θЄΘ* implies that h (x) = 0 a. s. Pθfor every θЄΘ. (Zacks, 1971) • Strong completeness implies completeness by taking Θ* = Θ • Suppose T1 and T2 are independent random variables. If {FθT1,θЄΘ} • is complete and {FθT2, θЄΘ’} is strongly complete then the family • of joint distributions {Fθ, θ’T1,T2θЄΘ, θ’ ЄΘ’} is complete. • (Zacks, 1971) • Gamma family is strongly complete • Parameter space - I+,fi-sigma field, μ is a measure induced by • geometric distribution • For A Є fi, μ(A) = Σδ (1 – δ) k – 1, sum being taken over k Є A
Suppose Θ* is a subset of I+ such that μ(I+ - Θ* ) = 0 ∫ h(x) αk e-αx xk-1/Г(k)dx = 0 , for all k ЄΘ* ⇔ g(k) = 0, for all k ЄΘ* μ({k}) = 0 , for all k Є (I+ - Θ*) g(k) μ(k) = 0 , for all k Є I+ Σδ (1 – δ) k – 1( ∫ h(x) αk e-αx xk-1/Г(k)dx) = 0, sum being taken on I+ Using Fubini’s theorem, summation and integration can be interchanged ∫ h(x) e-θx = 0, for all θ > 0, by analytic continuation h(x) = 0, a.s. Pk for all k Є I+ Gamma family is strongly complete
Thus, the joint distribution of ((Vokj ,V1kj j= 1, …,n) ) is complete. Further using continuity of F we get G1 = G2 implies F1 = F2 Markov sampling is an identifiable sampling scheme • {Z(k), k ≥ 1} is a Markov process iff {X(t)}is a Markov process. • {Z(k), k ≥ 1} is a stationary process iff {X(t)}is a stationary • process. • limt →∞P[X(t) Є B] = lim n→∞ P[Zn Є B] • Fraction of time the process {X(t)} is in set B (a measurable • subset of a state space of {X(t)}) is the same as the fraction of • time the process {X(t)} is in B when observed at the epochs of • visits to the state 1 of {Y(t)} • Parallel to the Poisson Arrivals See Time Averages (PASTA) • property
Application : Identifiable sampling designs in spatial processes to • select the locations. • {Z(s) , s D} : Spatial process • s: Locations, D : Study region • Aim : To select locations at which the characteristic under study is • to be measured, thickness or smoothness of powder coating, • nests of birds • Most common scheme: Regular sampling, Cressie,1993 Non-identifiability 16
Study region – Continuous • Aim : Selection of locations (s1,s2) • If both coordinates are selected by Poisson sampling, it generates • CSR pattern. If both coordinates are selected by Markov Process • sampling, it generates aggregated pattern. • Spatial process observed at these locations determines the original • process uniquely • Study region – Discrete • Adopt Bernoulli sampling or Markov sampling • Deshmukh (2003), JISA (Adke Special volume) • Prayag & Deshmukh (2000) : Environmetrics • Test for CSR against aggregated pattern 17
Suppose X has negative binomial distribution • Pk[ X = x] = (x + k -1)C(k-1) pk qx-k , x = 0, 1, …, • p – known, k Є I+, • not a one parameter exponential family • Complete • Strongly complete
Risk models in insurance • U (t) : reserve/ value of the fund/ insurer’s surplus at time t • U (t) = initial capital + input via premiums by time t – output due to claims by t • S (t) = Output due to claim payments by t = 0∫t X(u) du,random part • Probability of ruin = P [ U (t) < 0] • Distribution of {S (t)} or its discrete version : Sn = Σ Xi, i running from 1 to n • Observed data are the claim amounts in various time periods - weeks or months • Uk = Σ Xi, i runs from 1 to Nk, Nk is the frequency of a claim in a fixed time period, • and Xi denotes the claim amount, Nk and Xi are random. • If Nk = 0, Uk = 0 • Nk – Poisson, negative binomial
{Tk, k ≥ 1}, Tk – Tk-1 are distributed as Nk with support I+ • Uk = S(Tk) – S(Tk-1) • Observed data are realization of the process {S (Tn), n ≥ 1}, a process observed • at random epochs • On the basis of these data we wish to study the process {Sn, n ≥ 1} • Identifiability of the random sampling scheme. • If {Sn} is modelled as a renewal process then identifiability of the random • sampling scheme is valid for any discrete distribution of Nn with support I+ . • (Teke & Deshmukh,2008, SPL) • If {Sn} is a discrete parameter process then identifiability of the random • sampling scheme is valid for negative binomial distribution of Nn • Strong completeness of the family of negative binomial distributions helps • to prove identifiability
{Sn , n 1}: Renewal process, f(s) : L.T. Renewal processes Cox process {Tn , n 1} : Renewal process Support – N, P(s) : p.g.f. Zn = S(Tn), Renewal process g(s) : L.T. g(s) = P(f(s)) f(s) = P-1(g(s)) : Inversion formula {Zn} determines {Sn} gn(s) : Empirical L.T. fn(s) = P-1(gn(s)) Cox process Renewal processes P(s) – Geometric, Shifted geometric {Sn , n 1}: Random walk {S(t) , t 0}: Levy process P(s) – Geometric,Poisson & negative Binomial, both truncated at zero Bernstein, Stieltjes
Work in progress • {X(t), t T}: Original process under study • {Zn = X(Tn)}, {Tn, n ≥ 1}: Renewal process G1 = G2 implies E{F1(L1, L1+L2, …, L1+L2+…+Ln)} = E{F2(L1, L1+L2, …, L1+L2+…+Ln)} E{F1(L1, L1+L2, …, L1+…+Ln)- F2(L1, L1+L2, …, L1+…+Ln)} = 0 Lj : sum of kj iid random variables, if the joint distribution of (L1, L2,…,Ln) is complete then, G1 = G2 implies F1 = F2. f(x, k), k Є I+}: family of L Σδ (1 – δ) k – 1( ∫ h(x) f(x,k)dx) = 0, sum being taken on I+ ⇔ ∫ h(x) Σ (1 – δ)k f(x,k) = 0 ⇔ ∫ h(x) A(x, δ)= 0, A(x, δ) = Σ (1 – δ)k f(x,k) Can we conclude that h(x) = 0 a.s.?
References 1.Baba, Y. (1982). Maximum likelihood estimation of parameters in birth and death process by Poisson sampling, J. Oper. Res. 15, 99-111. 2. Basawa, I.V. (1974). Maximum likelihood estimation of parameters in renewal and Markov renewal processes. Austral. J. Statist. 16, 33-43. 3.Cressie, N. A. C. (1993). Statistics for Spatial Data, Wiley, New York. 4.Deshmukh, S.R. (1991). Bernoulli sampling, Austral. J. Statist. 33, 167-176. 5.Deshmukh, S.R. (2000). Markov sampling, Aust. N. Z.J. Statist. 42(3), 337-345. 6.Deshmukh, S.R. (2003). Identifiable sampling design for spatial process. J. Ind. Statist. Assoc. 41(2) 261-274. 7.Deshmukh, S.R. (2005). Markov Arrivals See Time Averages, Stochastic Modelling and Applications. Vol. 8, 2, p. 1-20.
8.Kingman, J.F.C. (1963). Poisson counts for random sequences of events. Ann. Math. Statist. 34, 1217-1232. 9.Prakasa Rao, B.L.S. (1988). Statistical inference from sampled data for stochastic process. Contemp. Math. 80, 249-284. 10. Prayag, V.R. & Deshmukh, S.R. (2000). Testing randomness of spatial pattern using Eberhardt’s index, Environmetrics, Vol. 11, p. 571-582. 11.Su, Y. and Cambanis, S. (1993). Sampling designs for estimation of a random process. Stochastic Process Appl. 46, 47-89. 12.Teke S.P. & Deshmukh, S.R.(2008) . Inverse Thinning of Cox and Renewal Processes, Statistics and Probability Letters, 78, p. 2705-2708.