Alicia M. Sintes Universitat de les Illes Balears Pisa,1 June 2007

Alicia M. Sintes Universitat de les Illes Balears Pisa,1 June 2007 Data analysis searches for continuous gravitational waves

Content • Gravitational waves from spinning neutron stars. • Emission mechanisms • Signal model • Data analysis for continuous gravitational waves • Bayesian analysis • Frequentist approach • Brief overview of the searches • Directed pulsar search • Wide-parameter search (all sky) • Coherent methods • Einstein@Home • Hierarchical strategies • Semi-coherent methods • Summary of results and perspectives

NASA (NASA/CXC/SAO) Rotating neutron stars • Neutron stars can form from the remnant of stellar collapse • Typical size of 10km, and are about 1.4 solar masses • Some of these stars are observed as pulsars • Gravitational waves from neutron stars could tell us about the equation of state of dense nuclear matter • Our galaxy might contain ~109 NS, of which ~105 are expected to be active pulsars. Up to know ~1700 pulsars have been identified • search for observed neutron stars • all sky search (computing challenge)

Neutron Stars Sources • Great interest in detecting radiation: physics of such stars is poorly understood. • After 40 years we still don’t know what makes pulsars pulse. • Interior properties not understood: equation of state, superfluidity, superconductivity, solid core, source of magnetic field. • May not even be neutron stars: could be made of strange matter!

“Continuous” gravitational waves from neutron stars Various physical mechanisms could operate in a NS to produce interesting levels of GW emission As the signal-strength is generally expected to be weak, long integrations times (several days to years) are required in order for the signal to be detectable in the noise. Therefore this GW emission has to last for a long time, which characterizes the class of ‘continuous-wave’ signals. NS might also be interesting sources of burst-like GW emission, f-mode or p-mode oscillations excited by a glitch crustal torsion modes

Low Mass X-Ray Binaries Wobbling Neutron Star Bumpy Neutron Star Magnetic mountains R-modes in accreting stars Emission mechanisms for continuous GW from spinning NS in the LIGO-VIRGO frequency band • Non-axisymmetric distortions • Unstable oscillation modes in the fluid part of the star • Free precession

Bumpy Neutron Star Magnetic mountains 1) Non-axisymmetric distortions • The strain amplitude h0 refers to a GW from an optimally oriented source with respect to the detector • The equatorial ellipticity is highly uncertain, ~10-7. In the most speculative model can reach up to 10-4. A non-axisymmetric neutron star at distance a d, rotating with frequency ν around the Izz axis emits monochromatic GWs of frequency f = 2ν with an amplitude Accreating neutron stars in binary systems can also have large crust deformations Strong internal magnetic fields could produce deformations of up to ~10-6. These deformations would result in GW emission at f = ν and f = 2ν.

Wobbling Neutron Star R-modes in accreting stars 2) Non-axisymmetric instabilities 3) Free precession At birth or during accretion, rapidly rotating NS can be subject to various non-axisymmetric instabilities, which would lead to GW emission, The r-mode instability has been proposed as a source of GWs (with frequency f = 4ν/3) from newborn NS and from rapidly accreting NS. results in emission at (approximately) the rotation rate ν and twice the rotation rate, i.e. f =ν+νprec and f = 2ν.

Loudest expected signal From isolated neutron stars (Blandford’s argument): Although there is great uncertainty in the physics of the GW emission mechanisms and the strength of individual sources, one can argue for a statistical upper limit on the expected strongest GW signals from the galactic population of neutron stars, which is almost independent of individual source physics. h0~410-24 Is the optimistic upper limit, that there is a 50% chance that the strongest signal has at least that amplitude in the LIGO-VIRGO band, assuming NS are uniformly distributed in the galactic disc and a constant overall galactic bith-rate. Spindown limit for known pulsars If the GW emission is powered by the rotational energy of the NS, then one obtains:  ≤ sd, h0≤ hsd

The signal from a NS • The ‘periodic’ GW signal from a neutron star: • Nearly-monochromatic continuous signal • spin precession at ~frot • excited oscillatory modes such as the r-mode at 4/3* frot • non-axisymmetric distortion of crystalline structure, at 2frot (Signal-to-noise)2 ~

R The expected signal at the detector A gravitational wave signal we detect from a NS will be: • Frequency modulated by relative motion of detector and source • Amplitude modulated by the motion of the non-uniform antenna sensitivity pattern of the detector

Signal received from an isolated NS F+and F are the strain antenna patterns. They depend on the orientation of the detector and source and on the polarization of the waves. the phase of the received signal depends on the initial phase, the frequency evolution of the signal and on the instantaneous relative velocity between source and detector. T(t) is the time of arrival of a signal at the solar system barycenter, t the time at the detector.

Signal model:isolated non-precessing NS h0- amplitude of the gravitational wave signal  - angle between the pulsar spin axis and line of sight In the case of an isolated tri-axial neutron star emitting at twice its rotational frequency - equatorial ellipticity

Signals in noise. The meaning of probability The strain x(t) measured by a detector is mainly dominated by noise n(t), such that even in the presence of a ignal h(t) we have x(t)=n(t)+h(t) Data analysis methods are not just simple recipes. We want tools capable of dealing with very faint sources handling very large data sets diagnosing systematic errors avoiding unnecessary assumptions estimating parameters and testing models There are different ways of proceeding depending on the paradigm of statistics used. There are three popular interpretations of the word: Probability as a measure of our degree of belief in a statement Probability as a measure of limiting relative frequency of outcome of a set of identical experiments Probability as the fraction of favourable (equally likely) possibilities We will call these the Bayesian, Frequentist and Combinatorial interpretations.

Prior Likelihood Posterior Evidence Thomas Bayes (1702 – 1761 AD) We can usually calculate all these terms Algebra of (Bayesian) probability • If there are two statements X and Y, then joint probabilitywhere the vertical line denotes the conditional statement “X given Y is true” – The Product Rule • …because p(X,Y)=p(Y,X), we getwhich is calledBayes’ Theorem Bayes’ theorem is the appropriate rule for updating our degree of belief when we have new data

Algebra of (Bayesian) probability • We can also deduce the marginal probabilities. If X and Y are propositions that can take on values drawn from and then this gives use the probability of X when we don’t care about Y. In these circumstances, Y is known as a nuisance parameter. • All these relationships can be smoothly extended from discrete probabilities to probability densities, e.g.where “p(y)dy” is the probability that y lies in the range y to y+dy. =1

Prior Likelihood Posterior Influence of our observations What we knew before What we know now Bayesian parameter estimation In the Bayesian approach, we can test our model As our data improve (e.g. our sample increases), the posterior pdf narrows and becomes less sensitive to our choice of prior. The posterior conveys our (evolving) degree of belief in different values of  , in the light of our data If we want to express our result as a single number we could perhaps adopt the mean, median, or mode We can use the variance of the posterior pdf to assign an uncertainty for  It is very straightforward to define confidence intervals

 We are 95% sure that lies between and Note: the confidence interval is not unique, but we can define the shortest C.I. 1 2 95% of area under pdf p( | data, I ) 2 1  Bayesian parameter estimation Bayesian confidence intervals

Frequentist framework • Recall that in frequentist (orthodox) statistics, probability is limiting relative frequency of outcome, so :only random variables can have frequentist probabilitiesas only these show variation with repeated measurement. So we can’t talk about the probability of a model parameter, or of a hypothesis. E.g., a measurement of a mass is a random variable, but the mass itself is not. • So no orthodox probabilistic statement can be interpreted as directly referring to the parameter in question! For example, orthodox confidence intervals do not indicate the range in which we are confident the parameter value lies. That’s what Bayesian intervals do.

Frequentist hypothesis testing – significance tests The method goes like this: • To test a hypothesis H1 consider another hypothesis, called the null hypothesis, H0, the truth of which would deny H1. Then argue againstH0… • Use the data you have gathered to compute a test statistic obswhich has a calculable pdf if H0 is true. This can be calculated analytically or by Monte Carlo methods. • Look where your observed value of the statistic lies in the pdf, and reject H0 based on how far in the wings of the distribution you have fallen. p(|H0) Reject H0 if your result lies in here 

p(|H0) X% of the area  obs p(|H0) p(|H1)   Frequentist hypothesis testing – significance tests • H0 is rejected at the x% level if x% of the probability lies to the right of the observed value of the statistic (or is ‘worse’ in some other sense):and makes no reference to how improbable the value is under any alternative hypothesis (not even H1!). • We will chose a critical region of *, so that if obs>* we reject H0 and therefore accept H1.

Type II error Type I error  Frequentist hypothesis testing • A Type I error occurs when we reject the null hypothesis when it is true (false alarm) • A Type II error occurs when we accept the null hypothesis when it is false (false dismissal) both of which we should strive to minimise. According to the Neyman-Pearson lemma, this optimal test is the so-called likelihood ratio: • For Gaussian noise, one finds which is the well-known expression for the matched-filtering amplitude. If some of the parameters of the signal h(t;A,) are unknown, one has to find the maximum of lnΛ as a function of the unknown parameters

Calibrated output: LIGO noise history Integration times S1 - L1 5.7 days, H1 8.7 days, H2 8.9 days S2 - L1 14.3 days, H1 37.9 days, H2 28.8 days S3 - L1 13.4 days, H1 45.5 days, H2 42.1 days S4 - L1 17.1 days, H1 19.4 days, H2 22.5 days S5 (so far...) >1 year

Calibrated output: GEO noise history

Continuous wave searches • Signal parameters: position (may be known), inclination angle, [orbital parameters in case of a NS in a binary system], polarization, amplitude, frequency (may be known), frequency derivative(s) (may be known), initial phase. • Most sensitive method: coherently correlate the data with the expected signal (template) and inverse weights with the noise. If the signal were monochromatic this would be equivalent to a FT. • Templates: we assume various sets of unknown parameters and correlate the data against these different wave-forms. • Good news: we do not have to search explicitly over polarization, inclination, initial phase and amplitude. • Because of the antenna pattern, we are sensitive to all the sky. Our data stream has signals from all over the sky all at once. However: low signal-to-noise is expected. Hence confusion from many sources overlapping on each other is not a concern. • Input data to our analyses: • A calibrated data stream which with a better than 10% accuracy, is a measure of the GW excitation of the detector. Sampling rate 16kHz (LIGO-GEO, 20kHz VIRGO), but since the high sensitivity range is 40-1500 Hz we can downsample at 3 kHz.

Four neutron star populationsand searches • Known pulsars • Position & frequency evolution known (including derivatives, timing noise, glitches, orbit) • Unknown neutron stars • Nothing known, search over position, frequency & its derivatives • Accreting neutron stars in low-mass x-ray binaries • Position known, sometimes orbit & frequency • Known, isolated, non-pulsing neutron stars • Position known, search over frequency & derivatives • What searches? • Targeted searches for signals from known pulsars • Blind searches of previously unknown objects • Coherent methods (require accurate prediction of the phase evolution of the signal) • Semi-coherent methods (require prediction of the frequency evolution of the signal) What drives the choice? The computational expense of the search

Frequency domain Conceived as a module in a hierarchical search Matched filtering techniques. Aimed at computing a detection statistic. These methods have been implemented in the frequency domain (although this is not necessary) and are very computationally efficient. Best suited for large parameter space searches(when signal characteristics are uncertain) Frequentist approach used to cast upper limits. Time domain process signal to remove frequency variations due to Earth’s motion around Sun and spindown Standard Bayesian analysis, as fast numerically but provides natural parameter estimation Best suited to target known objects, even if phase evolution is complicated Efficiently handless missing data Upper limits interpretation: Bayesian approach Coherent detection methods There are essentially two types of coherent searches that are performed

Summary of directed pulsar searches • Within the LIGO sensitive band (gw > 50 Hz) there are currently 163 known pulsars • We have rotational and positional parameter information for 97 of these • S1 (LIGO and GEO: separate analyses) • Upper limit set for GWs from J1939+2134 (h0<1.4 x 10-22) • Phys. Rev. D 69, 082004 (2004) • S2 science run (LIGO: 3 interferometers coherently, TDS) • End-to-end validation with 2 hardware injections • Upper limits set for GWs from 28 known isolated pulsars • Phys. Rev. Lett. 94, 181103 (2005) • S3 & S4 science runs (LIGO and GEO: up to 4 interferometers coherently, TDS) • Additional hardware injections in both GEO and LIGO • Add known binary pulsars to targeted search • Full results with total of 93 (33 isolated, 60 binary) pulsars • S5 science run (ongoing, TDS) • 32 known isolated, 41 in binaries, 29 in globular clusters

Frequency domain method • The outcome of a target search is a number F* that represents the optimal detection statistic for this search. • 2F* is a random variable:For Gaussian stationary noise, follows a c2distribution with 4 degrees of freedom with a non-centrality parameter l(h|h).Fixing ,  and 0 , for every h0, we can obtain a pdf curve:p(2F|h0) • Thefrequentistapproach says the data will contain a signal with amplitude  h0 , with confidence C, if in repeated experiments, some fraction of trials C would yield a value of the detection statistics  F* • Use signal injection Monte Carlos to measure Probability Distribution Function (PDF) of F

Measured PDFs for the F statistic with fake injected worst-case signals at nearby frequencies h0 = 1.9E-21 h0 = 2.7E-22 S1 Note: hundreds of thousands of injections were needed to get such nice clean statistics! 95% 95% 2F* 2F* 2F* = 1.5: Chance probability 83% 2F* = 3.6: Chance probability 46% h0 = 5.4E-22 h0 = 4.0E-22 95% 95% 2F* 2F* 2F* = 6.0: chance probability 20% 2F* = 3.4: chance probability 49%

F-statistics We can express h(t) in terms of amplitude A {A+, A, ,0} and Doppler parameters 

F-Statistics • Analytically maximize the likelihood over A

Time domain target search • Time-domain data are successively heterodyned to reduce the sample rate and take account of pulsar slowdown and Doppler shift, • Coarse stage (fixed frequency) 16384  4 samples/sec • Fine stage (Doppler & spin-down correction)  1 samples/min Bk • Low-pass filter these data in each step. The data is down-sampled via averaging, yielding one value Bkof the complex time series, every 60 seconds • Noise level is estimated from the variance of the data over each minute to account for non-stationarity. k • Standard Bayesian parameter fitting problem, using time-domain model for signal -- a function of the unknown source parameters h0 ,,  and 0

posterior prior likelihood Time domain: Bayesian approach • We take a Bayesian approach, and determine the joint posterior distribution of the probability of our unknown parameters, using uniform priors on h0 ,cos ,  and 0over their accessible values, i.e. • The likelihood exp(-2/2), where • To get the posterior PDF for h0, marginalizing with respect to the nuisance parameters cos ,  and 0given the data Bk

Posterior PDFs for CW time domain analyses Simulated injection at 2.2 x10-21 p shaded area = 95% of total area p

Amplitudes of < 10-25 and ellipticities <10-6 for many objects Our most stringent ellipticities (10-7) are starting to reach into the range of neutron star structures for some neutron-proton-electron models Progression of targeted pulsars upper limits Crab pulsar

Preliminary S5 results Used parameters provided by Pulsar Group, Jodrell Bank Joint 95% upper limits from first ~13 months of S5 using H1, H2 and L1 (97 pulsars) Due to pulsar glitches the Crab pulsar result uses data up to the glitch on 23 Aug 2006, and the PSRJ0537-6910 result uses only three months of data between two glitches on 5th May and 4th Aug 2006 Lowest h0 upper limit: PSR J1435-6100 (ngw = 214.0 Hz, r = 3.3 kpc) h0_min = 4.2x10-26 Lowest ellipticity upper limit: PSR J2124-3358 (ngw = 405.6Hz, r = 0.25 kpc) = 9.6x10-8 preliminary APS Meeting, Jacksonville, April 2007

preliminary • Black curve represents one full year of data for all three interferometers running at design sensitivity • Bluestars represent pulsars for which we are reasonably confident of having phase coherence with the signal model • Green stars represent pulsars for which there is uncertainty about phase coherence APS Meeting, Jacksonville, April 2007

Blind searches and coherent detection methods • Coherent methods are the most sensitive methods (amplitude SNR increases with sqrt of observation time) but they are the most computationally expensive, why? • Our templates are constructed based on different values of the signal parameters (e.g. position, frequency and spindown) • The parameter resolution increases with longer observations • Sensitivity also increases with longer observations • As one increases the sensitivity of the search, one also increases the number of templates one needs to use.

Number of templates The number of templates grows dramatically with the coherent integration time baseline and the computational requirements become prohibitive [Brady et al., Phys.Rev.D57 (1998)2101]

Entire sky search Fully coherent matched filtering 160 to 728.8 Hz df/dt < 4 x 10-10 Hz/s 10 hours of S2 data; computationally intensive 95% confidence upper limit on the GW strain amplitude range from 6.6x10-23 to 1.0x10-21 across the frequency band Scorpius X-1 Fully coherent matched filtering 464 to 484 Hz, 604 to 624 Hz df/dt < 1 x 10-9 Hz/s 6 hours of S2 data 95% confidence upper limit on the GW strain amplitude range from 1.7x10-22 to 1.3x10-21 across the two 20 Hz wide frequency bands See gr-qc/0605028 S2 run: Coherent search for unknown isolated sources and Sco-X1

Computational Engine Searchs offline at: • Medusa, Nemo clusters (UWM) • Merlin, Morgane cluster (AEI) • Tsunami (B’ham) • Others

Einstein@home http://einstein.phys.uwm.edu/ • Like SETI@home, but for LIGO/GEO data • American Physical Society (APS) publicized as part of World Year of Physics (WYP) 2005 activities • Use infrastructure/help from SETI@home developers for the distributed computing parts (BOINC) • Goal: pulsar searches using ~1 million clients. Support for Windows, Mac OSX, Linux clients • From our own clusters we can get ~ thousands of CPUs. From Einstein@home hope to get order(s) of magnitude more at low cost • Currently : ~140,000 active users corresponding to about 80Tflops

Coherent searches onEinstein@home • Public distributed computing project to look for isolated pulsars in LIGO/GEO data ~ 80 TFlops 24/7 • Initially making use of coherent F-statistic method S3 - no spindown • Using segment lengths of 10 hours • No evidence of strong pulsar signals • Outliers are consistent with instrumental artifacts or bad bands. None of the low significance remaining candidates showed up in follow-up on S4 data. S4 - one spindown parameter, up to f/fdot ~ 10,000 yr • Using segment lengths of 30 hours • Analysis took ~ 6 months • Post-processing complete. Currently under review. S5 (R1)- similar to S4 • Faster more efficient application • Currently in post-processing stage

User/Credit History http://www.boincsynergy.com/stats/

Current performance http://www.boincstats.com/ Einstein@Home is currently getting 84 Tflops

All-Sky surveys for unknown gravity-wave emitting pulsars It is necessary to search for every signal template distinguishable in parameter space. Number of parameter points required for a coherent T=107s search [Brady et al., Phys.Rev.D57 (1998)2101]: Number of templates grows dramatically with the integration time. To search this many parameter space coherently, with the optimum sensitivity that can be achieved by matched filtering, is computationally prohibitive.

Coherent wide-parameter searches • The second effect of the large number of templates Np is to reduce the sensitivity compared to a targeted search with the same observation time and false-alarm probability: increasing the number of templates increases the number of expected false-alarm candidates at fixed detection threshold. Therefore the detection-threshold needs to be raised to maintain the same false-alarm rate, thereby decreasing the sensitivity. • Note that increasing the number of equal-sensitivity detectors N improves the SNR in the same way as increasing the integration time Tobs. However, increasing the number of detectors N does — contrary to the observation time Tobs — not increase the required number of templates Np, which makes this the computationally cheapest way to improve the SNR of coherent wide-parameter searches.

Coherent search (a,d,fi) Pre-processing in a frequency band raw data GEO/LIGO Divide the data set in N chunks Template placing Construct set of short FT (tSFT) Incoherent search Candidates selection Peak selection in t-f plane Set upper-limit Candidates selection Hough transform(a, d, f0, fi) Hierarchical strategies

h-reconstructed data Data quality SFDB Average spectrum estimation Data quality SFDB Average spectrum estimation peak map peak map Hough transf. Hough transf. candidates coincidences candidates coherent step events Hierarchical method for ‘blind’ searches The procedure involves two or more data sets belonging to a single or more detectors

Alicia M. Sintes Universitat de les Illes Balears Pisa,1 June 2007

Alicia M. Sintes Universitat de les Illes Balears Pisa,1 June 2007

Presentation Transcript

Alicia M. Sintes Universitat de les Illes Balears Paris,17 November 2006

Carles Manera Universitat de les Illes Balears