## Matched Filtering and Chi Squared Veto

When the expected signal is known in advance and the noise is Gaussian and stationary, the optimal linear search algorithm is matched filtering [79]. The idea behind matched filtering is to take the signal, and data segments of the same length as the signal, and treat them as members of a vector space. As with any two vectors in a vector space, the degree to which the signal and a data vector overlap is calculated using an inner product.

5 The detectors in Washington are somewhat misaligned with the detector in Louisiana due to the curvature of the Earth.

To be more precise, consider detector strain s(t) and a signal h(t) that lasts for a duration of T. If the signal arrives at the detector at time to, then the detector strain can be written

where n(t) is the detector noise. For this paragraph, we will assume that, apart from being stationary and Gaussian, n(t) is white (same average power at all frequencies) for simplicity. Then, the matched filter output, Z(t), is given by

Let us denote the first and second integrals in (11) by Ii and I2(to) respectively. Clearly, the integrand of Ii is deterministic and positive everywhere. However, the integrand of I2 is stochastic. The average of I2 over all noise realizations vanishes. In other words, on average Z(to) = Ii when there is a signal starting at time to. On the other hand, when there is no signal, Z(t) = I2(t). Denoting the standard deviation of I2 over all noise realizations by a, we define the signal-to-noise ratio (SNR) for the data to be g(t) := |Z(t)\/a. (12)

Clearly, at time to the expected value of the SNR is g(to) = Ii/a. Thus, if the signal is strong enough that Ii is several times larger than a, there is a high statistical confidence that it can be detected.

In practice, it is preferable to implement the matched filter in the frequency domain. Thus, rather than a stretch of data s(t), one analyzes its Fourier transform

—tt where f labels frequencies. This has several advantages: first, it allows for the non-white noise spectrum of interferometers (cf. Fig. 3) to be more easily handled. Second, it allows the use of the stationary phase approximation to the restricted post-Newtonian waveform [1,80], which is much less computationally intensive to calculate, and accurate enough for detection [81]. Third, it allows one to easily deal with one of the search parameters, the unknown phase at which the signal enters the detector's band.

In the frequency domain, the matched filter is complex and takes the form z(t) = x(t) + iy(t) = 4 Jtt sf hf) e2nift df, (14)

where Sn(f) is the one-sided noise strain power spectral density of the detector and the * superscript denotes complex conjugation. It can be shown that the variance of the matched filter due to noise is

Note that a and z(t) are both linear in their dependence on the signal template h. This means that the SNR is independent of an overall scaling of h(t), which in turn means that a single template can be used to search for signals from the same source at any distance. Also, a difference of initial phase between the signal and the template manifests itself as a change in the complex phase of z(t). Thus, the SNR, which depends only on the magnitude of the matched filter output, is insensitive to phase differences between the signal and the template.

Equations (14-16) tell us how to look for a signal if we know which signal to look for. However, in practice, we wish to look for signals from any neutron star binary in the last minutes before coalescence. Because, as mentioned above, finite-size effects are irrelevant, a single waveform covers all possible equations of state for the neutron stars. Likewise, as stated above, the spinless waveform will find binaries of neutron stars with any physically allowable spin. Further, as just discussed, a single template covers all source distances and initial signal phases. However, a single template does not cover all neutron star binaries because it does not cover all masses of neutron stars.

Population synthesis models for neutron star binaries indicate that masses may span a range as large as ~1-3Mq. Since mass is a continuous parameter, it is not possible to search at every possible mass for each of the neutron stars in the binaries. However, if a signal is "close enough" to a template, the loss of SNR will be small. Thus, by using an appropriate set of templates, called a template bank, one can cover all masses in the 1-3 Mq range with some predetermined maximum loss in SNR [82,83]. The smaller the maximum loss in SNR, the larger the number of templates needed in the bank. Typically, searches will implement a template bank with a maximum SNR loss of 3%, which leads to template banks containing of the order of a few hundred templates (the exact number depends on the noise spectrum because both z(t) and a do, and therefore the number of templates can change from epoch to epoch).

When the noise is stationary and Gaussian, then matched filtering alone gives the best probability of detecting a signal (given a fixed false alarm rate). However, as mentioned earlier, gravitational wave interferometer noise generi-cally contains noise bursts, or glitches, which provide a substantial noise background for the detection of binary inspirals. It is possible for strong glitches

In terms of z(t), the SNR is given by g(t) = \z(t)\/a.

to cause substantial portions of the template bank to simultaneously yield high SNR values. It is therefore highly desirable to have some other way of distinguishing the majority of glitches from true signals.

The method which has become standard for this is to use a chi-squared (x2) veto [84]. When a template exceeds the trigger threshold in SNR, it is then divided into p different frequency bands such that each band should yield 1 /p of the total SNR of the data if the high SNR event were a signal matching the template. The sum of the squares of the differences between the expected SNR and the actual SNR from each of the p bands, that is the x2 statistic, is then calculated. The advantage of using the x2 veto is that glitches tend to produce large (low probability) x2 values, and are therefore distinguishable from real signals. Thus, only those template matches with low enough x2 values are considered triggers.

If the data were a matching signal in Gaussian noise, the x2 statistic would be x2 distributed with 2p — 2 degrees of freedom [84]. However, it is much more likely that the template that produces the highest SNR will not be an exact match for the signal. In this case, denoting the fractional loss in SNR due to mismatch by /, the statistic is distributed as a non-central chi-squared, with non-centrality parameter A < 2g2/. This simply means that the x2 threshold, x*, depends quadratically on the measured SNR, g, as well as linearly on /.

In practice, the number of bins, p, and the parameters which relate the x2 threshold to the SNR , as well as the SNR threshold g* which an event must exceed to be considered a trigger are determined empirically from a subset of the data, the playground data. A typical playground data set would be ~10% of the total data set, and would be chosen to be representative of the data set as a whole. Playground data is not used in the actual detection or upper limit analysis, since deriving search parameters from data which will be used in a statistical analysis can result in statistical bias. Values for these parameters for the LIGO S1 and S2 BNS analyses are given in Table 1.

Finally, let us say a few words about clustering. As discussed earlier, when a glitch occurs, many templates may give a high SNR. This would also be true for a strong enough signal. It would be a misinterpretation to suppose that there might be multiple independent and simultaneous signals - rather, it is preferable to treat the simultaneous events as a cluster and then try to determine the statistical significance of that cluster as a whole. The simplest

Data Set |
e" |
P |
L1 x* |
H1/H2 x* |

S1 |
6.5 |
8 |
5 (p + 0.03e2) |
5 (p + 0.03 e2) |

S2 |
6.0 |
15 |
5 (p + 0.0ie2) |
12.5 (p + 0.01 e2) |

strategy, and the one used thus far, is to take the highest SNR in the cluster and perform the x2 using the corresponding template. Another possibility might have been to take the template with the lowest x2 value as representative, or some function of g and x2. In fact, there is reason to believe that the last option may be best [23].

## Post a comment