Automatic Burst Detection

The goal of this survey is to automatically identify a burst in a spike train. Bursts are considered as a unit of neural information since they denote a period of 'high activity' in a given spike train. A generally accepted, rough definition of a burst is "an occurrence of "many" spikes in a "small" time interval" (Palm, 1981). Most neurons can burst if stimulated or manipulated pharmacologically. Much research has been focused on the way that a neuron fires an individual spike or burst (for example see link 1). This study however will not focus on the how part of the process but will rather attempt to detect burst activity by examining the output ie the spike train. We will review some methods that are commonly used and also test them in a set of computer-generated data inspired from (Raos, 2006). The problem is set like this: given a spike train that may contain a number of bursts, find the same bursts that would be identified by an "expert" upon visual inspection.

Introduction, some measures of activity and a global approach to the problem


From a signal processing point of view, we must first describe an appropriate measure of the interesting activity. This measure should be able to cover the intuitive definition of a burst that we already stated. For example, a simple approach would be to compare the firing frequency of a period of spontaneous neuron activity with another, candidate period of bursting activity. Another simple approach would be to use the distribution of the inter spike intervals (ISIs) during a time period. Such measures can be used to compare two different time periods of activity. Thus, if we are provided with a spike train of a neuron's spontaneous activity we can use the selected measure to denote if another spike train displays significant activity.

Consider the spike train in figures 1&2. One can easily say that this neuron displays a period of intense activity between 2200 and 2800 ms. For this example, we arbitrarily set three different epochs of activity: the first one between 0 and 2000 ms the second between 2000 and 3000ms and the third between 3000 and 4000ms. Figure 1 shows the firing rate of the spike train. The following table displays the average firing rate of each epoch, which is simply computed by the formula $$ mean(f.rate) = \frac{N} {duration}$$, where N is the number of spikes in the epoch and duration is the epoch duration in ms. The mean firing rate of the second epoch is sufficiently higher than the mean firing rate of all the other epochs and we can conclude that epoch two is a period of higher bursting activity.
 * Example 1 :

Figure 2 shows the ISIs distribution for the different epochs. Notice the peak at (c) that denotes many short inter spike intervals and a possible burst period. A small peak at medium-sized intervals can be seen at (b) whereas (d) has no significant peaks. The histogram at (e) has two peaks: one corresponding to short inter spike intervals (coming from the second epoch) and one corresponding to medium-sized intervals (coming from the other non-bursting epochs.

Assuming that we have a suitable measure to compare two periods of neuron activity and that we are provided with a period of spontaneous activity we can proceed to the second part of our approach. The problem now is to find a period in the spike train that maximizes the measure of difference when compared with the spontaneous activity. A simple but effective approach would be to estimate our measure at all the possible time intervals and denote as burst the one with the maximum difference. If we expect to find multiple bursts, we can use a threshold to find time intervals that significantly differ from the given spontaneous period. Additional constraints can also be applied, such as that a burst must have a minimum or maximum number of spikes or that a burst can't be longer or shorter than a given time constant.

Conclusively, a general outline to burst detection would consist of two parts:


 * (One) Find an appropriate measure that will be used to compare two periods of spike activity
 * (Two) Find the period in the given spike train that differs significantly from a period of spontaneous activity by using the measure described in (One)

All the methods that will be described next share this common outline.

Description
The Poisson surprise is a somewhat widely used method for burst detection. It is based on the assumption that neuron firing follows a Poisson process or, in other words, that the inter spike intervals (ISIs) distribution is a Poisson distribution. This assumption comes from the analysis of Smith & Smith (1965). The Poisson surprise method was first described by Legendy & Salcman (1985) and consists of an appliance of the definitions of evidence and surprise described by Palm (1981). Since then, various modifications of the algorithm have been used. They all follow the same principles that we will briefly describe next.

The Poisson probability distribution function is given by $$ P [\lambda;k] = \frac{e^{-\lambda} (\lambda)^k}{k!} \qquad k= 0,1,\ldots,$$. Here, P is the probability that k events occur in a given time interval. The parameter $$ \lambda $$ is the mean of the distribution. In the case of a spike train, if the assumption is valid, $$ \lambda $$ can be set as the mean inter spike interval in the given time window. The Poisson surprise method, as described in (Legendy and Salcman, 1985), consists of two parts: Burst classification and Burst detection. These two parts are analogous to the two parts we described in the global approach.


 * Burst classification

The measure used in the Poisson surprise method is described as an evaluation of how improbable it is that a burst is a chance occurrence. It is defined by $$ S = -logP $$ where P is the probability that a given time interval of length $$\tau$$ contains k or more spikes. Since the description of P refers to " k or more spikes" we use the cumulative distribution function of the Poisson distribution. Moreover, since we want to find the probability at a predefined time interval, we replace the mean lambda with $$r\tau$$ where r is the mean firing rate of the whole spike and $$\tau$$ is the time interval. Thus, $$ P = e^{-r\tau}\sum_{i=k}^\infty (r\tau)^i/i! $$. Therefore, S is a measure analogous to the improbability of the specific burst ie S is high if the burst is considered improbable and low if the burst is considered a chance occurrence. Legendy and Salcman suggest the value of S to be above 10 in order to declare a period as a burst.


 * Burst detection

The second part of the method consists of finding periods in the spike train that show a high surprise index. First, we advance through the spike train, spike by spike, until a sequence of several closely spaced spikes is found. There are two parameters in the first step: the number of spikes in this (first) sequence and the definition of "closely spaced". Legendy & Salcman suggest using a sequence of at least 3 spikes that display a firing rate higher than the half of the average firing rate. The next step of the search is to compute the S for this sequence and then to test whether including an additional spike in the end of the sequence increases the S. This step is repeated (we add another spike in the sequence and compute the S) until a relatively long interval is encountered or inclusion of more (up to 10) additional spikes fails to increase the surprise index (S). The final step is to test whether removal of spikes from the beginning of the sequence further increases the surprise index. When this process is complete, we denote a burst if the final S is higher than the desired threshold(in this case the threshold is 10) and we proceed in the spike train, looking for additional bursts.


 * Example 2 - The Poisson method.

Figure 3 shows the Poisson method in action. In (a) the spike train is displayed along with the detected burst (denoted by the green dots above the spike train). Since we now know where the burst is we define three different epochs : Epoch 1 is from 0 to 2400 ms, Epoch 2 is from 2400 to 2800 ms and Epoch 3 is from 2800 to 4000ms (end). The epochs are separated by the red vertical lines in the figure. In (b) we show the ISI distribution for epoch 2. Notice the large number of short ISIs. In (c) we show the ISI distribution for the whole spike train and the corresponding Poisson distribution (with parameter $$ \lambda = mean(ISI)$$). Finally, in (d) we show the ISI distribution for epochs 1 and 3 combined and the corresponding Poisson distribution. In (d) the Poisson distribution closely matches the ISI distribution whereas in (c) it "misses" the peak of the short ISIs that is attributed to epoch 2. All histograms are normalized in order to represent a probability distribution ie each bin is divided with the sum of all bins.

Pros, Cons, Variations and Suggestions
The main disadvantage of the Poisson surprise method is the assumption that the ISIs follow a Poisson process. This in turn implies the assumption that the ISIs are independent events which can be considered as non-valid since the neurons in the nervous system display a refractory period after each spike (See Cateau & Reyes, 2006 for a model-based abjuration of this assumption). Moreover, common ISIs distribution display many extremes that cannot be reliably explained by the corresponding Poisson distribution. This was indeed shown in analysis of F5 neurons from Raos, 2005 (unpublished result) and also in (Sheinberg and Logothetis, 2001).

Despite the possible invalidity of the assumption, the Poisson method successfully determines periods of surprising activity and can be used to objectively detect unusual events. Furthermore, it assigns a measure of surprise (S) to each detected burst, providing a helpful tool in burst examination and validation. Lastly, it is, up to a point, a parameter-free method while the two parts (burst classification and detection) remain separated. This means that the results are generally unaffected by the parameters used and that the two parts could be separately examined and perhaps improved. All the detected bursts are assigned with a value of S which is compared with a threshold. The choice of the threshold though does not affect burst detection.

The variations of the method mainly consist of changes in the burst detection algorithm. The initiation of a burst (defined as "several closely spaced spikes") can be differently interpreted resulting in more strict or lenient burst detection. Furthermore, the maximum number of spikes to be added at a detected burst at the second step of the detection is a parameter that can affect the length of the detected bursts (a large value would generally allow longer bursts). Hanes et. al. (1995) set this parameter to the maximum possible value by searching the entire spike train after the burst. A final but crucial disadvantage of the search method is that once a burst has been detected and expanded, the algorithm proceeds to the next spike without further investigating the spikes of the last burst. This may result in "connected" bursts that may display a larger S index if combined in one burst. This hypothesis is never tested by the current search method. A different search algorithm, which will be described in the next section (rank surprise), is proposed by Gourevitch and Eggmont (2007).

To Do

 * Search in bibliography for variations in the method parameters or search algorithm (part 2 - detection).

Description
The rank surprise method is a recent method proposed by Gourevitch and Eggermont (2007) that has a structure similar to the Poisson method and produces comparable results. Instead of assuming a Poisson ISI distribution, the Rank surprise assigns a uniform probability to each ISI, based on its rank, and proposes a different statistic, the rank statistic (RS), to evaluate a burst. Moreover, Gourevitch and Eggermont introduce a more exhaustive search, the ESM algorithm, in order to search for the burst that maximizes the RS and avoid the detection problems of the Poisson burst detection algorithm. In the following description we will describe the two parts of the method, keeping up with the global approach we use in this study.


 * Burst Classification - The Rank Surprise Index

In order to define the RS statistic, we first consider the ranking of the ISIs in the spike train. Each ISI is assigned a rank according to its size. The smallest ISI gets a rank of one and the largest a rank of N (where N is the number of ISIs). These ranks can be seen as discrete uniformly distributed random variables. Intuitively, we expect a burst to have ISIs with small ranks. To evaluate a burst, we use the sum of the ranks of the burst's ISIs. Consider a burst of q spikes and q-1 ISIs. Let $$r_n$$ be the rank of the nth ISI where n=1,..,q-1. The sum of the ranks for this burst will be $$u = r_1+r_2 + .. + r_{q-1}$$. Next, let $$T_q$$ be the sum of q discrete uniformly distributed random variables. The probability that $$T_q$$ is less (or equal) than u is given by $$P(T_q\le u) = \frac{1}{N^q}\sum_{k=0}^{\frac{u-q}{N}} \frac{(-1)!(u-kN)!}{k!(q-k)!(u-kN-q)!} $$ ** where u is the sum of the burst's ISIs, N is the number of the train's ISIs and q is the number of burst's ISIs (for a proof of this formula see Uspensky (1937)).

The RS statistic can now be defined as $$ RS = -log(P(T_q\le u))$$.

** Correction: The first term in the numerator is (-1)^k, and not (-1)! as described. The rest of the formula is correct.


 * Burst Detection - The ESM Algorithm

The proposed method for burst detection is called Exhaustive Surprise Maximization algorithm. A schematic explanation of the algorithm is provided in Figure 4. We need to specify two parameters: the largest ISI allowed in a burst (limit) and the minimum RS that a burst must have in order to be considered valid. For the first parameter, the authors suggest a given percentile of the ISI distribution e.g. 75%. First, we search the spike train for all the ISIs that are smaller than limit and mark them. Then, we compute the RS value for every possible continuous sequence (time period) of marked ISIs and find the one with the largest RS. If this RS value is larger than the threshold (parameter 2), we denote a burst and remove the corresponding ISIs from the marked ones. Then we repeat the process for the remaining marked ISIs.

Pros, Cons and a Cookie
The Rank Surprise method has the main advantage of the Poisson method: it can subjectively detect a burst in any spike train assuming that the train also contains spontaneous activity. This feature though leads to a disadvantage that both methods share: If we have a spike train that is mainly composed of bursts separated by long ISIs, both methods will assign small indexes (S or RS) to these bursts thus not properly detecting them. A method that could cope with this type of problem would be one that is able to detect change in activity in a way unaffected of a whole spike train feature such as the mean firing rate (Poisson method) or the ISIs ranks (rank surprise method). Such a method will be described in a following section (significance sliding window). This is also the case for a different example: a spike train which contains a long period of "low" activity followed by a long period of "high" activity. Both methods will detect many bursts in the first period while ignoring potential bursts in the first, low-activity period.

Apart from these constrains, the Rank Surprise method offers a more robust solution in the both the burst classification and detection parts than the Poisson method. The non-parametric approach can cope with any ISI distribution without constrains. On the other hand, should we be able to safely assume a specific distribution for our data, a parametric approach would be better in theory. On the burst detection part, the ESM algorithm fairly addresses the problems we described for the poisson burst detection thus detecting the bursts that really maximize the RS statistic.

To Do
Where's the cookie?

The CZD (Cocatre-Zilgien and Delcomyn) method

 * Description
 * Pros
 * Cons
 * To Do

Simple Thresholding: Mean firing rate and cumulative spike histograms

 * Description
 * Pros
 * Cons
 * To Do

See also (Web Links)

 * Izhikevich E. M. (2006) Bursting. Scholarpedia, p.1401.

Glossary
Same terms are used without further explanation. This part will focus on defining the exact use of each term as it's assumed to be in this study.


 * Spontaneous activity : single-unit activity under natural simulation.


 * Firing rate : The firing rate of a neuron in spikes/second. Since the rate is computed by integration of the spike count over a time period it is essential to state the time window at which this integration is performed. If we want to estimate the mean firing rate over a time period we can simply divide the number of spikes with the duration of that period (eg. the rates at Table 1, Example 1. - See also (computational neuroscience book,19**)). If we want to estimate a "function" of the firing rate against time in order to see how the rate changes we can use a sliding window and compute the firing rate at each "placement" of the window (eg the rate at Figure 1, Example 1).


 * Inter spike interval (ISI) :