Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] optimal_kernel_bandwidth yields extremely large kernel widths #461

Open
essink opened this issue Mar 14, 2022 · 2 comments
Open

[Bug] optimal_kernel_bandwidth yields extremely large kernel widths #461

essink opened this issue Mar 14, 2022 · 2 comments
Labels
new functionality New modules, functions

Comments

@essink
Copy link
Collaborator

essink commented Mar 14, 2022

Describe the bug
The estimation of the instantaneous_rate with kernel="auto" of a very long spiketrain (~650 s) yielded a very smooth and flat rate. It was tracked down to the estimation of the optimal_kernel_bandwidth (the width of the Gaussian kernel, fixed for the whole duration), which yielded a bandwidth of ~22 s for spiketrain with a rather low firing rate.

After some investigation, it was noticed that the optimal bandwidth strongly depends on the duration of the spiketrain. One reason for this might be the hard-coded number of bins (1000) here:

if times is None:
time = np.max(spiketimes) - np.min(spiketimes)
isi = np.diff(spiketimes)
isi = isi[isi > 0].copy()
dt = np.min(isi)
times = np.linspace(np.min(spiketimes),
np.max(spiketimes),
min(int(time / dt + 0.5),
1000)) # The 1000 seems somewhat arbitrary
t = times
else:
time = np.max(times) - np.min(times)
spiketimes = spiketimes[(spiketimes >= np.min(times)) &
(spiketimes <= np.max(times))].copy()
isi = np.diff(spiketimes)
isi = isi[isi > 0].copy()
dt = np.min(isi)
if dt > np.min(np.diff(times)):
t = np.linspace(np.min(times), np.max(times),
min(int(time / dt + 0.5), 1000))
else:
t = times

Although, the option to give an array of times on which the optimization of the kernel is performed exists, it is not very accessible. (See original publication Shimazaki, H. & Shinomoto, S. Kernel bandwidth optimization in spike rate estimation. J Comput Neurosci 29, 171–182 (2010). for details)

Further notice: The implemented optimal_kernel_bandwidth estimates one fixed kernel bandwidth for the given spiketrain and is not adaptive over time! The originating code https://github.com/cooperlab/AdaptiveKDE also implements the adaptive case and could be included into elephant.

To Reproduce

  1. Get spiketimes
  2. Call elephant.statistics.optimal_kernel_bandwidth(spiketimes)

A minimal example is provided in the attached files:
issue_optimal_kernel_bandwidth.zip

Expected behavior
The order of magnitude should be reasonable and if this is exceeded a warning should be raise.

Environment

  • OS (e.g., Linux): Ubuntu
  • How you installed elephant (conda, pip, source): conda/pip
  • Python version: 3.9.7
  • neo python package version: 0.10.2
  • elephant python package version: 0.10.0
@Moritz-Alexander-Kern
Copy link
Member

Thank you for reporting this and your effort to contribute to elephant.

I have extended your minimal example to illustrate the issue in connection with elephant.statistics.instantaneous_rate.

import pickle
import quantities as pq
import elephant
import matplotlib.pyplot as plt

# Load example spiketrains
high_firing_rate_st, low_firing_rate_st = pickle.load(
    open('spiketrain_examples.pkl', 'rb'))
# firing rates
print(high_firing_rate_st.annotations)
print(low_firing_rate_st.annotations)
{'mean_firing_rate': array(22.00288951) * 1/s}
{'mean_firing_rate': array(3.85978253) * 1/s}
# Look at kernel bandwidth estimation

# spiketrain = high_firing_rate_st
spiketrain = low_firing_rate_st
spiketimes = spiketrain.times.rescale(pq.s).magnitude

# Choose example spiketrain
kernel_bandwidth = elephant.statistics.optimal_kernel_bandwidth(spiketimes)
print(f'The optimal kernel bandwidth for the full length spiketrain was '
      f'estimated to be {kernel_bandwidth["optw"]} s')
The optimal kernel bandwidth for the full length spiketrain was estimated to be 22.401845783799754 s

# Time slice the spiketrain to a 5 s duration
time_sliced_st = spiketrain.time_slice(t_start=1*pq.s, t_stop=6*pq.s)
spiketimes_sliced = time_sliced_st.times.rescale(pq.s).magnitude

kernel_bandwidth_sliced = elephant.statistics.optimal_kernel_bandwidth(
    spiketimes_sliced)
print(f'The optimal kernel bandwidth for the full length spiketrain was '
      f'estimated to be {kernel_bandwidth_sliced["optw"]} s')
The optimal kernel bandwidth for the full length spiketrain was estimated to be 0.5003558368563431 s
# Look at instantaneous rate


def plot_rate(inst_rate):
    plt.plot(inst_rate.times.simplified.magnitude,
             inst_rate.magnitude, color='orange',
             label='instantaneous rate')
    plt.legend()
    plt.title(f"{inst_rate.annotations['kernel']['type']}: "
              f"sigma={inst_rate.annotations['kernel']['sigma']}, "
              f"sampling_period={int(sampling_period.item())} ms")
    fig = plt.gcf()
    fig.set_size_inches(18.5, 10.5)
    plt.show()


sampling_period = 10 * pq.ms
# use kernel "auto"
rate_auto = elephant.statistics.instantaneous_rate(
    spiketrain,
    sampling_period=sampling_period,
    kernel="auto")

# plot result
plot_rate(rate_auto)

Figure 1:
Figure_1

# calculate rate using kernel bandwidth based on time slice
kernel = elephant.kernels.GaussianKernel(
    sigma=kernel_bandwidth_sliced["optw"] * pq.s)

rate_slice = elephant.statistics.instantaneous_rate(
    spiketrain,
    sampling_period=sampling_period,
    kernel=kernel)

# plot result
plot_rate(rate_slice)

Figure 2:
Figure_2

The two figures above show the difference for large and small kernel bandwidths.
If I understand correctly Figure 2 is the desired result for this case, Figure 1 shows a rate estimate, which is too smooth. (bandwidth too large)

possible solutions

  • rise a warning to the user when elephant.statistics.optimal_kernel_bandwidth(spiketimes) is used and the resulting bandwidth is too large.
    • In order to rise this warning in optimal_kernel_bandwidth, the following question comes to mind:
      Is there a way to estimate if the bandwidth is reasonable ? or would it be a fixed value ?
  • use the adaptive kernel functionality proposed in PR Enh/adaptive kernel #462.
  • change the hard coded number of bins (Lines 1576 to 1597)
    if times is None:
    time = np.max(spiketimes) - np.min(spiketimes)
    isi = np.diff(spiketimes)
    isi = isi[isi > 0].copy()
    dt = np.min(isi)
    times = np.linspace(np.min(spiketimes),
    np.max(spiketimes),
    min(int(time / dt + 0.5),
    1000)) # The 1000 seems somewhat arbitrary
    t = times
    else:
    time = np.max(times) - np.min(times)
    spiketimes = spiketimes[(spiketimes >= np.min(times)) &
    (spiketimes <= np.max(times))].copy()
    isi = np.diff(spiketimes)
    isi = isi[isi > 0].copy()
    dt = np.min(isi)
    if dt > np.min(np.diff(times)):
    t = np.linspace(np.min(times), np.max(times),
    min(int(time / dt + 0.5), 1000))
    else:
    t = times

I guess further steps should be discussed in connection to PR #462, thanks again for contributing this.

@essink
Copy link
Collaborator Author

essink commented Mar 15, 2022

Hi Moritz,

thanks for having a look at it so quickly! Just to clarify a few points:

  1. What an expected or reasonable bandwidth should be, is a very valid question and I don't have a proper answer to that.
  2. I would consider a 500 ms kernel bandwidth still as quite large.
  3. The PR Enh/adaptive kernel #462 does not solve the issue yet! It was just created while trying to solve it. (The hard-coded number of bins are definitely not ideal, but seem not to cause the main problem here)
  4. This method of estimating the instantaneous rate was, traditionally, applied to the spiketrains of sets of trials, which were short in duration. It might be a limitation of the method itself.

Having said that, I agree that we can move the discussion to PR #462 .

@Moritz-Alexander-Kern Moritz-Alexander-Kern added the new functionality New modules, functions label Mar 21, 2022
@Moritz-Alexander-Kern Moritz-Alexander-Kern linked a pull request Apr 4, 2022 that will close this issue
@Moritz-Alexander-Kern Moritz-Alexander-Kern added this to the v0.12.0 milestone Apr 14, 2022
@Moritz-Alexander-Kern Moritz-Alexander-Kern modified the milestones: v0.12.0, v0.13.0 Dec 5, 2022
@Moritz-Alexander-Kern Moritz-Alexander-Kern modified the milestones: v0.13.0, v0.14.0 Jul 5, 2023
@Moritz-Alexander-Kern Moritz-Alexander-Kern removed this from the v0.14.0 milestone Jul 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new functionality New modules, functions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants