Functions

ECG

Submodule for NeuroKit.

ecg_analyze(data, sampling_rate=1000, method='auto')[source]

Performs ECG analysis on either epochs (event-related analysis) or on longer periods of data such as resting- state data.

Parameters
  • data (Union[dict, pd.DataFrame]) – A dictionary of epochs, containing one DataFrame per epoch, usually obtained via epochs_create(), or a DataFrame containing all epochs, usually obtained via epochs_to_df(). Can also take a DataFrame of processed signals from a longer period of data, typically generated by ecg_process() or bio_process(). Can also take a dict containing sets of separate periods of data.

  • sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second). Defaults to 1000Hz.

  • method (str) – Can be one of ‘event-related’ for event-related analysis on epochs, or ‘interval-related’ for analysis on longer periods of data. Defaults to ‘auto’ where the right method will be chosen based on the mean duration of the data (‘event-related’ for duration under 10s).

Returns

DataFrame – A dataframe containing the analyzed ECG features. If event-related analysis is conducted, each epoch is indicated by the Label column. See ecg_eventrelated() and ecg_intervalrelated() docstrings for details.

See also

bio_process(), ecg_process(), epochs_create(), ecg_eventrelated(), ecg_intervalrelated()

Examples

>>> import neurokit2 as nk
>>>
>>> # Example 1: Download the data for event-related analysis
>>> data = nk.data("bio_eventrelated_100hz")
>>>
>>> # Process the data for event-related analysis
>>> df, info = nk.bio_process(ecg=data["ECG"], sampling_rate=100)
>>> events = nk.events_find(data["Photosensor"], threshold_keep='below',
...                         event_conditions=["Negative", "Neutral",
...                                           "Neutral", "Negative"])
>>> epochs = nk.epochs_create(df, events, sampling_rate=100, epochs_start=-0.1, epochs_end=1.9)
>>>
>>> # Analyze
>>> nk.ecg_analyze(epochs, sampling_rate=100) 
  Label Condition  ...  ECG_Phase_Completion_Ventricular  ECG_Quality_Mean
1     1  Negative  ...                               ...              ...
2     2   Neutral  ...                               ...              ...
3     3   Neutral  ...                               ...              ...
4     4  Negative  ...                               ...              ...

[4 rows x 17 columns] >>> >>> # Example 2: Download the resting-state data >>> data = nk.data(“bio_resting_5min_100hz”) >>> >>> # Process the data >>> df, info = nk.ecg_process(data[“ECG”], sampling_rate=100) >>> >>> # Analyze >>> nk.ecg_analyze(df, sampling_rate=100) #doctest: +ELLIPSIS

ECG_Rate_Mean HRV_RMSSD …

0 …

[1 rows x 37 columns]

ecg_clean(ecg_signal, sampling_rate=1000, method='neurokit')[source]

Clean an ECG signal.

Prepare a raw ECG signal for R-peak detection with the specified method.

Parameters
  • ecg_signal (Union[list, np.array, pd.Series]) – The raw ECG channel.

  • sampling_rate (int) – The sampling frequency of ecg_signal (in Hz, i.e., samples/second). Defaults to 1000.

  • method (str) – The processing pipeline to apply. Can be one of ‘neurokit’ (default), ‘biosppy’, ‘pamtompkins1985’, ‘hamilton2002’, ‘elgendi2010’, ‘engzeemod2012’.

Returns

array – Vector containing the cleaned ECG signal.

See also

ecg_findpeaks(), signal_rate(), ecg_process(), ecg_plot()

Examples

>>> import pandas as pd
>>> import neurokit2 as nk
>>> import matplotlib.pyplot as plt
>>>
>>> ecg = nk.ecg_simulate(duration=10, sampling_rate=1000)
>>> signals = pd.DataFrame({"ECG_Raw" : ecg,
...                         "ECG_NeuroKit" : nk.ecg_clean(ecg, sampling_rate=1000, method="neurokit"),
...                         "ECG_BioSPPy" : nk.ecg_clean(ecg, sampling_rate=1000, method="biosppy"),
...                         "ECG_PanTompkins" : nk.ecg_clean(ecg, sampling_rate=1000, method="pantompkins1985"),
...                         "ECG_Hamilton" : nk.ecg_clean(ecg, sampling_rate=1000, method="hamilton2002"),
...                         "ECG_Elgendi" : nk.ecg_clean(ecg, sampling_rate=1000, method="elgendi2010"),
...                         "ECG_EngZeeMod" : nk.ecg_clean(ecg, sampling_rate=1000, method="engzeemod2012")})
 >>> signals.plot() 
 <matplotlib.axes._subplots.AxesSubplot object at ...>

References

  • Jiapu Pan and Willis J. Tompkins. A Real-Time QRS Detection Algorithm. In: IEEE Transactions on Biomedical Engineering BME-32.3 (1985), pp. 230–236.

  • Hamilton, Open Source ECG Analysis Software Documentation, E.P.Limited, 2002.

ecg_delineate(ecg_cleaned, rpeaks=None, sampling_rate=1000, method='peak', show=False, show_type='peaks', check=False)[source]

Delineate QRS complex.

Function to delineate the QRS complex.

  • Cardiac Cycle: A typical ECG heartbeat consists of a P wave, a QRS complex and a T wave. The P wave represents the wave of depolarization that spreads from the SA-node throughout the atria. The QRS complex reflects the rapid depolarization of the right and left ventricles. Since the ventricles are the largest part of the heart, in terms of mass, the QRS complex usually has a much larger amplitude than the P-wave. The T wave represents the ventricular repolarization of the ventricles.On rare occasions, a U wave can be seen following the T wave. The U wave is believed to be related to the last remnants of ventricular repolarization.

Parameters
  • ecg_cleaned (Union[list, np.array, pd.Series]) – The cleaned ECG channel as returned by ecg_clean().

  • rpeaks (Union[list, np.array, pd.Series]) – The samples at which R-peaks occur. Accessible with the key “ECG_R_Peaks” in the info dictionary returned by ecg_findpeaks().

  • sampling_rate (int) – The sampling frequency of ecg_signal (in Hz, i.e., samples/second). Defaults to 500.

  • method (str) – Can be one of ‘peak’ (default) for a peak-based method, ‘cwt’ for continuous wavelet transform or ‘dwt’ for discrete wavelet transform.

  • show (bool) – If True, will return a plot to visualizing the delineated waves information.

  • show_type (str) – The type of delineated waves information showed in the plot.

  • check (bool) – Defaults to False.

Returns

  • waves (dict) – A dictionary containing additional information. For derivative method, the dictionary contains the samples at which P-peaks, Q-peaks, S-peaks, T-peaks, P-onsets and T-offsets occur, accessible with the key “ECG_P_Peaks”, “ECG_Q_Peaks”, “ECG_S_Peaks”, “ECG_T_Peaks”, “ECG_P_Onsets”, “ECG_T_Offsets” respectively.

    For wavelet methods, the dictionary contains the samples at which P-peaks, T-peaks, P-onsets, P-offsets, T-onsets, T-offsets, QRS-onsets and QRS-offsets occur, accessible with the key “ECG_P_Peaks”, “ECG_T_Peaks”, “ECG_P_Onsets”, “ECG_P_Offsets”, “ECG_T_Onsets”, “ECG_T_Offsets”, “ECG_R_Onsets”, “ECG_R_Offsets” respectively.

  • signals (DataFrame) – A DataFrame of same length as the input signal in which occurences of peaks, onsets and offsets marked as “1” in a list of zeros.

See also

ecg_clean(), signal_fixpeaks(), ecg_peaks(), signal_rate(), ecg_process(), ecg_plot()

Examples

>>> import neurokit2 as nk
>>>
>>> ecg = nk.ecg_simulate(duration=10, sampling_rate=1000)
>>> cleaned = nk.ecg_clean(ecg, sampling_rate=1000)
>>> _, rpeaks = nk.ecg_peaks(cleaned)
>>> signals, waves = nk.ecg_delineate(cleaned, rpeaks, sampling_rate=1000, method="peak")
>>> nk.events_plot(waves["ECG_P_Peaks"], cleaned) 
<Figure ...>
>>> nk.events_plot(waves["ECG_T_Peaks"], cleaned) 
<Figure ...>

References

  • Martínez, J. P., Almeida, R., Olmos, S., Rocha, A. P., & Laguna, P. (2004). A wavelet-based ECG delineator: evaluation on standard databases. IEEE Transactions on biomedical engineering, 51(4), 570-581.

ecg_eventrelated(epochs, silent=False)[source]

Performs event-related ECG analysis on epochs.

Parameters
  • epochs (Union[dict, pd.DataFrame]) – A dict containing one DataFrame per event/trial, usually obtained via epochs_create(), or a DataFrame containing all epochs, usually obtained via epochs_to_df().

  • silent (bool) – If True, silence possible warnings.

Returns

DataFrame – A dataframe containing the analyzed ECG features for each epoch, with each epoch indicated by the Label column (if not present, by the Index column). The analyzed features consist of the following:

  • ”ECG_Rate_Max”: the maximum heart rate after stimulus onset.

  • ”ECG_Rate_Min”: the minimum heart rate after stimulus onset.

  • ”ECG_Rate_Mean”: the mean heart rate after stimulus onset.

  • ”ECG_Rate_Max_Time”: the time at which maximum heart rate occurs.

  • ”ECG_Rate_Min_Time”: the time at which minimum heart rate occurs.

  • ”ECG_Phase_Atrial”: indication of whether the onset of the event concurs with respiratory systole (1) or diastole (0).

  • ”ECG_Phase_Ventricular”: indication of whether the onset of the event concurs with respiratory systole (1) or diastole (0).

  • ”ECG_Phase_Atrial_Completion”: indication of the stage of the current cardiac (atrial) phase (0 to 1) at the onset of the event.

  • ”ECG_Phase_Ventricular_Completion”: indication of the stage of the current cardiac (ventricular) phase (0 to 1) at the onset of the event.

We also include the following experimental features related to the parameters of a quadratic model:

  • ”ECG_Rate_Trend_Linear”: The parameter corresponding to the linear trend.

  • ”ECG_Rate_Trend_Quadratic”: The parameter corresponding to the curvature.

  • ”ECG_Rate_Trend_R2”: the quality of the quadratic model. If too low, the parameters might not be reliable or meaningful.

See also

events_find(), epochs_create(), bio_process()

Examples

>>> import neurokit2 as nk
>>>
>>> # Example with simulated data
>>> ecg, info = nk.ecg_process(nk.ecg_simulate(duration=20))
>>>
>>> # Process the data
>>> epochs = nk.epochs_create(ecg, events=[5000, 10000, 15000],
...                           epochs_start=-0.1, epochs_end=1.9)
>>> nk.ecg_eventrelated(epochs) 
  Label  Event_Onset  ...  ECG_Phase_Completion_Ventricular  ECG_Quality_Mean
1     1          ...  ...                               ...               ...
2     2          ...  ...                               ...               ...
3     3          ...  ...                               ...               ...

[3 rows x 16 columns] >>> >>> # Example with real data >>> data = nk.data(“bio_eventrelated_100hz”) >>> >>> # Process the data >>> df, info = nk.bio_process(ecg=data[“ECG”], sampling_rate=100) >>> events = nk.events_find(data[“Photosensor”], … threshold_keep=’below’, … event_conditions=[“Negative”, “Neutral”, … “Neutral”, “Negative”]) >>> epochs = nk.epochs_create(df, events, sampling_rate=100, … epochs_start=-0.1, epochs_end=1.9) >>> nk.ecg_eventrelated(epochs) #doctest: +ELLIPSIS

Label Condition … ECG_Phase_Completion_Ventricular ECG_Quality_Mean

1 1 Negative … … … 2 2 Neutral … … … 3 3 Neutral … … … 4 4 Negative … … …

[4 rows x 17 columns]

ecg_findpeaks(ecg_cleaned, sampling_rate=1000, method='neurokit', show=False)[source]

Find R-peaks in an ECG signal.

Low-level function used by ecg_peaks() to identify R-peaks in an ECG signal using a different set of algorithms. See ecg_peaks() for details.

Parameters
  • ecg_cleaned (Union[list, np.array, pd.Series]) – The cleaned ECG channel as returned by ecg_clean().

  • sampling_rate (int) – The sampling frequency of ecg_signal (in Hz, i.e., samples/second). Defaults to 1000.

  • method (string) – The algorithm to be used for R-peak detection. Can be one of ‘neurokit’ (default), ‘pamtompkins1985’, ‘hamilton2002’, ‘christov2004’, ‘gamboa2008’, ‘elgendi2010’, ‘engzeemod2012’, ‘kalidas2017’, ‘martinez2003’, ‘rodrigues2020’ or ‘promac’.

  • show (bool) – If True, will return a plot to visualizing the thresholds used in the algorithm. Useful for debugging.

Returns

info (dict) – A dictionary containing additional information, in this case the samples at which R-peaks occur, accessible with the key “ECG_R_Peaks”.

See also

ecg_clean(), signal_fixpeaks(), ecg_peaks(), ecg_rate(), ecg_process(), ecg_plot()

Examples

>>> import neurokit2 as nk
>>>
>>> ecg = nk.ecg_simulate(duration=10, sampling_rate=1000)
>>> cleaned = nk.ecg_clean(ecg, sampling_rate=1000)
>>> info = nk.ecg_findpeaks(cleaned)
>>> nk.events_plot(info["ECG_R_Peaks"], cleaned) 
<Figure ...>
>>>
>>> # Different methods
>>> neurokit = nk.ecg_findpeaks(nk.ecg_clean(ecg, method="neurokit"), method="neurokit")
>>> pantompkins1985 = nk.ecg_findpeaks(nk.ecg_clean(ecg, method="pantompkins1985"), method="pantompkins1985")
>>> hamilton2002 = nk.ecg_findpeaks(nk.ecg_clean(ecg, method="hamilton2002"), method="hamilton2002")
>>> martinez2003 = nk.ecg_findpeaks(cleaned, method="martinez2003")
>>> christov2004 = nk.ecg_findpeaks(cleaned, method="christov2004")
>>> gamboa2008 = nk.ecg_findpeaks(nk.ecg_clean(ecg, method="gamboa2008"), method="gamboa2008")
>>> elgendi2010 = nk.ecg_findpeaks(nk.ecg_clean(ecg, method="elgendi2010"), method="elgendi2010")
>>> engzeemod2012 = nk.ecg_findpeaks(nk.ecg_clean(ecg, method="engzeemod2012"), method="engzeemod2012")
>>> kalidas2017 = nk.ecg_findpeaks(nk.ecg_clean(ecg, method="kalidas2017"), method="kalidas2017")
>>> rodrigues2020 = nk.ecg_findpeaks(cleaned, method="rodrigues2020")
>>>
>>> # Visualize
>>> nk.events_plot([neurokit["ECG_R_Peaks"],
...                       pantompkins1985["ECG_R_Peaks"],
...                       hamilton2002["ECG_R_Peaks"],
...                       christov2004["ECG_R_Peaks"],
...                       gamboa2008["ECG_R_Peaks"],
...                       elgendi2010["ECG_R_Peaks"],
...                       engzeemod2012["ECG_R_Peaks"],
...                       kalidas2017["ECG_R_Peaks"],
...                       martinez2003["ECG_R_Peaks"],
...                       rodrigues2020["ECG_R_Peaks"]], cleaned) 
<Figure ...>
>>>
>>> # Method-agreement
>>> ecg = nk.ecg_simulate(duration=10, sampling_rate=500)
>>> ecg = nk.signal_distort(ecg,
...                         sampling_rate=500,
...                         noise_amplitude=0.2, noise_frequency=[25, 50],
...                         artifacts_amplitude=0.2, artifacts_frequency=50)
>>> nk.ecg_findpeaks(ecg, sampling_rate=1000, method="promac", show=True) 
{'ECG_R_Peaks': array(...)}

References

  • Gamboa, H. (2008). Multi-modal behavioral biometrics based on hci and electrophysiology. PhD ThesisUniversidade.

  • Zong, W., Heldt, T., Moody, G. B., & Mark, R. G. (2003, September). An open-source algorithm to detect onset of arterial blood pressure pulses. In Computers in Cardiology, 2003 (pp. 259-262). IEEE.

  • Hamilton, Open Source ECG Analysis Software Documentation, E.P.Limited, 2002.

  • Pan, J., & Tompkins, W. J. (1985). A real-time QRS detection algorithm. IEEE transactions on biomedical engineering, (3), 230-236.

  • Engelse, W. A. H., & Zeelenberg, C. (1979). A single scan algorithm for QRS detection and feature extraction IEEE Comput Cardiol. Long Beach: IEEE Computer Society.

  • Lourenço, A., Silva, H., Leite, P., Lourenço, R., & Fred, A. L. (2012, February). Real Time Electrocardiogram Segmentation for Finger based ECG Biometrics. In Biosignals (pp. 49-54).

ecg_intervalrelated(data, sampling_rate=1000)[source]

Performs ECG analysis on longer periods of data (typically > 10 seconds), such as resting-state data.

Parameters
  • data (Union[dict, pd.DataFrame]) – A DataFrame containing the different processed signal(s) as different columns, typically generated by ecg_process() or bio_process(). Can also take a dict containing sets of separately processed DataFrames.

  • sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second).

Returns

DataFrame – A dataframe containing the analyzed ECG features. The analyzed features consist of the following:

  • ”ECG_Rate_Mean”: the mean heart rate.

  • ”ECG_HRV”: the different heart rate variability metrices.

See hrv_summary() docstrings for details.

See also

bio_process(), ecg_eventrelated()

Examples

>>> import neurokit2 as nk
>>>
>>> # Download data
>>> data = nk.data("bio_resting_5min_100hz")
>>>
>>> # Process the data
>>> df, info = nk.ecg_process(data["ECG"], sampling_rate=100)
>>>
>>> # Single dataframe is passed
>>> nk.ecg_intervalrelated(df, sampling_rate=100) 
   ECG_Rate_Mean  HRV_RMSSD  ...
0      ...

[1 rows x 55 columns] >>> >>> epochs = nk.epochs_create(df, events=[0, 15000], sampling_rate=100, … epochs_end=150) >>> nk.ecg_intervalrelated(epochs) #doctest: +ELLIPSIS

ECG_Rate_Mean HRV_RMSSD …

1 …

[2 rows x 55 columns]

ecg_peaks(ecg_cleaned, sampling_rate=1000, method='neurokit', correct_artifacts=False)[source]

Find R-peaks in an ECG signal.

Find R-peaks in an ECG signal using the specified method.

Parameters
  • ecg_cleaned (Union[list, np.array, pd.Series]) – The cleaned ECG channel as returned by ecg_clean().

  • sampling_rate (int) – The sampling frequency of ecg_signal (in Hz, i.e., samples/second). Defaults to 1000.

  • method (string) – The algorithm to be used for R-peak detection. Can be one of ‘neurokit’ (default), ‘pamtompkins1985’, ‘hamilton2002’, ‘christov2004’, ‘gamboa2008’, ‘elgendi2010’, ‘engzeemod2012’ or ‘kalidas2017’.

  • correct_artifacts (bool) – Whether or not to identify artifacts as defined by Jukka A. Lipponen & Mika P. Tarvainen (2019): A robust algorithm for heart rate variability time series artefact correction using novel beat classification, Journal of Medical Engineering & Technology, DOI: 10.1080/03091902.2019.1640306.

Returns

  • signals (DataFrame) – A DataFrame of same length as the input signal in which occurences of R-peaks marked as “1” in a list of zeros with the same length as ecg_cleaned. Accessible with the keys “ECG_R_Peaks”.

  • info (dict) – A dictionary containing additional information, in this case the samples at which R-peaks occur, accessible with the key “ECG_R_Peaks”.

See also

ecg_clean(), ecg_findpeaks(), ecg_process(), ecg_plot(), signal_rate(), signal_fixpeaks()

Examples

>>> import neurokit2 as nk
>>>
>>> ecg = nk.ecg_simulate(duration=10, sampling_rate=1000)
>>> cleaned = nk.ecg_clean(ecg, sampling_rate=1000)
>>> signals, info = nk.ecg_peaks(cleaned, correct_artifacts=True)
>>> nk.events_plot(info["ECG_R_Peaks"], cleaned) 
<Figure ...>

References

  • Gamboa, H. (2008). Multi-modal behavioral biometrics based on hci and electrophysiology. PhD ThesisUniversidade.

  • W. Zong, T. Heldt, G.B. Moody, and R.G. Mark. An open-source algorithm to detect onset of arterial blood pressure pulses. In Computers in Cardiology, 2003, pages 259–262, 2003.

  • Hamilton, Open Source ECG Analysis Software Documentation, E.P.Limited, 2002.

  • Jiapu Pan and Willis J. Tompkins. A Real-Time QRS Detection Algorithm. In: IEEE Transactions on Biomedical Engineering BME-32.3 (1985), pp. 230–236.

  • C. Zeelenberg, A single scan algorithm for QRS detection and feature extraction, IEEE Comp. in Cardiology, vol. 6, pp. 37-42, 1979

  • A. Lourenco, H. Silva, P. Leite, R. Lourenco and A. Fred, “Real Time Electrocardiogram Segmentation for Finger Based ECG Biometrics”, BIOSIGNALS 2012, pp. 49-54, 2012.

ecg_phase(ecg_cleaned, rpeaks=None, delineate_info=None, sampling_rate=None)[source]

Compute cardiac phase (for both atrial and ventricular).

Finds the cardiac phase, labelled as 1 for systole and 0 for diastole.

Parameters
  • ecg_cleaned (Union[list, np.array, pd.Series]) – The cleaned ECG channel as returned by ecg_clean().

  • rpeaks (list or array or DataFrame or Series or dict) – The samples at which the different ECG peaks occur. If a dict or a DataFrame is passed, it is assumed that these containers were obtained with ecg_findpeaks() or ecg_peaks().

  • delineate_info (dict) – A dictionary containing additional information of ecg delineation and can be obtained with ecg_delineate().

  • sampling_rate (int) – The sampling frequency of ecg_signal (in Hz, i.e., samples/second). Defaults to None.

Returns

signals (DataFrame) – A DataFrame of same length as ecg_signal containing the following columns:

  • ”ECG_Phase_Atrial”: cardiac phase, marked by “1” for systole and “0” for diastole.

  • ”ECG_Phase_Completion_Atrial”: cardiac phase (atrial) completion, expressed in percentage (from 0 to 1), representing the stage of the current cardiac phase.

  • ”ECG_Phase_Ventricular”: cardiac phase, marked by “1” for systole and “0” for diastole.

  • ”ECG_Phase_Completion_Ventricular”: cardiac phase (ventricular) completion, expressed in percentage (from 0 to 1), representing the stage of the current cardiac phase.

Examples

>>> import neurokit2 as nk
>>>
>>> ecg = nk.ecg_simulate(duration=10, sampling_rate=1000)
>>> cleaned = nk.ecg_clean(ecg, sampling_rate=1000)
>>> _, rpeaks = nk.ecg_peaks(cleaned)
>>> signals, waves = nk.ecg_delineate(cleaned, rpeaks, sampling_rate=1000)
>>>
>>> cardiac_phase = nk.ecg_phase(ecg_cleaned=cleaned, rpeaks=rpeaks,
...                              delineate_info=waves, sampling_rate=1000)
>>> nk.signal_plot([cleaned, cardiac_phase], standardize=True) 
ecg_plot(ecg_signals, rpeaks=None, sampling_rate=None, show_type='default')[source]

Visualize ECG data.

Parameters
  • ecg_signals (DataFrame) – DataFrame obtained from ecg_process().

  • rpeaks (dict) – The samples at which the R-peak occur. Dict returned by ecg_process(). Defaults to None.

  • sampling_rate (int) – The sampling frequency of the ECG (in Hz, i.e., samples/second). Needs to be supplied if the data should be plotted over time in seconds. Otherwise the data is plotted over samples. Defaults to None. Must be specified to plot artifacts.

  • show_type (str) – Visualize the ECG data with ‘default’ or visualize artifacts thresholds with ‘artifacts’ produced by ecg_fixpeaks(), or ‘full’ to visualize both.

Returns

fig – Figure representing a plot of the processed ecg signals (and peak artifacts).

Examples

>>> import neurokit2 as nk
>>>
>>> ecg = nk.ecg_simulate(duration=15, sampling_rate=1000, heart_rate=80)
>>> signals, info = nk.ecg_process(ecg, sampling_rate=1000)
>>> nk.ecg_plot(signals, sampling_rate=1000, show_type='default') 
<Figure ...>

See also

ecg_process()

ecg_process(ecg_signal, sampling_rate=1000, method='neurokit')[source]

Process an ECG signal.

Convenience function that automatically processes an ECG signal.

Parameters
  • ecg_signal (Union[list, np.array, pd.Series]) – The raw ECG channel.

  • sampling_rate (int) – The sampling frequency of ecg_signal (in Hz, i.e., samples/second). Defaults to 1000.

  • method (str) – The processing pipeline to apply. Defaults to “neurokit”.

Returns

  • signals (DataFrame) – A DataFrame of the same length as the ecg_signal containing the following columns:

    • ”ECG_Raw”: the raw signal.

    • ”ECG_Clean”: the cleaned signal.

    • ”ECG_R_Peaks”: the R-peaks marked as “1” in a list of zeros.

    • ”ECG_Rate”: heart rate interpolated between R-peaks.

    • ”ECG_P_Peaks”: the P-peaks marked as “1” in a list of zeros

    • ”ECG_Q_Peaks”: the Q-peaks marked as “1” in a list of zeros .

    • ”ECG_S_Peaks”: the S-peaks marked as “1” in a list of zeros.

    • ”ECG_T_Peaks”: the T-peaks marked as “1” in a list of zeros.

    • ”ECG_P_Onsets”: the P-onsets marked as “1” in a list of zeros.

    • ”ECG_P_Offsets”: the P-offsets marked as “1” in a list of zeros

      (only when method in ecg_delineate is wavelet).

    • ”ECG_T_Onsets”: the T-onsets marked as “1” in a list of zeros

      (only when method in ecg_delineate is wavelet).

    • ”ECG_T_Offsets”: the T-offsets marked as “1” in a list of zeros.

    • ”ECG_R_Onsets”: the R-onsets marked as “1” in a list of zeros

      (only when method in ecg_delineate is wavelet).

    • ”ECG_R_Offsets”: the R-offsets marked as “1” in a list of zeros

      (only when method in ecg_delineate is wavelet).

    • ”ECG_Phase_Atrial”: cardiac phase, marked by “1” for systole and “0” for diastole.

    • ”ECG_Phase_Ventricular”: cardiac phase, marked by “1” for systole and “0” for diastole.

    • ”ECG_Atrial_PhaseCompletion”: cardiac phase (atrial) completion, expressed in percentage (from 0 to 1), representing the stage of the current cardiac phase.

    • ”ECG_Ventricular_PhaseCompletion”: cardiac phase (ventricular) completion, expressed in percentage (from 0 to 1), representing the stage of the current cardiac phase.

  • info (dict) – A dictionary containing the samples at which the R-peaks occur, accessible with the key “ECG_Peaks”.

See also

ecg_clean(), ecg_findpeaks(), ecg_plot(), signal_rate(), signal_fixpeaks()

Examples

>>> import neurokit2 as nk
>>>
>>> ecg = nk.ecg_simulate(duration=15, sampling_rate=1000, heart_rate=80)
>>> signals, info = nk.ecg_process(ecg, sampling_rate=1000)
>>> nk.ecg_plot(signals) 
<Figure ...>
ecg_quality(ecg_cleaned, rpeaks=None, sampling_rate=1000)[source]

Quality of ECG Signal.

Compute a continuous index of quality of the ECG signal, by interpolating the distance of each QRS segment from the average QRS segment present in the data. This index is therefore relative, and 1 corresponds to heartbeats that are the closest to the average sample and 0 corresponds to the most distance heartbeat, from that average sample.

Returns

array – Vector containing the quality index ranging from 0 to 1.

Examples

>>> import neurokit2 as nk
>>>
>>> ecg = nk.ecg_simulate(duration=30, sampling_rate=300, noise=0.2)
>>> ecg_cleaned = nk.ecg_clean(ecg, sampling_rate=300)
>>> quality = nk.ecg_quality(ecg_cleaned, sampling_rate=300)
>>>
>>> nk.signal_plot([ecg_cleaned, quality], standardize=True)
ecg_rate(peaks, sampling_rate=1000, desired_length=None, interpolation_method='monotone_cubic')

Calculate signal rate from a series of peaks.

This function can also be called either via ecg_rate(), `ppg_rate() or rsp_rate() (aliases provided for consistency).

Parameters
  • peaks (Union[list, np.array, pd.DataFrame, pd.Series, dict]) – The samples at which the peaks occur. If an array is passed in, it is assumed that it was obtained with signal_findpeaks(). If a DataFrame is passed in, it is assumed it is of the same length as the input signal in which occurrences of R-peaks are marked as “1”, with such containers obtained with e.g., ecg_findpeaks() or rsp_findpeaks().

  • sampling_rate (int) – The sampling frequency of the signal that contains peaks (in Hz, i.e., samples/second). Defaults to 1000.

  • desired_length (int) – If left at the default None, the returned rated will have the same number of elements as peaks. If set to a value larger than the sample at which the last peak occurs in the signal (i.e., peaks[-1]), the returned rate will be interpolated between peaks over desired_length samples. To interpolate the rate over the entire duration of the signal, set desired_length to the number of samples in the signal. Cannot be smaller than or equal to the sample at which the last peak occurs in the signal. Defaults to None.

  • interpolation_method (str) – Method used to interpolate the rate between peaks. See signal_interpolate(). ‘monotone_cubic’ is chosen as the default interpolation method since it ensures monotone interpolation between data points (i.e., it prevents physiologically implausible “overshoots” or “undershoots” in the y-direction). In contrast, the widely used cubic spline interpolation does not ensure monotonicity.

Returns

array – A vector containing the rate.

See also

signal_findpeaks(), signal_fixpeaks(), signal_plot()

Examples

>>> import neurokit2 as nk
>>>
>>> signal = nk.signal_simulate(duration=10, sampling_rate=1000, frequency=1)
>>> info = nk.signal_findpeaks(signal)
>>>
>>> rate = nk.signal_rate(peaks=info["Peaks"], desired_length=len(signal))
>>> fig = nk.signal_plot(rate)
>>> fig 
ecg_rsa(ecg_signals, rsp_signals=None, rpeaks=None, sampling_rate=1000, continuous=False)[source]

Respiratory Sinus Arrhythmia (RSA)

Respiratory sinus arrhythmia (RSA), also referred to as ‘cardiac coherence’, is the naturally occurring variation in heart rate during the breathing cycle. Metrics to quantify it are often used as a measure of parasympathetic nervous system activity. Neurophysiology informs us that the functional output of the myelinated vagus originating from the nucleus ambiguus has a respiratory rhythm. Thus, there would a temporal relation between the respiratory rhythm being expressed in the firing of these efferent pathways and the functional effect on the heart rate rhythm manifested as RSA. Importantly, several methods exist to quantify RSA:

  • The Peak-to-trough (P2T) algorithm measures the statistical range in milliseconds of the heart

period oscillation associated with synchronous respiration. Operationally, subtracting the shortest heart period during inspiration from the longest heart period during a breath cycle produces an estimate of RSA during each breath. The peak-to-trough method makes no statistical assumption or correction (e.g., adaptive filtering) regarding other sources of variance in the heart period time series that may confound, distort, or interact with the metric such as slower periodicities and baseline trend. Although it has been proposed that the P2T method “acts as a time-domain filter dynamically centered at the exact ongoing respiratory frequency” (Grossman, 1992), the method does not transform the time series in any way, as a filtering process would. Instead the method uses knowledge of the ongoing respiratory cycle to associate segments of the heart period time series with either inhalation or exhalation (Lewis, 2012).

  • The Porges-Bohrer (PB) algorithm assumes that heart period time series reflect the sum of several

component time series. Each of these component time series may be mediated by different neural mechanisms and may have different statistical features. The Porges-Bohrer method applies an algorithm that selectively extracts RSA, even when the periodic process representing RSA is superimposed on a complex baseline that may include aperiodic and slow periodic processes. Since the method is designed to remove sources of variance in the heart period time series other than the variance within the frequency band of spontaneous breathing, the method is capable of accurately quantifying RSA when the signal to noise ratio is low.

Parameters
  • ecg_signals (DataFrame) – DataFrame obtained from ecg_process(). Should contain columns ECG_Rate and ECG_R_Peaks. Can also take a DataFrame comprising of both ECG and RSP signals, generated by bio_process().

  • rsp_signals (DataFrame) – DataFrame obtained from rsp_process(). Should contain columns RSP_Phase and RSP_PhaseCompletion. No impact when a DataFrame comprising of both the ECG and RSP signals are passed as ecg_signals. Defaults to None.

  • rpeaks (dict) – The samples at which the R-peaks of the ECG signal occur. Dict returned by ecg_peaks(), ecg_process(), or bio_process(). Defaults to None.

  • sampling_rate (int) – The sampling frequency of signals (in Hz, i.e., samples/second).

  • continuous (bool) – If False, will return RSA properties computed from the data (one value per index). If True, will return continuous estimations of RSA of the same length as the signal. See below for more details.

Returns

rsa (dict) – A dictionary containing the RSA features, which includes:

  • RSA_P2T_Values”: the estimate of RSA during each breath cycle, produced by subtracting the shortest heart period (or RR interval) from the longest heart period in ms.

  • RSA_P2T_Mean”: the mean peak-to-trough across all cycles in ms

  • RSA_P2T_Mean_log”: the logarithm of the mean of RSA estimates.

  • RSA_P2T_SD”: the standard deviation of all RSA estimates.

  • RSA_P2T_NoRSA”: the number of breath cycles from which RSA could not be calculated.

  • RSA_PorgesBohrer”: the Porges-Bohrer estimate of RSA, optimal when the signal to noise ratio is low, in ln(ms^2).

Example

>>> import neurokit2 as nk
>>>
>>> # Download data
>>> data = nk.data("bio_eventrelated_100hz")
>>>
>>> # Process the data
>>> ecg_signals, info = nk.ecg_process(data["ECG"], sampling_rate=100)
>>> rsp_signals, _ = nk.rsp_process(data["RSP"], sampling_rate=100)
>>>
>>> # Get RSA features
>>> nk.ecg_rsa(ecg_signals, rsp_signals, info, sampling_rate=100, continuous=False) 
{'RSA_P2T_Mean': ...,
 'RSA_P2T_Mean_log': ...,
 'RSA_P2T_SD': ...,
 'RSA_P2T_NoRSA': ...,
 'RSA_PorgesBohrer': ...}
>>>
>>> # Get RSA as a continuous signal
>>> rsa = nk.ecg_rsa(ecg_signals, rsp_signals, info, sampling_rate=100, continuous=True)
>>> rsa 
        RSA_P2T
0      0.09
1      0.09
2      0.09
...    ...

[15000 rows x 1 columns] >>> nk.signal_plot([ecg_signals[“ECG_Rate”], rsp_signals[“RSP_Rate”], rsa], standardize=True)

References

  • Servant, D., Logier, R., Mouster, Y., & Goudemand, M. (2009). La variabilité de la fréquence cardiaque. Intérêts en psychiatrie. L’Encéphale, 35(5), 423–428. doi:10.1016/j.encep.2008.06.016

  • Lewis, G. F., Furman, S. A., McCool, M. F., & Porges, S. W. (2012). Statistical strategies to quantify respiratory sinus arrhythmia: Are commonly used metrics equivalent?. Biological psychology, 89(2), 349-364.

  • Zohar, A. H., Cloninger, C. R., & McCraty, R. (2013). Personality and heart rate variability: exploring pathways from personality to cardiac coherence and health. Open Journal of Social Sciences, 1(06), 32.

ecg_rsp(ecg_rate, sampling_rate=1000, method='vangent2019')[source]

Extract ECG Derived Respiration (EDR).

This implementation is far from being complete, as the information in the related papers prevents me from getting a full understanding of the procedure. Help is required!

Parameters
  • ecg_rate (array) – The heart rate signal as obtained via ecg_rate().

  • sampling_rate (int) – The sampling frequency of the signal that contains the R-peaks (in Hz, i.e., samples/second). Defaults to 1000Hz.

  • method (str) – Can be one of ‘vangent2019’ (default), ‘soni2019’, ‘charlton2016’ or ‘sarkar2015’.

Returns

array – A Numpy array containing the heart rate.

Examples

>>> import neurokit2 as nk
>>> import pandas as pd
>>>
>>> # Get heart rate
>>> data = nk.data("bio_eventrelated_100hz")
>>> rpeaks, info = nk.ecg_peaks(data["ECG"], sampling_rate=100)
>>> ecg_rate = nk.signal_rate(rpeaks, sampling_rate=100, desired_length=len(rpeaks))
>>>
>>>
>>> # Get ECG Derived Respiration (EDR)
>>> edr = nk.ecg_rsp(ecg_rate, sampling_rate=100)
>>> nk.standardize(pd.DataFrame({"EDR": edr, "RSP": data["RSP"]})).plot() 
<matplotlib.axes._subplots.AxesSubplot object at ...>
>>>
>>> # Method comparison (the closer to 0 the better)
>>> nk.standardize(pd.DataFrame({"True RSP": data["RSP"],
...                              "vangent2019": nk.ecg_rsp(ecg_rate, sampling_rate=100, method="vangent2019"),
...                              "sarkar2015": nk.ecg_rsp(ecg_rate, sampling_rate=100, method="sarkar2015"),
...                              "charlton2016": nk.ecg_rsp(ecg_rate, sampling_rate=100, method="charlton2016"),
...                              "soni2019": nk.ecg_rsp(ecg_rate, sampling_rate=100,
...                                                     method="soni2019")})).plot() 
<matplotlib.axes._subplots.AxesSubplot object at ...>

References

  • van Gent, P., Farah, H., van Nes, N., & van Arem, B. (2019). HeartPy: A novel heart rate algorithm for the analysis of noisy signals. Transportation research part F: traffic psychology and behaviour, 66, 368-378.

  • Sarkar, S., Bhattacherjee, S., & Pal, S. (2015). Extraction of respiration signal from ECG for respiratory rate estimation.

  • Charlton, P. H., Bonnici, T., Tarassenko, L., Clifton, D. A., Beale, R., & Watkinson, P. J. (2016). An assessment of algorithms to estimate respiratory rate from the electrocardiogram and photoplethysmogram. Physiological measurement, 37(4), 610.

  • Soni, R., & Muniyandi, M. (2019). Breath rate variability: a novel measure to study the meditation effects. International Journal of Yoga, 12(1), 45.

ecg_segment(ecg_cleaned, rpeaks=None, sampling_rate=1000, show=False)[source]

Segment an ECG signal into single heartbeats.

Parameters
  • ecg_cleaned (Union[list, np.array, pd.Series]) – The cleaned ECG channel as returned by ecg_clean().

  • rpeaks (dict) – The samples at which the R-peaks occur. Dict returned by ecg_peaks(). Defaults to None.

  • sampling_rate (int) – The sampling frequency of ecg_signal (in Hz, i.e., samples/second). Defaults to 1000.

  • show (bool) – If True, will return a plot of heartbeats. Defaults to False.

Returns

dict – A dict containing DataFrames for all segmented heartbeats.

Examples

>>> import neurokit2 as nk
>>>
>>> ecg = nk.ecg_simulate(duration=15, sampling_rate=1000, heart_rate=80)
>>> ecg_cleaned = nk.ecg_clean(ecg, sampling_rate=1000)
>>> nk.ecg_segment(ecg_cleaned, rpeaks=None, sampling_rate=1000, show=True) 
{'1':              Signal  Index Label
 ...
 '2':              Signal  Index Label
 ...
 '19':              Signal  Index Label
 ...}
ecg_simulate(duration=10, length=None, sampling_rate=1000, noise=0.01, heart_rate=70, method='ecgsyn', random_state=None)[source]

Simulate an ECG/EKG signal.

Generate an artificial (synthetic) ECG signal of a given duration and sampling rate using either the ECGSYN dynamical model (McSharry et al., 2003) or a simpler model based on Daubechies wavelets to roughly approximate cardiac cycles.

Parameters
  • duration (int) – Desired recording length in seconds.

  • sampling_rate (int) – The desired sampling rate (in Hz, i.e., samples/second).

  • length (int) – The desired length of the signal (in samples).

  • noise (float) – Noise level (amplitude of the laplace noise).

  • heart_rate (int) – Desired simulated heart rate (in beats per minute).

  • method (str) – The model used to generate the signal. Can be ‘simple’ for a simulation based on Daubechies wavelets that roughly approximates a single cardiac cycle. If ‘ecgsyn’ (default), will use an advanced model desbribed McSharry et al. (2003).

  • random_state (int) – Seed for the random number generator.

Returns

array – Vector containing the ECG signal.

Examples

>>> import pandas as pd
>>> import neurokit2 as nk
>>>
>>> ecg1 = nk.ecg_simulate(duration=10, method="simple")
>>> ecg2 = nk.ecg_simulate(duration=10, method="ecgsyn")
>>> pd.DataFrame({"ECG_Simple": ecg1,
...               "ECG_Complex": ecg2}).plot(subplots=True) 
array([<matplotlib.axes._subplots.AxesSubplot object at ...>,
   <matplotlib.axes._subplots.AxesSubplot object at ...>],
  dtype=object)

See also

rsp_simulate(), eda_simulate(), ppg_simulate(), emg_simulate()

References

  • McSharry, P. E., Clifford, G. D., Tarassenko, L., & Smith, L. A. (2003). A dynamical model for

generating synthetic electrocardiogram signals. IEEE transactions on biomedical engineering, 50(3), 289-294. - https://github.com/diarmaidocualain/ecg_simulation

PPG

Submodule for NeuroKit.

ppg_clean(ppg_signal, sampling_rate=1000, method='elgendi')[source]

Clean a photoplethysmogram (PPG) signal.

Prepare a raw PPG signal for systolic peak detection.

Parameters
  • ppg_signal (Union[list, np.array, pd.Series]) – The raw PPG channel.

  • sampling_rate (int) – The sampling frequency of the PPG (in Hz, i.e., samples/second). The default is 1000.

  • method (str) – The processing pipeline to apply. Can be one of “elgendi”. The default is “elgendi”.

Returns

clean (array) – A vector containing the cleaned PPG.

Examples

>>> import neurokit2 as nk
>>> import matplotlib.pyplot as plt
>>>
>>> ppg = nk.ppg_simulate(heart_rate=75, duration=30)
>>> ppg_clean = nk.ppg_clean(ppg)
>>>
>>> plt.plot(ppg, label="raw PPG") 
>>> plt.plot(ppg_clean, label="clean PPG") 
>>> plt.legend() 
ppg_findpeaks(ppg_cleaned, sampling_rate=1000, method='elgendi', show=False)[source]

Find systolic peaks in a photoplethysmogram (PPG) signal.

Parameters
  • ppg_cleaned (Union[list, np.array, pd.Series]) – The cleaned PPG channel as returned by ppg_clean().

  • sampling_rate (int) – The sampling frequency of the PPG (in Hz, i.e., samples/second). The default is 1000.

  • method (str) – The processing pipeline to apply. Can be one of “elgendi”. The default is “elgendi”.

  • show (bool) – If True, returns a plot of the thresholds used during peak detection. Useful for debugging. The default is False.

Returns

info (dict) – A dictionary containing additional information, in this case the samples at which systolic peaks occur, accessible with the key “PPG_Peaks”.

Examples

>>> import neurokit2 as nk
>>> import matplotlib.pyplot as plt
>>>
>>> ppg = nk.ppg_simulate(heart_rate=75, duration=30)
>>> ppg_clean = nk.ppg_clean(ppg)
>>> info = nk.ppg_findpeaks(ppg_clean)
>>> peaks = info["PPG_Peaks"]
>>>
>>> plt.plot(ppg, label="raw PPG") 
>>> plt.plot(ppg_clean, label="clean PPG") 
>>> plt.scatter(peaks, ppg[peaks], c="r", label="systolic peaks") 
>>> plt.legend() 

References

  • Elgendi M, Norton I, Brearley M, Abbott D, Schuurmans D (2013) Systolic Peak Detection in

Acceleration Photoplethysmograms Measured from Emergency Responders in Tropical Conditions. PLoS ONE 8(10): e76585. doi:10.1371/journal.pone.0076585.

ppg_plot(ppg_signals, sampling_rate=None)[source]

Visualize photoplethysmogram (PPG) data.

Parameters
  • ppg_signals (DataFrame) – DataFrame obtained from ppg_process().

  • sampling_rate (int) – The sampling frequency of the PPG (in Hz, i.e., samples/second). Needs to be supplied if the data should be plotted over time in seconds. Otherwise the data is plotted over samples. Defaults to None.

Returns

fig – Figure representing a plot of the processed PPG signals.

Examples

>>> import neurokit2 as nk
>>>
>>> # Simulate data
>>> ppg = nk.ppg_simulate(duration=10, sampling_rate=1000, heart_rate=70)
>>>
>>> # Process signal
>>> signals, info = nk.ppg_process(ppg, sampling_rate=1000)
>>>
>>> # Plot
>>> nk.ppg_plot(signals) 
<Figure ...>

See also

ppg_process()

ppg_process(ppg_signal, sampling_rate=1000, **kwargs)[source]

Process a photoplethysmogram (PPG) signal.

Convenience function that automatically processes an electromyography signal.

Parameters
  • ppg_signal (Union[list, np.array, pd.Series]) – The raw PPG channel.

  • sampling_rate (int) – The sampling frequency of emg_signal (in Hz, i.e., samples/second).

Returns

  • signals (DataFrame) – A DataFrame of same length as emg_signal containing the following columns: - “PPG_Raw”: the raw signal. - “PPG_Clean”: the cleaned signal. - “PPG_Rate”: the heart rate as measured based on PPG peaks. - “PPG_Peaks”: the PPG peaks marked as “1” in a list of zeros.

  • info (dict) – A dictionary containing the information of peaks.

Examples

>>> import neurokit2 as nk
>>>
>>> ppg = nk.ppg_simulate(duration=10, sampling_rate=1000, heart_rate=70)
>>> signals, info = nk.ppg_process(ppg, sampling_rate=1000)
>>> fig = nk.ppg_plot(signals)
>>> fig 
ppg_rate(peaks, sampling_rate=1000, desired_length=None, interpolation_method='monotone_cubic')

Calculate signal rate from a series of peaks.

This function can also be called either via ecg_rate(), `ppg_rate() or rsp_rate() (aliases provided for consistency).

Parameters
  • peaks (Union[list, np.array, pd.DataFrame, pd.Series, dict]) – The samples at which the peaks occur. If an array is passed in, it is assumed that it was obtained with signal_findpeaks(). If a DataFrame is passed in, it is assumed it is of the same length as the input signal in which occurrences of R-peaks are marked as “1”, with such containers obtained with e.g., ecg_findpeaks() or rsp_findpeaks().

  • sampling_rate (int) – The sampling frequency of the signal that contains peaks (in Hz, i.e., samples/second). Defaults to 1000.

  • desired_length (int) – If left at the default None, the returned rated will have the same number of elements as peaks. If set to a value larger than the sample at which the last peak occurs in the signal (i.e., peaks[-1]), the returned rate will be interpolated between peaks over desired_length samples. To interpolate the rate over the entire duration of the signal, set desired_length to the number of samples in the signal. Cannot be smaller than or equal to the sample at which the last peak occurs in the signal. Defaults to None.

  • interpolation_method (str) – Method used to interpolate the rate between peaks. See signal_interpolate(). ‘monotone_cubic’ is chosen as the default interpolation method since it ensures monotone interpolation between data points (i.e., it prevents physiologically implausible “overshoots” or “undershoots” in the y-direction). In contrast, the widely used cubic spline interpolation does not ensure monotonicity.

Returns

array – A vector containing the rate.

See also

signal_findpeaks(), signal_fixpeaks(), signal_plot()

Examples

>>> import neurokit2 as nk
>>>
>>> signal = nk.signal_simulate(duration=10, sampling_rate=1000, frequency=1)
>>> info = nk.signal_findpeaks(signal)
>>>
>>> rate = nk.signal_rate(peaks=info["Peaks"], desired_length=len(signal))
>>> fig = nk.signal_plot(rate)
>>> fig 
ppg_simulate(duration=120, sampling_rate=1000, heart_rate=70, frequency_modulation=0.3, ibi_randomness=0.1, drift=0, motion_amplitude=0.1, powerline_amplitude=0.01, burst_number=0, burst_amplitude=1, random_state=None, show=False)[source]

Simulate a photoplethysmogram (PPG) signal.

Phenomenological approximation of PPG. The PPG wave is described with four landmarks: wave onset, location of the systolic peak, location of the dicrotic notch and location of the diastolic peaks. These landmarks are defined as x and y coordinates (in a time series). These coordinates are then interpolated at the desired sampling rate to obtain the PPG signal.

Parameters
  • duration (int) – Desired recording length in seconds. The default is 120.

  • sampling_rate (int) – The desired sampling rate (in Hz, i.e., samples/second). The default is 1000.

  • heart_rate (int) – Desired simulated heart rate (in beats per minute). The default is 70.

  • frequency_modulation (float) – Float between 0 and 1. Determines how pronounced respiratory sinus arrythmia (RSA) is (0 corresponds to absence of RSA). The default is 0.3.

  • ibi_randomness (float) – Float between 0 and 1. Determines how much random noise there is in the duration of each PPG wave (0 corresponds to absence of variation). The default is 0.1.

  • drift (float) – Float between 0 and 1. Determines how pronounced the baseline drift (.05 Hz) is (0 corresponds to absence of baseline drift). The default is 1.

  • motion_amplitude (float) – Float between 0 and 1. Determines how pronounced the motion artifact (0.5 Hz) is (0 corresponds to absence of motion artifact). The default is 0.1.

  • powerline_amplitude (float) – Float between 0 and 1. Determines how pronounced the powerline artifact (50 Hz) is (0 corresponds to absence of powerline artifact). Note that powerline_amplitude > 0 is only possible if ‘sampling_rate’ is >= 500. The default is 0.1.

  • burst_amplitude (float) – Float between 0 and 1. Determines how pronounced high frequency burst artifacts are (0 corresponds to absence of bursts). The default is 1.

  • burst_number (int) – Determines how many high frequency burst artifacts occur. The default is 0.

  • show (bool) – If true, returns a plot of the landmarks and interpolated PPG. Useful for debugging.

  • random_state (int) – Seed for the random number generator. Keep it fixed for reproducible results.

Returns

ppg (array) – A vector containing the PPG.

See also

ecg_simulate(), rsp_simulate(), eda_simulate(), emg_simulate()

Examples

>>> import neurokit2 as nk
>>>
>>> ppg = nk.ppg_simulate(duration=40, sampling_rate=500, heart_rate=75, random_state=42)

HRV

hrv(peaks, sampling_rate=1000, show=False)[source]

Computes indices of Heart Rate Variability (HRV).

Computes HRV indices in the time-, frequency-, and nonlinear domain. Note that a minimum duration of the signal containing the peaks is recommended for some HRV indices to be meaninful. For instance, 1, 2 and 5 minutes of high quality signal are the recomended minima for HF, LF and LF/HF, respectively. See references for details.

Parameters
  • peaks (dict) – Samples at which cardiac extrema (i.e., R-peaks, systolic peaks) occur. Dictionary returned by ecg_findpeaks, ecg_peaks, ppg_findpeaks, or ppg_peaks.

  • sampling_rate (int, optional) – Sampling rate (Hz) of the continuous cardiac signal in which the peaks occur. Should be at least twice as high as the highest frequency in vhf. By default 1000.

  • show (bool, optional) – If True, returns the plots that are generates for each of the domains.

Returns

DataFrame – Contains HRV metrics from three domains: - frequency (see hrv_frequency) - time (see hrv_time) - non-linear (see `hrv_nonlinear <https://neurokit2.readthedocs.io/en/latest/functions.html#neurokit2.hrv.hrv_nonlinear`_)

See also

ecg_peaks(), ppg_peaks(), hrv_time(), hrv_frequency(), hrv_nonlinear()

Examples

>>> import neurokit2 as nk
>>>
>>> # Download data
>>> data = nk.data("bio_resting_5min_100hz")
>>>
>>> # Find peaks
>>> peaks, info = nk.ecg_peaks(data["ECG"], sampling_rate=100)
>>>
>>> # Compute HRV indices
>>> hrv_indices = nk.hrv(peaks, sampling_rate=100, show=True)
>>> hrv_indices 

References

  • Stein, P. K. (2002). Assessing heart rate variability from real-world Holter reports. Cardiac

electrophysiology review, 6(3), 239-244.

  • Shaffer, F., & Ginsberg, J. P. (2017). An overview of heart rate variability metrics and norms.

Frontiers in public health, 5, 258.

hrv_frequency(peaks, sampling_rate=1000, ulf=0, 0.0033, vlf=0.0033, 0.04, lf=0.04, 0.15, hf=0.15, 0.4, vhf=0.4, 0.5, psd_method='welch', show=False, silent=True, **kwargs)[source]

Computes frequency-domain indices of Heart Rate Variability (HRV).

Note that a minimum duration of the signal containing the peaks is recommended for some HRV indices to be meaningful. For instance, 1, 2 and 5 minutes of high quality signal are the recomended minima for HF, LF and LF/HF, respectively. See references for details.

Parameters
  • peaks (dict) – Samples at which cardiac extrema (i.e., R-peaks, systolic peaks) occur. Dictionary returned by ecg_findpeaks, ecg_peaks, ppg_findpeaks, or ppg_peaks.

  • sampling_rate (int, optional) – Sampling rate (Hz) of the continuous cardiac signal in which the peaks occur. Should be at least twice as high as the highest frequency in vhf. By default 1000.

  • ulf (tuple, optional) – Upper and lower limit of the ultra-low frequency band. By default (0, 0.0033).

  • vlf (tuple, optional) – Upper and lower limit of the very-low frequency band. By default (0.0033, 0.04).

  • lf (tuple, optional) – Upper and lower limit of the low frequency band. By default (0.04, 0.15).

  • hf (tuple, optional) – Upper and lower limit of the high frequency band. By default (0.15, 0.4).

  • vhf (tuple, optional) – Upper and lower limit of the very-high frequency band. By default (0.4, 0.5).

  • psd_method (str) – Method used for spectral density estimation. For details see signal.signal_power. By default “welch”.

  • silent (bool) – If False, warnings will be printed. Default to True.

  • show (bool) – If True, will plot the power in the different frequency bands.

  • **kwargs (optional) – Other arguments.

Returns

DataFrame – Contains frequency domain HRV metrics: - ULF: The spectral power density pertaining to ultra low frequency band i.e., .0 to .0033 Hz by default. - VLF: The spectral power density pertaining to very low frequency band i.e., .0033 to .04 Hz by default. - LF: The spectral power density pertaining to low frequency band i.e., .04 to .15 Hz by default. - HF: The spectral power density pertaining to high frequency band i.e., .15 to .4 Hz by default. - VHF: The variability, or signal power, in very high frequency i.e., .4 to .5 Hz by default. - LFn: The normalized low frequency, obtained by dividing the low frequency power by the total power. - HFn: The normalized high frequency, obtained by dividing the low frequency power by the total power. - LnHF: The log transformed HF.

See also

ecg_peaks(), ppg_peaks(), hrv_summary(), hrv_time(), hrv_nonlinear()

Examples

>>> import neurokit2 as nk
>>>
>>> # Download data
>>> data = nk.data("bio_resting_5min_100hz")
>>>
>>> # Find peaks
>>> peaks, info = nk.ecg_peaks(data["ECG"], sampling_rate=100)
>>>
>>> # Compute HRV indices
>>> hrv = nk.hrv_frequency(peaks, sampling_rate=100, show=True)

References

  • Stein, P. K. (2002). Assessing heart rate variability from real-world Holter reports. Cardiac

electrophysiology review, 6(3), 239-244.

  • Shaffer, F., & Ginsberg, J. P. (2017). An overview of heart rate variability metrics and norms.

Frontiers in public health, 5, 258.

hrv_nonlinear(peaks, sampling_rate=1000, show=False)[source]

Computes nonlinear indices of Heart Rate Variability (HRV).

See references for details.

Parameters
  • peaks (dict) – Samples at which cardiac extrema (i.e., R-peaks, systolic peaks) occur. Dictionary returned by ecg_findpeaks, ecg_peaks, ppg_findpeaks, or ppg_peaks.

  • sampling_rate (int, optional) – Sampling rate (Hz) of the continuous cardiac signal in which the peaks occur. Should be at least twice as high as the highest frequency in vhf. By default 1000.

  • show (bool, optional) – If True, will return a Poincaré plot, a scattergram, which plots each RR interval against the next successive one. The ellipse centers around the average RR interval. By default False.

Returns

DataFrame – Contains non-linear HRV metrics:

  • Characteristics of the Poincaré Plot Geometry:

    • SD1: SD1 is a measure of the spread of RR intervals on the Poincaré plot

    perpendicular to the line of identity. It is an index of short-term RR interval fluctuations, i.e., beat-to-beat variability. It is equivalent (although on another scale) to RMSSD, and therefore it is redundant to report correlations with both (Ciccone, 2017).

    • SD2: SD2 is a measure of the spread of RR intervals on the Poincaré plot along the

    line of identity. It is an index of long-term RR interval fluctuations.

    • SD1SD2: the ratio between short and long term fluctuations of the RR intervals

    (SD1 divided by SD2).

    • S: Area of ellipse described by SD1 and SD2 (pi * SD1 * SD2). It is

    proportional to SD1SD2.

    • CSI: The Cardiac Sympathetic Index (Toichi, 1997), calculated by dividing the

    longitudinal variability of the Poincaré plot (4*SD2) by its transverse variability (4*SD1).

    • CVI: The Cardiac Vagal Index (Toichi, 1997), equal to the logarithm of the product of

    longitudinal (4*SD2) and transverse variability (4*SD1).

    • CSI_Modified: The modified CSI (Jeppesen, 2014) obtained by dividing the square of

    the longitudinal variability by its transverse variability.

  • Indices of Heart Rate Asymmetry (HRA), i.e., asymmetry of the Poincaré plot (Yan, 2017):

    • GI: Guzik’s Index, defined as the distance of points above line of identity (LI)

    to LI divided by the distance of all points in Poincaré plot to LI except those that are located on LI.

    • SI: Slope Index, defined as the phase angle of points above LI divided by the

    phase angle of all points in Poincaré plot except those that are located on LI.

    • AI: Area Index, defined as the cumulative area of the sectors corresponding to

    the points that are located above LI divided by the cumulative area of sectors corresponding to all points in the Poincaré plot except those that are located on LI.

    • PI: Porta’s Index, defined as the number of points below LI divided by the total

    number of points in Poincaré plot except those that are located on LI.

    • SD1d and SD1a: short-term variance of contributions of decelerations

    (prolongations of RR intervals) and accelerations (shortenings of RR intervals), respectively (Piskorski, 2011).

    • C1d and C1a: the contributions of heart rate decelerations and accelerations

    to short-term HRV, respectively (Piskorski, 2011).

    • SD2d and SD2a: long-term variance of contributions of decelerations

    (prolongations of RR intervals) and accelerations (shortenings of RR intervals), respectively (Piskorski, 2011).

    • C2d and C2a: the contributions of heart rate decelerations and accelerations

    to long-term HRV, respectively (Piskorski, 2011).

    • SDNNd and SDNNa: total variance of contributions of decelerations

    (prolongations of RR intervals) and accelerations (shortenings of RR intervals), respectively (Piskorski, 2011).

    • Cd and Ca: the total contributions of heart rate decelerations and

    accelerations to HRV.

  • Indices of Heart Rate Fragmentation (Costa, 2017):

    • PIP: Percentage of inflection points of the RR intervals series.

    • IALS: Inverse of the average length of the acceleration/deceleration segments.

    • PSS: Percentage of short segments.

    • PAS: IPercentage of NN intervals in alternation segments.

  • Indices of Complexity:

    • ApEn: The approximate entropy measure of HRV, calculated by entropy_approximate().

    • SampEn: The sample entropy measure of HRV, calculated by entropy_sample().

See also

ecg_peaks(), ppg_peaks(), hrv_frequency(), hrv_time(), hrv_summary()

Examples

>>> import neurokit2 as nk
>>>
>>> # Download data
>>> data = nk.data("bio_resting_5min_100hz")
>>>
>>> # Find peaks
>>> peaks, info = nk.ecg_peaks(data["ECG"], sampling_rate=100)
>>>
>>> # Compute HRV indices
>>> hrv = nk.hrv_nonlinear(peaks, sampling_rate=100, show=True)
>>> hrv 

References

  • Yan, C., Li, P., Ji, L., Yao, L., Karmakar, C., & Liu, C. (2017). Area asymmetry of heart

rate variability signal. Biomedical engineering online, 16(1), 112.

  • Ciccone, A. B., Siedlik, J. A., Wecht, J. M., Deckert, J. A., Nguyen, N. D., & Weir, J. P.

(2017). Reminder: RMSSD and SD1 are identical heart rate variability metrics. Muscle & nerve, 56(4), 674-678.

  • Shaffer, F., & Ginsberg, J. P. (2017). An overview of heart rate variability metrics and norms.

Frontiers in public health, 5, 258.

  • Costa, M. D., Davis, R. B., & Goldberger, A. L. (2017). Heart rate fragmentation: a new

approach to the analysis of cardiac interbeat interval dynamics. Front. Physiol. 8, 255 (2017).

  • Jeppesen, J., Beniczky, S., Johansen, P., Sidenius, P., & Fuglsang-Frederiksen, A. (2014).

Using Lorenz plot and Cardiac Sympathetic Index of heart rate variability for detecting seizures for patients with epilepsy. In 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (pp. 4563-4566). IEEE.

  • Piskorski, J., & Guzik, P. (2011). Asymmetric properties of long-term and total heart rate

variability. Medical & biological engineering & computing, 49(11), 1289-1297.

  • Stein, P. K. (2002). Assessing heart rate variability from real-world Holter reports. Cardiac

electrophysiology review, 6(3), 239-244.

  • Brennan, M. et al. (2001). Do Existing Measures of Poincaré Plot Geometry Reflect Nonlinear

Features of Heart Rate Variability?. IEEE Transactions on Biomedical Engineering, 48(11), 1342-1347.

  • Toichi, M., Sugiura, T., Murai, T., & Sengoku, A. (1997). A new method of assessing cardiac

autonomic function and its comparison with spectral analysis and coefficient of variation of R–R interval. Journal of the autonomic nervous system, 62(1-2), 79-84.

hrv_time(peaks, sampling_rate=1000, show=False)[source]

Computes time-domain indices of Heart Rate Variability (HRV).

See references for details.

Parameters
  • peaks (dict) – Samples at which cardiac extrema (i.e., R-peaks, systolic peaks) occur. Dictionary returned by ecg_findpeaks, ecg_peaks, ppg_findpeaks, or ppg_peaks.

  • sampling_rate (int, optional) – Sampling rate (Hz) of the continuous cardiac signal in which the peaks occur. Should be at least twice as high as the highest frequency in vhf. By default 1000.

  • show (bool) – If True, will plot the distribution of R-R intervals.

Returns

DataFrame – Contains time domain HRV metrics: - RMSSD: The square root of the mean of the sum of successive differences between adjacent RR intervals. It is equivalent (although on another scale) to SD1, and therefore it is redundant to report correlations with both (Ciccone, 2017). - MeanNN: The mean of the RR intervals. - SDNN: The standard deviation of the RR intervals. - SDSD: The standard deviation of the successive differences between RR intervals. - CVNN: The standard deviation of the RR intervals (SDNN) divided by the mean of the RR intervals (MeanNN). - CVSD: The root mean square of the sum of successive differences (RMSSD) divided by the mean of the RR intervals (MeanNN). - MedianNN: The median of the absolute values of the successive differences between RR intervals. - MadNN: The median absolute deviation of the RR intervals. - HCVNN: The median absolute deviation of the RR intervals (MadNN) divided by the median of the absolute differences of their successive differences (MedianNN). - IQRNN: The interquartile range (IQR) of the RR intervals. - pNN50: The proportion of RR intervals greater than 50ms, out of the total number of RR intervals. - pNN20: The proportion of RR intervals greater than 20ms, out of the total number of RR intervals. - TINN: A geometrical parameter of the HRV, or more specifically, the baseline width of the RR intervals distribution obtained by triangular interpolation, where the error of least squares determines the triangle. It is an approximation of the RR interval distribution. - HTI: The HRV triangular index, measuring the total number of RR intervals divded by the height of the RR intervals histogram.

See also

ecg_peaks(), ppg_peaks(), hrv_frequency(), hrv_summary(), hrv_nonlinear()

Examples

>>> import neurokit2 as nk
>>>
>>> # Download data
>>> data = nk.data("bio_resting_5min_100hz")
>>>
>>> # Find peaks
>>> peaks, info = nk.ecg_peaks(data["ECG"], sampling_rate=100)
>>>
>>> # Compute HRV indices
>>> hrv = nk.hrv_time(peaks, sampling_rate=100, show=True)

References

  • Ciccone, A. B., Siedlik, J. A., Wecht, J. M., Deckert, J. A., Nguyen, N. D., & Weir, J. P.

(2017). Reminder: RMSSD and SD1 are identical heart rate variability metrics. Muscle & nerve, 56(4), 674-678.

  • Stein, P. K. (2002). Assessing heart rate variability from real-world Holter reports. Cardiac

electrophysiology review, 6(3), 239-244.

  • Shaffer, F., & Ginsberg, J. P. (2017). An overview of heart rate variability metrics and norms.

Frontiers in public health, 5, 258.

RSP

Submodule for NeuroKit.

rsp_amplitude(rsp_cleaned, peaks, troughs=None, interpolation_method='monotone_cubic')[source]

Compute respiratory amplitude.

Compute respiratory amplitude given the raw respiration signal and its extrema.

Parameters
  • rsp_cleaned (Union[list, np.array, pd.Series]) – The cleaned respiration channel as returned by rsp_clean().

  • peaks (list or array or DataFrame or Series or dict) – The samples at which the inhalation peaks occur. If a dict or a DataFrame is passed, it is assumed that these containers were obtained with rsp_findpeaks().

  • troughs (list or array or DataFrame or Series or dict) – The samples at which the inhalation troughs occur. If a dict or a DataFrame is passed, it is assumed that these containers were obtained with rsp_findpeaks().

  • interpolation_method (str) – Method used to interpolate the amplitude between peaks. See signal_interpolate(). ‘monotone_cubic’ is chosen as the default interpolation method since it ensures monotone interpolation between data points (i.e., it prevents physiologically implausible “overshoots” or “undershoots” in the y-direction). In contrast, the widely used cubic spline interpolation does not ensure monotonicity.

Returns

array – A vector containing the respiratory amplitude.

See also

rsp_clean(), rsp_peaks(), signal_rate(), rsp_process(), rsp_plot()

Examples

>>> import neurokit2 as nk
>>> import pandas as pd
>>>
>>> rsp = nk.rsp_simulate(duration=90, respiratory_rate=15)
>>> cleaned = nk.rsp_clean(rsp, sampling_rate=1000)
>>> info, signals = nk.rsp_peaks(cleaned)
>>>
>>> amplitude = nk.rsp_amplitude(cleaned, signals)
>>> fig = nk.signal_plot(pd.DataFrame({"RSP": rsp, "Amplitude": amplitude}), subplots=True)
>>> fig 
rsp_analyze(data, sampling_rate=1000, method='auto')[source]

Performs RSP analysis on either epochs (event-related analysis) or on longer periods of data such as resting- state data.

Parameters
  • data (dict or DataFrame) – A dictionary of epochs, containing one DataFrame per epoch, usually obtained via epochs_create(), or a DataFrame containing all epochs, usually obtained via epochs_to_df(). Can also take a DataFrame of processed signals from a longer period of data, typically generated by rsp_process() or bio_process(). Can also take a dict containing sets of separate periods of data.

  • sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second). Defaults to 1000Hz.

  • method (str) – Can be one of ‘event-related’ for event-related analysis on epochs, or ‘interval-related’ for analysis on longer periods of data. Defaults to ‘auto’ where the right method will be chosen based on the mean duration of the data (‘event-related’ for duration under 10s).

Returns

DataFrame – A dataframe containing the analyzed RSP features. If event-related analysis is conducted, each epoch is indicated by the Label column. See rsp_eventrelated() and rsp_intervalrelated() docstrings for details.

See also

bio_process(), rsp_process(), epochs_create(), rsp_eventrelated(), rsp_intervalrelated()

Examples

>>> import neurokit2 as nk
>>> # Example 1: Download the data for event-related analysis
>>> data = nk.data("bio_eventrelated_100hz")
>>>
>>> # Process the data for event-related analysis
>>> df, info = nk.bio_process(rsp=data["RSP"], sampling_rate=100)
>>> events = nk.events_find(data["Photosensor"], threshold_keep='below',
...                         event_conditions=["Negative", "Neutral", "Neutral", "Negative"])
>>> epochs = nk.epochs_create(df, events, sampling_rate=100, epochs_start=-0.1, epochs_end=1.9)
>>>
>>> # Analyze
>>> nk.rsp_analyze(epochs, sampling_rate=100) 
>>>
>>> # Example 2: Download the resting-state data
>>> data = nk.data("bio_resting_5min_100hz")
>>>
>>> # Process the data
>>> df, info = nk.rsp_process(data["RSP"], sampling_rate=100)
>>>
>>> # Analyze
>>> nk.rsp_analyze(df, sampling_rate=100) 
rsp_clean(rsp_signal, sampling_rate=1000, method='khodadad2018')[source]

Preprocess a respiration (RSP) signal.

Clean a respiration signal using different sets of parameters, such as ‘khodadad2018’ (linear detrending followed by a fifth order 2Hz low-pass IIR Butterworth filter) or BioSPPy (second order0.1 - 0.35 Hz bandpass Butterworth filter followed by a constant detrending).

Parameters
  • rsp_signal (Union[list, np.array, pd.Series]) – The raw respiration channel (as measured, for instance, by a respiration belt).

  • sampling_rate (int) – The sampling frequency of rsp_signal (in Hz, i.e., samples/second).

  • method (str) – The processing pipeline to apply. Can be one of “khodadad2018” (default) or “biosppy”.

Returns

array – Vector containing the cleaned respiratory signal.

Examples

>>> import pandas as pd
>>> import neurokit2 as nk
>>>
>>> rsp = nk.rsp_simulate(duration=30, sampling_rate=50, noise=0.01)
>>> signals = pd.DataFrame({ "RSP_Raw": rsp,
...                         "RSP_Khodadad2018": nk.rsp_clean(rsp, sampling_rate=50, method="khodadad2018"),
...                         "RSP_BioSPPy": nk.rsp_clean(rsp, sampling_rate=50, method="biosppy")})
>>> fig = signals.plot()
>>> fig 

References

rsp_eventrelated(epochs, silent=False)[source]

Performs event-related RSP analysis on epochs.

Parameters
  • epochs (Union[dict, pd.DataFrame]) – A dict containing one DataFrame per event/trial, usually obtained via epochs_create(), or a DataFrame containing all epochs, usually obtained via epochs_to_df().

  • silent (bool) – If True, silence possible warnings.

Returns

DataFrame – A dataframe containing the analyzed RSP features for each epoch, with each epoch indicated by the Label column (if not present, by the Index column). The analyzed features consist of the following: - “RSP_Rate_Max”: the maximum respiratory rate after stimulus onset. - “RSP_Rate_Min”: the minimum respiratory rate after stimulus onset. - “RSP_Rate_Mean”: the mean respiratory rate after stimulus onset. - “RSP_Rate_Max_Time”: the time at which maximum respiratory rate occurs. - “RSP_Rate_Min_Time”: the time at which minimum respiratory rate occurs. - “RSP_Amplitude_Max”: the maximum respiratory amplitude after stimulus onset. - “RSP_Amplitude_Min”: the minimum respiratory amplitude after stimulus onset. - “RSP_Amplitude_Mean”: the mean respiratory amplitude after stimulus onset. - “RSP_Phase”: indication of whether the onset of the event concurs with respiratory inspiration (1) or expiration (0). - “RSP_PhaseCompletion”: indication of the stage of the current respiration phase (0 to 1) at the onset of the event.

See also

events_find(), epochs_create(), bio_process()

Examples

>>> import neurokit2 as nk
>>>
>>> # Example with simulated data
>>> rsp, info = nk.rsp_process(nk.rsp_simulate(duration=120))
>>> epochs = nk.epochs_create(rsp, events=[5000, 10000, 15000], epochs_start=-0.1, epochs_end=1.9)
>>>
>>> # Analyze
>>> rsp1 = nk.rsp_eventrelated(epochs)
>>> rsp1 
>>>
>>> # Example with real data
>>> data = nk.data("bio_eventrelated_100hz")
>>>
>>> # Process the data
>>> df, info = nk.bio_process(rsp=data["RSP"], sampling_rate=100)
>>> events = nk.events_find(data["Photosensor"], threshold_keep='below',
...                         event_conditions=["Negative", "Neutral", "Neutral", "Negative"])
>>> epochs = nk.epochs_create(df, events, sampling_rate=100, epochs_start=-0.1, epochs_end=2.9)
>>>
>>> # Analyze
>>> rsp2 = nk.rsp_eventrelated(epochs)
>>> rsp2 
rsp_findpeaks(rsp_cleaned, sampling_rate=1000, method='khodadad2018', amplitude_min=0.3)[source]

Extract extrema in a respiration (RSP) signal.

Low-level function used by rsp_peaks() to identify inhalation peaks and exhalation troughs in a preprocessed respiration signal using different sets of parameters. See rsp_peaks() for details.

Parameters
  • rsp_cleaned (Union[list, np.array, pd.Series]) – The cleaned respiration channel as returned by rsp_clean().

  • sampling_rate (int) – The sampling frequency of ‘rsp_cleaned’ (in Hz, i.e., samples/second).

  • method (str) – The processing pipeline to apply. Can be one of “khodadad2018” (default) or “biosppy”.

  • amplitude_min (float) – Only applies if method is “khodadad2018”. Extrema that have a vertical distance smaller than (outlier_threshold * average vertical distance) to any direct neighbour are removed as false positive outliers. I.e., outlier_threshold should be a float with positive sign (the default is 0.3). Larger values of outlier_threshold correspond to more conservative thresholds (i.e., more extrema removed as outliers).

Returns

info (dict) – A dictionary containing additional information, in this case the samples at which inhalation peaks and exhalation troughs occur, accessible with the keys “RSP_Peaks”, and “RSP_Troughs”, respectively.

Examples

>>> import neurokit2 as nk
>>>
>>> rsp = nk.rsp_simulate(duration=30, respiratory_rate=15)
>>> cleaned = nk.rsp_clean(rsp, sampling_rate=1000)
>>> info = nk.rsp_findpeaks(cleaned)
>>> fig = nk.events_plot([info["RSP_Peaks"], info["RSP_Troughs"]], cleaned)
>>> fig 
rsp_fixpeaks(peaks, troughs=None)[source]

Correct RSP peaks.

Low-level function used by rsp_peaks() to correct the peaks found by rsp_findpeaks(). Doesn’t do anything for now for RSP. See rsp_peaks() for details.

Parameters
  • peaks (list or array or DataFrame or Series or dict) – The samples at which the inhalation peaks occur. If a dict or a DataFrame is passed, it is assumed that these containers were obtained with rsp_findpeaks().

  • troughs (list or array or DataFrame or Series or dict) – The samples at which the inhalation troughs occur. If a dict or a DataFrame is passed, it is assumed that these containers were obtained with rsp_findpeaks().

Returns

info (dict) – A dictionary containing additional information, in this case the samples at which inhalation peaks and exhalation troughs occur, accessible with the keys “RSP_Peaks”, and “RSP_Troughs”, respectively.

Examples

>>> import neurokit2 as nk
>>>
>>> rsp = nk.rsp_simulate(duration=30, respiratory_rate=15)
>>> cleaned = nk.rsp_clean(rsp, sampling_rate=1000)
>>> info = nk.rsp_findpeaks(cleaned)
>>> info = nk.rsp_fixpeaks(info)
>>> fig = nk.events_plot([info["RSP_Peaks"], info["RSP_Troughs"]], cleaned)
>>> fig 
rsp_intervalrelated(data, sampling_rate=1000)[source]

Performs RSP analysis on longer periods of data (typically > 10 seconds), such as resting-state data.

Parameters
  • data (DataFrame or dict) – A DataFrame containing the different processed signal(s) as different columns, typically generated by rsp_process() or bio_process(). Can also take a dict containing sets of separately processed DataFrames.

  • sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second).

Returns

DataFrame – A dataframe containing the analyzed RSP features. The analyzed features consist of the following: - “RSP_Rate_Mean”: the mean heart rate. - “RSP_Amplitude_Mean”: the mean respiratory amplitude. - “RSP_RRV”: the different respiratory rate variability metrices. See rsp_rrv() docstrings for details.

See also

bio_process(), rsp_eventrelated()

Examples

>>> import neurokit2 as nk
>>>
>>> # Download data
>>> data = nk.data("bio_resting_5min_100hz")
>>>
>>> # Process the data
>>> df, info = nk.rsp_process(data["RSP"], sampling_rate=100)
>>> # Single dataframe is passed
>>> nk.rsp_intervalrelated(df) 
>>>
>>> epochs = nk.epochs_create(df, events=[0, 15000], sampling_rate=100, epochs_end=150)
>>> nk.rsp_intervalrelated(epochs) 
rsp_peaks(rsp_cleaned, sampling_rate=1000, method='khodadad2018', amplitude_min=0.3)[source]

Identify extrema in a respiration (RSP) signal.

This function rsp_findpeaks() and rsp_fixpeaks to identify and process inhalation peaks and exhalation troughs in a preprocessed respiration signal using different sets of parameters, such as:

  • `Khodadad et al. (2018)

<https://iopscience.iop.org/article/10.1088/1361-6579/aad7e6/meta>`_

  • `BioSPPy

<https://github.com/PIA-Group/BioSPPy/blob/master/biosppy/signals/resp.py>`_

Parameters
  • rsp_cleaned (Union[list, np.array, pd.Series]) – The cleaned respiration channel as returned by rsp_clean().

  • sampling_rate (int) – The sampling frequency of ‘rsp_cleaned’ (in Hz, i.e., samples/second).

  • method (str) – The processing pipeline to apply. Can be one of “khodadad2018” (default) or “biosppy”.

  • amplitude_min (float) – Only applies if method is “khodadad2018”. Extrema that have a vertical distance smaller than (outlier_threshold * average vertical distance) to any direct neighbour are removed as false positive outliers. i.e., outlier_threshold should be a float with positive sign (the default is 0.3). Larger values of outlier_threshold correspond to more conservative thresholds (i.e., more extrema removed as outliers).

Returns

  • info (dict) – A dictionary containing additional information, in this case the samples at which inhalation peaks and exhalation troughs occur, accessible with the keys “RSP_Peaks”, and “RSP_Troughs”, respectively.

  • peak_signal (DataFrame) – A DataFrame of same length as the input signal in which occurences of inhalation peaks and exhalation troughs are marked as “1” in lists of zeros with the same length as rsp_cleaned. Accessible with the keys “RSP_Peaks” and “RSP_Troughs” respectively.

Examples

>>> import neurokit2 as nk
>>> import pandas as pd
>>>
>>> rsp = nk.rsp_simulate(duration=30, respiratory_rate=15)
>>> cleaned = nk.rsp_clean(rsp, sampling_rate=1000)
>>> peak_signal, info = nk.rsp_peaks(cleaned, sampling_rate=1000)
>>>
>>> data = pd.concat([pd.DataFrame({"RSP": rsp}), peak_signal], axis=1)
>>> fig = nk.signal_plot(data)
>>> fig 
rsp_phase(peaks, troughs=None, desired_length=None)[source]

Compute respiratory phase (inspiration and expiration).

Finds the respiratory phase, labelled as 1 for inspiration and 0 for expiration.

Parameters
  • peaks (list or array or DataFrame or Series or dict) – The samples at which the inhalation peaks occur. If a dict or a DataFrame is passed, it is assumed that these containers were obtained with rsp_findpeaks().

  • troughs (list or array or DataFrame or Series or dict) – The samples at which the inhalation troughs occur. If a dict or a DataFrame is passed, it is assumed that these containers were obtained with rsp_findpeaks().

  • desired_length (int) – By default, the returned respiration rate has the same number of elements as peaks. If set to an integer, the returned rate will be interpolated between peaks over desired_length samples. Has no effect if a DataFrame is passed in as the peaks argument.

Returns

signals (DataFrame) – A DataFrame of same length as rsp_signal containing the following columns: - “RSP_Inspiration”: breathing phase, marked by “1” for inspiration and “0” for expiration. - “RSP_Phase_Completion”: breathing phase completion, expressed in percentage (from 0 to 1), representing the stage of the current respiratory phase.

Examples

>>> import neurokit2 as nk
>>>
>>> rsp = nk.rsp_simulate(duration=30, respiratory_rate=15)
>>> cleaned = nk.rsp_clean(rsp, sampling_rate=1000)
>>> peak_signal, info = nk.rsp_peaks(cleaned)
>>>
>>> phase = nk.rsp_phase(peak_signal, desired_length=len(cleaned))
>>> fig = nk.signal_plot([rsp, phase], standardize=True)
>>> fig 
rsp_plot(rsp_signals, sampling_rate=None)[source]

Visualize respiration (RSP) data.

Parameters
  • rsp_signals (DataFrame) – DataFrame obtained from rsp_process().

  • sampling_rate (int) – The desired sampling rate (in Hz, i.e., samples/second).

Examples

>>> import neurokit2 as nk
>>>
>>> rsp = nk.rsp_simulate(duration=90, respiratory_rate=15)
>>> rsp_signals, info = nk.rsp_process(rsp, sampling_rate=1000)
>>> fig = nk.rsp_plot(rsp_signals)
>>> fig 
Returns

fig – Figure representing a plot of the processed rsp signals.

See also

rsp_process()

rsp_process(rsp_signal, sampling_rate=1000, method='khodadad2018')[source]

Process a respiration (RSP) signal.

Convenience function that automatically processes a respiration signal with one of the following methods:

  • `Khodadad et al. (2018)

<https://iopscience.iop.org/article/10.1088/1361-6579/aad7e6/meta>`_

  • `BioSPPy

<https://github.com/PIA-Group/BioSPPy/blob/master/biosppy/signals/resp.py>`_

Parameters
  • rsp_signal (Union[list, np.array, pd.Series]) – The raw respiration channel (as measured, for instance, by a respiration belt).

  • sampling_rate (int) – The sampling frequency of rsp_signal (in Hz, i.e., samples/second).

  • method (str) – The processing pipeline to apply. Can be one of “khodadad2018” (default) or “biosppy”.

Returns

  • signals (DataFrame) – A DataFrame of same length as rsp_signal containing the following columns: - “RSP_Raw”: the raw signal. - “RSP_Clean”: the cleaned signal. - “RSP_Peaks”: the inhalation peaks marked as “1” in a list of zeros. - “RSP_Troughs”: the exhalation troughs marked as “1” in a list of zeros. - “RSP_Rate”: breathing rate interpolated between inhalation peaks. - “RSP_Amplitude”: breathing amplitude interpolated between inhalation peaks. - “RSP_Phase”: breathing phase, marked by “1” for inspiration and “0” for expiration. - “RSP_PhaseCompletion”: breathing phase completion, expressed in percentage (from 0 to 1), representing the stage of the current respiratory phase.

  • info (dict) – A dictionary containing the samples at which inhalation peaks and exhalation troughs occur, accessible with the keys “RSP_Peaks”, and “RSP_Troughs”, respectively.

Examples

>>> import neurokit2 as nk
>>>
>>> rsp = nk.rsp_simulate(duration=90, respiratory_rate=15)
>>> signals, info = nk.rsp_process(rsp, sampling_rate=1000)
>>> fig = nk.rsp_plot(signals)
>>> fig 
rsp_rate(peaks, sampling_rate=1000, desired_length=None, interpolation_method='monotone_cubic')

Calculate signal rate from a series of peaks.

This function can also be called either via ecg_rate(), `ppg_rate() or rsp_rate() (aliases provided for consistency).

Parameters
  • peaks (Union[list, np.array, pd.DataFrame, pd.Series, dict]) – The samples at which the peaks occur. If an array is passed in, it is assumed that it was obtained with signal_findpeaks(). If a DataFrame is passed in, it is assumed it is of the same length as the input signal in which occurrences of R-peaks are marked as “1”, with such containers obtained with e.g., ecg_findpeaks() or rsp_findpeaks().

  • sampling_rate (int) – The sampling frequency of the signal that contains peaks (in Hz, i.e., samples/second). Defaults to 1000.

  • desired_length (int) – If left at the default None, the returned rated will have the same number of elements as peaks. If set to a value larger than the sample at which the last peak occurs in the signal (i.e., peaks[-1]), the returned rate will be interpolated between peaks over desired_length samples. To interpolate the rate over the entire duration of the signal, set desired_length to the number of samples in the signal. Cannot be smaller than or equal to the sample at which the last peak occurs in the signal. Defaults to None.

  • interpolation_method (str) – Method used to interpolate the rate between peaks. See signal_interpolate(). ‘monotone_cubic’ is chosen as the default interpolation method since it ensures monotone interpolation between data points (i.e., it prevents physiologically implausible “overshoots” or “undershoots” in the y-direction). In contrast, the widely used cubic spline interpolation does not ensure monotonicity.

Returns

array – A vector containing the rate.

See also

signal_findpeaks(), signal_fixpeaks(), signal_plot()

Examples

>>> import neurokit2 as nk
>>>
>>> signal = nk.signal_simulate(duration=10, sampling_rate=1000, frequency=1)
>>> info = nk.signal_findpeaks(signal)
>>>
>>> rate = nk.signal_rate(peaks=info["Peaks"], desired_length=len(signal))
>>> fig = nk.signal_plot(rate)
>>> fig 
rsp_rrv(rsp_rate, peaks=None, sampling_rate=1000, show=False, silent=True)[source]

Computes time domain and frequency domain features for Respiratory Rate Variability (RRV) analysis.

Parameters
  • rsp_rate (array) – Array containing the respiratory rate, produced by signal_rate().

  • peaks (dict) – The samples at which the inhalation peaks occur. Dict returned by rsp_peaks(). Defaults to None.

  • sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second).

  • show (bool) – If True, will return a Poincaré plot, a scattergram, which plots each breath-to-breath interval against the next successive one. The ellipse centers around the average breath-to-breath interval. Defaults to False.

  • silent (bool) – If False, warnings will be printed. Default to True.

Returns

DataFrame – DataFrame consisting of the computed RRV metrics, which includes: - “RRV_SDBB”: the standard deviation of the breath-to-breath intervals. - “RRV_RMSSD”: the root mean square of successive differences of the breath-to-breath intervals. - “RRV_SDSD”: the standard deviation of the successive differences between adjacent breath-to-breath intervals. - “RRV_BBx”: the number of successive interval differences that are greater than x seconds. - “RRV-pBBx”: the proportion of breath-to-breath intervals that are greater than x seconds, out of the total number of intervals. - “RRV_VLF”: spectral power density pertaining to very low frequency band i.e., 0 to .04 Hz by default. - “RRV_LF”: spectral power density pertaining to low frequency band i.e., .04 to .15 Hz by default. - “RRV_HF”: spectral power density pertaining to high frequency band i.e., .15 to .4 Hz by default. - “RRV_LFHF”: the ratio of low frequency power to high frequency power. - “RRV_LFn”: the normalized low frequency, obtained by dividing the low frequency power by the total power. - “RRV_HFn”: the normalized high frequency, obtained by dividing the low frequency power by total power. - “RRV_SD1”: SD1 is a measure of the spread of breath-to-breath intervals on the Poincaré plot perpendicular to the line of identity. It is an index of short-term variability. - “RRV_SD2”: SD2 is a measure of the spread of breath-to-breath intervals on the Poincaré plot along the line of identity. It is an index of long-term variability. - “RRV_SD2SD1”: the ratio between short and long term fluctuations of the breath-to-breath intervals (SD2 divided by SD1). - “RRV_ApEn”: the approximate entropy of RRV, calculated by entropy_approximate(). - “RRV_SampEn”: the sample entropy of RRV, calculated by entropy_sample(). - “RRV_DFA_1”: the “short-term” fluctuation value generated from Detrended Fluctuation Analysis i.e. the root mean square deviation from the fitted trend of the breath-to-breath intervals. Will only be computed if mora than 160 breath cycles in the signal. - “RRV_DFA_2”: the long-term fluctuation value. Will only be computed if mora than 640 breath cycles in the signal.

See also

signal_rate(), rsp_peaks(), signal_power(), entropy_sample(), entropy_approximate()

Examples

>>> import neurokit2 as nk
>>>
>>> rsp = nk.rsp_simulate(duration=90, respiratory_rate=15)
>>> rsp, info = nk.rsp_process(rsp)
>>> rrv = nk.rsp_rrv(rsp, show=True) 

References

  • Soni, R., & Muniyandi, M. (2019). Breath rate variability: a novel measure to study the meditation

effects. International Journal of Yoga, 12(1), 45.

rsp_simulate(duration=10, length=None, sampling_rate=1000, noise=0.01, respiratory_rate=15, method='breathmetrics', random_state=None)[source]

Simulate a respiratory signal.

Generate an artificial (synthetic) respiratory signal of a given duration and rate.

Parameters
  • duration (int) – Desired length of duration (s).

  • sampling_rate (int) – The desired sampling rate (in Hz, i.e., samples/second).

  • length (int) – The desired length of the signal (in samples).

  • noise (float) – Noise level (amplitude of the laplace noise).

  • respiratory_rate (float) – Desired number of breath cycles in one minute.

  • method (str) – The model used to generate the signal. Can be ‘sinusoidal’ for a simulation based on a trigonometric sine wave that roughly approximates a single respiratory cycle. If ‘breathmetrics’ (default), will use an advanced model desbribed Noto, et al. (2018).

  • random_state (int) – Seed for the random number generator.

Returns

array – Vector containing the respiratory signal.

Examples

>>> import pandas as pd
>>> import numpy as np
>>> import neurokit2 as nk
>>>
>>> rsp1 = nk.rsp_simulate(duration=30, method="sinusoidal")
>>> rsp2 = nk.rsp_simulate(duration=30, method="breathmetrics")
>>> fig = pd.DataFrame({"RSP_Simple": rsp1, "RSP_Complex": rsp2}).plot(subplots=True)
>>> fig 

References

Noto, T., Zhou, G., Schuele, S., Templer, J., & Zelano, C. (2018). Automated analysis of breathing waveforms using BreathMetrics: A respiratory signal processing toolbox. Chemical Senses, 43(8), 583–597. https://doi.org/10.1093/chemse/bjy045

EDA

Submodule for NeuroKit.

eda_analyze(data, sampling_rate=1000, method='auto')[source]

Performs EDA analysis on either epochs (event-related analysis) or on longer periods of data such as resting- state data.

Parameters
  • data (Union[dict, pd.DataFrame]) – A dictionary of epochs, containing one DataFrame per epoch, usually obtained via epochs_create(), or a DataFrame containing all epochs, usually obtained via epochs_to_df(). Can also take a DataFrame of processed signals from a longer period of data, typically generated by eda_process() or bio_process(). Can also take a dict containing sets of separate periods of data.

  • sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second). Defaults to 1000Hz.

  • method (str) – Can be one of ‘event-related’ for event-related analysis on epochs, or ‘interval-related’ for analysis on longer periods of data. Defaults to ‘auto’ where the right method will be chosen based on the mean duration of the data (‘event-related’ for duration under 10s).

Returns

DataFrame – A dataframe containing the analyzed EDA features. If event-related analysis is conducted, each epoch is indicated by the Label column. See eda_eventrelated() and eda_intervalrelated() docstrings for details.

See also

bio_process(), eda_process(), epochs_create(), eda_eventrelated(), eda_intervalrelated()

Examples

>>> import neurokit2 as nk
>>> # Example 1: Download the data for event-related analysis
>>> data = nk.data("bio_eventrelated_100hz")
>>>
>>> # Process the data for event-related analysis
>>> df, info = nk.bio_process(eda=data["EDA"], sampling_rate=100)
>>> events = nk.events_find(data["Photosensor"], threshold_keep='below',
...                         event_conditions=["Negative", "Neutral", "Neutral", "Negative"])
>>> epochs = nk.epochs_create(df, events, sampling_rate=100, epochs_start=-0.1, epochs_end=1.9)
>>>
>>> # Analyze
>>> nk.eda_analyze(epochs, sampling_rate=100) 
>>>
>>> # Example 2: Download the resting-state data
>>> data = nk.data("bio_resting_8min_100hz")
>>>
>>> # Process the data
>>> df, info = nk.eda_process(data["EDA"], sampling_rate=100)
>>>
>>> # Analyze
>>> nk.eda_analyze(df, sampling_rate=100) 
eda_autocor(eda_cleaned, sampling_rate=1000, lag=4)[source]

Computes autocorrelation measure of raw EDA signal i.e., the correlation between the time series data and a specified time-lagged version of itself.

Parameters
  • eda_cleaned (Union[list, np.array, pd.Series]) – The cleaned EDA signal.

  • sampling_rate (int) – The sampling frequency of raw EDA signal (in Hz, i.e., samples/second). Defaults to 1000Hz.

  • lag (int) – Time lag in seconds. Defaults to 4 seconds to avoid autoregressive correlations approaching 1, as recommended by Halem et al. (2020).

Returns

float – Autocorrelation index of the eda signal.

Examples

>>> import neurokit2 as nk
>>>
>>> # Simulate EDA signal
>>> eda_signal = nk.eda_simulate(duration=5, scr_number=5, drift=0.1)
>>> eda_cleaned = nk.eda_clean(eda_signal)
>>> cor = nk.eda_autocor(eda_cleaned)
>>> cor 

References

  • Halem, S., van Roekel, E., Kroencke, L., Kuper, N., & Denissen, J. (2020). Moments That Matter? On the Complexity of Using Triggers Based on Skin Conductance to Sample Arousing Events Within an Experience Sampling Framework. European Journal of Personality.

eda_changepoints(eda_cleaned)[source]

Calculate the number of change points using of the skin conductance signal in terms of mean and variance. Defaults to an algorithm penalty of 10000, as recommended by Halem et al. (2020).

Parameters

eda_cleaned (Union[list, np.array, pd.Series]) – The cleaned EDA signal.

Returns

float – Number of changepoints in the

See also

eda_simulate()

Examples

>>> import neurokit2 as nk
>>>
>>> # Simulate EDA signal
>>> eda_signal = nk.eda_simulate(duration=5, scr_number=5, drift=0.1)
>>> eda_cleaned = nk.eda_clean(eda_signal)
>>> changepoints = nk.eda_changepoints(eda_cleaned)
>>> changepoints 

References

  • Halem, S., van Roekel, E., Kroencke, L., Kuper, N., & Denissen, J. (2020). Moments That Matter? On the Complexity of Using Triggers Based on Skin Conductance to Sample Arousing Events Within an Experience Sampling Framework. European Journal of Personality.

eda_clean(eda_signal, sampling_rate=1000, method='neurokit')[source]

Preprocess Electrodermal Activity (EDA) signal.

Parameters
  • eda_signal (Union[list, np.array, pd.Series]) – The raw EDA signal.

  • sampling_rate (int) – The sampling frequency of rsp_signal (in Hz, i.e., samples/second).

  • method (str) – The processing pipeline to apply. Can be one of ‘neurokit’ (default) or ‘biosppy’.

Returns

array – Vector containing the cleaned EDA signal.

Examples

>>> import pandas as pd
>>> import neurokit2 as nk
>>>
>>> eda = nk.eda_simulate(duration=30, sampling_rate=100, scr_number=10, noise=0.01, drift=0.02)
>>> signals = pd.DataFrame({"EDA_Raw": eda,
...                         "EDA_BioSPPy": nk.eda_clean(eda, sampling_rate=100,method='biosppy'),
...                         "EDA_NeuroKit": nk.eda_clean(eda, sampling_rate=100, method='neurokit')})
>>> fig = signals.plot()
>>> fig 
eda_eventrelated(epochs, silent=False)[source]

Performs event-related EDA analysis on epochs.

Parameters
  • epochs (Union[dict, pd.DataFrame]) – A dict containing one DataFrame per event/trial, usually obtained via epochs_create(), or a DataFrame containing all epochs, usually obtained via epochs_to_df().

  • silent (bool) – If True, silence possible warnings.

Returns

DataFrame – A dataframe containing the analyzed EDA features for each epoch, with each epoch indicated by the Label column (if not present, by the Index column). The analyzed features consist the following:

  • ”EDA_SCR”: indication of whether Skin Conductance Response (SCR) occurs following the event (1 if an SCR onset is present and 0 if absent) and if so, its corresponding peak amplitude, time of peak, rise and recovery time. If there is no occurrence of SCR, nans are displayed for the below features.

  • EDA_Peak_Amplitude”: the maximum amplitude of the phasic component of the signal.

  • ”SCR_Peak_Amplitude”: the peak amplitude of the first SCR in each epoch.

  • ”SCR_Peak_Amplitude_Time”: the timepoint of each first SCR peak amplitude.

  • ”SCR_RiseTime”: the risetime of each first SCR i.e., the time it takes for SCR to reach peak amplitude from onset.

  • ”SCR_RecoveryTime”: the half-recovery time of each first SCR i.e., the time it takes for SCR to decrease to half amplitude.

See also

events_find(), epochs_create(), bio_process()

Examples

>>> import neurokit2 as nk
>>>
>>> # Example with simulated data
>>> eda = nk.eda_simulate(duration=15, scr_number=3)
>>>
>>> # Process data
>>> eda_signals, info = nk.eda_process(eda, sampling_rate=1000)
>>> epochs = nk.epochs_create(eda_signals, events=[5000, 10000, 15000], sampling_rate=1000,
...                           epochs_start=-0.1, epochs_end=1.9)
>>>
>>> # Analyze
>>> nk.eda_eventrelated(epochs) 
>>>
>>> # Example with real data
>>> data = nk.data("bio_eventrelated_100hz")
>>>
>>> # Process the data
>>> df, info = nk.bio_process(eda=data["EDA"], sampling_rate=100)
>>> events = nk.events_find(data["Photosensor"], threshold_keep='below',
...                         event_conditions=["Negative", "Neutral", "Neutral", "Negative"])
>>> epochs = nk.epochs_create(df, events, sampling_rate=100, epochs_start=-0.1, epochs_end=6.9)
>>>
>>> # Analyze
>>> nk.eda_eventrelated(epochs) 
eda_findpeaks(eda_phasic, sampling_rate=1000, method='neurokit', amplitude_min=0.1)[source]

Identify Skin Conductance Responses (SCR) in Electrodermal Activity (EDA).

Low-level function used by eda_peaks() to identify Skin Conductance Responses (SCR) peaks in the phasic component of Electrodermal Activity (EDA) with different possible methods. See eda_peaks() for details.

Parameters
  • eda_phasic (Union[list, np.array, pd.Series]) – The phasic component of the EDA signal (from eda_phasic()).

  • sampling_rate (int) – The sampling frequency of the EDA signal (in Hz, i.e., samples/second).

  • method (str) – The processing pipeline to apply. Can be one of “neurokit” (default), “gamboa2008”, “kim2004” (the default in BioSPPy) or “vanhalem2020”.

  • amplitude_min (float) – Only used if ‘method’ is ‘neurokit’ or ‘kim2004’. Minimum threshold by which to exclude SCRs (peaks) as relative to the largest amplitude in the signal.

Returns

info (dict) – A dictionary containing additional information, in this case the aplitude of the SCR, the samples at which the SCR onset and the SCR peaks occur. Accessible with the keys “SCR_Amplitude”, “SCR_Onsets”, and “SCR_Peaks” respectively.

Examples

>>> import neurokit2 as nk
>>>
>>> # Get phasic component
>>> eda_signal = nk.eda_simulate(duration=30, scr_number=5, drift=0.1, noise=0)
>>> eda_cleaned = nk.eda_clean(eda_signal)
>>> eda = nk.eda_phasic(eda_cleaned)
>>> eda_phasic = eda["EDA_Phasic"].values
>>>
>>> # Find peaks
>>> gamboa2008 = nk.eda_findpeaks(eda_phasic, method="gamboa2008")
>>> kim2004 = nk.eda_findpeaks(eda_phasic, method="kim2004")
>>> neurokit = nk.eda_findpeaks(eda_phasic, method="neurokit")
>>> vanhalem2020 = nk.eda_findpeaks(eda_phasic, method="vanhalem2020")
>>> fig = nk.events_plot([gamboa2008["SCR_Peaks"], kim2004["SCR_Peaks"], vanhalem2020["SCR_Peaks"],
...                       neurokit["SCR_Peaks"]], eda_phasic)
>>> fig 

References

  • Gamboa, H. (2008). Multi-modal behavioral biometrics based on hci and electrophysiology. PhD ThesisUniversidade.

  • Kim, K. H., Bang, S. W., & Kim, S. R. (2004). Emotion recognition system using short-term monitoring of physiological signals. Medical and biological engineering and computing, 42(3), 419-427.

  • van Halem, S., Van Roekel, E., Kroencke, L., Kuper, N., & Denissen, J. (2020). Moments That Matter? On the Complexity of Using Triggers Based on Skin Conductance to Sample Arousing Events Within an Experience Sampling Framework. European Journal of Personality.

eda_fixpeaks(peaks, onsets=None, height=None)[source]

Correct Skin Conductance Responses (SCR) peaks.

Low-level function used by eda_peaks() to correct the peaks found by eda_findpeaks(). Doesn’t do anything for now for EDA. See eda_peaks() for details.

Parameters
  • peaks (list or array or DataFrame or Series or dict) – The samples at which the SCR peaks occur. If a dict or a DataFrame is passed, it is assumed that these containers were obtained with eda_findpeaks().

  • onsets (list or array or DataFrame or Series or dict) – The samples at which the SCR onsets occur. If a dict or a DataFrame is passed, it is assumed that these containers were obtained with eda_findpeaks(). Defaults to None.

  • height (list or array or DataFrame or Series or dict) – The samples at which the amplitude of the SCR peaks occur. If a dict or a DataFrame is passed, it is assumed that these containers were obtained with eda_findpeaks(). Defaults to None.

Returns

info (dict) – A dictionary containing additional information, in this case the aplitude of the SCR, the samples at which the SCR onset and the SCR peaks occur. Accessible with the keys “SCR_Amplitude”, “SCR_Onsets”, and “SCR_Peaks” respectively.

Examples

>>> import neurokit2 as nk
>>>
>>> # Get phasic component
>>> eda_signal = nk.eda_simulate(duration=30, scr_number=5, drift=0.1, noise=0)
>>> eda_cleaned = nk.eda_clean(eda_signal)
>>> eda = nk.eda_phasic(eda_cleaned)
>>> eda_phasic = eda["EDA_Phasic"].values
>>>
>>> # Find and fix peaks
>>> info = nk.eda_findpeaks(eda_phasic)
>>> info = nk.eda_fixpeaks(info)
>>>
>>> fig = nk.events_plot(info["SCR_Peaks"], eda_phasic)
>>> fig 
eda_intervalrelated(data)[source]

Performs EDA analysis on longer periods of data (typically > 10 seconds), such as resting-state data.

Parameters

data (Union[dict, pd.DataFrame]) – A DataFrame containing the different processed signal(s) as different columns, typically generated by eda_process() or bio_process(). Can also take a dict containing sets of separately processed DataFrames.

Returns

DataFrame – A dataframe containing the analyzed EDA features. The analyzed features consist of the following: - “SCR_Peaks_N”: the number of occurrences of Skin Conductance Response (SCR). - “SCR_Peaks_Amplitude_Mean”: the mean amplitude of the SCR peak occurrences.

See also

bio_process(), eda_eventrelated()

Examples

>>> import neurokit2 as nk
>>>
>>> # Download data
>>> data = nk.data("bio_resting_8min_100hz")
>>>
>>> # Process the data
>>> df, info = nk.eda_process(data["EDA"], sampling_rate=100)
>>>
>>> # Single dataframe is passed
>>> nk.eda_intervalrelated(df) 
>>>
>>> epochs = nk.epochs_create(df, events=[0, 25300], sampling_rate=100, epochs_end=20)
>>> nk.eda_intervalrelated(epochs) 
eda_peaks(eda_phasic, sampling_rate=1000, method='neurokit', amplitude_min=0.1)[source]

Identify Skin Conductance Responses (SCR) in Electrodermal Activity (EDA).

Identify Skin Conductance Responses (SCR) peaks in the phasic component of Electrodermal Activity (EDA) with different possible methods, such as:

  • `Gamboa, H. (2008)

<http://www.lx.it.pt/~afred/pub/thesisHugoGamboa.pdf>`_ - Kim et al. (2004)

Parameters
  • eda_phasic (Union[list, np.array, pd.Series]) – The phasic component of the EDA signal (from eda_phasic()).

  • sampling_rate (int) – The sampling frequency of the EDA signal (in Hz, i.e., samples/second).

  • method (str) – The processing pipeline to apply. Can be one of “neurokit” (default), “gamboa2008” or “kim2004” (the default in BioSPPy).

  • amplitude_min (float) – Only used if ‘method’ is ‘neurokit’ or ‘kim2004’. Minimum threshold by which to exclude SCRs (peaks) as relative to the largest amplitude in the signal.

Returns

  • info (dict) – A dictionary containing additional information, in this case the aplitude of the SCR, the samples at which the SCR onset and the SCR peaks occur. Accessible with the keys “SCR_Amplitude”, “SCR_Onsets”, and “SCR_Peaks” respectively.

  • signals (DataFrame) – A DataFrame of same length as the input signal in which occurences of SCR peaks are marked as “1” in lists of zeros with the same length as eda_cleaned. Accessible with the keys “SCR_Peaks”.

Examples

>>> import neurokit2 as nk
>>>
>>> # Get phasic component
>>> eda_signal = nk.eda_simulate(duration=30, scr_number=5, drift=0.1, noise=0, sampling_rate=100)
>>> eda_cleaned = nk.eda_clean(eda_signal, sampling_rate=100)
>>> eda = nk.eda_phasic(eda_cleaned, sampling_rate=100)
>>> eda_phasic = eda["EDA_Phasic"].values
>>>
>>> # Find peaks
>>> _, gamboa2008 = nk.eda_peaks(eda_phasic, method="gamboa2008")
>>> _, kim2004 = nk.eda_peaks(eda_phasic, method="kim2004")
>>> _, neurokit = nk.eda_peaks(eda_phasic, method="neurokit")
>>> nk.events_plot([gamboa2008["SCR_Peaks"], kim2004["SCR_Peaks"], neurokit["SCR_Peaks"]], eda_phasic) 
<Figure ...>

References

  • Gamboa, H. (2008). Multi-modal behavioral biometrics based on hci and electrophysiology. PhD ThesisUniversidade.

  • Kim, K. H., Bang, S. W., & Kim, S. R. (2004). Emotion recognition system using short-term monitoring of physiological signals. Medical and biological engineering and computing, 42(3), 419-427.

eda_phasic(eda_signal, sampling_rate=1000, method='highpass')[source]

Decompose Electrodermal Activity (EDA) into Phasic and Tonic components.

Decompose the Electrodermal Activity (EDA) into two components, namely Phasic and Tonic, using different methods including cvxEDA (Greco, 2016) or Biopac’s Acqknowledge algorithms.

Parameters
  • eda_signal (Union[list, np.array, pd.Series]) – The raw EDA signal.

  • sampling_rate (int) – The sampling frequency of raw EDA signal (in Hz, i.e., samples/second). Defaults to 1000Hz.

  • method (str) – The processing pipeline to apply. Can be one of “cvxEDA”, “median”, “smoothmedian”, “highpass”, “biopac”, or “acqknowledge”.

Returns

DataFrame – DataFrame containing the ‘Tonic’ and the ‘Phasic’ components as columns.

Examples

>>> import neurokit2 as nk
>>>
>>> # Simulate EDA signal
>>> eda_signal = nk.eda_simulate(duration=30, scr_number=5, drift=0.1)
>>>
>>> # Decompose using different algorithms
>>> cvxEDA = nk.eda_phasic(nk.standardize(eda_signal), method='cvxeda')
>>> smoothMedian = nk.eda_phasic(nk.standardize(eda_signal), method='smoothmedian')
>>> highpass = nk.eda_phasic(nk.standardize(eda_signal), method='highpass')
>>>
>>> data = pd.concat([cvxEDA.add_suffix('_cvxEDA'), smoothMedian.add_suffix('_SmoothMedian'),
...                   highpass.add_suffix('_Highpass')], axis=1)
>>> data["EDA_Raw"] = eda_signal
>>> fig = data.plot()
>>> fig 
>>>
>>> eda_signal = nk.data("bio_eventrelated_100hz")["EDA"]
>>> data = nk.eda_phasic(nk.standardize(eda_signal), sampling_rate=100)
>>> data["EDA_Raw"] = eda_signal
>>> fig = nk.signal_plot(data, standardize=True)
>>> fig 

References

  • cvxEDA: https://github.com/lciti/cvxEDA.

  • Greco, A., Valenza, G., & Scilingo, E. P. (2016). Evaluation of CDA and CvxEDA Models. In Advances in Electrodermal Activity Processing with Applications for Mental Health (pp. 35-43). Springer International Publishing.

  • Greco, A., Valenza, G., Lanata, A., Scilingo, E. P., & Citi, L. (2016). cvxEDA: A convex optimization approach to electrodermal activity processing. IEEE Transactions on Biomedical Engineering, 63(4), 797-804.

eda_plot(eda_signals, sampling_rate=None)[source]

Visualize electrodermal activity (EDA) data.

Parameters
  • eda_signals (DataFrame) – DataFrame obtained from eda_process().

  • sampling_rate (int) – The desired sampling rate (in Hz, i.e., samples/second). Defaults to None.

Returns

fig – Figure representing a plot of the processed EDA signals.

Examples

>>> import neurokit2 as nk
>>>
>>> eda_signal = nk.eda_simulate(duration=30, scr_number=5, drift=0.1, noise=0, sampling_rate=250)
>>> eda_signals, info = nk.eda_process(eda_signal, sampling_rate=250)
>>> fig = nk.eda_plot(eda_signals)
>>> fig 

See also

eda_process()

eda_process(eda_signal, sampling_rate=1000, method='neurokit')[source]

Process Electrodermal Activity (EDA).

Convenience function that automatically processes electrodermal activity (EDA) signal.

Parameters
  • eda_signal (Union[list, np.array, pd.Series]) – The raw EDA signal.

  • sampling_rate (int) – The sampling frequency of rsp_signal (in Hz, i.e., samples/second).

  • method (str) – The processing pipeline to apply. Can be one of “biosppy” or “neurokit” (default).

Returns

  • signals (DataFrame) – A DataFrame of same length as eda_signal containing the following columns:

    • ”EDA_Raw”: the raw signal.

    • ”EDA_Clean”: the cleaned signal.

    • ”EDA_Tonic”: the tonic component of the signal, or the Tonic Skin Conductance Level (SCL).

    • ”EDA_Phasic”: the phasic component of the signal, or the Phasic Skin Conductance Response (SCR).

    • ”SCR_Onsets”: the samples at which the onsets of the peaks occur, marked as “1” in a list of zeros.

    • ”SCR_Peaks”: the samples at which the peaks occur, marked as “1” in a list of zeros.

    • ”SCR_Height”: the SCR amplitude of the signal including the Tonic component. Note that cumulative effects of close- occurring SCRs might lead to an underestimation of the amplitude.

    • ”SCR_Amplitude”: the SCR amplitude of the signal excluding the Tonic component.

    • ”SCR_RiseTime”: the time taken for SCR onset to reach peak amplitude within the SCR.

    • ”SCR_Recovery”: the samples at which SCR peaks recover (decline) to half amplitude, marked as “1” in a list of zeros.

  • info (dict) – A dictionary containing the information of each SCR peak (see eda_findpeaks()).

Examples

>>> import neurokit2 as nk
>>>
>>> eda_signal = nk.eda_simulate(duration=30, scr_number=5, drift=0.1, noise=0)
>>> signals, info = nk.eda_process(eda_signal, sampling_rate=1000)
>>> fig = nk.eda_plot(signals)
>>> fig 
eda_simulate(duration=10, length=None, sampling_rate=1000, noise=0.01, scr_number=1, drift=- 0.01, random_state=None)[source]

Simulate Electrodermal Activity (EDA) signal.

Generate an artificial (synthetic) EDA signal of a given duration and sampling rate.

Parameters
  • duration (int) – Desired recording length in seconds.

  • sampling_rate (int) – The desired sampling rate (in Hz, i.e., samples/second). Defaults to 1000Hz.

  • length (int) – The desired length of the signal (in samples). Defaults to None.

  • noise (float) – Noise level (amplitude of the laplace noise). Defaults to 0.01.

  • scr_number (int) – Desired number of skin conductance responses (SCRs), i.e., peaks. Defaults to 1.

  • drift (float or list) – The slope of a linear drift of the signal. Defaults to -0.01.

  • random_state (int) – Seed for the random number generator. Defaults to None.

Returns

array – Vector containing the EDA signal.

Examples

>>> import neurokit2 as nk
>>> import pandas as pd
>>>
>>> eda = nk.eda_simulate(duration=10, scr_number=3)
>>> fig = nk.signal_plot(eda)
>>> fig 

See also

ecg_simulate(), rsp_simulate(), emg_simulate(), ppg_simulate()

References

  • Bach, D. R., Flandin, G., Friston, K. J., & Dolan, R. J. (2010). Modelling event-related skin conductance responses. International Journal of Psychophysiology, 75(3), 349-356.

EMG

Submodule for NeuroKit.

emg_activation(emg_amplitude=None, emg_cleaned=None, sampling_rate=1000, method='threshold', threshold='default', duration_min='default', **kwargs)[source]

Detects onset in EMG signal based on the amplitude threshold.

Parameters
  • emg_amplitude (array) – At least one EMG-related signal. Either the amplitude of the EMG signal, obtained from emg_amplitude() for methods like ‘threshold’ or ‘mixture’), and / or the cleaned EMG signal (for methods like ‘pelt’).

  • emg_cleaned (array) – At least one EMG-related signal. Either the amplitude of the EMG signal, obtained from emg_amplitude() for methods like ‘threshold’ or ‘mixture’), and / or the cleaned EMG signal (for methods like ‘pelt’).

  • sampling_rate (int) – The sampling frequency of emg_signal (in Hz, i.e., samples/second).

  • method (str) – The algorithm used to discriminate between activity and baseline. Can be one of ‘mixture’ (default) or ‘threshold’. If ‘mixture’, will use a Gaussian Mixture Model to categorize between the two states. If ‘threshold’, will consider as activated all points which amplitude is superior to the threshold.

  • threshold (float) – If method is ‘mixture’, then it corresponds to the minimum probability required to be considered as activated (default to 0.33). If method is ‘threshold’, then it corresponds to the minimum amplitude to detect as onset. Defaults to one tenth of the standard deviation of emg_amplitude.

  • duration_min (float) – The minimum duration of a period of activity or non-activity in seconds. If ‘default’, will be set to 0.05 (50 ms).

  • kwargs (optional) – Other arguments.

Returns

  • info (dict) – A dictionary containing additional information, in this case the samples at which the onsets, offsets, and periods of activations of the EMG signal occur, accessible with the key “EMG_Onsets”, “EMG_Offsets”, and “EMG_Activity” respectively.

  • activity_signal (DataFrame) – A DataFrame of same length as the input signal in which occurences of onsets, offsets, and activity (above the threshold) of the EMG signal are marked as “1” in lists of zeros with the same length as emg_amplitude. Accessible with the keys “EMG_Onsets”, “EMG_Offsets”, and “EMG_Activity” respectively.

Examples

>>> import neurokit2 as nk
>>>
>>> # Simulate signal and obtain amplitude
>>> emg = nk.emg_simulate(duration=10, burst_number=3)
>>> emg_cleaned = nk.emg_clean(emg)
>>> emg_amplitude = nk.emg_amplitude(emg_cleaned)
>>>
>>> # Threshold method
>>> activity, info = nk.emg_activation(emg_amplitude=emg_amplitude, method="threshold")
>>> fig = nk.events_plot([info["EMG_Offsets"], info["EMG_Onsets"]], emg_cleaned)
>>> fig 
>>>
>>> # Threshold method
>>> activity, info = nk.emg_activation(emg_cleaned=emg_cleaned, method="pelt")
>>> nk.signal_plot([emg_cleaned, activity["EMG_Activity"]])
>>> fig 

References

emg_amplitude(emg_cleaned)[source]

Compute electromyography (EMG) amplitude.

Compute electromyography amplitude given the cleaned respiration signal, done by calculating the linear envelope of the signal.

Parameters

emg_cleaned (Union[list, np.array, pd.Series]) – The cleaned electromyography channel as returned by emg_clean().

Returns

array – A vector containing the electromyography amplitude.

See also

emg_clean(), emg_rate(), emg_process(), emg_plot()

Examples

>>> import neurokit2 as nk
>>> import pandas as pd
>>>
>>> emg = nk.emg_simulate(duration=10, sampling_rate=1000, burst_number=3)
>>> cleaned = nk.emg_clean(emg, sampling_rate=1000)
>>>
>>> amplitude = nk.emg_amplitude(cleaned)
>>> fig = pd.DataFrame({"EMG": emg, "Amplitude": amplitude}).plot(subplots=True)
>>> fig 
emg_analyze(data, sampling_rate=1000, method='auto')[source]

Performs EMG analysis on either epochs (event-related analysis) or on longer periods of data such as resting- state data.

Parameters
  • data (Union[dict, pd.DataFrame]) – A dictionary of epochs, containing one DataFrame per epoch, usually obtained via epochs_create(), or a DataFrame containing all epochs, usually obtained via epochs_to_df(). Can also take a DataFrame of processed signals from a longer period of data, typically generated by emg_process() or bio_process(). Can also take a dict containing sets of separate periods of data.

  • sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second). Defaults to 1000Hz.

  • method (str) – Can be one of ‘event-related’ for event-related analysis on epochs, or ‘interval-related’ for analysis on longer periods of data. Defaults to ‘auto’ where the right method will be chosen based on the mean duration of the data (‘event-related’ for duration under 10s).

Returns

DataFrame – A dataframe containing the analyzed EMG features. If event-related analysis is conducted, each epoch is indicated by the Label column. See emg_eventrelated() and emg_intervalrelated() docstrings for details.

See also

bio_process(), emg_process(), epochs_create(), emg_eventrelated(), emg_intervalrelated()

Examples

>>> import neurokit2 as nk
>>> import pandas as pd
>>> # Example with simulated data
>>> emg = nk.emg_simulate(duration=20, sampling_rate=1000, burst_number=3)
>>> emg_signals, info = nk.emg_process(emg, sampling_rate=1000)
>>> epochs = nk.epochs_create(emg_signals, events=[3000, 6000, 9000], sampling_rate=1000,
...                           epochs_start=-0.1, epochs_end=1.9)
>>>
>>> # Event-related analysis
>>> nk.emg_analyze(epochs, method="event-related") 
>>>
>>> # Interval-related analysis
>>> nk.emg_analyze(emg_signals, method="interval-related") 
emg_clean(emg_signal, sampling_rate=1000)[source]

Preprocess an electromyography (emg) signal.

Clean an EMG signal using a set of parameters, such as: in `BioSPPy <https://github.com/PIA-Group/BioSPPy/blob/e65da30f6379852ecb98f8e2e0c9b4b5175416c3/biosppy/signals/emg.py>>`_: fourth order 100 Hz highpass Butterworth filter followed by a constant detrending.

Parameters
  • emg_signal (Union[list, np.array, pd.Series]) – The raw EMG channel.

  • sampling_rate (int) – The sampling frequency of emg_signal (in Hz, i.e., samples/second). Defaults to 1000.

Returns

array – Vector containing the cleaned EMG signal.

Examples

>>> import pandas as pd
>>> import neurokit2 as nk
>>>
>>> emg = nk.emg_simulate(duration=10, sampling_rate=1000)
>>> signals = pd.DataFrame({"EMG_Raw": emg, "EMG_Cleaned":nk.emg_clean(emg, sampling_rate=1000)})
>>> fig = signals.plot()
>>> fig 
emg_eventrelated(epochs, silent=False)[source]

Performs event-related EMG analysis on epochs.

Parameters
  • epochs (Union[dict, pd.DataFrame]) – A dict containing one DataFrame per event/trial, usually obtained via epochs_create(), or a DataFrame containing all epochs, usually obtained via epochs_to_df().

  • silent (bool) – If True, silence possible warnings.

Returns

DataFrame – A dataframe containing the analyzed EMG features for each epoch, with each epoch indicated by the Label column (if not present, by the Index column). The analyzed features consist of the following:

  • ”EMG_Activation”: indication of whether there is muscular activation following the onset

of the event (1 if present, 0 if absent) and if so, its corresponding amplitude features and the number of activations in each epoch. If there is no activation, nans are displayed for the below features. - “EMG_Amplitude_Mean”: the mean amplitude of the activity. - “EMG_Amplitude_Max”: the maximum amplitude of the activity. - “EMG_Amplitude_Max_Time”: the time of maximum amplitude. - “EMG_Bursts”: the number of activations, or bursts of activity, within each epoch.

See also

emg_simulate(), emg_process(), events_find(), epochs_create()

Examples

>>> import neurokit2 as nk
>>>
>>> # Example with simulated data
>>> emg = nk.emg_simulate(duration=20, sampling_rate=1000, burst_number=3)
>>> emg_signals, info = nk.emg_process(emg, sampling_rate=1000)
>>> epochs = nk.epochs_create(emg_signals, events=[3000, 6000, 9000], sampling_rate=1000,
...                           epochs_start=-0.1,epochs_end=1.9)
>>> nk.emg_eventrelated(epochs) 
emg_intervalrelated(data)[source]

Performs EMG analysis on longer periods of data (typically > 10 seconds), such as resting-state data.

Parameters

data (Union[dict, pd.DataFrame]) – A DataFrame containing the different processed signal(s) as different columns, typically generated by emg_process() or bio_process(). Can also take a dict containing sets of separately processed DataFrames.

Returns

DataFrame – A dataframe containing the analyzed EMG features. The analyzed features consist of the following: - “EMG_Activation_N”: the number of bursts of muscular activity. - “EMG_Amplitude_Mean”: the mean amplitude of the muscular activity.

See also

bio_process(), emg_eventrelated()

Examples

>>> import neurokit2 as nk
>>>
>>> # Example with simulated data
>>> emg = nk.emg_simulate(duration=40, sampling_rate=1000, burst_number=3)
>>> emg_signals, info = nk.emg_process(emg, sampling_rate=1000)
>>>
>>> # Single dataframe is passed
>>> nk.emg_intervalrelated(emg_signals) 
>>>
>>> epochs = nk.epochs_create(emg_signals, events=[0, 20000], sampling_rate=1000, epochs_end=20)
>>> nk.emg_intervalrelated(epochs) 
emg_plot(emg_signals, sampling_rate=None)[source]

Visualize electromyography (EMG) data.

Parameters
  • emg_signals (DataFrame) – DataFrame obtained from emg_process().

  • sampling_rate (int) – The sampling frequency of the EMG (in Hz, i.e., samples/second). Needs to be supplied if the data should be plotted over time in seconds. Otherwise the data is plotted over samples. Defaults to None.

Returns

fig – Figure representing a plot of the processed emg signals.

Examples

>>> import neurokit2 as nk
>>>
>>> emg = nk.emg_simulate(duration=10, sampling_rate=1000, burst_number=3)
>>> emg_signals, _ = nk.emg_process(emg, sampling_rate=1000)
>>> fig = nk.emg_plot(emg_signals)
>>> fig 

See also

ecg_process()

emg_process(emg_signal, sampling_rate=1000)[source]

Process a electromyography (EMG) signal.

Convenience function that automatically processes an electromyography signal.

Parameters
  • emg_signal (Union[list, np.array, pd.Series]) – The raw electromyography channel.

  • sampling_rate (int) – The sampling frequency of emg_signal (in Hz, i.e., samples/second).

Returns

  • signals (DataFrame) – A DataFrame of same length as emg_signal containing the following columns: - “EMG_Raw”: the raw signal. - “EMG_Clean”: the cleaned signal. - “EMG_Amplitude”: the signal amplitude, or the activation level of the signal. - “EMG_Activity”: the activity of the signal for which amplitude exceeds the threshold specified, marked as “1” in a list of zeros. - “EMG_Onsets”: the onsets of the amplitude, marked as “1” in a list of zeros. - “EMG_Offsets”: the offsets of the amplitude, marked as “1” in a list of zeros.

  • info (dict) – A dictionary containing the information of each amplitude onset, offset, and peak activity (see emg_activation()).

Examples

>>> import neurokit2 as nk
>>>
>>> emg = nk.emg_simulate(duration=10, sampling_rate=1000, burst_number=3)
>>> signals, info = nk.emg_process(emg, sampling_rate=1000)
>>> fig = nk.emg_plot(signals)
>>> fig 
emg_simulate(duration=10, length=None, sampling_rate=1000, noise=0.01, burst_number=1, burst_duration=1.0, random_state=42)[source]

Simulate an EMG signal.

Generate an artificial (synthetic) EMG signal of a given duration and sampling rate.

Parameters
  • duration (int) – Desired recording length in seconds.

  • sampling_rate (int) – The desired sampling rate (in Hz, i.e., samples/second).

  • length (int) – The desired length of the signal (in samples).

  • noise (float) – Noise level (gaussian noise).

  • burst_number (int) – Desired number of bursts of activity (active muscle periods).

  • burst_duration (float or list) – Duration of the bursts. Can be a float (each burst will have the same duration) or a list of durations for each bursts.

  • random_state (int) – Seed for the random number generator.

Returns

array – Vector containing the EMG signal.

Examples

>>> import neurokit2 as nk
>>> import pandas as pd
>>>
>>> emg = nk.emg_simulate(duration=10, burst_number=3)
>>> fig = nk.signal_plot(emg)
>>> fig 

See also

ecg_simulate(), rsp_simulate(), eda_simulate(), ppg_simulate()

References

This function is based on this script.

EEG

Submodule for NeuroKit.

mne_channel_add(raw, channel, channel_type=None, channel_name=None, sync_index_raw=0, sync_index_channel=0)[source]

Add channel as array to MNE.

Add a channel to a mne’s Raw m/eeg file. It will basically synchronize the channel to the eeg data following a particular index and add it.

Parameters
  • raw (mne.io.Raw) – Raw EEG data from MNE.

  • channel (list or array) – The signal to be added.

  • channel_type (str) – Channel type. Currently supported fields are ‘ecg’, ‘bio’, ‘stim’, ‘eog’, ‘misc’, ‘seeg’, ‘ecog’, ‘mag’, ‘eeg’, ‘ref_meg’, ‘grad’, ‘emg’, ‘hbr’ or ‘hbo’.

  • channel_name (str) – Desired channel name.

  • sync_index_raw (int or list) – An index (e.g., the onset of the same event marked in the same signal), in the raw data, by which to align the two inputs. This can be used in case the EEG data and the channel to add do not have the same onsets and must be aligned through some common event.

  • sync_index_channel (int or list) – An index (e.g., the onset of the same event marked in the same signal), in the channel to add, by which to align the two inputs. This can be used in case the EEG data and the channel to add do not have the same onsets and must be aligned through some common event.

Returns

mne.io.Raw – Raw data in FIF format.

Example

>>> import neurokit2 as nk
>>> import mne
>>>
>>> # Let's say that the 42nd sample point in the EEG correspond to the 333rd point in the ECG
>>> event_index_in_eeg = 42
>>> event_index_in_ecg = 333
>>>
>>> raw = mne.io.read_raw_fif(mne.datasets.sample.data_path() + '/MEG/sample/sample_audvis_raw.fif',
...                           preload=True) 
>>> ecg = nk.ecg_simulate(length=170000)
>>>
>>> raw = nk.mne_channel_add(raw, ecg, sync_index_raw=event_index_in_eeg,
...                          sync_index_channel=event_index_in_ecg, channel_type="ecg") 
mne_channel_extract(raw, what, name=None)[source]

Channel array extraction from MNE.

Select one or several channels by name and returns them in a dataframe.

Parameters
  • raw (mne.io.Raw) – Raw EEG data.

  • what (str or list) – Can be ‘MEG’, which will extract all MEG channels, ‘EEG’, which will extract all EEG channels, or ‘EOG’, which will extract all EOG channels (that is, if channel names are named with prefixes of their type e.g., ‘EEG 001’ etc. or ‘EOG 061’). Provide exact a single or a list of channel’s name(s) if not (e.g., [‘124’, ‘125’]).

  • name (str or list) – Useful only when extracting one channel. Can also take a list of names for renaming multiple channels, Otherwise, defaults to None.

Returns

DataFrame – A DataFrame or Series containing the channel(s).

Example

>>> import neurokit2 as nk
>>> import mne
>>>
>>> raw = mne.io.read_raw_fif(mne.datasets.sample.data_path() +
...                           '/MEG/sample/sample_audvis_raw.fif', preload=True) 
>>>
>>> raw_channel = nk.mne_channel_extract(raw, what=["EEG 060", "EEG 055"], name=['060', '055']) 
>>> eeg_channels = nk.mne_channel_extract(raw, "EEG") 
>>> eog_channels = nk.mne_channel_extract(raw, what='EOG', name='EOG') 

Signal Processing

Submodule for NeuroKit.

signal_autocor(signal, lag=None, normalize=True)[source]

Auto-correlation of a 1-dimensional sequences.

Parameters
  • signal (Union[list, np.array, pd.Series]) – Vector of values.

  • normalize (bool) – Normalize the autocorrelation output.

  • lag (int) – Time lag. If specified, one value of autocorrelation between signal with its lag self will be returned.

Returns

r – The cross-correlation of the signal with itself at different time lags. Minimum time lag is 0, maximum time lag is the length of the signal. Or a correlation value at a specific lag if lag is not None.

Examples

>>> import neurokit2 as nk
>>>
>>> x = [1, 2, 3, 4, 5]
>>> autocor = nk.signal_autocor(x)
>>> autocor 
signal_binarize(signal, method='threshold', threshold='auto')[source]

Binarize a continuous signal.

Convert a continuous signal into zeros and ones depending on a given threshold.

Parameters
  • signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

  • method (str) – The algorithm used to discriminate between the two states. Can be one of ‘mixture’ (default) or ‘threshold’. If ‘mixture’, will use a Gaussian Mixture Model to categorize between the two states. If ‘threshold’, will consider as activated all points which value is superior to the threshold.

  • threshold (float) – If method is ‘mixture’, then it corresponds to the minimum probability required to be considered as activated (if ‘auto’, then 0.5). If method is ‘threshold’, then it corresponds to the minimum amplitude to detect as onset. If “auto”, takes the value between the max and the min.

Returns

list – A list or array depending on the type passed.

Examples

>>> import numpy as np
>>> import pandas as pd
>>> import neurokit2 as nk
>>>
>>> signal = np.cos(np.linspace(start=0, stop=20, num=1000))
>>> binary = nk.signal_binarize(signal)
>>> fig = pd.DataFrame({"Raw": signal, "Binary": binary}).plot()
>>> fig 
signal_changepoints(signal, change='meanvar', penalty=None, show=False)[source]

Change Point Detection.

Only the PELT method is implemented for now.

Parameters
  • signal (Union[list, np.array, pd.Series]) – Vector of values.

  • change (str) – Can be one of “meanvar” (default), “mean” or “var”.

  • penalty (float) – The algorithm penalty. Default to np.log(len(signal)).

  • show (bool) – Defaults to False.

Returns

  • Array – Values indicating the samples at which the changepoints occur.

  • Fig – Figure of plot of signal with markers of changepoints.

Examples

>>> import neurokit2 as nk
>>>
>>> signal = nk.emg_simulate(burst_number=3)
>>> fig = nk.signal_changepoints(signal, change="var", show=True)
>>> fig 

References

  • Killick, R., Fearnhead, P., & Eckley, I. A. (2012). Optimal detection of changepoints with a linear

computational cost. Journal of the American Statistical Association, 107(500), 1590-1598.

signal_decompose(signal, method='emd', n_components=None, **kwargs)[source]

Decompose a signal.

Signal decomposition into different sources using different methods, such as Empirical Mode Decomposition (EMD) or Singular spectrum analysis (SSA)-based signal separation method.

The extracted components can then be recombined into meaningful sources using signal_recompose().

Parameters
  • signal (Union[list, np.array, pd.Series]) – Vector of values.

  • method (str) – The decomposition method. Can be one of ‘emd’ or ‘ssa’.

  • n_components (int) – Number of components to extract. Only used for ‘ssa’ method. If None, will default to 50.

  • **kwargs – Other arguments passed to other functions.

Returns

Array – Components of the decomposed signal.

Examples

>>> import neurokit2 as nk
>>>
>>> # Create complex signal
>>> signal = nk.signal_simulate(duration=10, frequency=1, noise=0.01)  # High freq
>>> signal += 3 * nk.signal_simulate(duration=10, frequency=3, noise=0.01)  # Higher freq
>>> signal += 3 * np.linspace(0, 2, len(signal))  # Add baseline and trend
>>> signal += 2 * nk.signal_simulate(duration=10, frequency=0.1, noise=0)
>>>
>>> nk.signal_plot(signal)
>>>
>>> # EMD method
>>> components = nk.signal_decompose(signal, method="emd")
>>> fig = nk.signal_plot(components)  # Visualize components
>>> fig  
>>>
>>> # SSA method
>>> components = nk.signal_decompose(signal, method="ssa")
>>> fig = nk.signal_plot(components)  # Visualize components
>>> fig  
signal_detrend(signal, method='polynomial', order=1, regularization=500, alpha=0.75, window=1.5, stepsize=0.02)[source]

Polynomial detrending of signal.

Apply a baseline (order = 0), linear (order = 1), or polynomial (order > 1) detrending to the signal (i.e., removing a general trend). One can also use other methods, such as smoothness priors approach described by Tarvainen (2002) or LOESS regression, but these scale badly for long signals.

Parameters
  • signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

  • method (str) – Can be one of ‘polynomial’ (default; traditional detrending of a given order) or ‘tarvainen2002’ to use the smoothness priors approach described by Tarvainen (2002) (mostly used in HRV analyses as a lowpass filter to remove complex trends), ‘loess’ for LOESS smoothing trend removal or ‘locreg’ for local linear regression (the ‘runline’ algorithm from chronux).

  • order (int) – Only used if method is ‘polynomial’. The order of the polynomial. 0, 1 or > 1 for a baseline (‘constant detrend’, i.e., remove only the mean), linear (remove the linear trend) or polynomial detrending, respectively. Can also be ‘auto’, it which case it will attempt to find the optimal order to minimize the RMSE.

  • regularization (int) – Only used if method=’tarvainen2002’. The regularization parameter (default to 500).

  • alpha (float) – Only used if method is ‘loess’. The parameter which controls the degree of smoothing.

  • window (float) – Only used if method is ‘locreg’. The detrending ‘window’ should correspond to the desired low frequency band to remove multiplied by the sampling rate (for instance, 1.5*1000 will remove frequencies below 1.5Hz for a signal sampled at 1000Hz).

  • stepsize (float) – Only used if method is ‘locreg’. Simialrly to ‘window’, ‘stepsize’ should also be multiplied by the sampling rate.

Returns

array – Vector containing the detrended signal.

See also

signal_filter(), fit_loess()

Examples

>>> import numpy as np
>>> import pandas as pd
>>> import neurokit2 as nk
>>> import matplotlib.pyplot as plt
>>>
>>> # Simulate signal with low and high frequency
>>> signal = nk.signal_simulate(frequency=[0.1, 2], amplitude=[2, 0.5], sampling_rate=100)
>>> signal = signal + (3 + np.linspace(0, 6, num=len(signal)))  # Add baseline and linear trend
>>>
>>> # Apply detrending algorithms
>>> baseline = nk.signal_detrend(signal, order=0)  # Constant detrend (removes the mean)
>>> linear = nk.signal_detrend(signal, order=1)  # Linear detrend
>>> quadratic = nk.signal_detrend(signal, order=2)  # Quadratic detrend
>>> cubic = nk.signal_detrend(signal, order=3)  # Cubic detrend
>>> poly10 = nk.signal_detrend(signal, order=10)  # Linear detrend (10th order)
>>> tarvainen = nk.signal_detrend(signal, method='tarvainen2002')  # Tarvainen (2002) method
>>> loess = nk.signal_detrend(signal, method='loess')  # LOESS detrend (smooth removal)
>>> locreg = nk.signal_detrend(signal, method='locreg',
...                            window=1.5*100, stepsize=0.02*100)  # Local regression (100Hz)
>>>
>>> # Visualize different methods
>>> axes = pd.DataFrame({"Original signal": signal,
...                      "Baseline": baseline,
...                      "Linear": linear,
...                      "Quadratic": quadratic,
...                      "Cubic": cubic,
...                      "Polynomial (10th)": poly10,
...                      "Tarvainen": tarvainen,
...                      "LOESS": loess,
...                      "Local Regression": locreg}).plot(subplots=True)
>>> # Plot horizontal lines to better visualize the detrending
>>> for subplot in axes: 
...     subplot.axhline(y=0, color='k', linestyle='--') 

References

  • `Tarvainen, M. P., Ranta-Aho, P. O., & Karjalainen, P. A. (2002). An advanced detrending method

with application to HRV analysis. IEEE Transactions on Biomedical Engineering, 49(2), 172-175. <https://ieeexplore.ieee.org/document/979357>`_

signal_distort(signal, sampling_rate=1000, noise_shape='laplace', noise_amplitude=0, noise_frequency=100, powerline_amplitude=0, powerline_frequency=50, artifacts_amplitude=0, artifacts_frequency=100, artifacts_number=5, linear_drift=False, random_state=None, silent=False)[source]

Signal distortion.

Add noise of a given frequency, amplitude and shape to a signal.

Parameters
  • signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

  • sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second).

  • noise_shape (str) – The shape of the noise. Can be one of ‘laplace’ (default) or ‘gaussian’.

  • noise_amplitude (float) – The amplitude of the noise (the scale of the random function, relative to the standard deviation of the signal).

  • noise_frequency (float) – The frequency of the noise (in Hz, i.e., samples/second).

  • powerline_amplitude (float) – The amplitude of the powerline noise (relative to the standard deviation of the signal).

  • powerline_frequency (float) – The frequency of the powerline noise (in Hz, i.e., samples/second).

  • artifacts_amplitude (float) – The amplitude of the artifacts (relative to the standard deviation of the signal).

  • artifacts_frequency (int) – The frequency of the artifacts (in Hz, i.e., samples/second).

  • artifacts_number (int) – The number of artifact bursts. The bursts have a random duration between 1 and 10% of the signal duration.

  • linear_drift (bool) – Whether or not to add linear drift to the signal.

  • random_state (int) – Seed for the random number generator. Keep it fixed for reproducible results.

  • silent (bool) – Whether or not to display warning messages.

Returns

array – Vector containing the distorted signal.

Examples

>>> import numpy as np
>>> import pandas as pd
>>> import neurokit2 as nk
>>>
>>> signal = nk.signal_simulate(duration=10, frequency=0.5)
>>>
>>> # Noise
>>> noise = pd.DataFrame({"Freq100": nk.signal_distort(signal, noise_frequency=200),
...                       "Freq50": nk.signal_distort(signal, noise_frequency=50),
...                       "Freq10": nk.signal_distort(signal, noise_frequency=10),
...                       "Freq5": nk.signal_distort(signal, noise_frequency=5),
...                       "Raw": signal}).plot()
>>> noise 
>>>
>>> # Artifacts
>>> artifacts = pd.DataFrame({"1Hz": nk.signal_distort(signal, noise_amplitude=0,
...                                                    artifacts_frequency=1, artifacts_amplitude=0.5),
...                           "5Hz": nk.signal_distort(signal, noise_amplitude=0,
...                                                    artifacts_frequency=5, artifacts_amplitude=0.2),
...                           "Raw": signal}).plot()
>>> artifacts 
signal_filter(signal, sampling_rate=1000, lowcut=None, highcut=None, method='butterworth', order=2, window_size='default', powerline=50)[source]

Filter a signal using ‘butterworth’, ‘fir’ or ‘savgol’ filters.

Apply a lowpass (if ‘highcut’ frequency is provided), highpass (if ‘lowcut’ frequency is provided) or bandpass (if both are provided) filter to the signal.

Parameters
  • signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values. or “bandstop”.

  • sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second).

  • lowcut (float) – Lower cutoff frequency in Hz. The default is None.

  • highcut (float) – Upper cutoff frequency in Hz. The default is None.

  • method (str) – Can be one of ‘butterworth’, ‘fir’, ‘bessel’ or ‘savgol’. Note that for Butterworth, the function uses the SOS method from scipy.signal.sosfiltfilt, recommended for general purpose filtering. One can also specify “butterworth_ba’ for a more traditional and legacy method (often implemented in other software).

  • order (int) – Only used if method is ‘butterworth’ or ‘savgol’. Order of the filter (default is 2).

  • window_size (int) – Only used if method is ‘savgol’. The length of the filter window (i.e. the number of coefficients). Must be an odd integer. If ‘default’, will be set to the sampling rate divided by 10 (101 if the sampling rate is 1000 Hz).

  • powerline (int) – Only used if method is ‘powerline’. The powerline frequency (normally 50 Hz or 60 Hz).

Returns

array – Vector containing the filtered signal.

Examples

>>> import numpy as np
>>> import pandas as pd
>>> import neurokit2 as nk
>>>
>>> signal = nk.signal_simulate(duration=10, frequency=0.5) # Low freq
>>> signal += nk.signal_simulate(duration=10, frequency=5) # High freq
>>>
>>> # Lowpass
>>> fig1 = pd.DataFrame({"Raw": signal,
...                      "Butter_2": nk.signal_filter(signal, highcut=3, method='butterworth', order=2),
...                      "Butter_2_BA": nk.signal_filter(signal, highcut=3, method='butterworth_ba', order=2),
...                      "Butter_5": nk.signal_filter(signal, highcut=3, method='butterworth', order=5),
...                      "Butter_5_BA": nk.signal_filter(signal, highcut=3, method='butterworth_ba', order=5),
...                      "Bessel_2": nk.signal_filter(signal, highcut=3, method='bessel', order=2),
...                      "Bessel_5": nk.signal_filter(signal, highcut=3, method='bessel', order=5),
...                      "FIR": nk.signal_filter(signal, highcut=3, method='fir')}).plot(subplots=True)
>>> fig1 
>>> # Highpass
>>> fig2 = pd.DataFrame({"Raw": signal,
...                      "Butter_2": nk.signal_filter(signal, lowcut=2, method='butterworth', order=2),
...                      "Butter_2_ba": nk.signal_filter(signal, lowcut=2, method='butterworth_ba', order=2),
...                      "Butter_5": nk.signal_filter(signal, lowcut=2, method='butterworth', order=5),
...                      "Butter_5_BA": nk.signal_filter(signal, lowcut=2, method='butterworth_ba', order=5),
...                      "Bessel_2": nk.signal_filter(signal, lowcut=2, method='bessel', order=2),
...                      "Bessel_5": nk.signal_filter(signal, lowcut=2, method='bessel', order=5),
...                      "FIR": nk.signal_filter(signal, lowcut=2, method='fir')}).plot(subplots=True)
>>> fig2 
>>> # Bandpass in real-life scenarios
>>> original = nk.rsp_simulate(duration=30, method="breathmetrics", noise=0)
>>> signal = nk.signal_distort(original, noise_frequency=[0.1, 2, 10, 100], noise_amplitude=1,
...                            powerline_amplitude=1)
>>>
>>> # Bandpass between 10 and 30 breaths per minute (respiratory rate range)
>>> fig3 = pd.DataFrame({"Raw": signal,
...                      "Butter_2": nk.signal_filter(signal, lowcut=10/60, highcut=30/60,
...                                                   method='butterworth', order=2),
...                      "Butter_2_BA": nk.signal_filter(signal, lowcut=10/60, highcut=30/60,
...                                                      method='butterworth_ba', order=2),
...                      "Butter_5": nk.signal_filter(signal, lowcut=10/60, highcut=30/60,
...                                                   method='butterworth', order=5),
...                      "Butter_5_BA": nk.signal_filter(signal, lowcut=10/60, highcut=30/60,
...                                                      method='butterworth_ba', order=5),
...                      "Bessel_2": nk.signal_filter(signal, lowcut=10/60, highcut=30/60,
...                                                   method='bessel', order=2),
...                      "Bessel_5": nk.signal_filter(signal, lowcut=10/60, highcut=30/60,
...                                                   method='bessel', order=5),
...                      "FIR": nk.signal_filter(signal, lowcut=10/60, highcut=30/60,
...                                              method='fir'),
...                      "Savgol": nk.signal_filter(signal, method='savgol')}).plot(subplots=True)
>>> fig3 
signal_findpeaks(signal, height_min=None, height_max=None, relative_height_min=None, relative_height_max=None, relative_mean=True, relative_median=False, relative_max=False)[source]

Find peaks in a signal.

Locate peaks (local maxima) in a signal and their related characteristics, such as height (prominence), width and distance with other peaks.

Parameters
  • signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

  • height_min (float) – The minimum height (i.e., amplitude in terms of absolute values). For example,``height_min=20`` will remove all peaks which height is smaller or equal to 20 (in the provided signal’s values).

  • height_max (float) – The maximum height (i.e., amplitude in terms of absolute values).

  • relative_height_min (float) – The minimum height (i.e., amplitude) relative to the sample (see below). For example, relative_height_min=-2.96 will remove all peaks which height lies below 2.96 standard deviations from the mean of the heights.

  • relative_height_max (float) – The maximum height (i.e., amplitude) relative to the sample (see below).

  • relative_mean (bool) – If a relative threshold is specified, how should it be computed (i.e., relative to what?). relative_mean=True will use Z-scores.

  • relative_median (bool) – If a relative threshold is specified, how should it be computed (i.e., relative to what?). Relative to median uses a more robust form of standardization (see standardize()).

  • relative_max (bool) – If a relative threshold is specified, how should it be computed (i.e., relative to what?). Reelative to max will consider the maximum height as the reference.

Returns

dict – Returns a dict itself containing 5 arrays: - ‘Peaks’ contains the peaks indices (as relative to the given signal). For instance, the value 3 means that the third data point of the signal is a peak. - ‘Distance’ contains, for each peak, the closest distance with another peak. Note that these values will be recomputed after filtering to match the selected peaks. - ‘Height’ contains the prominence of each peak. See scipy.signal.peak_prominences(). - ‘Width’ contains the width of each peak. See scipy.signal.peak_widths(). - ‘Onset’ contains the onset, start (or left trough), of each peak. - ‘Offset’ contains the offset, end (or right trough), of each peak.

Examples

>>> import numpy as np
>>> import pandas as pd
>>> import neurokit2 as nk
>>> import scipy.misc
>>>
>>> signal = nk.signal_simulate(duration=5)
>>> info = nk.signal_findpeaks(signal)
>>> fig1 = nk.events_plot([info["Onsets"], info["Peaks"]], signal)
>>> fig1 
>>>
>>> signal = nk.signal_distort(signal)
>>> info = nk.signal_findpeaks(signal, height_min=1)
>>> fig2 = nk.events_plot(info["Peaks"], signal)
>>> fig2 
>>>
>>> # Filter peaks
>>> ecg = scipy.misc.electrocardiogram()
>>> signal = ecg[0:1000]
>>> info1 = nk.signal_findpeaks(signal, relative_height_min=0)
>>> info2 = nk.signal_findpeaks(signal, relative_height_min=1)
>>> fig3 = nk.events_plot([info1["Peaks"], info2["Peaks"]], signal)
>>> fig3 

See also

scipy.signal.find_peaks(), scipy.signal.peak_widths(), peak_prominences.signal.peak_widths(), eda_findpeaks(), ecg_findpeaks(), rsp_findpeaks(), signal_fixpeaks()

signal_fixpeaks(peaks, sampling_rate=1000, iterative=True, show=False, interval_min=None, interval_max=None, relative_interval_min=None, relative_interval_max=None, robust=False, method='Kubios')[source]

Correct erroneous peak placements.

Identify and correct erroneous peak placements based on outliers in peak-to-peak differences (period).

Parameters
  • peaks (list or array or DataFrame or Series or dict) – The samples at which the peaks occur. If an array is passed in, it is assumed that it was obtained with signal_findpeaks(). If a DataFrame is passed in, it is assumed to be obtained with ecg_findpeaks() or ppg_findpeaks() and to be of the same length as the input signal.

  • sampling_rate (int) – The sampling frequency of the signal that contains the peaks (in Hz, i.e., samples/second).

  • iterative (bool) – Whether or not to apply the artifact correction repeatedly (results in superior artifact correction).

  • show (bool) – Whether or not to visualize artifacts and artifact thresholds.

  • interval_min (float) – The minimum interval between the peaks.

  • interval_max (float) – The maximum interval between the peaks.

  • relative_interval_min (float) – The minimum interval between the peaks as relative to the sample (expressed in standard deviation from the mean).

  • relative_interval_max (float) – The maximum interval between the peaks as relative to the sample (expressed in standard deviation from the mean).

  • robust (bool) – Use a robust method of standardization (see standardize()) for the relative thresholds.

  • method (str) – Either “Kubios” or “Neurokit”. “Kubios” uses the artifact detection and correction described in Lipponen, J. A., & Tarvainen, M. P. (2019). Note that “Kubios” is only meant for peaks in ECG or PPG. “neurokit” can be used with peaks in ECG, PPG, or respiratory data.

Returns

  • peaks_clean (array) – The corrected peak locations.

  • artifacts (dict) – Only if method=”Kubios”. A dictionary containing the indices of artifacts, accessible with the keys “ectopic”, “missed”, “extra”, and “longshort”.

See also

signal_findpeaks(), ecg_findpeaks(), ecg_peaks(), ppg_findpeaks(), ppg_peaks()

Examples

>>> import neurokit2 as nk
>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>>
>>> # Kubios
>>> ecg = nk.ecg_simulate(duration=240, noise=0.25, heart_rate=70, random_state=42)
>>> rpeaks_uncorrected = nk.ecg_findpeaks(ecg)
>>> artifacts, rpeaks_corrected = nk.signal_fixpeaks(rpeaks_uncorrected, iterative=True,
...                                                  show=True, method="Kubios")
>>> rate_corrected = nk.signal_rate(rpeaks_corrected, desired_length=len(ecg))
>>> rate_uncorrected = nk.signal_rate(rpeaks_uncorrected, desired_length=len(ecg))
>>>
>>> fig, ax = plt.subplots()
>>> ax.plot(rate_uncorrected, label="heart rate without artifact correction") 
>>> ax.plot(rate_corrected, label="heart rate with artifact correction") 
>>> ax.legend(loc="upper right") 
>>>
>>> # NeuroKit
>>> signal = nk.signal_simulate(duration=4, sampling_rate=1000, frequency=1)
>>> peaks_true = nk.signal_findpeaks(signal)["Peaks"]
>>> peaks = np.delete(peaks_true, [1])  # create gaps
>>>
>>> signal = nk.signal_simulate(duration=20, sampling_rate=1000, frequency=1)
>>> peaks_true = nk.signal_findpeaks(signal)["Peaks"]
>>> peaks = np.delete(peaks_true, [5, 15])  # create gaps
>>> peaks = np.sort(np.append(peaks, [1350, 11350, 18350]))  # add artifacts
>>>
>>> peaks_corrected = nk.signal_fixpeaks(peaks=peaks, interval_min=0.5, interval_max=1.5, method="neurokit")
>>> # Plot and shift original peaks to the rightto see the difference.
>>> fig = nk.events_plot([peaks + 50, peaks_corrected], signal)
>>> fig 

References

  • Lipponen, J. A., & Tarvainen, M. P. (2019). A robust algorithm for heart rate variability time

series artefact correction using novel beat classification. Journal of medical engineering & technology, 43(3), 173-181. 10.1080/03091902.2019.1640306

signal_formatpeaks(info, desired_length, peak_indices=None)[source]

Transforms an peak-info dict to a signal of given length.

signal_interpolate(x_values, y_values, x_new=None, method='quadratic')[source]

Interpolate a signal.

Interpolate a signal using different methods.

Parameters
  • x_values (Union[list, np.array, pd.Series]) – The samples corresponding to the values to be interpolated.

  • y_values (Union[list, np.array, pd.Series]) – The values to be interpolated.

  • x_new (Union[list, np.array, pd.Series] or int) – The samples at which to interpolate the y_values. Samples before the first value in x_values or after the last value in x_values will be extrapolated. If an integer is passed, nex_x will be considered as the desired length of the interpolated signal between the first and the last values of x_values. No extrapolation will be done for values before or after the first and the last valus of x_values.

  • method (str) – Method of interpolation. Can be ‘linear’, ‘nearest’, ‘zero’, ‘slinear’, ‘quadratic’, ‘cubic’, ‘previous’, ‘next’ or ‘monotone_cubic’. ‘zero’, ‘slinear’, ‘quadratic’ and ‘cubic’ refer to a spline interpolation of zeroth, first, second or third order; ‘previous’ and ‘next’ simply return the previous or next value of the point) or as an integer specifying the order of the spline interpolator to use. See https://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.PchipInterpolator.html for details on the ‘monotone_cubic’ method.

Returns

array – Vector of interpolated samples.

Examples

>>> import numpy as np
>>> import neurokit2 as nk
>>> import matplotlib.pyplot as plt
>>>
>>> # Generate points
>>> signal = nk.signal_simulate(duration=1, sampling_rate=10)
>>>
>>> # List all interpolation methods and interpolation parameters
>>> interpolation_methods = ["zero", "linear", "quadratic", "cubic", 5, "nearest", "monotone_cubic"]
>>> x_values = np.linspace(0, 1, num=10)
>>> x_new = np.linspace(0, 1, num=1000)
>>>
>>> # Visualize all interpolations
>>> fig, ax = plt.subplots() 
>>> ax.scatter(x_values, signal, label="original datapoints", zorder=3) 
>>> for im in interpolation_methods:
...     signal_interpolated = nk.signal_interpolate(x_values, signal, x_new=x_new, method=im)
...     ax.plot(x_new, signal_interpolated, label=im) 
>>> ax.legend(loc="upper left") 
signal_merge(signal1, signal2, time1=[0, 10], time2=[0, 10])[source]

Arbitrary addition of two signals with different time ranges.

Parameters
  • signal1 (Union[list, np.array, pd.Series]) – The first signal (i.e., a time series)s in the form of a vector of values.

  • signal2 (Union[list, np.array, pd.Series]) – The second signal (i.e., a time series)s in the form of a vector of values.

  • time1 (list) – Lists containing two numeric values corresponding to the beginning and end of signal1.

  • time2 (list) – Same as above, but for signal2.

Returns

array – Vector containing the sum of the two signals.

Examples

>>> import numpy as np
>>> import pandas as pd
>>> import neurokit2 as nk
>>>
>>> signal1 = np.cos(np.linspace(start=0, stop=10, num=100))
>>> signal2 = np.cos(np.linspace(start=0, stop=20, num=100))
>>>
>>> signal = nk.signal_merge(signal1, signal2, time1=[0, 10], time2=[-5, 5])
>>> nk.signal_plot(signal)
signal_period(peaks, sampling_rate=1000, desired_length=None, interpolation_method='monotone_cubic')[source]

Calculate signal period from a series of peaks.

Parameters
  • peaks (Union[list, np.array, pd.DataFrame, pd.Series, dict]) – The samples at which the peaks occur. If an array is passed in, it is assumed that it was obtained with signal_findpeaks(). If a DataFrame is passed in, it is assumed it is of the same length as the input signal in which occurrences of R-peaks are marked as “1”, with such containers obtained with e.g., ecg_findpeaks() or rsp_findpeaks().

  • sampling_rate (int) – The sampling frequency of the signal that contains peaks (in Hz, i.e., samples/second). Defaults to 1000.

  • desired_length (int) – If left at the default None, the returned period will have the same number of elements as peaks. If set to a value larger than the sample at which the last peak occurs in the signal (i.e., peaks[-1]), the returned period will be interpolated between peaks over desired_length samples. To interpolate the period over the entire duration of the signal, set desired_length to the number of samples in the signal. Cannot be smaller than or equal to the sample at which the last peak occurs in the signal. Defaults to None.

  • interpolation_method (str) – Method used to interpolate the rate between peaks. See signal_interpolate(). ‘monotone_cubic’ is chosen as the default interpolation method since it ensures monotone interpolation between data points (i.e., it prevents physiologically implausible “overshoots” or “undershoots” in the y-direction). In contrast, the widely used cubic spline interpolation does not ensure monotonicity.

Returns

array – A vector containing the period.

Examples

>>> import neurokit2 as nk
>>>
>>> signal = nk.signal_simulate(duration=10, sampling_rate=1000, frequency=1)
>>> info = nk.signal_findpeaks(signal)
>>>
>>> period = nk.signal_period(peaks=info["Peaks"], desired_length=len(signal))
>>> nk.signal_plot(period)
signal_phase(signal, method='radians')[source]

Compute the phase of the signal.

The real phase has the property to rotate uniformly, leading to a uniform distribution density. The prophase typically doesn’t fulfill this property. The following functions applies a nonlinear transformation to the phase signal that makes its distribution exactly uniform. If a binary vector is provided (containing 2 unique values), the function will compute the phase of completion of each phase as denoted by each value.

Parameters
  • signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

  • method (str) – The values in which the phase is expressed. Can be ‘radians’ (default), ‘degrees’ (for values between 0 and 360) or ‘percents’ (for values between 0 and 1).

Returns

array – A vector containing the phase of the signal, between 0 and 2*pi.

Examples

>>> import neurokit2 as nk
>>>
>>> signal = nk.signal_simulate(duration=10)
>>> phase = nk.signal_phase(signal)
>>> nk.signal_plot([signal, phase])
>>>
>>> rsp = nk.rsp_simulate(duration=30)
>>> phase = nk.signal_phase(rsp, method="degrees")
>>> nk.signal_plot([rsp, phase])
>>>
>>> # Percentage of completion of two phases
>>> signal = nk.signal_binarize(nk.signal_simulate(duration=10))
>>> phase = nk.signal_phase(signal, method="percents")
>>> nk.signal_plot([signal, phase])
signal_plot(signal, sampling_rate=None, subplots=False, standardize=False, labels=None, **kwargs)[source]

Plot signal with events as vertical lines.

Parameters
  • signal (array or DataFrame) – Signal array (can be a dataframe with many signals).

  • sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second). Needs to be supplied if the data should be plotted over time in seconds. Otherwise the data is plotted over samples. Defaults to None.

  • subplots (bool) – If True, each signal is plotted in a subplot.

  • standardize (bool) – If True, all signals will have the same scale (useful for visualisation).

  • labels (str or list) – Defaults to None.

  • **kwargs (optional) – Arguments passed to matplotlib plotting.

Examples

>>> import numpy as np
>>> import pandas as pd
>>> import neurokit2 as nk
>>>
>>> signal = nk.signal_simulate(duration=10, sampling_rate=1000)
>>> nk.signal_plot(signal, labels='signal1', sampling_rate=1000, color="red")
>>>
>>> data = pd.DataFrame({"Signal2": np.cos(np.linspace(start=0, stop=20, num=1000)),
...                      "Signal3": np.sin(np.linspace(start=0, stop=20, num=1000)),
...                      "Signal4": nk.signal_binarize(np.cos(np.linspace(start=0, stop=40, num=1000)))})
>>> nk.signal_plot(data, labels=['signal_1', 'signal_2', 'signal_3'], subplots=False)
>>> nk.signal_plot([signal, data], standardize=True)
signal_power(signal, frequency_band, sampling_rate=1000, continuous=False, show=False, **kwargs)[source]

Compute the power of a signal in a given frequency band.

Parameters
  • signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

  • frequency_band (tuple or list) – Tuple or list of tuples indicating the range of frequencies to compute the power in.

  • sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second).

  • continuous (bool) – Compute instant frequency, or continuous power.

  • show (bool) – If True, will return a Poincaré plot. Defaults to False.

  • **kwargs – Keyword arguments to be passed to signal_psd().

Returns

pd.DataFrame – A DataFrame containing the Power Spectrum values and a plot if show is True.

Examples

>>> import neurokit2 as nk
>>> import numpy as np
>>>
>>> # Instant power
>>> signal = nk.signal_simulate(frequency=5) + 0.5*nk.signal_simulate(frequency=20)
>>> power_plot = nk.signal_power(signal, frequency_band=[(18, 22), (10, 14)], method="welch", show=True)
>>> power_plot 
>>>
>>> # Continuous (simulated signal)
>>> signal = np.concatenate((nk.ecg_simulate(duration=30, heart_rate=75),  nk.ecg_simulate(duration=30, heart_rate=85)))
>>> power = nk.signal_power(signal, frequency_band=[(72/60, 78/60), (82/60, 88/60)], continuous=True)
>>> processed, _ = nk.ecg_process(signal)
>>> power["ECG_Rate"] = processed["ECG_Rate"]
>>> nk.signal_plot(power, standardize=True) 
>>>
>>> # Continuous (real signal)
>>> signal = nk.data("bio_eventrelated_100hz")["ECG"]
>>> power = nk.signal_power(signal, sampling_rate=100, frequency_band=[(0.12, 0.15), (0.15, 0.4)], continuous=True)
>>> processed, _ = nk.ecg_process(signal, sampling_rate=100)
>>> power["ECG_Rate"] = processed["ECG_Rate"]
>>> nk.signal_plot(power, standardize=True)
signal_psd(signal, sampling_rate=1000, method='welch', show=True, min_frequency=0, max_frequency=inf, window=None, ar_order=15, order_criteria='KIC', order_corrected=True, burg_norm=True)[source]

Compute the Power Spectral Density (PSD).

Parameters
  • signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

  • sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second).

  • method (str) – Either ‘multitapers’ (default; requires the ‘mne’ package), or ‘welch’ (requires the ‘scipy’ package).

  • show (bool) – If True, will return a plot. If False, will return the density values that can be plotted externally.

  • min_frequency (float) – The minimum frequency.

  • max_frequency (float) – The maximum frequency.

  • window (int) – Length of each window in seconds (for Welch method). If None (default), window will be automatically calculated to capture at least 2 cycles of min_frequency. If the length of recording does not allow the formal, window will be default to half of the length of recording.

  • ar_order (int) – The order of autoregression (for AR methods e.g. Burg).

  • order_criteria (str) – The criteria to automatically select order in parametric PSD (for AR methods e.g. Burg).

  • order_corrected (bool) – Specify for AIC and KIC order_criteria. If unsure which method to use to choose the order, rely on the default of corrected KIC.

  • bug_norm (bool) – Normalization for Burg method.

See also

signal_filter(), mne.time_frequency.psd_array_multitaper(), scipy.signal.welch()

Returns

pd.DataFrame – A DataFrame containing the Power Spectrum values and a plot if show is True.

Examples

>>> import neurokit2 as nk
>>>
>>> signal = nk.signal_simulate(frequency=5) + 0.5*nk.signal_simulate(frequency=20)
>>>
>>> fig1 = nk.signal_psd(signal, method="multitapers")
>>> fig1 
>>> fig2 = nk.signal_psd(signal, method="welch", min_frequency=1)
>>> fig2 
>>> fig3 = nk.signal_psd(signal, method="burg", min_frequency=1)
>>>
>>> data = nk.signal_psd(signal, method="multitapers", max_frequency=30, show=False)
>>> fig4 = data.plot(x="Frequency", y="Power")
>>> fig4 
>>> data = nk.signal_psd(signal, method="welch", max_frequency=30, show=False, min_frequency=1)
>>> fig5 = data.plot(x="Frequency", y="Power")
>>> fig5 
signal_rate(peaks, sampling_rate=1000, desired_length=None, interpolation_method='monotone_cubic')[source]

Calculate signal rate from a series of peaks.

This function can also be called either via ecg_rate(), `ppg_rate() or rsp_rate() (aliases provided for consistency).

Parameters
  • peaks (Union[list, np.array, pd.DataFrame, pd.Series, dict]) – The samples at which the peaks occur. If an array is passed in, it is assumed that it was obtained with signal_findpeaks(). If a DataFrame is passed in, it is assumed it is of the same length as the input signal in which occurrences of R-peaks are marked as “1”, with such containers obtained with e.g., ecg_findpeaks() or rsp_findpeaks().

  • sampling_rate (int) – The sampling frequency of the signal that contains peaks (in Hz, i.e., samples/second). Defaults to 1000.

  • desired_length (int) – If left at the default None, the returned rated will have the same number of elements as peaks. If set to a value larger than the sample at which the last peak occurs in the signal (i.e., peaks[-1]), the returned rate will be interpolated between peaks over desired_length samples. To interpolate the rate over the entire duration of the signal, set desired_length to the number of samples in the signal. Cannot be smaller than or equal to the sample at which the last peak occurs in the signal. Defaults to None.

  • interpolation_method (str) – Method used to interpolate the rate between peaks. See signal_interpolate(). ‘monotone_cubic’ is chosen as the default interpolation method since it ensures monotone interpolation between data points (i.e., it prevents physiologically implausible “overshoots” or “undershoots” in the y-direction). In contrast, the widely used cubic spline interpolation does not ensure monotonicity.

Returns

array – A vector containing the rate.

Examples

>>> import neurokit2 as nk
>>>
>>> signal = nk.signal_simulate(duration=10, sampling_rate=1000, frequency=1)
>>> info = nk.signal_findpeaks(signal)
>>>
>>> rate = nk.signal_rate(peaks=info["Peaks"], desired_length=len(signal))
>>> fig = nk.signal_plot(rate)
>>> fig 
signal_recompose(components, method='wcorr', threshold=0.5, keep_sd=None, **kwargs)[source]

Combine signal sources after decomposition.

Combine and reconstruct meaningful signal sources after signal decomposition.

Parameters
  • components (array) – Array of components obtained via signal_decompose().

  • method (str) – The decomposition method. Can be one of ‘wcorr’.

  • threshold (float) – The threshold used to group components together.

  • keep_sd (float) – If a float is specified, will only keep the reconstructed components that are superior or equal to that percentage of the max standard deviaiton (SD) of the components. For instance, keep_sd=0.01 will remove all components with SD is lower that 1% of the max SD. This can be used to filter out noise.

  • **kwargs – Other arguments to override for instance metric='chebyshev'.

Returns

Array – Components of the recomposed components.

Examples

>>> import neurokit2 as nk
>>>
>>> # Create complex signal
>>> signal = nk.signal_simulate(duration=10, frequency=1, noise=0.01)  # High freq
>>> signal += 3 * nk.signal_simulate(duration=10, frequency=3, noise=0.01)  # Higher freq
>>> signal += 3 * np.linspace(0, 2, len(signal))  # Add baseline and trend
>>> signal += 2 * nk.signal_simulate(duration=10, frequency=0.1, noise=0)
>>>
>>> # Decompose signal
>>> components = nk.signal_decompose(signal, method='emd')
>>>
>>> # Recompose
>>> recomposed = nk.signal_recompose(components, method='wcorr', threshold=0.90)
>>> fig = nk.signal_plot(components)  # Visualize components
>>> fig  
signal_resample(signal, desired_length=None, sampling_rate=None, desired_sampling_rate=None, method='interpolation')[source]

Resample a continuous signal to a different length or sampling rate.

Up- or down-sample a signal. The user can specify either a desired length for the vector, or input the original sampling rate and the desired sampling rate. See https://github.com/neuropsychology/NeuroKit/scripts/resampling.ipynb for a comparison of the methods.

Parameters
  • signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

  • desired_length (int) – The desired length of the signal.

  • sampling_rate (int) – The original sampling frequency (in Hz, i.e., samples/second).

  • desired_sampling_rate (int) – The desired (output) sampling frequency (in Hz, i.e., samples/second).

  • method (str) – Can be ‘interpolation’ (see scipy.ndimage.zoom()), ‘numpy’ for numpy’s interpolation (see numpy.interp()),’pandas’ for Pandas’ time series resampling, ‘poly’ (see scipy.signal.resample_poly()) or ‘FFT’ (see scipy.signal.resample()) for the Fourier method. FFT is the most accurate (if the signal is periodic), but becomes exponentially slower as the signal length increases. In contrast, ‘interpolation’ is the fastest, followed by ‘numpy’, ‘poly’ and ‘pandas’.

Returns

array – Vector containing resampled signal values.

Examples

>>> import numpy as np
>>> import pandas as pd
>>> import neurokit2 as nk
>>>
>>> signal = np.cos(np.linspace(start=0, stop=20, num=100))
>>>
>>> # Downsample
>>> downsampled_interpolation = nk.signal_resample(signal, method="interpolation",
...                                                sampling_rate=1000, desired_sampling_rate=500)
>>> downsampled_fft = nk.signal_resample(signal, method="FFT",
...                                      sampling_rate=1000, desired_sampling_rate=500)
>>> downsampled_poly = nk.signal_resample(signal, method="poly",
...                                       sampling_rate=1000, desired_sampling_rate=500)
>>> downsampled_numpy = nk.signal_resample(signal, method="numpy",
...                                        sampling_rate=1000, desired_sampling_rate=500)
>>> downsampled_pandas = nk.signal_resample(signal, method="pandas",
...                                         sampling_rate=1000, desired_sampling_rate=500)
>>>
>>> # Upsample
>>> upsampled_interpolation = nk.signal_resample(downsampled_interpolation,
...                                              method="interpolation",
...                                              sampling_rate=500, desired_sampling_rate=1000)
>>> upsampled_fft = nk.signal_resample(downsampled_fft, method="FFT",
...                                    sampling_rate=500, desired_sampling_rate=1000)
>>> upsampled_poly = nk.signal_resample(downsampled_poly, method="poly",
...                                     sampling_rate=500, desired_sampling_rate=1000)
>>> upsampled_numpy = nk.signal_resample(downsampled_numpy, method="numpy",
...                                      sampling_rate=500, desired_sampling_rate=1000)
>>> upsampled_pandas = nk.signal_resample(downsampled_pandas, method="pandas",
...                                       sampling_rate=500, desired_sampling_rate=1000)
>>>
>>> # Compare with original
>>> fig = pd.DataFrame({"Original": signal,
...                     "Interpolation": upsampled_interpolation,
...                     "FFT": upsampled_fft,
...                     "Poly": upsampled_poly,
...                     "Numpy": upsampled_numpy,
...                     "Pandas": upsampled_pandas}).plot(style='.-')
>>> fig 
>>>
>>> # Timing benchmarks
>>> %timeit nk.signal_resample(signal, method="interpolation",
...                            sampling_rate=1000, desired_sampling_rate=500) 
>>> %timeit nk.signal_resample(signal, method="FFT",
...                            sampling_rate=1000, desired_sampling_rate=500) 
>>> %timeit nk.signal_resample(signal, method="poly",
...                            sampling_rate=1000, desired_sampling_rate=500) 
>>> %timeit nk.signal_resample(signal, method="numpy",
...                            sampling_rate=1000, desired_sampling_rate=500) 
>>> %timeit nk.signal_resample(signal, method="pandas",
...                            sampling_rate=1000, desired_sampling_rate=500) 

See also

scipy.signal.resample_poly(), scipy.signal.resample(), scipy.ndimage.zoom()

signal_simulate(duration=10, sampling_rate=1000, frequency=1, amplitude=0.5, noise=0, silent=False)[source]

Simulate a continuous signal.

Parameters
  • duration (float) – Desired length of duration (s).

  • sampling_rate (int) – The desired sampling rate (in Hz, i.e., samples/second).

  • frequency (float or list) – Oscillatory frequency of the signal (in Hz, i.e., oscillations per second).

  • amplitude (float or list) – Amplitude of the oscillations.

  • noise (float) – Noise level (amplitude of the laplace noise).

  • silent (bool) – If False (default), might print warnings if impossible frequencies are queried.

Returns

array – The simulated signal.

Examples

>>> import numpy as np
>>> import pandas as pd
>>> import neurokit2 as nk
>>>
>>> fig = pd.DataFrame({"1Hz": nk.signal_simulate(duration=5, frequency=1),
...                     "2Hz": nk.signal_simulate(duration=5, frequency=2),
...                     "Multi": nk.signal_simulate(duration=5, frequency=[0.5, 3], amplitude=[0.5, 0.2])}).plot()
>>> fig 
signal_smooth(signal, method='convolution', kernel='boxzen', size=10, alpha=0.1)[source]

Signal smoothing.

Signal smoothing can be achieved using either the convolution of a filter kernel with the input signal to compute the smoothed signal (Smith, 1997) or a LOESS regression.

Parameters
  • signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

  • method (str) – Can be one of ‘convolution’ (default) or ‘loess’.

  • kernel (Union[str, np.array]) – Only used if method is ‘convolution’. Type of kernel to use; if array, use directly as the kernel. Can be one of ‘median’, ‘boxzen’, ‘boxcar’, ‘triang’, ‘blackman’, ‘hamming’, ‘hann’, ‘bartlett’, ‘flattop’, ‘parzen’, ‘bohman’, ‘blackmanharris’, ‘nuttall’, ‘barthann’, ‘kaiser’ (needs beta), ‘gaussian’ (needs std), ‘general_gaussian’ (needs power, width), ‘slepian’ (needs width) or ‘chebwin’ (needs attenuation).

  • size (int) – Only used if method is ‘convolution’. Size of the kernel; ignored if kernel is an array.

  • alpha (float) – Only used if method is ‘loess’. The parameter which controls the degree of smoothing.

Returns

array – Smoothed signal.

See also

fit_loess()

Examples

>>> import numpy as np
>>> import pandas as pd
>>> import neurokit2 as nk
>>>
>>> signal = np.cos(np.linspace(start=0, stop=10, num=1000))
>>> distorted = nk.signal_distort(signal, noise_amplitude=[0.3, 0.2, 0.1, 0.05], noise_frequency=[5, 10, 50, 100])
>>>
>>> size = len(signal)/100
>>> signals = pd.DataFrame({"Raw": distorted,
...                         "Median": nk.signal_smooth(distorted, kernel='median', size=size-1),
...                         "BoxZen": nk.signal_smooth(distorted, kernel='boxzen', size=size),
...                         "Triang": nk.signal_smooth(distorted, kernel='triang', size=size),
...                         "Blackman": nk.signal_smooth(distorted, kernel='blackman', size=size),
...                         "Loess_01": nk.signal_smooth(distorted, method='loess', alpha=0.1),
...                         "Loess_02": nk.signal_smooth(distorted, method='loess', alpha=0.2),
...                         "Loess_05": nk.signal_smooth(distorted, method='loess', alpha=0.5)})
>>> fig = signals.plot()
>>> fig_magnify = signals[50:150].plot()  # Magnify
>>> fig_magnify 

References

  • Smith, S. W. (1997). The scientist and engineer’s guide to digital signal processing.

signal_synchrony(signal1, signal2, method='hilbert', window_size=50)[source]

Compute the synchrony (coupling) between two signals.

Compute a continuous index of coupling between two signals either using the ‘Hilbert’ method to get the instantaneous phase synchrony, or using rolling window correlation.

The instantaneous phase synchrony measures the phase similarities between signals at each timepoint. The phase refers to the angle of the signal, calculated through the hilbert transform, when it is resonating between -pi to pi degrees. When two signals line up in phase their angular difference becomes zero.

For less clean signals, windowed correlations are widely used because of their simplicity, and can be a good a robust approximation of synchrony between two signals. The limitation is the need to select a window.

Parameters
  • signal1 (Union[list, np.array, pd.Series]) – Time series in the form of a vector of values.

  • signal2 (Union[list, np.array, pd.Series]) – Time series in the form of a vector of values.

  • method (str) – The method to use. Can be one of ‘hilbert’ or ‘correlation’.

  • window_size (int) – Only used if method=’correlation’. The number of samples to use for rolling correlation.

Returns

array – A vector containing the phase of the signal, between 0 and 2*pi.

Examples

>>> import neurokit2 as nk
>>>
>>> signal1 = nk.signal_simulate(duration=10, frequency=1)
>>> signal2 = nk.signal_simulate(duration=10, frequency=1.5)
>>>
>>> coupling_h = nk.signal_synchrony(signal1, signal2, method="hilbert")
>>> coupling_c = nk.signal_synchrony(signal1, signal2, method="correlation", window_size=1000/2)
>>>
>>> fig = nk.signal_plot([signal1, signal2, coupling_h, coupling_c])
>>> fig 

References

signal_zerocrossings(signal, direction='both')[source]

Locate the indices where the signal crosses zero.

Note that when the signal crosses zero between two points, the first index is returned.

Parameters
  • signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

  • direction (str) – Direction in which the signal crosses zero, can be “positive”, “negative” or “both” (default).

Returns

array – Vector containing the indices of zero crossings.

Examples

>>> import numpy as np
>>> import neurokit2 as nk
>>>
>>> signal = np.cos(np.linspace(start=0, stop=15, num=1000))
>>> zeros = nk.signal_zerocrossings(signal)
>>> fig = nk.events_plot(zeros, signal)
>>> fig 
>>>
>>> # Only upward or downward zerocrossings
>>> up = nk.signal_zerocrossings(signal, direction='up')
>>> down = nk.signal_zerocrossings(signal, direction='down')
>>> fig = nk.events_plot([up, down], signal)
>>> fig 

Events

Submodule for NeuroKit.

events_find(event_channel, threshold='auto', threshold_keep='above', start_at=0, end_at=None, duration_min=1, duration_max=None, inter_min=0, discard_first=0, discard_last=0, event_labels=None, event_conditions=None)[source]

Find and select events in a continuous signal (e.g., from a photosensor).

Parameters
  • event_channel (array or list) – The channel containing the events.

  • threshold (str or float) – The threshold value by which to select the events. If “auto”, takes the value between the max and the min.

  • threshold_keep (str) – “above” or “below”, define the events as above or under the threshold. For photosensors, a white screen corresponds usually to higher values. Therefore, if your events are signaled by a black colour, events values are the lower ones, and you should set the cut to “below”.

  • start_at (int) – Keep events which onset is after a particular time point.

  • end_at (int) – Keep events which onset is before a particular time point.

  • duration_min (int) – The minimum duration of an event to be considered as such (in time points).

  • duration_max (int) – The maximum duration of an event to be considered as such (in time points).

  • inter_min (int) – The minimum duration after an event for the subsequent event to be considered as such (in time points). Useful when spurious consecutive events are created due to very high sampling rate.

  • discard_first (int) – Discard first or last n events. Useful if the experiment starts with some spurious events. If discard_first=0, no first event is removed.

  • discard_last (int) – Discard first or last n events. Useful if the experiment ends with some spurious events. If discard_last=0, no last event is removed.

  • event_labels (list) – A list containing unique event identifiers. If None, will use the event index number.

  • event_conditions (list) – An optional list containing, for each event, for example the trial category, group or experimental conditions.

Returns

dict – Dict containing 3 or 4 arrays, ‘onset’ for event onsets, ‘duration’ for event durations, ‘label’ for the event identifiers and the optional ‘conditions’ passed to event_conditions.

Example

>>> import numpy as np
>>> import pandas as pd
>>> import neurokit2 as nk
>>>
>>> signal = nk.signal_simulate(duration=4)
>>> events = nk.events_find(signal)
>>> events 
{'onset': array(...),
 'duration': array(...),
 'label': array(...)}
>>>
>>> nk.events_plot(events, signal) 
<Figure ...>
events_plot(events, signal=None, show=True, color='red', linestyle='--')[source]

Plot events in signal.

Parameters
  • events (list or ndarray or dict) – Events onset location. Can also be a list of lists, in which case it will mark them with different colors. If a dict is passed (e.g., from ‘events_find()’), will select only the ‘onset’ list.

  • signal (array or DataFrame) – Signal array (can be a dataframe with many signals).

  • show (bool) – If True, will return a plot. If False, will return a DataFrame that can be plotted externally.

  • color (str) – Argument passed to matplotlib plotting.

  • linestyle (str) – Argument passed to matplotlib plotting.

Returns

fig – Figure representing a plot of the signal and the event markers.

See also

events_find()

Examples

>>> import numpy as np
>>> import pandas as pd
>>> import neurokit2 as nk
>>>
>>> fig = nk.events_plot([1, 3, 5])
>>> fig 
>>>
>>> # With signal
>>> signal = nk.signal_simulate(duration=4)
>>> events = nk.events_find(signal)
>>> fig1 = nk.events_plot(events, signal)
>>> fig1 
>>>
>>> # Different events
>>> events1 = events["onset"]
>>> events2 = np.linspace(0, len(signal), 8)
>>> fig2 = nk.events_plot([events1, events2], signal)
>>> fig2 
>>>
>>> # Conditions
>>> events = nk.events_find(signal, event_conditions=["A", "B", "A", "B"])
>>> fig3 = nk.events_plot(events, signal)
>>> fig3 
>>>
>>> # Different colors for all events
>>> signal = nk.signal_simulate(duration=20)
>>> events = nk.events_find(signal)
>>> events = [[i] for i in events['onset']]
>>> fig4 = nk.events_plot(events, signal)
>>> fig4 
events_to_mne(events, event_conditions=None)[source]

Create MNE compatible events for integration with M/EEG.

Parameters
  • events (list or ndarray or dict) – Events onset location. Can also be a dict obtained through ‘events_find()’.

  • event_conditions (list) – An optional list containing, for each event, for example the trial category, group or experimental conditions. Defaults to None.

Returns

tuple – MNE-formatted events and the event id, that can be added via ‘raw.add_events(events), and a dictionary with event’s names.

See also

events_find()

Examples

>>> import numpy as np
>>> import pandas as pd
>>> import neurokit2 as nk
>>>
>>> signal = nk.signal_simulate(duration=4)
>>> events = nk.events_find(signal)
>>> events, event_id = nk.events_to_mne(events)
>>> events 
array([[   1,    0,    0],
       [1001,    0,    0],
       [2001,    0,    0],
       [3001,    0,    0]])
>>> event_id 
{'event': 0}
>>>
>>> # Conditions
>>> events = nk.events_find(signal, event_conditions=["A", "B", "A", "B"])
>>> events, event_id = nk.events_to_mne(events)
>>> event_id 
{'B': 0, 'A': 1}

Data

Submodule for NeuroKit.

data(dataset='bio_eventrelated_100hz')[source]

Download example datasets.

Download and load available example datasets. Note that an internet connexion is necessary.

Parameters

dataset (str) – The name of the dataset. The list and description is available here.

Returns

DataFrame – The data.

Examples

>>> import neurokit2 as nk
>>>
>>> data = nk.data("bio_eventrelated_100hz")
read_acqknowledge(filename, sampling_rate='max', resample_method='interpolation', impute_missing=True)[source]

Read and format a BIOPAC’s AcqKnowledge file into a pandas’ dataframe.

The function outputs both the dataframe and the sampling rate (encoded within the AcqKnowledge) file.

Parameters
  • filename (str) – Filename (with or without the extension) of a BIOPAC’s AcqKnowledge file.

  • sampling_rate (int) – Sampling rate (in Hz, i.e., samples/second). Since an AcqKnowledge file can contain signals recorded at different rates, harmonization is necessary in order to convert it to a DataFrame. Thus, if sampling_rate is set to ‘max’ (default), will keep the maximum recorded sampling rate and upsample the channels with lower rate if necessary (using the signal_resample() function). If the sampling rate is set to a given value, will resample the signals to the desired value. Note that the value of the sampling rate is outputted along with the data.

  • resample_method (str) – Method of resampling (see signal_resample()).

  • impute_missing (bool) – Sometimes, due to connections issues, the signal has some holes (short periods without signal). If ‘impute_missing’ is True, will automatically fill the signal interruptions using padding.

Returns

  • df (DataFrame) – The AcqKnowledge file converted to a dataframe.

  • sampling rate (int) – The AcqKnowledge file converted to its sampling rate.

See also

signal_resample()

Example

>>> import neurokit2 as nk
>>>
>>> data, sampling_rate = nk.read_acqknowledge('file.acq') 

Epochs

Submodule for NeuroKit.

epochs_create(data, events=None, sampling_rate=1000, epochs_start=0, epochs_end=1, event_labels=None, event_conditions=None, baseline_correction=False)[source]

Epoching a dataframe.

Parameters
  • data (DataFrame) – A DataFrame containing the different signal(s) as different columns. If a vector of values is passed, it will be transformed in a DataFrame with a single ‘Signal’ column.

  • events (list or ndarray or dict) – Events onset location. If a dict is passed (e.g., from events_find()), will select only the ‘onset’ list. If an integer is passed, will use this number to create an evenly spaced list of events. If None, will chunk the signal into successive blocks of the set duration.

  • sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second).

  • epochs_start (int) – Epochs start relative to events_onsets (in seconds). The start can be negative to start epochs before a given event (to have a baseline for instance).

  • epochs_end (int) – Epochs end relative to events_onsets (in seconds).

  • event_labels (list) – A list containing unique event identifiers. If None, will use the event index number.

  • event_conditions (list) – An optional list containing, for each event, for example the trial category, group or experimental conditions.

  • baseline_correction (bool) – Defaults to False.

Returns

dict – A dict containing DataFrames for all epochs.

See also

events_find(), events_plot(), epochs_to_df(), epochs_plot()

Examples

>>> import neurokit2 as nk
>>>
>>> # Get data
>>> data = nk.data("bio_eventrelated_100hz")
>>>
>>> # Find events
>>> events = nk.events_find(data["Photosensor"],
...                         threshold_keep='below',
...                         event_conditions=["Negative", "Neutral", "Neutral", "Negative"])
>>> fig1 = nk.events_plot(events, data)
>>> fig1 
>>>
>>> # Create epochs
>>> epochs = nk.epochs_create(data, events, sampling_rate=100, epochs_end=3)
>>> fig2 = nk.epochs_plot(epochs)
>>> fig2 
>>>
>>> # Baseline correction
>>> epochs = nk.epochs_create(data, events, sampling_rate=100, epochs_end=3, baseline_correction=True)
>>> fig3 = nk.epochs_plot(epochs)
>>> fig3 
>>>
>>> # Chunk into n blocks of 1 second
>>> epochs = nk.epochs_create(data, sampling_rate=100, epochs_end=1)
epochs_plot(epochs, legend=True, show=True)[source]

Plot epochs.

Parameters
  • epochs (dict) – A dict containing one DataFrame per event/trial. Usually obtained via epochs_create().

  • legend (bool) – Display the legend (the key of each epoch).

  • show (bool) – If True, will return a plot. If False, will return a DataFrame that can be plotted externally.

Returns

epochs (dict) – dict containing all epochs.

See also

events_find(), events_plot(), epochs_create(), epochs_to_df()

Examples

>>> import neurokit2 as nk
>>>
>>> # Example with data
>>> data = nk.data("bio_eventrelated_100hz")
>>> events = nk.events_find(data["Photosensor"],
...                         threshold_keep='below',
...                         event_conditions=["Negative", "Neutral", "Neutral", "Negative"])
>>> epochs = nk.epochs_create(data, events, sampling_rate=200, epochs_end=1)
>>> fig1 = nk.epochs_plot(epochs)
>>> fig1 
>>>
>>> # Example with ECG Peaks
>>> signal = nk.ecg_simulate(duration=10)
>>> events = nk.ecg_findpeaks(signal)
>>> epochs = nk.epochs_create(signal, events=events["ECG_R_Peaks"], epochs_start=-0.5, epochs_end=0.5)
>>> fig2 = nk.epochs_plot(epochs)
>>> fig2 
epochs_to_array(epochs)[source]

Convert epochs to an array.

TODO: make it work with uneven epochs (not the same length).

Parameters

epochs (dict) – A dict containing one DataFrame per event/trial. Usually obtained via epochs_create().

Returns

array – An array containing all signals.

See also

events_find(), events_plot(), epochs_create(), epochs_plot()

Examples

>>> import neurokit2 as nk
>>> import pandas as pd
>>>
>>> # Get data
>>> signal = nk.signal_simulate(sampling_rate=100)
>>>
>>> # Create epochs
>>> epochs = nk.epochs_create(signal, events=[400, 430, 460], sampling_rate=100, epochs_end=1)
>>> X = nk.epochs_to_array(epochs)
>>> nk.signal_plot(X.T)
epochs_to_df(epochs)[source]

Convert epochs to a DataFrame.

Parameters

epochs (dict) – A dict containing one DataFrame per event/trial. Usually obtained via epochs_create().

Returns

DataFrame – A DataFrame containing all epochs identifiable by the ‘Label’ column, which time axis is stored in the ‘Time’ column.

See also

events_find(), events_plot(), epochs_create(), epochs_plot()

Examples

>>> import neurokit2 as nk
>>> import pandas as pd
>>>
>>> # Get data
>>> data = pd.read_csv("https://raw.githubusercontent.com/neuropsychology/NeuroKit/dev/data/bio_eventrelated_100hz.csv")
>>>
>>> # Find events
>>> events = nk.events_find(data["Photosensor"],
...                         threshold_keep='below',
...                         event_conditions=["Negative", "Neutral", "Neutral", "Negative"])
>>> fig = nk.events_plot(events, data)
>>> fig 
>>>
>>> # Create epochs
>>> epochs = nk.epochs_create(data, events, sampling_rate=200, epochs_end=3)
>>> data = nk.epochs_to_df(epochs)

Statistics

Submodule for NeuroKit.

cor(x, y, method='pearson', show=False)[source]

Density estimation.

Computes kernel density estimates.

Parameters
  • x (Union[list, np.array, pd.Series]) – Vectors of values.

  • y (Union[list, np.array, pd.Series]) – Vectors of values.

  • method (str) – Correlation method. Can be one of ‘pearson’, ‘spearman’, ‘kendall’.

  • show (bool) – Draw a scatterplot with a regression line.

Returns

r – The correlation coefficient.

Examples

>>> import neurokit2 as nk
>>>
>>> x = [1, 2, 3, 4, 5]
>>> y = [3, 1, 5, 6, 6]
>>> corr = nk.cor(x, y, method="pearson", show=True)
>>> corr 
density(x, desired_length=100, bandwith=1, show=False)[source]

Density estimation.

Computes kernel density estimates.

Parameters
  • x (Union[list, np.array, pd.Series]) – A vector of values.

  • desired_length (int) – The amount of values in the returned density estimation.

  • bandwith (float) – The bandwith of the kernel. The smaller the values, the smoother the estimation.

  • show (bool) – Display the density plot.

Returns

  • x, y – The x axis of the density estimation.

  • y – The y axis of the density estimation.

Examples

>>> import neurokit2 as nk
>>>
>>> signal = nk.ecg_simulate(duration=20)
>>> x, y = nk.density(signal, bandwith=0.5, show=True)
>>>
>>> # Bandwidth comparison
>>> x, y1 = nk.density(signal, bandwith=0.5)
>>> x, y2 = nk.density(signal, bandwith=1)
>>> x, y3 = nk.density(signal, bandwith=2)
>>> pd.DataFrame({"x": x, "y1": y1, "y2": y2, "y3": y3}).plot(x="x") 
distance(X=None, method='mahalanobis')[source]

Distance.

Compute distance using different metrics.

Parameters
  • X (array or DataFrame) – A dataframe of values.

  • method (str) – The method to use. One of ‘mahalanobis’ or ‘mean’ for the average distance from the mean.

Returns

array – Vector containing the distance values.

Examples

>>> from sklearn import datasets
>>> import neurokit2 as nk
>>>
>>> X = datasets.load_iris().data
>>> vector = distance(X)
>>> vector 
fit_error(y, y_predicted, n_parameters=2)[source]

Calculate the fit error for a model.

Also specific and direct access functions can be used, such as fit_mse(), fit_rmse() and fit_r2().

Parameters
  • y (Union[list, np.array, pd.Series]) – The response variable (the y axis).

  • y_predicted (Union[list, np.array, pd.Series]) – The fitted data generated by a model.

  • n_parameters (int) – Number of model parameters (for the degrees of freedom used in R2).

Returns

dict – A dictionary containing different indices of fit error.

Examples

>>> import neurokit2 as nk
>>>
>>> y = np.array([-1.0, -0.5, 0, 0.5, 1])
>>> y_predicted = np.array([0.0, 0, 0, 0, 0])
>>>
>>> # Master function
>>> x = nk.fit_error(y, y_predicted)
>>> x 
>>>
>>> # Direct access
>>> nk.fit_mse(y, y_predicted) 
0.5
>>>
>>> nk.fit_rmse(y, y_predicted) 
0.7071067811865476
>>>
>>> nk.fit_r2(y, y_predicted, adjusted=False) 
0.7071067811865475
>>>
>>> nk.fit_r2(y, y_predicted, adjusted=True, n_parameters=2) 
0.057190958417936755
fit_loess(y, X=None, alpha=0.75, order=2)[source]

Local Polynomial Regression (LOESS)

Performs a LOWESS (LOcally WEighted Scatter-plot Smoother) regression.

Parameters
  • y (Union[list, np.array, pd.Series]) – The response variable (the y axis).

  • X (Union[list, np.array, pd.Series]) – Explanatory variable (the x axis). If ‘None’, will treat y as a continuous signal (useful for smoothing).

  • alpha (float) – The parameter which controls the degree of smoothing, which corresponds to the proportion of the samples to include in local regression.

  • order (int) – Degree of the polynomial to fit. Can be 1 or 2 (default).

Returns

array – Prediciton of the LOESS algorithm.

See also

signal_smooth(), signal_detrend(), fit_error()

Examples

>>> import pandas as pd
>>> import neurokit2 as nk
>>>
>>> signal = np.cos(np.linspace(start=0, stop=10, num=1000))
>>> distorted = nk.signal_distort(signal, noise_amplitude=[0.3, 0.2, 0.1], noise_frequency=[5, 10, 50])
>>>
>>> pd.DataFrame({ "Raw": distorted, "Loess_1": nk.fit_loess(distorted, order=1),
...               "Loess_2": nk.fit_loess(distorted, order=2)}).plot() 

References

fit_mixture(X=None, n_clusters=2)[source]

Gaussian Mixture Model.

Performs a polynomial regression of given order.

Parameters
  • X (Union[list, np.array, pd.Series]) – The values to classify.

  • n_clusters (int) – Number of components to look for.

Returns

pd.DataFrame – DataFrame containing the probability of belongning to each cluster.

See also

signal_detrend(), fit_error()

Examples

>>> import pandas as pd
>>> import neurokit2 as nk
>>>
>>> x = nk.signal_simulate()
>>> probs = nk.fit_mixture(x, n_clusters=2)
>>> fig = nk.signal_plot([x, probs["Cluster_0"], probs["Cluster_1"]], standardize=True)
>>> fig 
fit_mse(y, y_predicted)[source]

Compute Mean Square Error (MSE).

fit_polynomial(y, X=None, order=2)[source]

Polynomial Regression.

Performs a polynomial regression of given order.

Parameters
  • y (Union[list, np.array, pd.Series]) – The response variable (the y axis).

  • X (Union[list, np.array, pd.Series]) – Explanatory variable (the x axis). If ‘None’, will treat y as a continuous signal.

  • order (int) – The order of the polynomial. 0, 1 or > 1 for a baseline, linear or polynomial fit, respectively. Can also be ‘auto’, it which case it will attempt to find the optimal order to minimize the RMSE.

Returns

array – Prediction of the regression.

See also

signal_detrend(), fit_error(), fit_polynomial_findorder()

Examples

>>> import pandas as pd
>>> import neurokit2 as nk
>>>
>>> y = np.cos(np.linspace(start=0, stop=10, num=100))
>>>
>>> pd.DataFrame({"y": y, "Poly_0": nk.fit_polynomial(y, order=0),
...               "Poly_1": nk.fit_polynomial(y, order=1),
...               "Poly_2": nk.fit_polynomial(y, order=2),
...               "Poly_3": nk.fit_polynomial(y, order=3), "Poly_5": nk.fit_polynomial(y, order=5),
...               "Poly_auto": nk.fit_polynomial(y, order='auto')}).plot() 
fit_polynomial_findorder(y, X=None, max_order=6)[source]

Polynomial Regression.

Find the optimal order for polynomial fitting. Currently, the only method implemented is RMSE minimization.

Parameters
  • y (Union[list, np.array, pd.Series]) – The response variable (the y axis).

  • X (Union[list, np.array, pd.Series]) – Explanatory variable (the x axis). If ‘None’, will treat y as a continuous signal.

  • max_order (int) – The maximum order to test.

Returns

int – Optimal order.

See also

fit_polynomial()

Examples

>>> import neurokit2 as nk
>>>
>>> y = np.cos(np.linspace(start=0, stop=10, num=100))
>>>
>>> nk.fit_polynomial_findorder(y, max_order=10)
9
fit_r2(y, y_predicted, adjusted=True, n_parameters=2)[source]

Compute R2.

fit_rmse(y, y_predicted)[source]

Compute Root Mean Square Error (RMSE).

hdi(x, ci=0.95, show=False, **kwargs)[source]

Highest Density Interval (HDI)

Compute the Highest Density Interval (HDI) of a distribution. All points within this interval have a higher probability density than points outside the interval. The HDI can be used in the context of uncertainty characterisation of posterior distributions (in the Bayesian farmework) as Credible Interval (CI). Unlike equal-tailed intervals that typically exclude 2.5% from each tail of the distribution and always include the median, the HDI is not equal-tailed and therefore always includes the mode(s) of posterior distributions.

Parameters
  • x (Union[list, np.array, pd.Series]) – A vector of values.

  • ci (float) – Value of probability of the (credible) interval - CI (between 0 and 1) to be estimated. Default to .95 (95%).

  • show (bool) – If True, the function will produce a figure.

  • **kwargs (Line2D properties) – Other arguments to be passed to density().

See also

density()

Returns

  • float(s) – The HDI low and high limits.

  • fig – Distribution plot.

Examples

>>> import numpy as np
>>> import neurokit2 as nk
>>>
>>> x = np.random.normal(loc=0, scale=1, size=100000)
>>> ci_min, ci_high = nk.hdi(x, ci=0.95, show=True)
mad(x, constant=1.4826)[source]

Median Absolute Deviation: a “robust” version of standard deviation.

Parameters
  • x (Union[list, np.array, pd.Series]) – A vector of values.

  • constant (float) – Scale factor. Use 1.4826 for results similar to default R.

Returns

float – The MAD.

Examples

>>> import neurokit2 as nk
>>> nk.mad([2, 8, 7, 5, 4, 12, 5, 1])
3.7064999999999997

References

mutual_information(x, y, method='varoquaux', bins=256, sigma=1, normalized=True)[source]

Computes the (normalized) mutual information (MI) between two vectors from a joint histogram. The mutual information of two variables is a measure of the mutual dependence between them. More specifically, it quantifies the “amount of information” obtained about one variable by observing the other variable.

Parameters
  • x (Union[list, np.array, pd.Series]) – A vector of values.

  • y (Union[list, np.array, pd.Series]) – A vector of values.

  • method (str) – Method to use. Can either be ‘varoquaux’ or ‘nolitsa’.

  • bins (int) – Number of bins to use while creating the histogram.

  • sigma (float) – Sigma for Gaussian smoothing of the joint histogram. Only used if method==’varoquaux’.

  • normalized (book) – Compute normalised mutual information. Only used if method==’varoquaux’.

Returns

float – The computed similariy measure.

Examples

>>> import neurokit2 as nk
>>>
>>> x = [3, 3, 5, 1, 6, 3]
>>> y = [5, 3, 1, 3, 4, 5]
>>>
>>> nk.mutual_information(x, y, method="varoquaux") 
0.23600751227291816
>>>
>>> nk.mutual_information(x, y, method="nolitsa") 
1.4591479170272448

References

  • Studholme, jhill & jhawkes (1998). “A normalized entropy measure of 3-D medical image alignment”.

in Proc. Medical Imaging 1998, vol. 3338, San Diego, CA, pp. 132-143.

rescale(data, to=[0, 1])[source]

Rescale data.

Rescale a numeric variable to a new range.

Parameters
  • data (Union[list, np.array, pd.Series]) – Raw data.

  • to (list) – New range of values of the data after rescaling.

Returns

list – The rescaled values.

Examples

>>> import neurokit2 as nk
>>>
>>> nk.rescale(data=[3, 1, 2, 4, 6], to=[0, 1]) 
[0.4, 0.0, 0.2, 0.6000000000000001, 1.0]
standardize(data, robust=False, window=None, **kwargs)[source]

Standardization of data.

Performs a standardization of data (Z-scoring), i.e., centering and scaling, so that the data is expressed in terms of standard deviation (i.e., mean = 0, SD = 1) or Median Absolute Deviance (median = 0, MAD = 1).

Parameters
  • data (Union[list, np.array, pd.Series]) – Raw data.

  • robust (bool) – If True, centering is done by substracting the median from the variables and dividing it by the median absolute deviation (MAD). If False, variables are standardized by substracting the mean and dividing it by the standard deviation (SD).

  • window (int) – Perform a rolling window standardization, i.e., apply a standardization on a window of the specified number of samples that rolls along the main axis of the signal. Can be used for complex detrending.

  • **kwargs (optional) – Other arguments to be passed to pandas.rolling().

Returns

list – The standardized values.

Examples

>>> import neurokit2 as nk
>>> import pandas as pd
>>>
>>> # Simple example
>>> nk.standardize([3, 1, 2, 4, 6, np.nan]) 
[...]
>>> nk.standardize([3, 1, 2, 4, 6, np.nan], robust=True) 
[...]
>>> nk.standardize(np.array([[1, 2, 3, 4], [5, 6, 7, 8]]).T) 
 array(...)
>>> nk.standardize(pd.DataFrame({"A": [3, 1, 2, 4, 6, np.nan],
...                              "B": [3, 1, 2, 4, 6, 5]})) 
          A         B
0       ...       ...
...
>>>
>>> # Rolling standardization of a signal
>>> signal = nk.signal_simulate(frequency=[0.1, 2], sampling_rate=200)
>>> z = nk.standardize(signal, window=200)
>>> nk.signal_plot([signal, z], standardize=True)
summary_plot(x, **kwargs)[source]

Descriptive plot.

Visualize a distribution with density, histogram, boxplot and rugs plots all at once.

Examples

>>> import neurokit2 as nk
>>> import numpy as np
>>>
>>> x = np.random.normal(size=100)
>>> fig = nk.summary_plot(x)
>>> fig 

Complexity

Submodule for NeuroKit.

complexity_apen(signal, delay=1, dimension=2, r='default', corrected=False, **kwargs)

Approximate entropy (ApEn)

Python implementations of the approximate entropy (ApEn) and its corrected version (cApEn). Approximate entropy is a technique used to quantify the amount of regularity and the unpredictability of fluctuations over time-series data. The advantages of ApEn include lower computational demand (ApEn can be designed to work for small data samples (< 50 data points) and can be applied in real time) and less sensitive to noise. However, ApEn is heavily dependent on the record length and lacks relative consistency.

This function can be called either via entropy_approximate() or complexity_apen(), and the corrected version via complexity_capen().

Parameters
  • signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

  • delay (int) – Time delay (often denoted ‘Tau’, sometimes referred to as ‘lag’). In practice, it is common to have a fixed time lag (corresponding for instance to the sampling rate; Gautama, 2003), or to find a suitable value using some algorithmic heuristics (see delay_optimal()).

  • dimension (int) – Embedding dimension (often denoted ‘m’ or ‘d’, sometimes referred to as ‘order’). Typically 2 or 3. It corresponds to the number of compared runs of lagged data. If 2, the embedding returns an array with two columns corresponding to the original signal and its delayed (by Tau) version.

  • r (float) – Tolerance (similarity threshold). It corresponds to the filtering level - max absolute difference between segments. If ‘default’, will be set to 0.2 times the standard deviation of the signal (for dimension = 2).

  • corrected (bool) – If true, will compute corrected ApEn (cApEn), see Porta (2007).

  • **kwargs – Other arguments.

Returns

float – The approximate entropy as float value.

Examples

>>> import neurokit2 as nk
>>>
>>> signal = nk.signal_simulate(duration=2, frequency=5)
>>> entropy1 = nk.entropy_approximate(signal)
>>> entropy1 
>>> entropy2 = nk.entropy_approximate(signal, corrected=True)
>>> entropy2 

References

  • EntroPy <https://github.com/raphaelvallat/entropy>`_

  • Sabeti, M., Katebi, S., & Boostani, R. (2009). Entropy and complexity measures for EEG signal classification of schizophrenic and control participants. Artificial intelligence in medicine, 47(3), 263-274.

  • Shi, B., Zhang, Y., Yuan, C., Wang, S., & Li, P. (2017). Entropy analysis of short-term heartbeat interval time series during regular walking. Entropy, 19(10), 568.

complexity_capen(signal, delay=1, dimension=2, r='default', *, corrected=True, **kwargs)

Approximate entropy (ApEn)

Python implementations of the approximate entropy (ApEn) and its corrected version (cApEn). Approximate entropy is a technique used to quantify the amount of regularity and the unpredictability of fluctuations over time-series data. The advantages of ApEn include lower computational demand (ApEn can be designed to work for small data samples (< 50 data points) and can be applied in real time) and less sensitive to noise. However, ApEn is heavily dependent on the record length and lacks relative consistency.

This function can be called either via entropy_approximate() or complexity_apen(), and the corrected version via complexity_capen().

Parameters
  • signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

  • delay (int) – Time delay (often denoted ‘Tau’, sometimes referred to as ‘lag’). In practice, it is common to have a fixed time lag (corresponding for instance to the sampling rate; Gautama, 2003), or to find a suitable value using some algorithmic heuristics (see delay_optimal()).

  • dimension (int) – Embedding dimension (often denoted ‘m’ or ‘d’, sometimes referred to as ‘order’). Typically 2 or 3. It corresponds to the number of compared runs of lagged data. If 2, the embedding returns an array with two columns corresponding to the original signal and its delayed (by Tau) version.

  • r (float) – Tolerance (similarity threshold). It corresponds to the filtering level - max absolute difference between segments. If ‘default’, will be set to 0.2 times the standard deviation of the signal (for dimension = 2).

  • corrected (bool) – If true, will compute corrected ApEn (cApEn), see Porta (2007).

  • **kwargs – Other arguments.

Returns

float – The approximate entropy as float value.

Examples

>>> import neurokit2 as nk
>>>
>>> signal = nk.signal_simulate(duration=2, frequency=5)
>>> entropy1 = nk.entropy_approximate(signal)
>>> entropy1 
>>> entropy2 = nk.entropy_approximate(signal, corrected=True)
>>> entropy2 

References

  • EntroPy <https://github.com/raphaelvallat/entropy>`_

  • Sabeti, M., Katebi, S., & Boostani, R. (2009). Entropy and complexity measures for EEG signal classification of schizophrenic and control participants. Artificial intelligence in medicine, 47(3), 263-274.

  • Shi, B., Zhang, Y., Yuan, C., Wang, S., & Li, P. (2017). Entropy analysis of short-term heartbeat interval time series during regular walking. Entropy, 19(10), 568.

complexity_cmse(signal, scale='default', dimension=2, r='default', *, composite=True, refined=False, fuzzy=False, show=False, **kwargs)

Multiscale entropy (MSE) and its Composite (CMSE), Refined (RCMSE) or fuzzy version.

Python implementations of the multiscale entropy (MSE), the composite multiscale entropy (CMSE), the refined composite multiscale entropy (RCMSE) or their fuzzy version (FuzzyMSE, FuzzyCMSE or FuzzyRCMSE).

This function can be called either via entropy_multiscale() or complexity_mse(). Moreover, variants can be directly accessed via complexity_cmse(), complexity_rcmse()`, complexity_fuzzymse() and complexity_fuzzyrcmse().

Parameters
  • signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

  • scale (str or int or list) – A list of scale factors used for coarse graining the time series. If ‘default’, will use range(len(signal) / (dimension + 10)) (see discussion here). If ‘max’, will use all scales until half the length of the signal. If an integer, will create a range until the specified int.

  • dimension (int) – Embedding dimension (often denoted ‘m’ or ‘d’, sometimes referred to as ‘order’). Typically 2 or 3. It corresponds to the number of compared runs of lagged data. If 2, the embedding returns an array with two columns corresponding to the original signal and its delayed (by Tau) version.

  • r (float) – Tolerance (i.e., filtering level - max absolute difference between segments). If ‘default’, will be set to 0.2 times the standard deviation of the signal (for dimension = 2).

  • composite (bool) – Returns the composite multiscale entropy (CMSE), more accurate than MSE.

  • refined (bool) – Returns the ‘refined’ composite MSE (RCMSE; Wu, 2014)

  • fuzzy (bool) – Returns the fuzzy (composite) multiscale entropy (FuzzyMSE, FuzzyCMSE or FuzzyRCMSE).

  • show (bool) – Show the entropy values for each scale factor.

  • **kwargs – Optional arguments.

Returns

float – The point-estimate of multiscale entropy (MSE) as a float value corresponding to the area under the MSE values curvee, which is essentially the sum of sample entropy values over the range of scale factors.

Examples

>>> import neurokit2 as nk
>>>
>>> signal = nk.signal_simulate(duration=2, frequency=5)
>>> entropy1 = nk.entropy_multiscale(signal, show=True)
>>> entropy1 
>>> entropy2 = nk.entropy_multiscale(signal, show=True, composite=True)
>>> entropy2 
>>> entropy3 = nk.entropy_multiscale(signal, show=True, refined=True)
>>> entropy3 

References

  • pyEntropy <https://github.com/nikdon/pyEntropy>`_

  • Richman, J. S., & Moorman, J. R. (2000). Physiological time-series analysis using approximate entropy and sample entropy. American Journal of Physiology-Heart and Circulatory Physiology, 278(6), H2039-H2049.

  • Costa, M., Goldberger, A. L., & Peng, C. K. (2005). Multiscale entropy analysis of biological signals. Physical review E, 71(2), 021906.

  • Gow, B. J., Peng, C. K., Wayne, P. M., & Ahn, A. C. (2015). Multiscale entropy analysis of center-of-pressure dynamics in human postural control: methodological considerations. Entropy, 17(12), 7926-7947.

  • Norris, P. R., Anderson, S. M., Jenkins, J. M., Williams, A. E., & Morris Jr, J. A. (2008). Heart rate multiscale entropy at three hours predicts hospital mortality in 3,154 trauma patients. Shock, 30(1), 17-22.

  • Liu, Q., Wei, Q., Fan, S. Z., Lu, C. W., Lin, T. Y., Abbod, M. F., & Shieh, J. S. (2012). Adaptive computation of multiscale entropy and its application in EEG signals for monitoring depth of anesthesia during surgery. Entropy, 14(6), 978-992.

complexity_d2(signal, delay=1, dimension=2, r=64, show=False)

Correlation Dimension.

Python implementation of the Correlation Dimension D2 of a signal.

This function can be called either via fractal_correlation() or complexity_d2().

Parameters
  • signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

  • delay (int) – Time delay (often denoted ‘Tau’, sometimes referred to as ‘lag’). In practice, it is common to have a fixed time lag (corresponding for instance to the sampling rate; Gautama, 2003), or to find a suitable value using some algorithmic heuristics (see delay_optimal()).

  • dimension (int) – Embedding dimension (often denoted ‘m’ or ‘d’, sometimes referred to as ‘order’). Typically 2 or 3. It corresponds to the number of compared runs of lagged data. If 2, the embedding returns an array with two columns corresponding to the original signal and its delayed (by Tau) version.

  • r (str or int or list) – The sequence of radiuses to test. If an integer is passed, will get an exponential sequence ranging from 2.5% to 50% of the distance range. Methods implemented in other packages can be used via setting r='nolds' or r='Corr_Dim'.

  • show (bool) – Plot of correlation dimension if True. Defaults to False.

Returns

D2 (float) – The correlation dimension D2.

Examples

>>> import neurokit2 as nk
>>>
>>> signal = nk.signal_simulate(duration=2, frequency=5)
>>>
>>> fractal1 = nk.fractal_correlation(signal, r="nolds", show=True)
>>> fractal1 
>>> fractal2 = nk.fractal_correlation(signal, r=32, show=True)
>>> fractal2 
>>>
>>> signal = nk.rsp_simulate(duration=120, sampling_rate=50)
>>>
>>> fractal3 = nk.fractal_correlation(signal, r="nolds", show=True)
>>> fractal3 
>>> fractal4 = nk.fractal_correlation(signal, r=32, show=True)
>>> fractal4 

References

  • Bolea, J., Laguna, P., Remartínez, J. M., Rovira, E., Navarro, A., & Bailón, R. (2014). Methodological framework for estimating the correlation dimension in HRV signals. Computational and mathematical methods in medicine, 2014.

  • Boon, M. Y., Henry, B. I., Suttle, C. M., & Dain, S. J. (2008). The correlation dimension: A useful objective measure of the transient visual evoked potential?. Journal of vision, 8(1), 6-6.

  • nolds

  • Corr_Dim

complexity_delay(signal, delay_max=100, method='fraser1986', show=False)[source]

Estimate optimal Time Delay (tau) for time-delay embedding.

The time delay (Tau) is one of the two critical parameters involved in the construction of the time-delay embedding of a signal.

Several authors suggested different methods to guide the choice of Tau:

  • Fraser and Swinney (1986) suggest using the first local minimum of the mutual information between the delayed and non-delayed time series, effectively identifying a value of tau for which they share the least information.

  • Theiler (1990) suggested to select Tau where the autocorrelation between the signal and its lagged version at Tau first crosses the value 1/e.

  • Casdagli (1991) suggests instead taking the first zero-crossing of the autocorrelation.

  • Rosenstein (1993) suggests to the point close to 40% of the slope of the average displacement from the diagonal (ADFD).

Parameters
  • signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

  • delay_max (int) – The maximum time delay (Tau or lag) to test.

  • method (str) – Correlation method. Can be one of ‘fraser1986’, ‘theiler1990’, ‘casdagli1991’, ‘rosenstein1993’.

  • show (bool) – If true, will plot the mutual information values for each value of tau.

Returns

int – Optimal time delay.

Examples

>>> import neurokit2 as nk
>>>
>>> # Artifical example
>>> signal = nk.signal_simulate(duration=10, frequency=1, noise=0.01)
>>> nk.signal_plot(signal)
>>>
>>> delay = nk.complexity_delay(signal, delay_max=1000, show=True, method="fraser1986")
>>> delay = nk.complexity_delay(signal, delay_max=1000, show=True, method="theiler1990")
>>> delay = nk.complexity_delay(signal, delay_max=1000, show=True, method="casdagli1991")
>>> delay = nk.complexity_delay(signal, delay_max=1000, show=True, method="rosenstein1993")
>>>
>>> # Realistic example
>>> ecg = nk.ecg_simulate(duration=60*6, sampling_rate=150)
>>> signal = nk.ecg_rate(nk.ecg_peaks(ecg, sampling_rate=150), sampling_rate=150, desired_length=len(ecg))
>>> nk.signal_plot(signal)
>>>
>>> delay = nk.complexity_delay(signal, delay_max=1000, show=True)

References

  • Gautama, T., Mandic, D. P., & Van Hulle, M. M. (2003, April). A differential entropy based method for determining the optimal embedding parameters of a signal. In 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings.(ICASSP’03). (Vol. 6, pp. VI-29). IEEE.

  • Camplani, M., & Cannas, B. (2009). The role of the embedding dimension and time delay in time series forecasting. IFAC Proceedings Volumes, 42(7), 316-320.

  • Rosenstein, M. T., Collins, J. J., & De Luca, C. J. (1994). Reconstruction expansion as a geometry-based framework for choosing proper delay times. Physica-Section D, 73(1), 82-98.

complexity_dfa(signal, windows='default', overlap=True, integrate=True, order=1, multifractal=False, q=2, show=False)

(Multifractal) Detrended Fluctuation Analysis (DFA or MFDFA)

Python implementation of Detrended Fluctuation Analysis (DFA) or Multifractal DFA of a signal. Detrended fluctuation analysis, much like the Hurst exponent, is used to find long-term statistical dependencies in time series.

This function can be called either via fractal_dfa() or complexity_dfa(), and its multifractal variant can be directly accessed via fractal_mfdfa() or complexity_mfdfa()

Parameters
  • signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

  • windows (list) – A list containing the lengths of the windows (number of data points in each subseries). Also referred to as ‘lag’ or ‘scale’. If ‘default’, will set it to a logarithmic scale (so that each window scale hase the same weight) with a minimum of 4 and maximum of a tenth of the length (to have more than 10 windows to calculate the average fluctuation).

  • overlap (bool) – Defaults to True, where the windows will have a 50% overlap with each other, otherwise non-overlapping windows will be used.

  • integrate (bool) – It is common practice to convert the signal to a random walk (i.e., detrend and integrate, which corresponds to the signal ‘profile’). Note that it leads to the flattening of the signal, which can lead to the loss of some details (see Ihlen, 2012 for an explanation). Note that for strongly anticorrelated signals, this transformation should be applied two times (i.e., provide np.cumsum(signal - np.mean(signal)) instead of signal).

  • order (int) – The order of the polynoiam trend, 1 for the linear trend.

  • multifractal (bool) – If true, compute Multifractal Detrended Fluctuation Analysis (MFDFA), in which case the argument `q is taken into account.

  • q (list) – The sequence of fractal exponents when multifractal=True. Must be a sequence between -10 and 10 (nota that zero will be removed, since the code does not converge there). Setting q = 2 (default) gives a result close to a standard DFA. For instance, Ihlen (2012) usese `` q=[-5, -3, -1, 0, 1, 3, 5]``.

  • show (bool) – Visualise the trend between the window size and the fluctuations.

Returns

dfa (float) – The DFA coefficient.

Examples

>>> import neurokit2 as nk
>>>
>>> signal = nk.signal_simulate(duration=3, noise=0.05)
>>> dfa1 = nk.fractal_dfa(signal, show=True)
>>> dfa1 
>>> dfa2 = nk.fractal_mfdfa(signal, q=np.arange(-3, 4), show=True)
>>> dfa2 

References

  • Ihlen, E. A. F. E. (2012). Introduction to multifractal detrended fluctuation analysis in Matlab. Frontiers in physiology, 3, 141.

  • Hardstone, R., Poil, S. S., Schiavone, G., Jansen, R., Nikulin, V. V., Mansvelder, H. D., & Linkenkaer-Hansen, K. (2012). Detrended fluctuation analysis: a scale-free view on neuronal oscillations. Frontiers in physiology, 3, 450.

  • nolds

  • Youtube introduction

complexity_dimension(signal, delay=1, dimension_max=20, method='afnn', show=False, R=10.0, A=2.0, **kwargs)[source]

Estimate optimal Dimension (m) for time-delay embedding.

Parameters
  • signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

  • delay (int) – Time delay (often denoted ‘Tau’, sometimes referred to as ‘lag’). In practice, it is common to have a fixed time lag (corresponding for instance to the sampling rate; Gautama, 2003), or to find a suitable value using some algorithmic heuristics (see complexity_delay()).

  • dimension_max (int) – The maximum embedding dimension (often denoted ‘m’ or ‘d’, sometimes referred to as ‘order’) to test.

  • method (str) – Method can either be afnn (average false nearest neighbour) or fnn (false nearest neighbour).

  • show (bool) – Visualize the result.

  • R (float) – Relative tolerance (for fnn method).

  • A (float) – Absolute tolerance (for fnn method)

  • **kwargs – Other arguments.

Returns

int – Optimal dimension.

Examples

>>> import neurokit2 as nk
>>>
>>> # Artifical example
>>> signal = nk.signal_simulate(duration=10, frequency=1, noise=0.01)
>>> delay = nk.complexity_delay(signal, delay_max=500)
>>>
>>> values = nk.complexity_dimension(signal, delay=delay, dimension_max=20, show=True)

References

  • Cao, L. (1997). Practical method for determining the minimum embedding dimension of a scalar time series. Physica D: Nonlinear Phenomena, 110(1-2), 43-50.

complexity_embedding(signal, delay=1, dimension=3, show=False)[source]

Time-delay embedding of a time series (a signal)

A dynamical system can be described by a vector of numbers, called its ‘state’, that aims to provide a complete description of the system at some point in time. The set of all possible states is called the ‘state space’.

Takens’s (1981) embedding theorem suggests that a sequence of measurements of a dynamic system includes in itself all the information required to completely reconstruct the state space. Delay coordinate embedding attempts to identify the state s of the system at some time t by searching the past history of observations for similar states, and, by studying the evolution of similar states, infer information about the future of the system.

How to visualize the dynamics of a system? A sequence of state values over time is called a trajectory. Depending on the system, different trajectories can evolve to a common subset of state space called an attractor. The presence and behavior of attractors gives intuition about the underlying dynamical system. We can visualize the system and its attractors by plotting the trajectory of many different initial state values and numerically integrating them to approximate their continuous time evolution on discrete computers.

This function is adapted from EntroPy and is equivalent to the delay_embedding() function from ‘nolds’.

Parameters
  • signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

  • delay (int) – Time delay (often denoted ‘Tau’, sometimes referred to as ‘lag’). In practice, it is common to have a fixed time lag (corresponding for instance to the sampling rate; Gautama, 2003), or to find a suitable value using some algorithmic heuristics (see delay_optimal()).

  • dimension (int) – Embedding dimension (often denoted ‘m’ or ‘d’, sometimes referred to as ‘order’). Typically 2 or 3. It corresponds to the number of compared runs of lagged data. If 2, the embedding returns an array with two columns corresponding to the original signal and its delayed (by Tau) version.

  • show (bool) – Plot the reconstructed attractor.

Returns

array – Embedded time-series, of shape (n_times - (order - 1) * delay, order)

See also

embedding_delay(), embedding_dimension()

Examples

>>> import neurokit2 as nk
>>>
>>> # Artifical example
>>> signal = nk.signal_simulate(duration=2, frequency=5, noise=0.01)
>>>
>>> embedded = nk.complexity_embedding(signal, delay=50, dimension=2, show=True)
>>> embedded = nk.complexity_embedding(signal, delay=50, dimension=3, show=True)
>>> embedded = nk.complexity_embedding(signal, delay=50, dimension=4, show=True)
>>>
>>> # Realistic example
>>> ecg = nk.ecg_simulate(duration=60*4, sampling_rate=200)
>>> signal = nk.ecg_rate(nk.ecg_peaks(ecg, sampling_rate=200)[0], sampling_rate=200, desired_length=len(ecg))
>>>
>>> embedded = nk.complexity_embedding(signal, delay=250, dimension=2, show=True)
>>> embedded = nk.complexity_embedding(signal, delay=250, dimension=3, show=True)
>>> embedded = nk.complexity_embedding(signal, delay=250, dimension=4, show=True)

References

  • Gautama, T., Mandic, D. P., & Van Hulle, M. M. (2003, April). A differential entropy based method for determining the optimal embedding parameters of a signal. In 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings.(ICASSP’03). (Vol. 6, pp. VI-29). IEEE.

complexity_fuzzycmse(signal, scale='default', dimension=2, r='default', *, composite=True, refined=False, fuzzy=True, show=False, **kwargs)

Multiscale entropy (MSE) and its Composite (CMSE), Refined (RCMSE) or fuzzy version.

Python implementations of the multiscale entropy (MSE), the composite multiscale entropy (CMSE), the refined composite multiscale entropy (RCMSE) or their fuzzy version (FuzzyMSE, FuzzyCMSE or FuzzyRCMSE).

This function can be called either via entropy_multiscale() or complexity_mse(). Moreover, variants can be directly accessed via complexity_cmse(), complexity_rcmse()`, complexity_fuzzymse() and complexity_fuzzyrcmse().

Parameters
  • signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

  • scale (str or int or list) – A list of scale factors used for coarse graining the time series. If ‘default’, will use range(len(signal) / (dimension + 10)) (see discussion here). If ‘max’, will use all scales until half the length of the signal. If an integer, will create a range until the specified int.

  • dimension (int) – Embedding dimension (often denoted ‘m’ or ‘d’, sometimes referred to as ‘order’). Typically 2 or 3. It corresponds to the number of compared runs of lagged data. If 2, the embedding returns an array with two columns corresponding to the original signal and its delayed (by Tau) version.

  • r (float) – Tolerance (i.e., filtering level - max absolute difference between segments). If ‘default’, will be set to 0.2 times the standard deviation of the signal (for dimension = 2).

  • composite (bool) – Returns the composite multiscale entropy (CMSE), more accurate than MSE.

  • refined (bool) – Returns the ‘refined’ composite MSE (RCMSE; Wu, 2014)

  • fuzzy (bool) – Returns the fuzzy (composite) multiscale entropy (FuzzyMSE, FuzzyCMSE or FuzzyRCMSE).

  • show (bool) – Show the entropy values for each scale factor.

  • **kwargs – Optional arguments.

Returns

float – The point-estimate of multiscale entropy (MSE) as a float value corresponding to the area under the MSE values curvee, which is essentially the sum of sample entropy values over the range of scale factors.

Examples

>>> import neurokit2 as nk
>>>
>>> signal = nk.signal_simulate(duration=2, frequency=5)
>>> entropy1 = nk.entropy_multiscale(signal, show=True)
>>> entropy1 
>>> entropy2 = nk.entropy_multiscale(signal, show=True, composite=True)
>>> entropy2 
>>> entropy3 = nk.entropy_multiscale(signal, show=True, refined=True)
>>> entropy3 

References

  • pyEntropy <https://github.com/nikdon/pyEntropy>`_

  • Richman, J. S., & Moorman, J. R. (2000). Physiological time-series analysis using approximate entropy and sample entropy. American Journal of Physiology-Heart and Circulatory Physiology, 278(6), H2039-H2049.

  • Costa, M., Goldberger, A. L., & Peng, C. K. (2005). Multiscale entropy analysis of biological signals. Physical review E, 71(2), 021906.

  • Gow, B. J., Peng, C. K., Wayne, P. M., & Ahn, A. C. (2015). Multiscale entropy analysis of center-of-pressure dynamics in human postural control: methodological considerations. Entropy, 17(12), 7926-7947.

  • Norris, P. R., Anderson, S. M., Jenkins, J. M., Williams, A. E., & Morris Jr, J. A. (2008). Heart rate multiscale entropy at three hours predicts hospital mortality in 3,154 trauma patients. Shock, 30(1), 17-22.

  • Liu, Q., Wei, Q., Fan, S. Z., Lu, C. W., Lin, T. Y., Abbod, M. F., & Shieh, J. S. (2012). Adaptive computation of multiscale entropy and its application in EEG signals for monitoring depth of anesthesia during surgery. Entropy, 14(6), 978-992.

complexity_fuzzyen(signal, delay=1, dimension=2, r='default', **kwargs)

Fuzzy entropy (FuzzyEn)

Python implementations of the fuzzy entropy (FuzzyEn) of a signal.

This function can be called either via entropy_fuzzy() or complexity_fuzzyen().

Parameters
  • signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

  • delay (int) – Time delay (often denoted ‘Tau’, sometimes referred to as ‘lag’). In practice, it is common to have a fixed time lag (corresponding for instance to the sampling rate; Gautama, 2003), or to find a suitable value using some algorithmic heuristics (see delay_optimal()).

  • dimension (int) – Embedding dimension (often denoted ‘m’ or ‘d’, sometimes referred to as ‘order’). Typically 2 or 3. It corresponds to the number of compared runs of lagged data. If 2, the embedding returns an array with two columns corresponding to the original signal and its delayed (by Tau) version.

  • r (float) – Tolerance (i.e., filtering level - max absolute difference between segments). If ‘default’, will be set to 0.2 times the standard deviation of the signal (for dimension = 2).

  • **kwargs – Other arguments.

Returns

float – The fuzzy entropy as float value.

Examples

>>> import neurokit2 as nk
>>>
>>> signal = nk.signal_simulate(duration=2, frequency=5)
>>> entropy = nk.entropy_fuzzy(signal)
>>> entropy 
complexity_fuzzymse(signal, scale='default', dimension=2, r='default', composite=False, refined=False, *, fuzzy=True, show=False, **kwargs)

Multiscale entropy (MSE) and its Composite (CMSE), Refined (RCMSE) or fuzzy version.

Python implementations of the multiscale entropy (MSE), the composite multiscale entropy (CMSE), the refined composite multiscale entropy (RCMSE) or their fuzzy version (FuzzyMSE, FuzzyCMSE or FuzzyRCMSE).

This function can be called either via entropy_multiscale() or complexity_mse(). Moreover, variants can be directly accessed via complexity_cmse(), complexity_rcmse()`, complexity_fuzzymse() and complexity_fuzzyrcmse().

Parameters
  • signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

  • scale (str or int or list) – A list of scale factors used for coarse graining the time series. If ‘default’, will use range(len(signal) / (dimension + 10)) (see discussion here). If ‘max’, will use all scales until half the length of the signal. If an integer, will create a range until the specified int.

  • dimension (int) – Embedding dimension (often denoted ‘m’ or ‘d’, sometimes referred to as ‘order’). Typically 2 or 3. It corresponds to the number of compared runs of lagged data. If 2, the embedding returns an array with two columns corresponding to the original signal and its delayed (by Tau) version.

  • r (float) – Tolerance (i.e., filtering level - max absolute difference between segments). If ‘default’, will be set to 0.2 times the standard deviation of the signal (for dimension = 2).

  • composite (bool) – Returns the composite multiscale entropy (CMSE), more accurate than MSE.

  • refined (bool) – Returns the ‘refined’ composite MSE (RCMSE; Wu, 2014)

  • fuzzy (bool) – Returns the fuzzy (composite) multiscale entropy (FuzzyMSE, FuzzyCMSE or FuzzyRCMSE).

  • show (bool) – Show the entropy values for each scale factor.

  • **kwargs – Optional arguments.

Returns

float – The point-estimate of multiscale entropy (MSE) as a float value corresponding to the area under the MSE values curvee, which is essentially the sum of sample entropy values over the range of scale factors.

Examples

>>> import neurokit2 as nk
>>>
>>> signal = nk.signal_simulate(duration=2, frequency=5)
>>> entropy1 = nk.entropy_multiscale(signal, show=True)
>>> entropy1 
>>> entropy2 = nk.entropy_multiscale(signal, show=True, composite=True)
>>> entropy2 
>>> entropy3 = nk.entropy_multiscale(signal, show=True, refined=True)
>>> entropy3 

References

  • pyEntropy <https://github.com/nikdon/pyEntropy>`_

  • Richman, J. S., & Moorman, J. R. (2000). Physiological time-series analysis using approximate entropy and sample entropy. American Journal of Physiology-Heart and Circulatory Physiology, 278(6), H2039-H2049.

  • Costa, M., Goldberger, A. L., & Peng, C. K. (2005). Multiscale entropy analysis of biological signals. Physical review E, 71(2), 021906.

  • Gow, B. J., Peng, C. K., Wayne, P. M., & Ahn, A. C. (2015). Multiscale entropy analysis of center-of-pressure dynamics in human postural control: methodological considerations. Entropy, 17(12), 7926-7947.

  • Norris, P. R., Anderson, S. M., Jenkins, J. M., Williams, A. E., & Morris Jr, J. A. (2008). Heart rate multiscale entropy at three hours predicts hospital mortality in 3,154 trauma patients. Shock, 30(1), 17-22.

  • Liu, Q., Wei, Q., Fan, S. Z., Lu, C. W., Lin, T. Y., Abbod, M. F., & Shieh, J. S. (2012). Adaptive computation of multiscale entropy and its application in EEG signals for monitoring depth of anesthesia during surgery. Entropy, 14(6), 978-992.

complexity_fuzzyrcmse(signal, scale='default', dimension=2, r='default', composite=False, *, refined=True, fuzzy=True, show=False, **kwargs)

Multiscale entropy (MSE) and its Composite (CMSE), Refined (RCMSE) or fuzzy version.

Python implementations of the multiscale entropy (MSE), the composite multiscale entropy (CMSE), the refined composite multiscale entropy (RCMSE) or their fuzzy version (FuzzyMSE, FuzzyCMSE or FuzzyRCMSE).

This function can be called either via entropy_multiscale() or complexity_mse(). Moreover, variants can be directly accessed via complexity_cmse(), complexity_rcmse()`, complexity_fuzzymse() and complexity_fuzzyrcmse().

Parameters
  • signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

  • scale (str or int or list) – A list of scale factors used for coarse graining the time series. If ‘default’, will use range(len(signal) / (dimension + 10)) (see discussion here). If ‘max’, will use all scales until half the length of the signal. If an integer, will create a range until the specified int.

  • dimension (int) – Embedding dimension (often denoted ‘m’ or ‘d’, sometimes referred to as ‘order’). Typically 2 or 3. It corresponds to the number of compared runs of lagged data. If 2, the embedding returns an array with two columns corresponding to the original signal and its delayed (by Tau) version.

  • r (float) – Tolerance (i.e., filtering level - max absolute difference between segments). If ‘default’, will be set to 0.2 times the standard deviation of the signal (for dimension = 2).

  • composite (bool) – Returns the composite multiscale entropy (CMSE), more accurate than MSE.

  • refined (bool) – Returns the ‘refined’ composite MSE (RCMSE; Wu, 2014)

  • fuzzy (bool) – Returns the fuzzy (composite) multiscale entropy (FuzzyMSE, FuzzyCMSE or FuzzyRCMSE).

  • show (bool) – Show the entropy values for each scale factor.

  • **kwargs – Optional arguments.

Returns

float – The point-estimate of multiscale entropy (MSE) as a float value corresponding to the area under the MSE values curvee, which is essentially the sum of sample entropy values over the range of scale factors.

Examples

>>> import neurokit2 as nk
>>>
>>> signal = nk.signal_simulate(duration=2, frequency=5)
>>> entropy1 = nk.entropy_multiscale(signal, show=True)
>>> entropy1 
>>> entropy2 = nk.entropy_multiscale(signal, show=True, composite=True)
>>> entropy2 
>>> entropy3 = nk.entropy_multiscale(signal, show=True, refined=True)
>>> entropy3 

References

  • pyEntropy <https://github.com/nikdon/pyEntropy>`_

  • Richman, J. S., & Moorman, J. R. (2000). Physiological time-series analysis using approximate entropy and sample entropy. American Journal of Physiology-Heart and Circulatory Physiology, 278(6), H2039-H2049.

  • Costa, M., Goldberger, A. L., & Peng, C. K. (2005). Multiscale entropy analysis of biological signals. Physical review E, 71(2), 021906.

  • Gow, B. J., Peng, C. K., Wayne, P. M., & Ahn, A. C. (2015). Multiscale entropy analysis of center-of-pressure dynamics in human postural control: methodological considerations. Entropy, 17(12), 7926-7947.

  • Norris, P. R., Anderson, S. M., Jenkins, J. M., Williams, A. E., & Morris Jr, J. A. (2008). Heart rate multiscale entropy at three hours predicts hospital mortality in 3,154 trauma patients. Shock, 30(1), 17-22.

  • Liu, Q., Wei, Q., Fan, S. Z., Lu, C. W., Lin, T. Y., Abbod, M. F., & Shieh, J. S. (2012). Adaptive computation of multiscale entropy and its application in EEG signals for monitoring depth of anesthesia during surgery. Entropy, 14(6), 978-992.

complexity_mfdfa(signal, windows='default', overlap=True, integrate=True, order=1, *, multifractal=True, q=2, show=False)

(Multifractal) Detrended Fluctuation Analysis (DFA or MFDFA)

Python implementation of Detrended Fluctuation Analysis (DFA) or Multifractal DFA of a signal. Detrended fluctuation analysis, much like the Hurst exponent, is used to find long-term statistical dependencies in time series.

This function can be called either via fractal_dfa() or complexity_dfa(), and its multifractal variant can be directly accessed via fractal_mfdfa() or complexity_mfdfa()

Parameters
  • signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

  • windows (list) – A list containing the lengths of the windows (number of data points in each subseries). Also referred to as ‘lag’ or ‘scale’. If ‘default’, will set it to a logarithmic scale (so that each window scale hase the same weight) with a minimum of 4 and maximum of a tenth of the length (to have more than 10 windows to calculate the average fluctuation).

  • overlap (bool) – Defaults to True, where the windows will have a 50% overlap with each other, otherwise non-overlapping windows will be used.

  • integrate (bool) – It is common practice to convert the signal to a random walk (i.e., detrend and integrate, which corresponds to the signal ‘profile’). Note that it leads to the flattening of the signal, which can lead to the loss of some details (see Ihlen, 2012 for an explanation). Note that for strongly anticorrelated signals, this transformation should be applied two times (i.e., provide np.cumsum(signal - np.mean(signal)) instead of signal).

  • order (int) – The order of the polynoiam trend, 1 for the linear trend.

  • multifractal (bool) – If true, compute Multifractal Detrended Fluctuation Analysis (MFDFA), in which case the argument `q is taken into account.

  • q (list) – The sequence of fractal exponents when multifractal=True. Must be a sequence between -10 and 10 (nota that zero will be removed, since the code does not converge there). Setting q = 2 (default) gives a result close to a standard DFA. For instance, Ihlen (2012) usese `` q=[-5, -3, -1, 0, 1, 3, 5]``.

  • show (bool) – Visualise the trend between the window size and the fluctuations.

Returns

dfa (float) – The DFA coefficient.

Examples

>>> import neurokit2 as nk
>>>
>>> signal = nk.signal_simulate(duration=3, noise=0.05)
>>> dfa1 = nk.fractal_dfa(signal, show=True)
>>> dfa1 
>>> dfa2 = nk.fractal_mfdfa(signal, q=np.arange(-3, 4), show=True)
>>> dfa2 

References

  • Ihlen, E. A. F. E. (2012). Introduction to multifractal detrended fluctuation analysis in Matlab. Frontiers in physiology, 3, 141.

  • Hardstone, R., Poil, S. S., Schiavone, G., Jansen, R., Nikulin, V. V., Mansvelder, H. D., & Linkenkaer-Hansen, K. (2012). Detrended fluctuation analysis: a scale-free view on neuronal oscillations. Frontiers in physiology, 3, 450.

  • nolds

  • Youtube introduction

complexity_mse(signal, scale='default', dimension=2, r='default', composite=False, refined=False, fuzzy=False, show=False, **kwargs)

Multiscale entropy (MSE) and its Composite (CMSE), Refined (RCMSE) or fuzzy version.

Python implementations of the multiscale entropy (MSE), the composite multiscale entropy (CMSE), the refined composite multiscale entropy (RCMSE) or their fuzzy version (FuzzyMSE, FuzzyCMSE or FuzzyRCMSE).

This function can be called either via entropy_multiscale() or complexity_mse(). Moreover, variants can be directly accessed via complexity_cmse(), complexity_rcmse()`, complexity_fuzzymse() and complexity_fuzzyrcmse().

Parameters
  • signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

  • scale (str or int or list) – A list of scale factors used for coarse graining the time series. If ‘default’, will use range(len(signal) / (dimension + 10)) (see discussion here). If ‘max’, will use all scales until half the length of the signal. If an integer, will create a range until the specified int.

  • dimension (int) – Embedding dimension (often denoted ‘m’ or ‘d’, sometimes referred to as ‘order’). Typically 2 or 3. It corresponds to the number of compared runs of lagged data. If 2, the embedding returns an array with two columns corresponding to the original signal and its delayed (by Tau) version.

  • r (float) – Tolerance (i.e., filtering level - max absolute difference between segments). If ‘default’, will be set to 0.2 times the standard deviation of the signal (for dimension = 2).

  • composite (bool) – Returns the composite multiscale entropy (CMSE), more accurate than MSE.

  • refined (bool) – Returns the ‘refined’ composite MSE (RCMSE; Wu, 2014)

  • fuzzy (bool) – Returns the fuzzy (composite) multiscale entropy (FuzzyMSE, FuzzyCMSE or FuzzyRCMSE).

  • show (bool) – Show the entropy values for each scale factor.

  • **kwargs – Optional arguments.

Returns

float – The point-estimate of multiscale entropy (MSE) as a float value corresponding to the area under the MSE values curvee, which is essentially the sum of sample entropy values over the range of scale factors.

Examples

>>> import neurokit2 as nk
>>>
>>> signal = nk.signal_simulate(duration=2, frequency=5)
>>> entropy1 = nk.entropy_multiscale(signal, show=True)
>>> entropy1 
>>> entropy2 = nk.entropy_multiscale(signal, show=True, composite=True)
>>> entropy2 
>>> entropy3 = nk.entropy_multiscale(signal, show=True, refined=True)
>>> entropy3 

References

  • pyEntropy <https://github.com/nikdon/pyEntropy>`_

  • Richman, J. S., & Moorman, J. R. (2000). Physiological time-series analysis using approximate entropy and sample entropy. American Journal of Physiology-Heart and Circulatory Physiology, 278(6), H2039-H2049.

  • Costa, M., Goldberger, A. L., & Peng, C. K. (2005). Multiscale entropy analysis of biological signals. Physical review E, 71(2), 021906.

  • Gow, B. J., Peng, C. K., Wayne, P. M., & Ahn, A. C. (2015). Multiscale entropy analysis of center-of-pressure dynamics in human postural control: methodological considerations. Entropy, 17(12), 7926-7947.

  • Norris, P. R., Anderson, S. M., Jenkins, J. M., Williams, A. E., & Morris Jr, J. A. (2008). Heart rate multiscale entropy at three hours predicts hospital mortality in 3,154 trauma patients. Shock, 30(1), 17-22.

  • Liu, Q., Wei, Q., Fan, S. Z., Lu, C. W., Lin, T. Y., Abbod, M. F., & Shieh, J. S. (2012). Adaptive computation of multiscale entropy and its application in EEG signals for monitoring depth of anesthesia during surgery. Entropy, 14(6), 978-992.

complexity_optimize(signal, delay_max=100, delay_method='fraser1986', dimension_max=20, dimension_method='afnn', r_method='maxApEn', show=False)[source]

Find optimal complexity parameters.

Estimate optimal complexity parameters Dimension (m), Time Delay (tau) and tolerance ‘r’.

Parameters
  • signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

  • delay_max (int) – See complexity_delay().

  • delay_method (str) – See complexity_delay().

  • dimension_max (int) – See complexity_dimension().

  • dimension_method (str) – See complexity_dimension().

  • r_method (str) – See complexity_r().

  • show (bool) – Defaults to False.

Returns

  • optimal_dimension (int) – Optimal dimension.

  • optimal_delay (int) – Optimal time delay.

Examples

>>> import neurokit2 as nk
>>>
>>> # Artifical example
>>> signal = nk.signal_simulate(duration=10, frequency=1, noise=0.01)
>>> parameters = nk.complexity_optimize(signal, show=True)
>>> parameters 

References

  • Gautama, T., Mandic, D. P., & Van Hulle, M. M. (2003, April). A differential entropy based method for determining the optimal embedding parameters of a signal. In 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings.(ICASSP’03). (Vol. 6, pp. VI-29). IEEE.

  • Camplani, M., & Cannas, B. (2009). The role of the embedding dimension and time delay in time series forecasting. IFAC Proceedings Volumes, 42(7), 316-320.

  • Rosenstein, M. T., Collins, J. J., & De Luca, C. J. (1994). Reconstruction expansion as a geometry-based framework for choosing proper delay times. Physica-Section D, 73(1), 82-98.

  • Cao, L. (1997). Practical method for determining the minimum embedding dimension of a scalar time series. Physica D: Nonlinear Phenomena, 110(1-2), 43-50.

  • Lu, S., Chen, X., Kanters, J. K., Solomon, I. C., & Chon, K. H. (2008). Automatic selection of the threshold value r for approximate entropy. IEEE Transactions on Biomedical Engineering, 55(8), 1966-1972.

complexity_plot(signal, delay_max=100, delay_method='fraser1986', dimension_max=20, dimension_method='afnn', r_method='maxApEn', *, show=True)

Find optimal complexity parameters.

Estimate optimal complexity parameters Dimension (m), Time Delay (tau) and tolerance ‘r’.

Parameters
  • signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

  • delay_max (int) – See complexity_delay().

  • delay_method (str) – See complexity_delay().

  • dimension_max (int) – See complexity_dimension().

  • dimension_method (str) – See complexity_dimension().

  • r_method (str) – See complexity_r().

  • show (bool) – Defaults to False.

Returns

  • optimal_dimension (int) – Optimal dimension.

  • optimal_delay (int) – Optimal time delay.

Examples

>>> import neurokit2 as nk
>>>
>>> # Artifical example
>>> signal = nk.signal_simulate(duration=10, frequency=1, noise=0.01)
>>> parameters = nk.complexity_optimize(signal, show=True)
>>> parameters 

References

  • Gautama, T., Mandic, D. P., & Van Hulle, M. M. (2003, April). A differential entropy based method for determining the optimal embedding parameters of a signal. In 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings.(ICASSP’03). (Vol. 6, pp. VI-29). IEEE.

  • Camplani, M., & Cannas, B. (2009). The role of the embedding dimension and time delay in time series forecasting. IFAC Proceedings Volumes, 42(7), 316-320.

  • Rosenstein, M. T., Collins, J. J., & De Luca, C. J. (1994). Reconstruction expansion as a geometry-based framework for choosing proper delay times. Physica-Section D, 73(1), 82-98.

  • Cao, L. (1997). Practical method for determining the minimum embedding dimension of a scalar time series. Physica D: Nonlinear Phenomena, 110(1-2), 43-50.

  • Lu, S., Chen, X., Kanters, J. K., Solomon, I. C., & Chon, K. H. (2008). Automatic selection of the threshold value r for approximate entropy. IEEE Transactions on Biomedical Engineering, 55(8), 1966-1972.

complexity_r(signal, delay=None, dimension=None, method='maxApEn', show=False)[source]

Estimate optimal tolerance (similarity threshold) :Parameters: * signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

  • delay (int) – Time delay (often denoted ‘Tau’, sometimes referred to as ‘lag’). In practice, it is common to have a fixed time lag (corresponding for instance to the sampling rate; Gautama, 2003), or to find a suitable value using some algorithmic heuristics (see delay_optimal()).

  • dimension (int) – Embedding dimension (often denoted ‘m’ or ‘d’, sometimes referred to as ‘order’). Typically 2 or 3. It corresponds to the number of compared runs of lagged data. If 2, the embedding returns an array with two columns corresponding to the original signal and its delayed (by Tau) version.

  • method (str) – If ‘maxApEn’, rmax where ApEn is max will be returned. If ‘traditional’, r = 0.2 * standard deviation of the signal will be returned.

  • show (bool) – If true and method is ‘maxApEn’, will plot the ApEn values for each value of r.

Returns

float – The optimal r as a similarity threshold. It corresponds to the filtering level - max absolute difference between segments.

Examples

>>> import neurokit2 as nk
>>>
>>> signal = nk.signal_simulate(duration=2, frequency=5)
>>> delay = nk.complexity_delay(signal)
>>> dimension = nk.complexity_dimension(signal, delay=delay)
>>> r = nk.complexity_r(signal, delay, dimension)
>>> r 

References

  • Lu, S., Chen, X., Kanters, J. K., Solomon, I. C., & Chon, K. H. (2008). Automatic selection of the threshold value r for approximate entropy. IEEE Transactions on Biomedical Engineering, 55(8), 1966-1972.

complexity_rcmse(signal, scale='default', dimension=2, r='default', composite=False, *, refined=True, fuzzy=False, show=False, **kwargs)

Multiscale entropy (MSE) and its Composite (CMSE), Refined (RCMSE) or fuzzy version.

Python implementations of the multiscale entropy (MSE), the composite multiscale entropy (CMSE), the refined composite multiscale entropy (RCMSE) or their fuzzy version (FuzzyMSE, FuzzyCMSE or FuzzyRCMSE).

This function can be called either via entropy_multiscale() or complexity_mse(). Moreover, variants can be directly accessed via complexity_cmse(), complexity_rcmse()`, complexity_fuzzymse() and complexity_fuzzyrcmse().

Parameters
  • signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

  • scale (str or int or list) – A list of scale factors used for coarse graining the time series. If ‘default’, will use range(len(signal) / (dimension + 10)) (see discussion here). If ‘max’, will use all scales until half the length of the signal. If an integer, will create a range until the specified int.

  • dimension (int) – Embedding dimension (often denoted ‘m’ or ‘d’, sometimes referred to as ‘order’). Typically 2 or 3. It corresponds to the number of compared runs of lagged data. If 2, the embedding returns an array with two columns corresponding to the original signal and its delayed (by Tau) version.

  • r (float) – Tolerance (i.e., filtering level - max absolute difference between segments). If ‘default’, will be set to 0.2 times the standard deviation of the signal (for dimension = 2).

  • composite (bool) – Returns the composite multiscale entropy (CMSE), more accurate than MSE.

  • refined (bool) – Returns the ‘refined’ composite MSE (RCMSE; Wu, 2014)

  • fuzzy (bool) – Returns the fuzzy (composite) multiscale entropy (FuzzyMSE, FuzzyCMSE or FuzzyRCMSE).

  • show (bool) – Show the entropy values for each scale factor.

  • **kwargs – Optional arguments.

Returns

float – The point-estimate of multiscale entropy (MSE) as a float value corresponding to the area under the MSE values curvee, which is essentially the sum of sample entropy values over the range of scale factors.

Examples

>>> import neurokit2 as nk
>>>
>>> signal = nk.signal_simulate(duration=2, frequency=5)
>>> entropy1 = nk.entropy_multiscale(signal, show=True)
>>> entropy1 
>>> entropy2 = nk.entropy_multiscale(signal, show=True, composite=True)
>>> entropy2 
>>> entropy3 = nk.entropy_multiscale(signal, show=True, refined=True)
>>> entropy3 

References

  • pyEntropy <https://github.com/nikdon/pyEntropy>`_

  • Richman, J. S., & Moorman, J. R. (2000). Physiological time-series analysis using approximate entropy and sample entropy. American Journal of Physiology-Heart and Circulatory Physiology, 278(6), H2039-H2049.

  • Costa, M., Goldberger, A. L., & Peng, C. K. (2005). Multiscale entropy analysis of biological signals. Physical review E, 71(2), 021906.

  • Gow, B. J., Peng, C. K., Wayne, P. M., & Ahn, A. C. (2015). Multiscale entropy analysis of center-of-pressure dynamics in human postural control: methodological considerations. Entropy, 17(12), 7926-7947.

  • Norris, P. R., Anderson, S. M., Jenkins, J. M., Williams, A. E., & Morris Jr, J. A. (2008). Heart rate multiscale entropy at three hours predicts hospital mortality in 3,154 trauma patients. Shock, 30(1), 17-22.

  • Liu, Q., Wei, Q., Fan, S. Z., Lu, C. W., Lin, T. Y., Abbod, M. F., & Shieh, J. S. (2012). Adaptive computation of multiscale entropy and its application in EEG signals for monitoring depth of anesthesia during surgery. Entropy, 14(6), 978-992.

complexity_sampen(signal, delay=1, dimension=2, r='default', **kwargs)

Sample Entropy (SampEn)

Python implementation of the sample entropy (SampEn) of a signal.

This function can be called either via entropy_sample() or complexity_sampen().

Parameters
  • signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

  • delay (int) – Time delay (often denoted ‘Tau’, sometimes referred to as ‘lag’). In practice, it is common to have a fixed time lag (corresponding for instance to the sampling rate; Gautama, 2003), or to find a suitable value using some algorithmic heuristics (see delay_optimal()).

  • dimension (int) – Embedding dimension (often denoted ‘m’ or ‘d’, sometimes referred to as ‘order’). Typically 2 or 3. It corresponds to the number of compared runs of lagged data. If 2, the embedding returns an array with two columns corresponding to the original signal and its delayed (by Tau) version.

  • r (float) – Tolerance (i.e., filtering level - max absolute difference between segments). If ‘default’, will be set to 0.2 times the standard deviation of the signal (for dimension = 2).

  • **kwargs (optional) – Other arguments.

Returns

float – The sample entropy as float value.

Examples

>>> import neurokit2 as nk
>>>
>>> signal = nk.signal_simulate(duration=2, frequency=5)
>>> entropy = nk.entropy_sample(signal)
>>> entropy 
complexity_se(signal)

Shannon entropy (SE)

Python implementation of Shannon entropy (SE). Entropy is a measure of unpredictability of the state, or equivalently, of its average information content. Shannon entropy (SE) is one of the first and most basic measure of entropy and a foundational concept of information theory. Shannon’s entropy quantifies the amount of information in a variable.

This function can be called either via entropy_shannon() or complexity_se().

Parameters

signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

Returns

float – The Shannon entropy as float value.

Examples

>>> import neurokit2 as nk
>>>
>>> signal = nk.signal_simulate(duration=2, frequency=5)
>>> entropy = nk.entropy_shannon(signal)
>>> entropy 

References

complexity_simulate(duration=10, sampling_rate=1000, method='ornstein', hurst_exponent=0.5, **kwargs)[source]

Simulate chaotic time series.

Generates time series using the discrete approximation of the Mackey-Glass delay differential equation described by Grassberger & Procaccia (1983).

Parameters
  • duration (int) – Desired length of duration (s).

  • sampling_rate (int) – The desired sampling rate (in Hz, i.e., samples/second).

  • duration (int) – The desired length in samples.

  • method (str) – The method. can be ‘hurst’ for a (fractional) Ornstein–Uhlenbeck process or ‘mackeyglass’ to use the Mackey-Glass equation.

  • hurst_exponent (float) – Defaults to 0.5.

  • **kwargs – Other arguments.

Returns

array – Simulated complexity time series.

Examples

>>> import neurokit2 as nk
>>>
>>> signal1 = nk.complexity_simulate(duration=30, sampling_rate=100, method="ornstein")
>>> signal2 = nk.complexity_simulate(duration=30, sampling_rate=100, method="mackeyglass")
>>> nk.signal_plot([signal1, signal2])
Returns

x (array) – Array containing the time series.

entropy_approximate(signal, delay=1, dimension=2, r='default', corrected=False, **kwargs)[source]

Approximate entropy (ApEn)

Python implementations of the approximate entropy (ApEn) and its corrected version (cApEn). Approximate entropy is a technique used to quantify the amount of regularity and the unpredictability of fluctuations over time-series data. The advantages of ApEn include lower computational demand (ApEn can be designed to work for small data samples (< 50 data points) and can be applied in real time) and less sensitive to noise. However, ApEn is heavily dependent on the record length and lacks relative consistency.

This function can be called either via entropy_approximate() or complexity_apen(), and the corrected version via complexity_capen().

Parameters
  • signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

  • delay (int) – Time delay (often denoted ‘Tau’, sometimes referred to as ‘lag’). In practice, it is common to have a fixed time lag (corresponding for instance to the sampling rate; Gautama, 2003), or to find a suitable value using some algorithmic heuristics (see delay_optimal()).

  • dimension (int) – Embedding dimension (often denoted ‘m’ or ‘d’, sometimes referred to as ‘order’). Typically 2 or 3. It corresponds to the number of compared runs of lagged data. If 2, the embedding returns an array with two columns corresponding to the original signal and its delayed (by Tau) version.

  • r (float) – Tolerance (similarity threshold). It corresponds to the filtering level - max absolute difference between segments. If ‘default’, will be set to 0.2 times the standard deviation of the signal (for dimension = 2).

  • corrected (bool) – If true, will compute corrected ApEn (cApEn), see Porta (2007).

  • **kwargs – Other arguments.

Returns

float – The approximate entropy as float value.

Examples

>>> import neurokit2 as nk
>>>
>>> signal = nk.signal_simulate(duration=2, frequency=5)
>>> entropy1 = nk.entropy_approximate(signal)
>>> entropy1 
>>> entropy2 = nk.entropy_approximate(signal, corrected=True)
>>> entropy2 

References

  • EntroPy <https://github.com/raphaelvallat/entropy>`_

  • Sabeti, M., Katebi, S., & Boostani, R. (2009). Entropy and complexity measures for EEG signal classification of schizophrenic and control participants. Artificial intelligence in medicine, 47(3), 263-274.

  • Shi, B., Zhang, Y., Yuan, C., Wang, S., & Li, P. (2017). Entropy analysis of short-term heartbeat interval time series during regular walking. Entropy, 19(10), 568.

entropy_fuzzy(signal, delay=1, dimension=2, r='default', **kwargs)[source]

Fuzzy entropy (FuzzyEn)

Python implementations of the fuzzy entropy (FuzzyEn) of a signal.

This function can be called either via entropy_fuzzy() or complexity_fuzzyen().

Parameters
  • signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

  • delay (int) – Time delay (often denoted ‘Tau’, sometimes referred to as ‘lag’). In practice, it is common to have a fixed time lag (corresponding for instance to the sampling rate; Gautama, 2003), or to find a suitable value using some algorithmic heuristics (see delay_optimal()).

  • dimension (int) – Embedding dimension (often denoted ‘m’ or ‘d’, sometimes referred to as ‘order’). Typically 2 or 3. It corresponds to the number of compared runs of lagged data. If 2, the embedding returns an array with two columns corresponding to the original signal and its delayed (by Tau) version.

  • r (float) – Tolerance (i.e., filtering level - max absolute difference between segments). If ‘default’, will be set to 0.2 times the standard deviation of the signal (for dimension = 2).

  • **kwargs – Other arguments.

Returns

float – The fuzzy entropy as float value.

Examples

>>> import neurokit2 as nk
>>>
>>> signal = nk.signal_simulate(duration=2, frequency=5)
>>> entropy = nk.entropy_fuzzy(signal)
>>> entropy 
entropy_multiscale(signal, scale='default', dimension=2, r='default', composite=False, refined=False, fuzzy=False, show=False, **kwargs)[source]

Multiscale entropy (MSE) and its Composite (CMSE), Refined (RCMSE) or fuzzy version.

Python implementations of the multiscale entropy (MSE), the composite multiscale entropy (CMSE), the refined composite multiscale entropy (RCMSE) or their fuzzy version (FuzzyMSE, FuzzyCMSE or FuzzyRCMSE).

This function can be called either via entropy_multiscale() or complexity_mse(). Moreover, variants can be directly accessed via complexity_cmse(), complexity_rcmse()`, complexity_fuzzymse() and complexity_fuzzyrcmse().

Parameters
  • signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

  • scale (str or int or list) – A list of scale factors used for coarse graining the time series. If ‘default’, will use range(len(signal) / (dimension + 10)) (see discussion here). If ‘max’, will use all scales until half the length of the signal. If an integer, will create a range until the specified int.

  • dimension (int) – Embedding dimension (often denoted ‘m’ or ‘d’, sometimes referred to as ‘order’). Typically 2 or 3. It corresponds to the number of compared runs of lagged data. If 2, the embedding returns an array with two columns corresponding to the original signal and its delayed (by Tau) version.

  • r (float) – Tolerance (i.e., filtering level - max absolute difference between segments). If ‘default’, will be set to 0.2 times the standard deviation of the signal (for dimension = 2).

  • composite (bool) – Returns the composite multiscale entropy (CMSE), more accurate than MSE.

  • refined (bool) – Returns the ‘refined’ composite MSE (RCMSE; Wu, 2014)

  • fuzzy (bool) – Returns the fuzzy (composite) multiscale entropy (FuzzyMSE, FuzzyCMSE or FuzzyRCMSE).

  • show (bool) – Show the entropy values for each scale factor.

  • **kwargs – Optional arguments.

Returns

float – The point-estimate of multiscale entropy (MSE) as a float value corresponding to the area under the MSE values curvee, which is essentially the sum of sample entropy values over the range of scale factors.

Examples

>>> import neurokit2 as nk
>>>
>>> signal = nk.signal_simulate(duration=2, frequency=5)
>>> entropy1 = nk.entropy_multiscale(signal, show=True)
>>> entropy1 
>>> entropy2 = nk.entropy_multiscale(signal, show=True, composite=True)
>>> entropy2 
>>> entropy3 = nk.entropy_multiscale(signal, show=True, refined=True)
>>> entropy3 

References

  • pyEntropy <https://github.com/nikdon/pyEntropy>`_

  • Richman, J. S., & Moorman, J. R. (2000). Physiological time-series analysis using approximate entropy and sample entropy. American Journal of Physiology-Heart and Circulatory Physiology, 278(6), H2039-H2049.

  • Costa, M., Goldberger, A. L., & Peng, C. K. (2005). Multiscale entropy analysis of biological signals. Physical review E, 71(2), 021906.

  • Gow, B. J., Peng, C. K., Wayne, P. M., & Ahn, A. C. (2015). Multiscale entropy analysis of center-of-pressure dynamics in human postural control: methodological considerations. Entropy, 17(12), 7926-7947.

  • Norris, P. R., Anderson, S. M., Jenkins, J. M., Williams, A. E., & Morris Jr, J. A. (2008). Heart rate multiscale entropy at three hours predicts hospital mortality in 3,154 trauma patients. Shock, 30(1), 17-22.

  • Liu, Q., Wei, Q., Fan, S. Z., Lu, C. W., Lin, T. Y., Abbod, M. F., & Shieh, J. S. (2012). Adaptive computation of multiscale entropy and its application in EEG signals for monitoring depth of anesthesia during surgery. Entropy, 14(6), 978-992.

entropy_sample(signal, delay=1, dimension=2, r='default', **kwargs)[source]

Sample Entropy (SampEn)

Python implementation of the sample entropy (SampEn) of a signal.

This function can be called either via entropy_sample() or complexity_sampen().

Parameters
  • signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

  • delay (int) – Time delay (often denoted ‘Tau’, sometimes referred to as ‘lag’). In practice, it is common to have a fixed time lag (corresponding for instance to the sampling rate; Gautama, 2003), or to find a suitable value using some algorithmic heuristics (see delay_optimal()).

  • dimension (int) – Embedding dimension (often denoted ‘m’ or ‘d’, sometimes referred to as ‘order’). Typically 2 or 3. It corresponds to the number of compared runs of lagged data. If 2, the embedding returns an array with two columns corresponding to the original signal and its delayed (by Tau) version.

  • r (float) – Tolerance (i.e., filtering level - max absolute difference between segments). If ‘default’, will be set to 0.2 times the standard deviation of the signal (for dimension = 2).

  • **kwargs (optional) – Other arguments.

Returns

float – The sample entropy as float value.

Examples

>>> import neurokit2 as nk
>>>
>>> signal = nk.signal_simulate(duration=2, frequency=5)
>>> entropy = nk.entropy_sample(signal)
>>> entropy 
entropy_shannon(signal)[source]

Shannon entropy (SE)

Python implementation of Shannon entropy (SE). Entropy is a measure of unpredictability of the state, or equivalently, of its average information content. Shannon entropy (SE) is one of the first and most basic measure of entropy and a foundational concept of information theory. Shannon’s entropy quantifies the amount of information in a variable.

This function can be called either via entropy_shannon() or complexity_se().

Parameters

signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

Returns

float – The Shannon entropy as float value.

Examples

>>> import neurokit2 as nk
>>>
>>> signal = nk.signal_simulate(duration=2, frequency=5)
>>> entropy = nk.entropy_shannon(signal)
>>> entropy 

References

fractal_correlation(signal, delay=1, dimension=2, r=64, show=False)[source]

Correlation Dimension.

Python implementation of the Correlation Dimension D2 of a signal.

This function can be called either via fractal_correlation() or complexity_d2().

Parameters
  • signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

  • delay (int) – Time delay (often denoted ‘Tau’, sometimes referred to as ‘lag’). In practice, it is common to have a fixed time lag (corresponding for instance to the sampling rate; Gautama, 2003), or to find a suitable value using some algorithmic heuristics (see delay_optimal()).

  • dimension (int) – Embedding dimension (often denoted ‘m’ or ‘d’, sometimes referred to as ‘order’). Typically 2 or 3. It corresponds to the number of compared runs of lagged data. If 2, the embedding returns an array with two columns corresponding to the original signal and its delayed (by Tau) version.

  • r (str or int or list) – The sequence of radiuses to test. If an integer is passed, will get an exponential sequence ranging from 2.5% to 50% of the distance range. Methods implemented in other packages can be used via setting r='nolds' or r='Corr_Dim'.

  • show (bool) – Plot of correlation dimension if True. Defaults to False.

Returns

D2 (float) – The correlation dimension D2.

Examples

>>> import neurokit2 as nk
>>>
>>> signal = nk.signal_simulate(duration=2, frequency=5)
>>>
>>> fractal1 = nk.fractal_correlation(signal, r="nolds", show=True)
>>> fractal1 
>>> fractal2 = nk.fractal_correlation(signal, r=32, show=True)
>>> fractal2 
>>>
>>> signal = nk.rsp_simulate(duration=120, sampling_rate=50)
>>>
>>> fractal3 = nk.fractal_correlation(signal, r="nolds", show=True)
>>> fractal3 
>>> fractal4 = nk.fractal_correlation(signal, r=32, show=True)
>>> fractal4 

References

  • Bolea, J., Laguna, P., Remartínez, J. M., Rovira, E., Navarro, A., & Bailón, R. (2014). Methodological framework for estimating the correlation dimension in HRV signals. Computational and mathematical methods in medicine, 2014.

  • Boon, M. Y., Henry, B. I., Suttle, C. M., & Dain, S. J. (2008). The correlation dimension: A useful objective measure of the transient visual evoked potential?. Journal of vision, 8(1), 6-6.

  • nolds

  • Corr_Dim

fractal_dfa(signal, windows='default', overlap=True, integrate=True, order=1, multifractal=False, q=2, show=False)[source]

(Multifractal) Detrended Fluctuation Analysis (DFA or MFDFA)

Python implementation of Detrended Fluctuation Analysis (DFA) or Multifractal DFA of a signal. Detrended fluctuation analysis, much like the Hurst exponent, is used to find long-term statistical dependencies in time series.

This function can be called either via fractal_dfa() or complexity_dfa(), and its multifractal variant can be directly accessed via fractal_mfdfa() or complexity_mfdfa()

Parameters
  • signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

  • windows (list) – A list containing the lengths of the windows (number of data points in each subseries). Also referred to as ‘lag’ or ‘scale’. If ‘default’, will set it to a logarithmic scale (so that each window scale hase the same weight) with a minimum of 4 and maximum of a tenth of the length (to have more than 10 windows to calculate the average fluctuation).

  • overlap (bool) – Defaults to True, where the windows will have a 50% overlap with each other, otherwise non-overlapping windows will be used.

  • integrate (bool) – It is common practice to convert the signal to a random walk (i.e., detrend and integrate, which corresponds to the signal ‘profile’). Note that it leads to the flattening of the signal, which can lead to the loss of some details (see Ihlen, 2012 for an explanation). Note that for strongly anticorrelated signals, this transformation should be applied two times (i.e., provide np.cumsum(signal - np.mean(signal)) instead of signal).

  • order (int) – The order of the polynoiam trend, 1 for the linear trend.

  • multifractal (bool) – If true, compute Multifractal Detrended Fluctuation Analysis (MFDFA), in which case the argument `q is taken into account.

  • q (list) – The sequence of fractal exponents when multifractal=True. Must be a sequence between -10 and 10 (nota that zero will be removed, since the code does not converge there). Setting q = 2 (default) gives a result close to a standard DFA. For instance, Ihlen (2012) usese `` q=[-5, -3, -1, 0, 1, 3, 5]``.

  • show (bool) – Visualise the trend between the window size and the fluctuations.

Returns

dfa (float) – The DFA coefficient.

Examples

>>> import neurokit2 as nk
>>>
>>> signal = nk.signal_simulate(duration=3, noise=0.05)
>>> dfa1 = nk.fractal_dfa(signal, show=True)
>>> dfa1 
>>> dfa2 = nk.fractal_mfdfa(signal, q=np.arange(-3, 4), show=True)
>>> dfa2 

References

  • Ihlen, E. A. F. E. (2012). Introduction to multifractal detrended fluctuation analysis in Matlab. Frontiers in physiology, 3, 141.

  • Hardstone, R., Poil, S. S., Schiavone, G., Jansen, R., Nikulin, V. V., Mansvelder, H. D., & Linkenkaer-Hansen, K. (2012). Detrended fluctuation analysis: a scale-free view on neuronal oscillations. Frontiers in physiology, 3, 450.

  • nolds

  • Youtube introduction

fractal_mandelbrot(size=1000, real_range=- 2, 2, imaginary_range=- 2, 2, threshold=4, iterations=25, buddha=False, show=False)[source]

Generate a Mandelbrot (or a Buddhabrot) fractal.

Vectorized function to efficiently generate an array containing values corresponding to a Mandelbrot fractal.

Parameters
  • size (int) – The size in pixels (corresponding to the width of the figure).

  • real_range (tuple) – The mandelbrot set is defined within the -2, 2 complex space (the real being the x-axis and the imaginary the y-axis). Adjusting these ranges can be used to pan, zoom and crop the figure.

  • imaginary_range (tuple) – The mandelbrot set is defined within the -2, 2 complex space (the real being the x-axis and the imaginary the y-axis). Adjusting these ranges can be used to pan, zoom and crop the figure.

  • iterations (int) – Number of iterations.

  • threshold (int) – The threshold used, increasing it will increase the sharpness (not used for buddhabrots).

  • buddha (bool) – Whether to return a buddhabrot.

  • show (bool) – Visualize the fratal.

Returns

fig – Plot of fractal.

Examples

>>> import neurokit2 as nk
>>>
>>> # Mandelbrot fractal
>>> nk.fractal_mandelbrot(show=True) 
array(...)
>>> # Zoom at seahorse valley
>>> nk.fractal_mandelbrot(real_range=(-0.76, -0.74), imaginary_range=(0.09, 0.11),
...                       iterations=100, show=True) 
array(...)
>>>
>>> # Draw manually
>>> m = nk.fractal_mandelbrot(real_range=(-2, 0.75), imaginary_range=(-1.25, 1.25))
>>> plt.imshow(m.T, cmap="viridis") 
>>> plt.axis("off") 
>>> plt.show() 
>>>
>>> # Buddhabrot
>>> b = nk.fractal_mandelbrot(size=1500, real_range=(-2, 0.75), imaginary_range=(-1.25, 1.25),
...                           buddha=True, iterations=200)
>>> plt.imshow(b.T, cmap="gray") 
>>> plt.axis("off") 
>>> plt.show() 
>>>
>>> # Mixed
>>> m = nk.fractal_mandelbrot()
>>> b = nk.fractal_mandelbrot(buddha=True, iterations=200)
>>>
>>> mixed = m - b
>>> plt.imshow(mixed.T, cmap="gray") 
>>> plt.axis("off") 
>>> plt.show() 
fractal_mfdfa(signal, windows='default', overlap=True, integrate=True, order=1, *, multifractal=True, q=2, show=False)

(Multifractal) Detrended Fluctuation Analysis (DFA or MFDFA)

Python implementation of Detrended Fluctuation Analysis (DFA) or Multifractal DFA of a signal. Detrended fluctuation analysis, much like the Hurst exponent, is used to find long-term statistical dependencies in time series.

This function can be called either via fractal_dfa() or complexity_dfa(), and its multifractal variant can be directly accessed via fractal_mfdfa() or complexity_mfdfa()

Parameters
  • signal (Union[list, np.array, pd.Series]) – The signal (i.e., a time series) in the form of a vector of values.

  • windows (list) – A list containing the lengths of the windows (number of data points in each subseries). Also referred to as ‘lag’ or ‘scale’. If ‘default’, will set it to a logarithmic scale (so that each window scale hase the same weight) with a minimum of 4 and maximum of a tenth of the length (to have more than 10 windows to calculate the average fluctuation).

  • overlap (bool) – Defaults to True, where the windows will have a 50% overlap with each other, otherwise non-overlapping windows will be used.

  • integrate (bool) – It is common practice to convert the signal to a random walk (i.e., detrend and integrate, which corresponds to the signal ‘profile’). Note that it leads to the flattening of the signal, which can lead to the loss of some details (see Ihlen, 2012 for an explanation). Note that for strongly anticorrelated signals, this transformation should be applied two times (i.e., provide np.cumsum(signal - np.mean(signal)) instead of signal).

  • order (int) – The order of the polynoiam trend, 1 for the linear trend.

  • multifractal (bool) – If true, compute Multifractal Detrended Fluctuation Analysis (MFDFA), in which case the argument `q is taken into account.

  • q (list) – The sequence of fractal exponents when multifractal=True. Must be a sequence between -10 and 10 (nota that zero will be removed, since the code does not converge there). Setting q = 2 (default) gives a result close to a standard DFA. For instance, Ihlen (2012) usese `` q=[-5, -3, -1, 0, 1, 3, 5]``.

  • show (bool) – Visualise the trend between the window size and the fluctuations.

Returns

dfa (float) – The DFA coefficient.

Examples

>>> import neurokit2 as nk
>>>
>>> signal = nk.signal_simulate(duration=3, noise=0.05)
>>> dfa1 = nk.fractal_dfa(signal, show=True)
>>> dfa1 
>>> dfa2 = nk.fractal_mfdfa(signal, q=np.arange(-3, 4), show=True)
>>> dfa2 

References

  • Ihlen, E. A. F. E. (2012). Introduction to multifractal detrended fluctuation analysis in Matlab. Frontiers in physiology, 3, 141.

  • Hardstone, R., Poil, S. S., Schiavone, G., Jansen, R., Nikulin, V. V., Mansvelder, H. D., & Linkenkaer-Hansen, K. (2012). Detrended fluctuation analysis: a scale-free view on neuronal oscillations. Frontiers in physiology, 3, 450.

  • nolds

  • Youtube introduction

Miscellaneous

Submodule for NeuroKit.

bio_analyze(data, sampling_rate=1000, method='auto')[source]

Automated analysis of bio signals.

Wrapper for other bio analyze functions of electrocardiography signals (ECG), respiration signals (RSP), electrodermal activity (EDA) and electromyography signals (EMG).

Parameters
  • data (DataFrame) – The DataFrame containing all the processed signals, typically produced by bio_process(), ecg_process(), rsp_process(), eda_process(), or emg_process().

  • sampling_rate (int) – The sampling frequency of the signals (in Hz, i.e., samples/second). Defaults to 1000.

  • method (str) – Can be one of ‘event-related’ for event-related analysis on epochs, or ‘interval-related’ for analysis on longer periods of data. Defaults to ‘auto’ where the right method will be chosen based on the mean duration of the data (‘event-related’ for duration under 10s).

Returns

DataFrame – DataFrame of the analyzed bio features. See docstrings of ecg_analyze(), rsp_analyze(), eda_analyze() and emg_analyze() for more details. Also returns Respiratory Sinus Arrhythmia features produced by ecg_rsa() if interval-related analysis is carried out.

See also

ecg_analyze(), rsp_analyze(), eda_analyze(), emg_analyze()

Examples

>>> import neurokit2 as nk
>>>
>>> # Example 1: Event-related analysis
>>> # Download data
>>> data = nk.data("bio_eventrelated_100hz")
>>>
>>> # Process the data
>>> df, info = nk.bio_process(ecg=data["ECG"], rsp=data["RSP"], eda=data["EDA"],
...                           keep=data["Photosensor"], sampling_rate=100)
>>>
>>> # Build epochs
>>> events = nk.events_find(data["Photosensor"], threshold_keep='below',
...                         event_conditions=["Negative", "Neutral",
...                                           "Neutral", "Negative"])
>>> epochs = nk.epochs_create(df, events, sampling_rate=100, epochs_start=-0.1,
...                           epochs_end=1.9)
>>>
>>> # Analyze
>>> nk.bio_analyze(epochs, sampling_rate=100) 
>>>
>>> # Example 2: Interval-related analysis
>>> # Download data
>>> data = nk.data("bio_resting_5min_100hz")
>>>
>>> # Process the data
>>> df, info = nk.bio_process(ecg=data["ECG"], rsp=data["RSP"], sampling_rate=100)
>>>
>>> # Analyze
>>> nk.bio_analyze(df, sampling_rate=100) 
bio_process(ecg=None, rsp=None, eda=None, emg=None, keep=None, sampling_rate=1000)[source]

Automated processing of bio signals.

Wrapper for other bio processing functions of electrocardiography signals (ECG), respiration signals (RSP), electrodermal activity (EDA) and electromyography signals (EMG).

Parameters
  • data (DataFrame # pylint: disable=W0611) – The DataFrame containing all the respective signals (e.g., ecg, rsp, Photosensor etc.). If provided, there is no need to fill in the other arguments denoting the channel inputs. Defaults to None.

  • ecg (Union[list, np.array, pd.Series]) – The raw ECG channel.

  • rsp (Union[list, np.array, pd.Series]) – The raw RSP channel (as measured, for instance, by a respiration belt).

  • eda (Union[list, np.array, pd.Series]) – The raw EDA channel.

  • emg (Union[list, np.array, pd.Series]) – The raw EMG channel.

  • keep (DataFrame) – Dataframe or channels to add by concatenation to the processed dataframe (for instance, the Photosensor channel).

  • sampling_rate (int) – The sampling frequency of the signals (in Hz, i.e., samples/second). Defaults to 1000.

Returns

  • bio_df (DataFrame) – DataFrames of the following processed bio features: - “ECG”: the raw signal, the cleaned signal, the heart rate, and the R peaks indexes. Also generated by ecg_process(). - “RSP”: the raw signal, the cleaned signal, the rate, and the amplitude. Also generated by rsp_process(). - “EDA”: the raw signal, the cleaned signal, the tonic component, the phasic component, indexes of the SCR onsets, peaks, amplitudes, and half-recovery times. Also generated by eda_process(). - “EMG”: the raw signal, the cleaned signal, and amplitudes. Also generated by emg_process(). - “RSA”: Respiratory Sinus Arrhythmia features generated by ecg_rsa(), if both ECG and RSP are provided.

  • bio_info (dict) – A dictionary containing the samples of peaks, troughs, amplitudes, onsets, offsets, periods of activation, recovery times of the respective processed signals.

See also

ecg_process(), rsp_process(), eda_process(), emg_process()

Example

>>> import neurokit2 as nk
>>>
>>> ecg = nk.ecg_simulate(duration=30, sampling_rate=250)
>>> rsp = nk.rsp_simulate(duration=30, sampling_rate=250)
>>> eda = nk.eda_simulate(duration=30, sampling_rate=250, scr_number=3)
>>> emg = nk.emg_simulate(duration=30, sampling_rate=250, burst_number=3)
>>>
>>> bio_df, bio_info = nk.bio_process(ecg=ecg, rsp=rsp, eda=eda, emg=emg, sampling_rate=250)
>>>
>>> # Visualize all signals
>>> fig = nk.standardize(bio_df).plot(subplots=True)
>>> fig 

Submodule for NeuroKit.

as_vector(x)[source]

Convert to vector.

Examples

>>> import neurokit2 as nk
>>>
>>> x = nk.as_vector(x=range(3))
>>> y = nk.as_vector(x=[0, 1, 2])
>>> z = nk.as_vector(x=np.array([0, 1, 2]))
>>> z 
>>>
>>> x = nk.as_vector(x=0)
>>> x 
>>>
>>> x = nk.as_vector(x=pd.Series([0, 1, 2]))
>>> y = nk.as_vector(x=pd.DataFrame([0, 1, 2]))
>>> y 
>>>
expspace(start, stop, num=50, base=1)[source]

Exponential range.

Creates a list of integer values of a given length from start to stop, spread by an exponential function.

Parameters
  • start (int) – Minimum range values.

  • stop (int) – Maximum range values.

  • num (int) – Number of samples to generate. Default is 50. Must be non-negative.

  • base (float) – If 1, will use np.exp(), if 2 will use np.exp2().

Returns

array – An array of integer values spread by the exponential function.

Examples

>>> import neurokit2 as nk
>>> nk.expspace(start=4, stop=100, num=10) 
array([  4,   6,   8,  12,  17,  24,  34,  49,  70, 100])
find_closest(closest_to, list_to_search_in, direction='both', strictly=False, return_index=False)[source]

Find the closest number in the array from a given number x.

Parameters
  • closest_to (float) – The target number(s) to find the closest of.

  • list_to_search_in (list) – The list of values to look in.

  • direction (str) – “both” for smaller or greater, “greater” for only greater numbers and “smaller” for the closest smaller.

  • strictly (bool) – False for stricly superior or inferior or True for including equal.

  • return_index (bool) – If True, will return the index of the closest value in the list.

Returns

closest (int) – The closest number in the array.

Example

>>> import neurokit2 as nk
>>>
>>> # Single number
>>> x = nk.find_closest(1.8, [3, 5, 6, 1, 2])
>>> x  
>>>
>>> y = nk.find_closest(1.8, [3, 5, 6, 1, 2], return_index=True)
>>> y  
>>>
>>> # Vectorized version
>>> x = nk.find_closest([1.8, 3.6], [3, 5, 6, 1, 2])
>>> x  
find_consecutive(x)[source]

Find and group consecutive values in a list.

Creates a list of integer values of a given length from start to stop, spread by an exponential function.

Parameters

x (list) – The list to look in.

Returns

list – A list of tuples corresponding to groups containing all the consecutive numbers.

Examples

>>> import neurokit2 as nk
>>>
>>> x =[2, 3, 4, 5, 12, 13, 14, 15, 16, 17, 20]
>>> nk.find_consecutive(x)
[(2, 3, 4, 5), (12, 13, 14, 15, 16, 17), (20,)]
listify(**kwargs)[source]

Transforms arguments into lists of the same length.

Examples

>>> import neurokit2 as nk
>>>
>>> nk.listify(a=3, b=[3, 5], c=[3]) 
{'a': [3, 3], 'b': [3, 5], 'c': [3, 3]}