neurosift-blog

Exploring NWB Datasets on DANDI with Neurosift

INCF Neuroinformatics Assembly, September 26, 2024

Jeremy Magland, Center for Computational Mathematics, Flatiron Institute

Developed in collaboration with CatalystNeuro

Neurosift overview

The main goal of Neurosift is to provide a user-friendly browser-based interface for exploring DANDI NWB files in a shared and collaborative environment.

DANDI Archive / Neurosift Integration

image

image

Click around to explore this file.

Neurosift URL

Let’s examine the above Neurosift URL

https://neurosift.app/?p=/nwb&url=https://api.dandiarchive.org/api/assets/47be899c-27a8-4864-a1e9-d7a3f92e522e/download/&dandisetId=000552&dandisetVersion=0.230630.2304

The url query parameter points to the DANDI Archive API. You can point that to the URL of any remote NWB file (not just on DANDI).

Opening local NWB files

You can also view local NWB files. This would be on your local machine – it’s not easy to do this on Dandihub.

# One-time install of neurosift (already installed above)
pip install --upgrade neurosift

# Open the local file (will open a browser window)
neurosift view-nwb /path/to/file.nwb

You can try this out by downloading one of these relatively small NWB files.

Neurosift as DANDI Browser

The DANDI REST API is open to the public, so Neurosift can also function as an alternative DANDI Archive explorer!

https://neurosift.app

image

Efficient streaming of NWB files

For more information, see the LINDI project.

For example open this 137 GB file. There is a “Using LINDI” indicator on the left panel meaning that it found the pre-indexed .nwb.lindi.json file in the cloud and used that instead of the .nwb. The .json file efficiently stores all the meta information and references the original file for the data chunks.

Here’s the corresponding LINDI JSON file for inspection: https://lindi.neurosift.org/dandi/dandisets/000409/assets/8e55b7ac-a085-43c0-9dc9-c577bcbe1824/nwb.lindi.json

LINDI uses a JSON representation of Zarr with external references to large binary chunks.

Streaming objects from NWB files using Python

You can load NWB objects using Python.

image

import lindi

url = 'https://lindi.neurosift.org/dandi/dandisets/000552/assets/25b641ae-5a56-49c2-893c-7dd19d039912/nwb.lindi.json'

# Load the remote file
f = lindi.LindiH5pyFile.from_lindi_file(url)

# load the neurodata object
X = f['/processing/behavior/SleepStates']

id = X['id']
label = X['label']
start_time = X['start_time']
stop_time = X['stop_time']

print(f'Shape of id: {id.shape}')
print(f'Shape of start_time: {start_time.shape}')
print(f'Shape of stop_time: {stop_time.shape}')

# This line was added
print(label[()])

# Output:
# Shape of id: (46,)
# Shape of start_time: (46,)
# Shape of stop_time: (46,)
# ['Awake' 'Non-REM' 'Awake' 'Non-REM' 'Awake' 'Non-REM' 'REM' 'Awake'
#  'Non-REM' 'REM' 'Awake' 'Non-REM' 'Awake' 'Non-REM' 'Awake' 'Non-REM'
#  'REM' 'Awake' 'Non-REM' 'Awake' 'Non-REM' 'REM' 'Awake' 'Non-REM' 'Awake'
#  'Non-REM' 'REM' 'Awake' 'Non-REM' 'Awake' 'Non-REM' 'Awake' 'Non-REM'
#  'Awake' 'Non-REM' 'Awake' 'Non-REM' 'Awake' 'Non-REM' 'REM' 'Awake'
#  'Non-REM' 'REM' 'Awake' 'Non-REM' 'Awake']

Open a Jupyter notebook. In Dandihub this would be File -> New -> Notebook and select Python 3 kernel. Then paste in this code for the first cell and run the cell.

Then run the following line to see the labels data

print(label[()])

Neurosift tabs

What are the different Neurosift tabs?

image

NWB Tab

The NWB tab gives a hierarchical layout of the Neurodata objects in the NWB file with links to various visualization plugins.

image

What are the checkboxes for?

RAW Tab

The RAW tab shows the raw HDF5 structure: groups, datasets, attributes, etc.

image

Tip: to inspect the contents of a larger dataset, open the browser developer console and click on the CIRCLE icon. The contents of the dataset will be printed to the console. For example in this example go to the RAW tab and navigate to processing -> behavior -> Blink -> TimeSeries -> data. Open the browser developer console and click the CIRCLE icon.

WIDGETS Tab

The WIDGETS tab provides a widget-centric view of the file. For each relevant visualization plugin, you can see the neurodata objects that can be opened with it.

image

SPECIFICATIONS Tab

The SPECIFICATIONS tab lets you visualize the HDMF spec that is embedded in the NWB file.

image

DENDRO Tab

Will discuss elsewhere.

Annotations Tab

Finally, the ANNOTATIONS tab is an advanced feature that lets you add annotations (notes) to Dandisets, NWB files, and individual Neurodata objects. Other users will be able to see your annotations.

Note: this is an experimental feature and is subject to change / deletion. We are not going to cover it in this tutorial.

Advanced DANDI Queries

Click on “advanced query” in the upper-right corner of the main neurosift page.

You can filter by neurodata types. For example, in this screenshot, I searched for all Dandisets that have an object of type Units AND an object of type ElectricalSeries. This is based on a pre-indexing of public DANDI that includes only the first 100 assets of each Dandiset.

image

You can then select a subset of these Dandisets and perform a SPECIAL query using JSONPath or JavaScript syntax. For example you could ask it to:

image

Example: Dandiset 000458

On DANDI Archive: 000458

Here’s a thorough description of the dataset for purpose of reanalysis.

This notebook shows how to use the DANDI API to summarize all of the sessions: 001_summarize_contents.ipynb

Here’s one of the examples in Neurosift: sub-551397/sub-551397_ses-20210211_behavior+ecephys.nwb

image

As you can see we have EEG, LFP, epochs, trials, running speed, and units.

In the Units section click on “Raster Plot” and then adjust the number of visible units.

image

Peri-Stimulus Time Histograms (PSTH)

Let’s explore the Peri-Stimulus Time Histograms (PSTH)!