Collaborative scientific visualization in the browser with Figurl

Jeremy Magland and Jeff Soules, Center for Computational Mathematics, Flatiron Institute

Flatiron-Wide Autumn Meeting (FWAM), October 2023




Follow along: https://github.com/magland
Figurl FWAM 2023

Visualization in the scientific process

Visualization is critical at every stage of the data-centric scientific process

  • Data exploration
  • Quality control
  • Curation
  • Analysis and interpretation
  • Communication
Figurl FWAM 2023

Benefits of interactive visualizations

Compared with static plots, interactive visualizations enable:

  • Exploration of large, complex datasets
  • Exploration of parameter space
  • Gain intuition about dataset
  • Identification of outliers and artifacts (QC)
  • Interactive / collaborative curation
Figurl FWAM 2023

Current limitations in sharing interactive visualizations

Interactive visualizations are not easily shared as they often require:

  • Specific operating system / hardware
  • Specialized software
  • Transfer of large datasets
  • Expertise in setting up / reproducing the visualization
Figurl FWAM 2023

Why web-based software?

  • Easy to use
  • Easy to share
  • Cross-platform
  • Development cycle advantages (simplifies distribution, etc.)
  • Integrates naturally with cloud resources
  • Collaboration
  • Reproducibility
  • Limitations: no native access to local files/software, requires internet connection, limited access to previous versions, requires coding in JavaScript
Figurl FWAM 2023

Existing browser-based visualization tools

Observable, Vega, Plotly, Bokeh, D3, Matplotlib, Google Charts, VisPy, Altair, Deck.gl, P5.js, Three.js, Babylon.js, A-Frame, Pixi.js, Recharts, NVD3, C3.js, Chart.js, Vega-Lite, ECharts, Highcharts, Leaflet

... and many more

These are powerful tools, but they typically need to be embedded in a framework to be usable and shareable.

Figurl FWAM 2023

Executable notebooks (browser-based solution)

  • Jupyter, Google Colab, etc.
  • Pros: reproducible, self-documenting, interactive
  • Shareable on github, but only for static rendering of output cells
  • Other systems exist for rendering in static pages, but they have limitations
  • Cons: Requires a running backend, Cluttered by code, etc.
Figurl FWAM 2023

Binder: Making notebooks shareable

  • Enable anyone to run code from public repositories.
  • Allow authors to create interactive versions of their code.
  • Scale to handle many users at once.
Figurl FWAM 2023

Binder: How it works

  • Git Repository: Start with code/notebooks in a git repo (e.g., GitHub)
  • Dependencies: (requirements.txt or environment.yml)
  • Docker Image: Converts repo into Docker image
  • Launch Instance: Starts instance of that image
  • Interactive Sessions: Jupyer notebooks
  • Ephemeral: Changes during session are ephemeral; lost when session ends.
Figurl FWAM 2023

Binder: Limitations

  • Requires running backend.

    • Depending on number of users/views, can be expensive
    • Infrastructure maintenance
  • User needs to wait for the backend to start

    • Matplotlib demo: https://matplotlib.org/ (relatively fast example 30-90 sec)
    • If image is large or needs to be built, can take several minutes
  • Binder links can break over time

Figurl FWAM 2023

Binder: Can be very slow to load

Here's a powerful example but very slow to load because image needs to be built at startup. (From my experience these often ultimately fail to load.)


https://pythoninchemistry.org/sim_and_scat/classical_methods/van_der_waals

Figurl FWAM 2023

Observable (observablehq.com)

  • Interactive Exploration: Users play with data and code
  • Collaboration: Work with others in real-time
  • Reactivity: Changes propagate automatically
  • Reproducibility: Ensures results can be reproduced by others
  • Sharing and Publishing: Disseminate findings easily

Example: centroid-and-voronoi-polygons

Figurl FWAM 2023

Observable: Limitations

  • Uses modified JavaScript (main limitation)
  • Performance issues for large datasets
  • Visualizations must be embedded in notebooks
Figurl FWAM 2023
  • Example: Neuroglancer (from Google)
  • Dissemination using URLs
  • No backend server required (client-side computation)
  • Open viewer
Figurl FWAM 2023

Example Neuroglancer URL

https://neuroglancer-demo.appspot.com/#!{'layers':{'image':{'type':'image'_'source':'precomputed://gs://neuroglancer-public-data/flyem_fib-25/image'}_'ground-truth':{'type':'segmentation'_'source':'precomputed://gs://neuroglancer-public-data/flyem_fib-25/ground_truth'_'segments':['21894'_'22060'_'158571'_'24436'_'2515']}}_'navigation':{'pose':{'position':{'voxelSize':[8_8_8]_'voxelCoordinates':[2914.500732421875_3088.243408203125_4045]}}_'zoomFactor':30.09748283999932}_'perspectiveOrientation':[0.3143535554409027_0.8142156600952148_0.4843369424343109_-0.06040262430906296]_'perspectiveZoom':443.63404517712684_'showSlices':false}

Figurl FWAM 2023

Neuroglancer: Limitations

  • This is a specialized tool
  • Requires data be prepared and uploaded to a cloud bucket (expertise and special configuration)
Figurl FWAM 2023

Figurl: overview

  • Simplifies sharing of interactive figures
    • Run Python script to generate shareable URL
    • Clickable hyperlink method
    • No backend server
    • Fast loading, not embedded in notebook
  • Create custom visualization plugins
    • Static HTML bundles in the cloud
    • React/typescript (or other frameworks)
  • Promotes scientific collaboration, communication, reproducibility
Figurl FWAM 2023

Figurl: Plotly example

import plotly.express as px
import figurl as fig

# Load the iris dataset and create a Plotly figure
iris = px.data.iris()
ff = px.scatter_3d(iris, x='sepal_length',
		y='sepal_width', z='petal_width',
		color='species')

# Create and print the figURL
url = fig.Plotly(ff).url(label='plotly example - iris 3d')
print(url)

https://figurl.org/f?v=gs://figurl/plotly-1&d=sha1://5c6ec276ce9a3b20b208aaff911b037ce4052e51&label=plotly example - iris 3d

Figurl FWAM 2023

Figurl: Embedding in presentations / websites

Figurl FWAM 2023

Figurl architecture

Figurl FWAM 2023

Figurl Neurophysiology Example (figurl)

Figurl FWAM 2023

Figurl Neurophysiology Example

import spikeextractors as se

# Load the recording and sorting
recording, sorting = ...

# prepare SpikeInterface widget
widget = ...

# Prepare and print the figURL
url = widget.url(label='example')
print(url)

https://figurl.org/f?v=gs://figurl/spikesortingview-10&d=sha1://8d61e59b2806cf927ca1bd265923c23f5c37b990&label=experiment1_Record Node 104%23Neuropix-PXI-100.ProbeA-AP_recording1 - kilosort2_5 - Sorting Summary

Figurl FWAM 2023

Figurl Timeseries Graph Example

import numpy as np
import sortingview.views as vv

G = vv.TimeseriesGraph(
    legend_opts={'location': 'northwest'},
    y_range=[-15, 15], hide_x_gridlines=False, hide_y_gridlines=True
)
n1 = 5000
t = np.arange(0, n1) / n1 * 10; v = t * np.cos((2 * t)**2)
G.add_line_series(name='blue line', t=t, y=v.astype(np.float32), color='blue')
n2 = 400
t = np.arange(0, n2) / n2 * 10; v = t * np.cos((2 * t)**2)
G.add_marker_series(name='red marker', t=t, y=v.astype(np.float32), color='red', radius=4)
v = t + 1
G.add_line_series(name='green dash', t=t, y=v.astype(np.float32), color='green', width=5, dash=[12, 8])
t = np.arange(0, 12) / 12 * 10; v = -t - 1
G.add_marker_series(name='black marker', t=t, y=v.astype(np.float32), color='black', radius=8, shape='square')

print(G.url(label='TimeseriesGraph-Example'))

https://figurl.org/f?v=gs://figurl/spikesortingview-10&d=sha1://e6ca2d115aa3b92b6da77643f07349cb8f9b5546&label=TimeseriesGraph-Example

Figurl FWAM 2023

Anatomy of a Figurl URL

Figurl FWAM 2023

Figurl Altair Example

import figurl as fig

# From: https://altair-viz.github.io/gallery/simple_histogram.html
import altair as alt
from vega_datasets import data

source = data.movies.url

chart = alt.Chart(source).mark_bar().encode(alt.X("IMDB_Rating:Q", bin=True), y='count()')

# Create and print the figURL
url = fig.Altair(chart).url(label='example altair chart')
print(url)

https://figurl.org/f?v=gs://figurl/vegalite-2&d=sha1://f5920cbea57e42211f7cf83065230132713a3f01&label=example altair chart

Figurl FWAM 2023

Figurl 3D Surface Example

vtk_uri = 'sha1://e54d59b5f12d226fdfe8a0de7d66a3efd1b83d69?label=rbc_001.vtk'
vtk_path = kcl.load_file(vtk_uri)

vertices, faces = vv._parse_vtk_unstructured_grid(vtk_path)

W = vv.Workspace()
S = W.add_surface(name='red-blood-cell', vertices=vertices, faces=faces)
W.add_surface_scalar_field(name='scalarX', surface=S, data=vertices[:, 0])
W.add_surface_scalar_field(name='scalarY', surface=S, data=vertices[:, 1])
W.add_surface_scalar_field(name='scalarZ', surface=S, data=vertices[:, 2])

F = W.create_figure()
url = F.url(label='rbc_surface_scalar_fields')
print(url)

https://figurl.org/f?v=gs://figurl/volumeview-4&d=sha1://5a9cc08b0d8ce7a71c132b41bb5f88b9247568ba&label=rbc_surface_scalar_fields

Figurl FWAM 2023

3D surface

Figurl FWAM 2023

Figurl uses Kachery

Content Addressable Storage (CAS) database in the cloud

  • Minimal configuration for upload
  • Download from anywhere
  • Python client or Command-line client
  • Serverless infrastructure
  • Organized into zones (labs can host zones / pay for storage)
Figurl FWAM 2023

Storing kachery data

echo "test-content" > test_content.txt
kachery-cloud-store test_content.txt
# output:
# sha1://b971c6ef19b1d70ae8f0feb989b106c319b36230?label=test_content.txt

From Python

uri = kcl.store_text('example text', label='example.txt')
# uri = "sha1://d9e989f651cdd269d7f9bb8a215d024d8d283688?label=example.txt"
Figurl FWAM 2023

Retrieving kachery data

kachery-cloud-load sha1://b971c6ef19b1d70ae8f0feb989b106c319b36230
w = kcl.load_text('sha1://d9e989f651cdd269d7f9bb8a215d024d8d283688?label=example.txt')

x = kcl.load_json('sha1://d0d9555e376ff13a08c6d56072808e27ca32d54a?label=example.json')

y = kcl.load_npy("sha1://bb55205a2482c6db2ace544fc7d8397551110701?label=example.npy")

z = kcl.load_pkl("sha1://20d178d5a1264fc3267e38ca238c23f3e2dcd5d2?label=example.pkl")
Figurl FWAM 2023
Figurl FWAM 2023
Figurl FWAM 2023
Figurl FWAM 2023

Lab-specific visualization plugins (Loren Frank Lab)

Figurl FWAM 2023

Lab-specific visualization plugins (with Ralph Peterson)

Figurl FWAM 2023
Figurl FWAM 2023
Figurl FWAM 2023
Figurl FWAM 2023
Figurl FWAM 2023

Advanced: Embedding Figurl Figures in Markdown Documents

https://github.com/dcmnts/isosplit-paper/blob/main/isosplit.md

yields

https://doc.figurl.org/gh/dcmnts/isosplit-paper/blob/main/isosplit.md

Figurl FWAM 2023

Summary

  • Many powerful visualization tools exist, but frictionless sharing is still a challenge
  • Figurl is an open-source browser-based visualization tool
    • Uses the clickable hyperlink method
    • Reduces friction for sharing interactive visualizations
    • Promotes scientific collaboration
Figurl FWAM 2023

Future directions

  • Develop additional visualization plugins
  • Expand collaborations and attract developers
  • Improve kachery infrastructure
Figurl FWAM 2023

Thank you!

  • Flatiron: Jeff Soules
  • Frank lab: Loren Frank, Eric Denovellis, Kyu Hyun Lee, Alison Comrie, Michael Coulter
  • Allen Institute: Alessio Buccino

https://github.com/flatironinstitute/figurl
https://github.com/flatironinstitute/kachery-cloud

Figurl FWAM 2023

For using custom theme, see https://github.com/orgs/marp-team/discussions/115

![bg right:40% 90%](https://github.com/magland/magland-fwam-2023/assets/3679296/02d77daf-ccf9-458f-9feb-724905769690)