Flatiron Software
Recent advances in calcium imaging acquisition techniques are creating datasets of the order of Terabytes/week. Memory and computationally efficient algorithms are required to analyze in reasonable amount of time terabytes of data. This project implements a set of essential methods required in the calcium imaging movies analysis pipeline. Fast and scalable algorithms are implemented for motion correction, movie manipulation, and source and spike extraction. CaImAn also contains some routines for the analyisis of behavior from video cameras. In summary, CaImAn provides a general purpose tool to handle large movies, with special emphasis on tools for two-photon and one-photon calcium imaging and behavioral datasets.
A Computational toolbox for large scale Calcium Imaging data Analysis. The code implements the CNMF algorithm for simultaneous source extraction and spike inference from large scale calcium imaging movies. Many more features are included. The code is suitable for the analysis of somatic imaging data. Improved implementation for the analysis of dendritic/axonal imaging data will be added in the future.
Fast sinc transform libraries which compute sums of the sinc and sinc2 kernels between N arbitrary points in 1, 2, or 3 dimensions. This has applications in MRI and band-limited function approximation. The naive cost is O(N2) whereas our algorithm is quasi-linear in N. Written by our 2017 summer intern Hannah Lawrence.
FFT-accelerated interpolation-based t-SNE (FIt-SNE) is an efficient implementation of t-SNE (stochastic neighborhood embedding) for dimensionality reduction and visualization of high dimensional datasets. This code is able to perform 1000 iterations of t-SNE on one million data points in under 2 minutes on a desktop, which is many times faster than any other existing code. Written by Manas Rachh with collaborators at Yale.
Figurl lets you use Python to generate shareable figURLs (permalinks) to interactive visualizations. With minimal configuration, these can be generated from any computer with access to the internet. Data objects required for the visualization are stored in kachery-cloud and are referenced by content-address strings. Domain-specific visualization plugins are also stored in the cloud and are developed using ReactJS. The central website, figurl.org, pairs the visualization plugin with the data object to create the shareable interactive views.
FINUFFT is a set of libraries to compute efficiently three types of nonuniform fast Fourier transform (NUFFT) to a specified precision, in one, two, or three dimensions, on a multi-core shared-memory machine. The library has a very simple interface, does not need any precomputation step, is written in C++ (using OpenMP and FFTW), and has wrappers to C, fortran, MATLAB, octave, and python. As an example, given M arbitrary real numbers xj and complex numbers cj, with j=1,…,M, and a requested integer number of modes N, the 1D type-1 (aka “adjoint”) transform evaluates the N numbers.
FMM3D is a set of libraries to compute N-body interactions governed by the Laplace and Helmholtz equations, to a specified precision, in three dimensions, on a multi-core shared-memory machine. The 3D fast multipole method evaluates potentials (and gradients, etc) at a large number of targets due to a large number of sources, in linear or quasi-linear time. Our implementation exploits efficient plane wave expansions, SIMD-accelerated kernel evaluations, and multi-threading.
IronClust is a fast and drift-resistant spike sorting pipeline. The accuracy of spike sorting is validated by multiple ground-truth datasets from a number of contributing labs. IronClust can take advantage of GPU or a compute cluster if available. IronClust requires Matlab with image, parallel, and signal processing toolboxes. IronClust supports Windows, Mac, and Linux.
ISO-SPLIT is an efficient clustering algorithm that handles an unknown number of unimodal clusters in low to moderate dimension, without any user-adjustable parameters. It is based on repeated tests for unimodality—using isotonic regression and a modified Hartigan dip test—applied to 1D projections of pairs of putative clusters. It handles well non-Gaussian clusters of widely varying densities and populations, and in such settings has been shown to outperform K-means variants, Gaussian mixture models, and density-based methods.
This repository contains an efficient single-threaded implementation in C++, with a MATLAB/MEX interface.
It was invented and coded by Jeremy Magland, with contributions to the algorithm and tests by Alex Barnett, at SCDA/Flatiron Institute.
Kachery-cloud is a network for sharing scientific data files, live feeds, mutable data and calculation results between lab computers and browser-based user interfaces. Resources are organized into projects which are accessed via registered Python clients. Using simple Python commands you can store files, data objects, mutables or live feeds, and then retrieve or access these on a remote machine (or in a browser via JavaScript) by referencing universal URI strings. In the case of static content, URIs are essentially content hashes, thus forming a content-addressable storage database. While the primary purpose of kachery-cloud at this time is to support figurl, it can also be used independently in collaborative scientific research workflows and for improving scientific reproducibility and dissemination.
MountainSort is spike sorting software developed by Jeremy Magland, Alex Barnett, and Leslie Greengard at the Center for Computational Biology, Flatiron Institute in close collaboration with Jason Chung and Loren Frank at UCSF department of Physiology. MountainSort is a plugin package to MountainLab, a general framework for scientific data analysis, sharing, and visualization.
MountainLab is data processing, sharing and visualization software for scientists. It is built around MountainSort, a spike sorting algorithm, but is designed to more generally applicable.
Riccati is an efficient numerical solver developed for a class of ordinary differential equations whose solution may exhibit extremely quick oscillations. Standard routines available from scientific computing libraries typically struggle with these types of equations: their runtime grows as the oscillation frequency. Riccati can achieve a frequency-independent (constant) runtime. The package is written in Python, complete with documentation, tests, and interactive examples. It implements the robust algorithm described in Agocs & Barnett (2022), which is able to switch between two different numerical methods on-the-fly, adapting to the behavior of the solution, as well as choose its own stepsize and other parameters to achieve a user-specified accuracy.
SpikeForest is a reproducible, continuously updating platform which benchmarks the performance of spike sorting codes across a large curated database of electrophysiological recordings with ground truth. It consists of this website for presenting our up-to-date findings, a Python package which contains the tools for running the SpikeForest analysis, and an expanding collection of electrophysiology recordings with ground-truth spiking information.
Stan is an open-source platform for statistical modeling and high-performance statistical computation. Users rely on Stan for statistical modeling, data analysis and prediction in the social, biological and physical sciences, engineering, business, medicine, finance, education and sports. Users specify log density functions in Stan’s probabilistic programming language and get full Bayesian statistical inference with MCMC sampling (NUTS, HMC), approximate Bayesian inference with variational inference (ADVI), and penalized maximum likelihood estimation with optimization and Laplace approximation (L-BFGS).Stan’s math library provides differentiable real and complex special functions, probability functions and linear algebra using reverse- and forward-mode automatic differentiation. Stan has interfaces supporting statistical workflow in R, Python and Julia.