Flatiron Software

Project Image for CaImAn Python
CaImAn Python

Recent advances in calcium imaging acquisition techniques are creating datasets of the order of Terabytes/week. Memory and computationally efficient algorithms are required to analyze in reasonable amount of time terabytes of data. This project implements a set of essential methods required in the calcium imaging movies analysis pipeline. Fast and scalable algorithms are implemented for motion correction, movie manipulation, and source and spike extraction. CaImAn also contains some routines for the analyisis of behavior from video cameras. In summary, CaImAn provides a general purpose tool to handle large movies, with special emphasis on tools for two-photon and one-photon calcium imaging and behavioral datasets.

View Project
Project Image for CaImAn-MATLAB
CaImAn-MATLAB

A Computational toolbox for large scale Calcium Imaging data Analysis. The code implements the CNMF algorithm for simultaneous source extraction and spike inference from large scale calcium imaging movies. Many more features are included. The code is suitable for the analysis of somatic imaging data. Improved implementation for the analysis of dendritic/axonal imaging data will be added in the future.

View Project
Project Image for Chunkflow
Chunkflow

Modern imaging methods, such as Light, Electron, and synchrotron X-ray, have enabled 3D imaging for large samples with high resolution. As a result, more and more terabyte-scale or even petabyte-scale image volumes are produced. Traditional software runs on a single computer can not handle them anymore, and distributed computing, especially cloud computing is usually preferred. At the same time, there exists a variety of image processing pipelines due to the diverse scientific tasks while they usually share some common operations inside. Chunkflow is designed to tackle these challenges. The image volume is decomposed as chunks and distributed across computation nodes. Benefiting the hybrid cloud architecture design, users can run the tasks using both local cluster and public cloud with both CPUs and GPUs at the same time. Currently, over fifty operators could be composed in the command line to build customized pipelines instantly. Users can also easily plug in their own Python code as a new operator. Chunkflow is built in practical projects and has already been used to produce over 18 petabytes of result volumes. The maximum scale we have reached is over 3300 instances with GPUs in Google Cloud across three regions, and Chunkflow is still robust and reliable.

View Project
Project Image for RealNeuralNetworks.jl
RealNeuralNetworks.jl

Due to the string-like nature of neurons and blood vessels, they could be abstracted as curved tubes with center lines and radii. This representation could be used for morphological analysis, such as path length and branching angle. Given an accurate voxel segmentation, the computation of object centerlines and radii is called skeletonization. RealNeuralNetworks.jl is developed to do that. Unlike most related packages, it combines the synaptic connectivity graph with morphological features and could be used to explore the relationship between synaptic connectivity and morphology. Recently, a new arising programing language, called Julia, is getting popular in data science. RealNeuralNetworks.jl is a Julia package and the algorithms are written from scratch for less dependency and efficiency.

View Project
Project Image for CaImAn Python
CaImAn Python

Recent advances in calcium imaging acquisition techniques are creating datasets of the order of Terabytes/week. Memory and computationally efficient algorithms are required to analyze in reasonable amount of time terabytes of data. This project implements a set of essential methods required in the calcium imaging movies analysis pipeline. Fast and scalable algorithms are implemented for motion correction, movie manipulation, and source and spike extraction. CaImAn also contains some routines for the analyisis of behavior from video cameras. In summary, CaImAn provides a general purpose tool to handle large movies, with special emphasis on tools for two-photon and one-photon calcium imaging and behavioral datasets.

View Project
Project Image for CaImAn-MATLAB
CaImAn-MATLAB

A Computational toolbox for large scale Calcium Imaging data Analysis. The code implements the CNMF algorithm for simultaneous source extraction and spike inference from large scale calcium imaging movies. Many more features are included. The code is suitable for the analysis of somatic imaging data. Improved implementation for the analysis of dendritic/axonal imaging data will be added in the future.

View Project
Project Image for Fast sinc transform library
Fast sinc transform library

Fast sinc transform libraries which compute sums of the sinc and sinc2 kernels between N arbitrary points in 1, 2, or 3 dimensions. This has applications in MRI and band-limited function approximation. The naive cost is O(N2) whereas our algorithm is quasi-linear in N. Written by our 2017 summer intern Hannah Lawrence.

View Project
Project Image for FFT-accelerated Interpolation-based t-SNE
FFT-accelerated Interpolation-based t-SNE

FFT-accelerated interpolation-based t-SNE (FIt-SNE) is an efficient implementation of t-SNE (stochastic neighborhood embedding) for dimensionality reduction and visualization of high dimensional datasets. This code is able to perform 1000 iterations of t-SNE on one million data points in under 2 minutes on a desktop, which is many times faster than any other existing code. Written by Manas Rachh with collaborators at Yale.

View Project
Project Image for Figurl
Figurl

Figurl lets you use Python to generate shareable figURLs (permalinks) to interactive visualizations. With minimal configuration, these can be generated from any computer with access to the internet. Data objects required for the visualization are stored in kachery-cloud and are referenced by content-address strings. Domain-specific visualization plugins are also stored in the cloud and are developed using ReactJS. The central website, figurl.org, pairs the visualization plugin with the data object to create the shareable interactive views.

View Project
Project Image for FINUFFT
FINUFFT

FINUFFT is a set of libraries to compute efficiently three types of nonuniform fast Fourier transform (NUFFT) to a specified precision, in one, two, or three dimensions, on a multi-core shared-memory machine. The library has a very simple interface, does not need any precomputation step, is written in C++ (using OpenMP and FFTW), and has wrappers to C, fortran, MATLAB, octave, and python. As an example, given M arbitrary real numbers xj and complex numbers cj, with j=1,…,M, and a requested integer number of modes N, the 1D type-1 (aka “adjoint”) transform evaluates the N numbers.

View Project
Project Image for FMM3D
FMM3D

FMM3D is a set of libraries to compute N-body interactions governed by the Laplace and Helmholtz equations, to a specified precision, in three dimensions, on a multi-core shared-memory machine. The 3D fast multipole method evaluates potentials (and gradients, etc) at a large number of targets due to a large number of sources, in linear or quasi-linear time. Our implementation exploits efficient plane wave expansions, SIMD-accelerated kernel evaluations, and multi-threading.

View Project
Project Image for IronClust
IronClust

IronClust is a fast and drift-resistant spike sorting pipeline. The accuracy of spike sorting is validated by multiple ground-truth datasets from a number of contributing labs. IronClust can take advantage of GPU or a compute cluster if available. IronClust requires Matlab with image, parallel, and signal processing toolboxes. IronClust supports Windows, Mac, and Linux.

View Project
Project Image for ISO-SPLIT
ISO-SPLIT

ISO-SPLIT is an efficient clustering algorithm that handles an unknown number of unimodal clusters in low to moderate dimension, without any user-adjustable parameters. It is based on repeated tests for unimodality—using isotonic regression and a modified Hartigan dip test—applied to 1D projections of pairs of putative clusters. It handles well non-Gaussian clusters of widely varying densities and populations, and in such settings has been shown to outperform K-means variants, Gaussian mixture models, and density-based methods.
This repository contains an efficient single-threaded implementation in C++, with a MATLAB/MEX interface.
It was invented and coded by Jeremy Magland, with contributions to the algorithm and tests by Alex Barnett, at SCDA/Flatiron Institute.

View Project
Project Image for Kachery-cloud
Kachery-cloud

Kachery-cloud is a network for sharing scientific data files, live feeds, mutable data and calculation results between lab computers and browser-based user interfaces. Resources are organized into projects which are accessed via registered Python clients. Using simple Python commands you can store files, data objects, mutables or live feeds, and then retrieve or access these on a remote machine (or in a browser via JavaScript) by referencing universal URI strings. In the case of static content, URIs are essentially content hashes, thus forming a content-addressable storage database. While the primary purpose of kachery-cloud at this time is to support figurl, it can also be used independently in collaborative scientific research workflows and for improving scientific reproducibility and dissemination.

View Project
Project Image for SkellySim
SkellySim

SkellySim is a simulation package for simulating cellular components such as flexible filaments, motor proteins, and arbitrary rigid bodies. It’s designed to be highly scalable, capable of both OpenMP and MPI style parallelism, while using the efficient STKFMM/PVFMM libraries for hydrodynamic resolution.

View Project
Project Image for Gala
Gala

Galactic Dynamics is the study of the formation, history, and evolution of galaxies using the orbits of objects — numerically-integrated trajectories of stars, dark matter particles, star clusters, or galaxies themselves. Gala is an Astropy-affiliated Python package that aims to provide efficient tools for performing common tasks needed in Galactic Dynamics research. Much of this code uses Python for flexible, user-friendly interfaces that interact with wrappers around low-level code (primarily C) to enable fast computations. Common operations include gravitational potential and force evaluations, orbit integrations, dynamical coordinate transformations, and computing chaos indicators for nonlinear dynamics. Gala heavily uses the units and astronomical coordinate systems defined in the Astropy core package.

View Project
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates