SMS scnews item created by Dario Strbenac at Thu 26 Aug 2021 0900
Type: Seminar
Distribution: World
Expiry: 30 Sep 2021
Calendar1: 30 Aug 2021 1300-1330
CalLoc1: Zoom videoconferencing
Auth: dario@ (dstr7320) in SMS-SAML

Statistical Bioinformatics Webinar: Gao -- Efficient Non-negatve Matrix Factorisation for Incrementally Increasing Data Sets

Presented by Mr.  Chao Gao (University of Michigan, USA) 

Integrating large single-cell gene expression, chromatin accessibility and DNA
methylation datasets requires general and scalable computational approaches.  Here we
describe online integrative non-negative matrix factorization (iNMF), an algorithm for
integrating large, diverse and continually arriving single-cell datasets.  Our approach
scales to arbitrarily large numbers of cells using fixed memory, iteratively
incorporates new datasets as they are generated and allows many users to simultaneously
analyze a single copy of a large dataset by streaming it over the internet.  Iterative
data addition can also be used to map new data to a reference dataset.  Comparisons with
previous methods indicate that the improvements in efficiency do not sacrifice dataset
alignment and cluster preservation performance.  We demonstrate the effectiveness of
online iNMF by integrating more than 1 million cells on a standard laptop, integrating
large single-cell RNA sequencing and spatial transcriptomic datasets, and iteratively
constructing a single-cell multi-omic atlas of the mouse motor cortex.