Presenter: Ren Yi, Ph.D. Candidate, New York University
Title: Multitask learning methods for efficient learning of data poor problems in biology
Abstract: The recent advancement in computational genomics has largely benefitted from the explosion of high-throughput genomic data, as well as an exponential growth in biological databases. However, as more sequencing technologies become available and large genomic consortiums start to crowdsource data from larger cohorts of research groups, data heterogeneity has become an increasingly prominent issue. Data integration across multiple data sources becomes particularly important for a greater number of biological systems. Biological data are typically highly-skewed towards a small number of model organisms, factors and conditions with which wet lab experiments have higher successful rates. It further introduces technical challenges when building machine learning models for data poor problems. Today I will use cell type specific transcription factor binding predictions as an example to show how effective multitask learning strategy can improve learning from data rich problems.