Statistical modeling of isoform splicing dynamics from RNA-seq time series data (1602.06317v1)
Abstract: Isoform quantification is an important goal of RNA-seq experiments, yet it remains prob- lematic for genes with low expression or several isoforms. These difficulties may in principle be ameliorated by exploiting correlated experimental designs, such as time series or dosage response experiments. Time series RNA-seq experiments, in particular, are becoming in- creasingly popular, yet there are no methods that explicitly leverage the experimental design to improve isoform quantification. Here we present DICEseq, the first isoform quantification method tailored to correlated RNA-seq experiments. DICEseq explicitly models the corre- lations between different RNA-seq experiments to aid the quantification of isoforms across experiments. Numerical experiments on simulated data sets show that DICEseq yields more accurate results than state-of-the-art methods, an advantage that can become considerable at low coverage levels. On real data sets, our results show that DICEseq provides substan- tially more reproducible and robust quantifications, increasing the correlation of estimates from replicate data sets by up to 10% on genes with low or moderate expression levels (bot- tom third of all genes). Furthermore, DICEseq permits to quantify the trade-off between temporal sampling of RNA and depth of sequencing, frequently an important choice when planning experiments. Our results have strong implications for the design of RNA-seq ex- periments, and offer a novel tool for improved analysis of such data sets. Python code is freely available at http://diceseq.sf.net.