Modeling and automation of sequencing-based characterization of RNA structure

Sharon Aviran, Cole Trapnell, Julius B. Lucks, Stefanie A. Mortimer, Shujun Luo, Gary P. Schroth, Jennifer A. Doudna, Adam P. Arkin, Lior Pachter.
Proceedings of the National Academy of Sciences (2011)

Abstract

Sequence census methods reduce molecular measurements such as transcript abundance and protein-nucleic acid interactions to counting problems via DNA sequencing. We focus on a novel assay utilizing this approach, called selective 2′-hydroxyl acylation analyzed by primer extension sequencing (SHAPE-Seq), that can be used to characterize RNA secondary and tertiary structure. We describe a fully automated data analysis pipeline for SHAPE-Seq analysis that includes read processing, mapping, and structural inference based on a model of the experiment. Our methods rely on the solution of a series of convex optimization problems for which we develop efficient and effective numerical algorithms. Our results can be easily extended to other chemical probes of RNA structure, and also generalized to modeling polymerase drop-off in other sequence census-based experiments.