To the Editor:
Recent progress in genome editing technologies, in particular the CRISPR-Cas9 nuclease system, has provided new opportunities to investigate the biological functions of genomic sequences by targeted mutagenesis (1-4). Double-strand breaks (DSBs) arising from site-specific Cas9 cleavage can be resolved by non-homologous end joining (NHEJ) or homology-directed repair (HDR), which result in a spectrum of diverse outcomes, including insertions, deletions, nucleotide substitutions and, in the case of HDR, recombination of extrachromosomal donor sequences (1-3,5,6). Deep sequencing of amplified genomic regions or whole genome sequencing (WGS) allows quantitative and sensitive detection of targeted mutations. However, to date, no standard analytic tool has been developed to systematically enumerate and visualize these events, resulting in inconsistencies among different experiments and across laboratories. Challenging issues for the interpretation of CRISPR-Cas9-edited sequences include amplification or sequencing errors, experimental variation in sequence quality, ambiguous alignment of variable length indels, deconvolution of mixed HDR-NHEJ outcomes, and analytical complexities resulting from large WGS data sets and pooled experiments where many different target sites are present in a single sequencing library. To both solve these issues and attempt to standardize data analysis, we developed CRISPResso, a robust and easy-to-use computational pipeline (Supplementary Note 1 and Supplementary Fig. 1). CRISPResso enables accurate quantification and visualization of CRISPR-Cas9 outcomes, as well as comprehensive evaluation of effects on coding sequences, non-coding elements and selected off-target sites.
CRISPResso is a suite of computational tools to qualitatively and quantitatively evaluate the outcomes of genome-editing experiments in which target loci are subject to deep sequencing. It provides an integrated, user-friendly interface that can be operated by biologists and bioinformaticians alike (Supplementary Figs. 1 and 2). Compared with existing tools (7), CRISPResso offers several notable features, including the following: batch sample analysis by command line interface; integration with other pipelines; tunable parameters of sequence quality and alignment fidelity; discrete measurement of insertions, deletions and nucleotide substitutions (which are ignored by other methods); tunable windows around the cleavage site to minimize false-positive classification; quantification of frameshift versus in-frame coding mutations; and distinction between NHEJ, HDR and mixed mutation events. CRISPResso automates the following steps: first, filtering low-quality reads; second,...