Author(s): Jasmin H Bavarva aff1 , Megha J Bavarva aff2 , Enusha Karunasena aff2
drug discovery; next-generation sequencing; single-cell sequencing; technology development
Next-generation sequencing (NGS) technologies have provided us with an opportunity to explore genomes with unprecedented detail. Alongside these advancements, NGS has its own challenges (much like all technology) creating a plethora of analytical and biological considerations before these methods can be standardized. Meanwhile, encouraging single-cell sequencing has become a recent trend among scientists who are optimistic that this approach will better characterize cancer (and other similarly complex diseases) [ 1,2 ]. The technology necessary to sequence a complete genome from a 'single' cell has been available, but now, combined with growing popularity and mainstream acceptance, these methods are soon to become pedestrian tools, among a list of many existing NGS technologies. Will this technique be the missing link to scaffold data between genome sequencing methods, which currently contribute volumes of data but with nominal insight into therapies? To answer these questions we have chosen to review current trends in sequencing technology and relative to novel therapies.
Before we invest our confidence in the concept that the genome of a single cell is informative and descriptive enough to define multifactorial diseases (i.e., cancer), the technology and science from it must be rigorously exercised and validated, a process that requires time and activity. Fundamental to all current sequencing technologies is DNA amplification and for single cell sequencing this step is critical. However, DNA amplification is well known for introducing errors and these technical imperfections are impossible to avoid [3 ]. High fidelity polymerase has a low error rate of ˜1 × 10 -5 errors per nucleotide [4 ]. However, this error rate is astounding when written in more simple terms as ˜30,000 base substitution errors introduced per genome amplified with a single PCR cycle; multiply these errors by the number of cycles and more errors are introduced in regions with high slippage, including repeat regions, further expanding the error-rate. To overcome this hurdle one could sequence each strand multiple times, the resulting caveat: there would need to be multiple strands and even then we would introduce new errors through this amplification process and therefore expand upon natural irregularities with those presented artificially, all prior to sequencing. Additionally, single cell sequencing is complicated by the lack of experimental replicates, therefore variability between cells may be difficult to substantiate. Because of this characteristic, during the analysis of these data, variations might subsequently pass as a 'natural' characteristic of the cell with no clear scientific procedure to determine the validity of this variation without reasonable controls (i.e., biological replicates). The next resulting challenge would be a question of how representative our determined sequence data is to the real sequence? Unfortunately, this problem persists in existing NGS technologies but also with Sanger sequencing, which is more than two decades old and considered a gold standard [5,6 ]. Accuracy rates of 100% are impractical to expect, thus biological interpretations are dependent upon statistical methods to...