Boston Children's Hospital, Massachusetts, United States
Abstract:
Introduction: Over two-thirds of patients with potential genetic diseases remain without an explanation after undergoing next-generation sequencing (NGS). The lack of a diagnosis can be the result of the inability to detect all variants by NGS or candidate variants lacking sufficient clinical or molecular evidence to establish pathogenicity. The short-read length is a fundamental technical limitation of NGS data and complicates the accurate detection of structurally complex loci in the human genome. This limitation typically results in NGS analyses being limited to profiling the “accessible” genome, while ignoring duplicated and repetitive regions and variant classes known to underlie many diseases, such as complex rearrangements. Furthermore, many genomic variants can be challenging to interpret without complementary molecular information, including DNA methylation and RNA expression (e.g. mRNA sequencing) data.
Long-read sequencing (LRS) technologies have the potential to address these limitations, generating nucleotide sequences, which resolve complex loci, provide a complete picture of the human genome, and provide a haplotype-phased methylation map of relevant disease loci. In this project, we investigate the application of LRS in three families undergoing the diagnostic odyssey.
Methods: 3 families (9 family members total, with each consisting of a proband, mother, and father) were selected for long-read multi-omics evaluation. Families had new research draws of 2 EDTA + 1 PAXgene tubes. All individuals underwent short-read WGS, short-read RNA (cDNA from polyA selection), short-read mitochondrial genome, LRS genome + methylome, LRS transcriptome (cDNA from polyA selection), LRS mitochondrial genome, and EPIC DNA methylation microarray. Detection of multiple classes of variants using long-read technology was performed, with a focus on de novo variant identification. Epigenome comparisons between the EPIC DNA methylation microarray and nanopore methyl-frequency data were performed at varying read depths to inform accuracy and dynamic range of 5mC detection.
Results: We identified likely causal variants in 2 of 3 families using this multi-omics approach. Transcriptome profiling supported the classification of a potential splice-altering variant in family 1, supporting the utility of multi-omic profiling in establishing a diagnosis. For the third family, the comprehensive evaluation allowed for the elimination of several candidate conditions and prevented unnecessary additional medical interventions.
Conclusion: For patients and families on the diagnostic odyssey, a comprehensive evaluation using multiple techniques may result in a diagnosis. Even if not, a multi-omics approach can facilitate the exclusion of erroneous diagnoses. This up-front testing investment therefore prevents unnecessary interventions.