- © Golder Wilson
- , et al.
A Protocol for Qualifying DNA Variants Associated with Complex Diseases like Ehlers-Danlos syndrome
- Golder Wilson 1,2
- Vijay Tonk 1,2
- 1 - Texas Tech University Health Science Center
- 2 - Lubbock
Jun 28, 2021
AbstractA protocol for DNA variant qualification adds medical perspective to consensus grading based on impact of the DNA change on its encoded RNA/protein and its prevalence in normal people versus those with the implicated disease. This clinically familiar 1-4+ qualification can encourage physician acceptance of genomic analysis often delayed by equivocal qualifications like ‘variant of uncertain significance’ that are of little help for patient management. Clinical qualification is particularly important as genomics progresses beyond targeted DNA sequencing for single-gene diseases and provides diagnostic utility for polygenic, multifactorial diseases like Ehlers-Danlos syndrome. Clinical qualification requires appreciation of disease etiology rather than symptoms and of aggregate rather than single DNA changes. These are necessary perspectives if genomics is to attain its remarkable potential for precision medicine.
IntroductionAs microscopically visible changes in chromosomes contributed laboratory support for diagnoses like Down syndrome and submicroscopic beta-globin gene change complemented histologic evidence for sickle cell anemia, physicians and specialists added chromosome and DNA change to their diagnostic menus (Wyandt et al., 2017). Very different approaches were required when technology enabled multiplex tests, the sequential multi-channel analysis with computer-20 (SMAC20) combining familiar tests with unexpected results, the newborn metabolic screen placing unfamiliar tests with unusual results under physician responsibility. Though adept at interpreting electrolytes or cholesterol and acquainted with metabolic disease and referrals, the transition to genomic screening of all chromosomal regions for extra or missing DNA (by microarray analysis, analogous to looking at a book for extra or missing pages) or to screening all genes for DNA sequence change (by NextGen sequencing, analogous to proofing the book text for typos) brought physicians into the alien world of nucleotide and amino acid notations that is exemplified below (Wyandt et al., 2017). Particularly daunting was the lack of usual pathological characterization of DNA results as high, low, out-of-range, or normal. Scanning all human genes for altered sequences at rates of a billion nucleotides per hour was made possible by basic scientists and applied commercially to individual genomes with necessary molecular emphasis. Consensus guidelines (MacArthur et al., 2014; Richards et al., 2016) framed by these scientists and medical geneticists were as uninformative to general physicians and patients as the alien notation—'variation of unknown significance’ (VUS), the qualification given to the overwhelming majority of DNA results. Even the more decisive qualifications of benign (not harmful) or pathogenic (disease-causing) were not very useful because DNA laboratories based associated diagnoses on prior disease associations as listed in the Online Mendelian Inheritance in Man database (www.omim.org), not on those the physician was suspecting or on the many new associations expected from novel genomic screening (Weerakkody et al., 2016; Wilson, 2019b). Unfamiliar language and the appropriate scientific hesitation in distinguishing chance association from genetic causation has resulted in a DNA-doctor divide, with prestigious medical journals editorializing that genomics has not met expectations of DNA-guided precision medicine (Hunter and Drazen, 2019). Yet genomics promises to extend early detection, prevention, and even cure based on rare allele detection to the multigene-environmental (multifactorial) disease associations that can affect 1% of the population and the majority of us as we age. Genomic analysis not only finds major mutations that have been targets of single-gene sequencing or allele detection but also less damaging coding changes that interact to cause variable expression and determine disease severity (Weerakkody et al., 2016; Wilson, 2019b). We thus present a clinical qualification protocol for DNA variants that can inform patients and physicians of their diagnostic utility, emphasizing that 1) DNA change, like any laboratory test, cannot make a diagnosis without experienced clinical correlations; 2) DNA variants must be associated with disease--as with Ehlers-Danlos syndrome (EDS), not with component symptoms--as with chronic fatigue, anxiety, or fibromyalgia that occur in EDS; and 3) that optimal DNA result interpretation comes from partnership between molecular laboratory specialists and clinicians experienced with the patient’s medical findings. Physicians can view the protocol in Fig. 1 as having two dimensions. First is the determination of how much a DNA variant will alter its encoded RNA or protein—i. e., how much the consequent alteration in molecular structure will affect product function. This less clinical step in variant qualification devolves to a single point: functional analysis of the altered product using cell-free (in vivo) or computer (in silico) systems cannot replicate complex physiologic relationships like that between infection and septic shock or that between tissue laxity and dysautonomia in EDS (Tinkle and Levy, 2019; Wilson, 2019a). Our protocol thus takes a more lenient approach in considering most DNA variants to have diagnostic utility, provided they are rare in the normal population compared to individuals with the disease in question as shown by emerging databases (NCBI-1000 genomes, GnomAD). The second dimension is squarely in the physician’s domain—deciding if the variant gene’s ongoing association with disease correlates with patient symptoms. Because our protocol was developed to qualify DNA variants encountered in patients with EDS, some knowledge of this disease spectrum is necessary to illustrate the advantages of a combined molecular and clinical qualification protocol. Its influence from multiple genes, its multiple complications that vary even among affected relatives, and its unifying pathologic mechanisms make EDS ideal for demonstrating the role of clinical perspective in promoting a genomic approach to complex disease (Tinkle and Levy, 2019; Wilson, 2019a). EDS occurs among the 20% of females and 10% of males who are hypermobile (double-jointed), at least 10% of them (1-2% of the population) having enough tissue laxity to cause joint sub- or dislocation with injury and fractures; wear-and-tear osteoarthritis with chronic pain; fragile and elastic skin with bruising and scarring; deformities like curved spine, flat feet, or altered gait from skeletal extensibility; cardiovascular tissue weakness leading to mitral valve prolapse, aneurysms, and varicosities; neuromuscular problems including migraines, neuropathies from compression, and poor balance (Wilson, 2019a). Equally disabling is the pooling of blood in the lower extremities (leg varicosities, pelvic congestion) that causes dizziness upon standing, menorrhagia, bladder issues, and the reactive adrenergic response that increases heart rate and blood pressure to restore cerebral circulation. This pulsatile “fight-or-flight” reaction produces tachycardia, anxiety, and chronic fatigue; digestive suppression with irritable bowel and reflux; rashes, hives, and food intolerances from enhanced mast cell activity—all symptoms of autonomic imbalance or dysautonomia (Wilson, 2019a). Clinical knowledge of how these tissue laxity and dysautonomia mechanisms cause multiple medical problems is an essential dimension of the qualification protocol in Fig. 1, one exemplified by EDS but applicable to any clinical condition.
Reagents and EquipmentComputerized databases for investigating gene attributes, prior reports of DNA variants in disease contexts, prevalence of DNA variants in normal and disease populations
DNA variant qualification protocol
The clinical qualification protocol is best illustrated by considering a specific result from whole exome sequencing (WES) in a woman with every symptom of EDS listed above. WES looks at the 1.5% of genomic DNA that encodes proteins (exons). DNA changes in coding regions are easier to interpret via the genetic code (3 nucleotide codons representing an amino acid residue) than those of whole genome sequencing (WGS). The woman’s report listed one DNA variant as possibly significant: COL3A1 c.3818A>G p.Lys1273Arg mat VUS. This result begins with the altered gene (COL3A1 abbreviating the collagen type III, alpha-1 chain gene) and proceeds to list its DNA nucleotide change (A to G at nucleotide 3818 from the start of the gene), the consequent protein amino acid change (lysine to arginine at position 1273 from the start of the protein), the parent of origin if determined (mat for maternal), and qualification (VUS or variant of uncertain significance).
Molecular steps in Fig.1 evaluate the variant’s disruption of product structure (D), its presence in a DNA region conserved or not conserved in evolution (E), its alteration of product function by the functional (in vitro or in silico) analysis mentioned above (F), and relevance of the altered gene (G) to the disease in question. This last step is a dynamic one as indicated by the dual arrow in Fig. 1; prevalence and relationship with disease or disease mechanism (tissue laxity-dysautonomia in the case of EDS) increases or decreases as more patients have genomic DNA sequencing results.
These D-G steps follow consensus qualification approaches (MacArthur et al., 2014; Richards et al., 2016) and their modifications (Duzkale et al., 2013; Nykamp et al., 2017)) but place more emphasis on structural impact as assessed quantitatively from evolutionary change: Grantham scores based on frequencies of amino acid replacement (Grantham, 1974) and tracking of regional homology to simpler organisms (NCBI-Genome with links to UC Santa Clara and Ensemble genome browsers) or from stringent consideration of amino acid size and conformation (amino acids of similar size and charge may spend more time in different conformations as shown by Ramachandran plots (Tam et al., 2020). Such biochemical emphasis would recognize that the C=N imine bond of the arginine replacement in our patient introduces a planar region that is lacking in lysine despite both amino acids having positively charged ammonium ion side chains at cellular pH. The clinical bottom lines as mentioned above are the inability to model complex pathogenic relationships like tissue laxity-dysautonomia in our patient by functional analysis (decreased size of F in Fig. 1, left) and the more lenient candidate approach to DNA variant relevance (G) like that followed in gene mapping studies before the human genome project localized every gene.
Molecular considerations lead to consensus qualifications (Richards et al., 2015) of benign, VUS or pathogenic in the second column of Fig. 1, now converted to variant impact Vi scores of 0-2+ that broadcast low to high alteration of gene product structure. The dynamic gene relevance (G) is combined with how well historical findings (H) and inheritance of variant and disease (I) in family members suggest diagnostic utility in this particular patient, the woman’s many tissue laxity and dysautonomia findings plus presence of symptoms and variant in her and her mother justifying the addition of 2 more pluses to give a maximal 4+ variant qualification score (third column, Fig. 1). Corresponding to the number of pluses are variant (V) and diagnostic utility (DU) qualifications in the fourth column, the qualifier * standing for No-1+ to E/evidenced-4+ diagnostic utility as explained in the figure legend. Because additional DNA variants with synergistic (Posey et al., 2017) or other (Green et al., 2013; Posey et al., 2017) disease mechanisms are often encountered (fifth column), they may add a plus or another diagnosis to the final result (none found in our female patient). Last comes qualification of the overall DNA result with an MDna 1 to 4+ score, translating the molecular consensus scores to one of clinical utility as emphasized by the “MD” prefix. This final step assigns one qualification to the overall DNA result, MDna 0+ or 1+ conveying low utility or contribution of the result to the patient’s clinical diagnosis, MDna 4+ conveying high diagnostic utility of the overall DNA result.
Knowledge of clinical mechanisms and relationships must define the diagnosis that accompanies variant 1-4+ utility, in this case realizing that changes in the COL3A1 gene can cause milder forms of EDS instead of the oft-assumed vascular type (Byers et al., 2017). Equally essential was the recognition that our patient’s multiple tissue laxity and dysautonomia findings suggest the encompassing diagnosis of EDS rather than component problems like osteoarthritis, scoliosis, chronic fatigue syndrome, or anxiety disorder (Tinkle and Levy, 2019; Wilson, 2019a). Although databases documenting variant occurrence and its association with particular diagnoses are accumulating (NCBI-dbSNP, NCBI-ClinVar, ClinGen), more clinical input is required before these correlations meet the physician demands and the validity wished for by medical authorities (Hunter and Drazen, 2019).
When the clinician (optimally a subspecialist familiar with the disease in question) adds input to the molecular qualification of DNA variants, one has a combined medical DNA diagnostic utility (MDna 1-4+ in Fig. 1) that directly informs physicians and patients in the way that chemical laboratory and imaging reports have always done. Also useful in this guidance are connotations of qualifications like VnoDU (variant of no diagnostic utility), VUDU (variant of uncertain diagnostic utility) or VEDU (Veda—variant of evidenced diagnostic utility) that highlight low or high utility. Hugely important is the negative qualification of common variants like those in the methylene tetrahydrofolate reductase (MTHFR) gene that mislead many patients (upper panel, last column), emphasizing their appropriate use for identity or ancestry testing rather than for disease association.
Examples of our patient’s variant qualification and others are given in the last column of Fig. 1 and explained in the legend, including the option to note the type of DNA test employed. One could add G1 to indicate targeted sequencing of one gene, A1-n for testing of one or more alleles as done with multiplex PCR for cystic fibrosis, Gn for sequencing a panel of genes related to a particular disease (epilepsy, cardiomyopathy, EDS), WES or WGS for whole exome or genome sequencing. The protocol is meant to catalyze binary approaches, one encouraging further function and molecular studies by scientists, another documenting variant prevalence in well-delineated disease categories by experienced physicians. Synthesis of these two approaches along with better understanding of noncoding DNA will foster a genomic medicine that predicts, ameliorates, and precisely manages any condition with substantial genetic influence.
Figure 1 Legend. Protocol for sequential molecular, genetic, and clinical qualification of DNA variants.
Medical diagnostic utility qualification begins on the left with ratings of D--disruption of protein-RNA structure (Grantham, 1974; NCBI-Genome), E--evolutionary conservation (NCBI-Genome, Tam et al., 2020), F--functional analysis, and G--gene-disease relevance that generate 0-2+ Vi (variant impact) scores correlating with benign to variant of uncertain significance (VUS) to pathogenic qualifications from consensus guidelines (MacArthur et al., 2014; Richards et al., 2015); middle ratings add pluses for increasing gene variant-disease correlation (G), more typical history findings (H), and family history/inheritance (I) correlations that will often convert unhelpful VUS qualifications to variant diagnostic utility scores V*DU where * = No-none, U-uncertain, M-moderate, S-strong, E-evidenced. Final 0-4+ medical DNA diagnostic utility (MDna) scores acknowledge the presence of additional variants qualified as having synergistic (V*DUS) or other (V*DUO) action (Posey et al., 2017), upgrading utility in the former case or suggesting other diagnoses (including incidental or secondary gene change—Green et al., 2013) in the latter. Appended to MD diagnostic utility scores can be the DNA test employed where G1 = single gene Gn = n gene panel, wes/wgs = whole exome/genome sequencing, A1 = single allele as in allele-specific oligonucleotide screening for sickle cell anemia, An= n allele multiplex polymerase chain reaction (PCR) as in screening for cystic fibrosis; the implied diagnosis should be added by appropriate medical rather than laboratory specialists as with the indicated examples for the MTHFR (methylene tetrahydrofolate reductase), COL3A1 (collagen type III alpha-1 chain), COL5A1 (collagen type V alpha-1 chain), BRCA1 (breast-ovarian cancer-1), HBB (beta-globin), and HFE (hemochromatosis) genes; c.677C>T, cytidylate to thymidylate nucleotide change at complementary DNA position 677; p.Ala300Val, p.Gly300Ala, or p.Gly300Glu, alanine to valine, glycine to alanine, glycine to glutamate at protein position 300; p.Cys100Gly, cysteine to glycine change at protein position 100; p.Lys1273Arg, lysine to arginine change at protein position 1273; p.His63Asp/p.Cys282Tyr compound heterozygous histidine to asparagine/cysteine to tyrosine change at protein position 63/282 (one gene copy or heterozygous change implied for the other variants); p.Glu6Ala, glutamate to alanine amino acid change at protein position 6; matSx, variant inherited from mother with symptoms of the indicated diagnosis, adding 1+ to the variant diagnostic utility score by inheritance (I).
Time TakenDepending on prior observations, the time to investigate and qualify a DNA variant is between 20 minutes to one hour
Byers, P. H., Belmont, J., Black, J. et al. (2017). Diagnosis, natural history, and management in vascular Ehlers-Danlos syndrome. Am. J. Med. Genet. Part C Semin. Med. Genet. 175C:27-39
Duzkale, H., Shen, J., McLaughlin, H., et al. (2013). A systematic approach to assessing the clinical significance of genetic variants Clin. Genet. 84:453–463. doi:10.1111/cge.12257.
Grantham, R. (1974). Amino acid difference formula to help explain protein evolution. Science. 185, 862-864; matrices showing 0-200 scores for amino acid substitutions are available online, e. g., https://gist.github.com/danielecook/501f03650bca6a3db31ff3af2d413d2a
Green, R. C., Berg, J. S., Grody, W. W., et al. (2013). American College of Medical Genetics and Genomics. ACMG recommendations for reporting of incidental findings in clinical exome and genome sequencing. Genet. Med. 15:565-574. doi: 10.1038/gim.2013.73.
Hunter, D. J., Drazen, J. M. (2019). Has the genome granted our wish yet?New. Engl. J. Med. 380:2391-2393. doi: 10.1056/NEJMp1904511.
MacArthur, D. G., Manolio, T. A., Dimmock, D. P., et al. (2014). Guidelines for investigating causality of sequence variants in human disease. Nature. 508:469-476. doi: 10.1038/nature13127.
Nykamp, K., Anderson, M., Powers, M., et. al. (2017). Sherloc: a comprehensive refinement of the ACMG–AMP variant classification criteria. Genet. Med. 19:1105-1117.
Posey, J. E., Harel, T., Liu, P., et al. (2017). Resolution of disease phenotypes resulting from multilocus genomic variation. New. Engl. J. Med. 376:21-31. doi: 10.1056/NEJMoa1516767
Richards, S., Aziz, N., Bale, S., et al. ACMG Laboratory Quality Assurance Committee (2015). Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. 17:405-24. doi: 10.1038/gim.2015.30.
Tam, B., Sinha, S., Wang, S. M. (2020). Combining Ramachandran plot and molecular dynamics simulation for structural-based variant classification: Using TP53 variants as model. Comput. Struct. Biotechnol. J. 18:4033-4039. doi: 10.1016/j.csbj.2020.11.041.
Tinkle, B. T., Levy, H. P. (2019). Symptomatic joint hypermobility: The hypermobile type of Ehlers-Danlos syndrome and the hypermobility spectrum disorders. Med. Clin. N. Amer. 103:1021-1033. https://doi: 10.1016/j.mcna.2019.08.002
Weerakkody, R. A., Vandrovkova, J., Kanonidau, C., et al. (2016). Targeted next-generation sequencing makes new molecular diagnoses and expands genotype-phenotype relationship in Ehlers-Danlos syndrome. Genet. Med. 18:1119-1127. doi: 10.1038/gim.2016.14
Wilson, G. N. (2019a) Clinical analysis supports articulo-autonomic dysplasia as a unifying pathogenic mechanism in Ehlers-Danlos syndrome and related conditions. J. Biosciences. Med. 7:149-168. doi: 10.4236/jbm.2019.76010
Wilson, G. N. (2019b) Genomic analysis of 727 patients with Ehlers-Danlos syndrome I: Clinical perspective relates 23 genes to a maternally influenced arthritis-adrenaline disorder. J. Biosciences. Med. 7:181-204. doi: 10.4236/jbm.2019.712015.
Wyandt, H. E., Wilson, G. N., Tonk, V. S. (2017). Gene and genome sequencing: Interpreting genetic variation at the nucleotide level. In: Human chromosome variation: Heteromorphism, polymorphism, and pathogenesis Ed.2 Ch.11. Springer Nature, Singapore Malaysia.
Associated PublicationsThese are in the reference list
ApplicationStrategies for the Development of a High Throughput Octet® Bio-Layer Interferometry Method to Measure Pharmacokinetics of Monoclonal Antibodies in Preclinical Animal Models
PrimerAnalytical Weights and Balances: Minimizing Errors
MediaGEN Protocols Expert Exchanges: Single-Cell RNA Sequencing--Challenges and Solutions
PrimerNanomembrane Technology Rapidly Isolates Exosomes from Tears
MediaGEN Protocols Expert Exchanges: Nanoparticles in Precision Medicine Diagnostics and Delivery