Unknown DNA Sequences Identified That May Be Critical to Human Health
Numerous short RNA sequences that code for microproteins and peptides have been identified, providing new opportunities for the study of diseases and the development of drugs.
Researchers from Duke-NUS Medical School and their collaborators have discovered thousands of previously unknown DNA sequences in the human genome that code for microproteins and peptides that could be critical for human health and disease.
“Much of what we understand about the known two per cent of the genome that codes for proteins comes from looking for long strands of protein-coding nucleotide sequences, or long open reading frames,” explained computational biologist Dr Sonia Chothani, a research fellow with Duke-NUS’ Cardiovascular and Metabolic Disorders (CVMD) Programme and first author of the study. “Recently, however, scientists have discovered small open reading frames (smORFs) that can also be translated from RNA into small peptides, which have roles in DNA repair, muscle formation and genetic regulation.”
Scientists have been seeking to identify smORFs and the tiny peptides they code for since smORF disruption can cause disease. However, the currently available techniques are quite limited.
“Much of the current datasets do not provide information that is detailed enough to identify smORFs in RNA,” added Dr Chothani. “The majority also comes from analyses of immortalised human cells that are propagated—sometimes for decades—to study cell physiology, function and disease. However, these cell lines aren’t always accurate representations of human physiology.”
Chothani and her colleagues from Singapore, Germany, the United Kingdom, and Australia present an approach they created to address these challenges in a recent study published in Molecular Cell. They scoured existing ribosome profiling datasets for short strands of RNA with periodic three-base sections that covered more than 60% of the RNA’s length. They then performed their own RNA sequencing and Ribosome profiling to establish a combined data set of six kinds of cells and five types of tissue derived from hundreds of patients.
Analyses of these data identified nearly 8,000 smORFs. Interestingly, they were highly specific to the tissues that they were found in, meaning that these smORFs may perform a function specific to their environment. The team also identified 603 microproteins coded by some of these smORFs.
“The genome is littered with smORFs,” said Assistant Professor Owen Rackham, senior author of the study from the CVMD Programme. “Our comprehensive and spatially resolved map of human smORFs highlights overlooked functional components of the genome, pinpoints new players in health and disease and provides a resource for the scientific community as a platform to accelerate discoveries.”
Professor Patrick Casey, Senior Vice-Dean of Research at Duke-NUS, said, “With the healthcare system evolving to not only treat diseases but also prevent them, identifying potential new targets for disease research and drug development could open avenues to new solutions. This research by Dr Chothani and her team, published as a resource for the scientific community, brings important insights to the field.”
Reference: “A high-resolution map of human RNA translation” by Sonia P. Chothani, Eleonora Adami, Anissa A. Widjaja, Sarah R. Langley, Sivakumar Viswanathan, Chee Jian Pua, Nevin Tham Zhihao, Nathan Harmston, Giuseppe D’Agostino, Nicola Whiffin, Wang Mao, John F. Ouyang, Wei Wen Lim, Shiqi Lim, Cheryl Q.E. Lee, Alexandra Grubman, Joseph Chen, J.P. Kovalik, Karl Tryggvason, Jose M. Polo, Lena Ho, Stuart A. Cook, Owen J.L. Rackham and Sebastian Schafer, 15 July 2022, Molecular Cell.DOI: 10.1016/j.molcel.2022.06.023