Bill Majoros, PhD, earns prestigious Maximizing Investigators' Research (MIRA) Award for Career in Genomics Research

The National Institute of General Medical Sciences has awarded Bill Majoros, PhD, assistant professor in the Duke Department of Biostatistics and Bioinformatics, a $1.9 million research grant to be used over five years to develop methods for identifying aberrations in gene expression to improve disease diagnosis. The grant, via a Maximizing Investigators’ Research Award (MIRA), marks a significant achievement in his career. The MIRA program aims to provide investigators with greater stability and flexibility to enhance scientific productivity and breakthroughs. 

“Bill is an incredibly creative scientist, and the MIRA award is a great opportunity to unleash his creativity at Duke,” said Tim Reddy, PhD, associate professor in the Duke Departments of Biostatistics and Bioinformatics and Molecular Genetics and Microbiology. “Dr. Majoros is an exceptional researcher, educator, and community member. The MIRA award is the perfect way to recognize and promote his contributions,” Reddy said.

This work will contribute to developing a better understanding of the human genome and how it operates. Majoros and his team aim to answer the question, “How much variability is there in healthy individuals in gene expression across multiple individuals?” he said.

Majoros said many patients who are experiencing symptoms of a disorder go to clinics, seeking answers, but if they don’t receive a definitive diagnosis, their provider can’t suggest treatment, or insurance companies won’t cover the suggested treatment plans. “It’s extremely hard to diagnosis people with genetic diseases, and when we look for these aberrations in gene expression, we’re trying to generate some possibilities that the clinicians can then track down and dig into,” he said.

Majoros and his lab will be using data generated by experimentalists at Duke as the starting point for the project. “Experimentalists are the people who are conducting experiments to try and find all the regions in the genome that regulate genes. As a computational scientist, I will be testing different statistical analyses and developing algorithms from data generated from those experiments.”

The human genome has between 20,000-30,000 protein coding genes and another 10,000 genes that produce RNA and provide other duties for the cell. Gene expression determines how much protein a given gene produces. “A high expression means that this gene is pumping out tons of protein. Low expression means you're not getting much protein from the gene. In terms of aberrations and how mutations may be problematic, if an individual needs a lot of protein, but they're just not generating enough protein, then that can cause disease,” Majoros explained.

An example of this situation is Duchenne muscular dystrophy. It’s one of the most severe forms of inherited muscular dystrophy. It primarily affects males because of a mutation on the dystrophin gene, a type of protein, which is located on the X chromosome.  Duchenne occurs when the gene isn’t producing any dystrophin or not enough of it, leading to muscle degeneration and weakness.

Majoros will be specifically developing models to identify allele specific expressions, which are located inside genes. “We’re developing statistical methods that will be able to find more of these cases more reliably, because the methods that people are currently using miss a lot of those cases,” Majoros said.

This work comes with several challenges. “The way our genome evolved, it didn’t really come with a user manual, and there doesn’t seem to be a sensible design to it. It’s almost as if you have to figure out what every individual gene normally does, and then from there, you have to ask, ‘what are the ways in which an aberration, or a perturbation to the gene could potentially be problematic?’”

He said it’s one of the reasons why people who do computational work can never really give a definitive answer. “We give possibilities, and someone must then track them down experimentally to see what the cells do when you disturb them in different ways.”

Majoros has been studying genetics for more than 20 years and was involved in the human genome project. “Our first goal was to sequence the genome, which is 3 billion letters long. Then once we finished that, we realized we didn’t know where the genes were. From that I got involved in developing algorithms to find genes and I published ‘Methods for Computational Gene Prediction,’ the world's first book on finding the genes.’

He is currently in the third phase of his research and hopes to learn more about the ‘dark region of the genome. “It’s very difficult to understand because while we’re starting to learn how some genes work and how they make proteins, we still know very little about the parts of the genome that actually regulate genes,” Majoros said. “While it does get harder as you move forward, it’s fascinating to be at the forefront and watch the field progress.”

Majoros has an extensive history with Duke, he earned his PhD from the University in 2017, is a co-investigator in the Center for Combinatorial Gene Regulation, and a faculty member in several programs, including the Duke Center for Genomic and Computational Biology, the Master of Biostatistics program, and the Computational Biology and Bioinformatics PhD program.