Proteins are the workhorses of the cell. They are required for the structure, function and regulation of the body’s tissues and organs, and although half of all proteins in modern cells are symmetric complexes, less is known about them, as they are more difficult to work with both experimentally and computationally. Bruce Donald is the PI on a new NIH R01 grant that aims to develop tools to help make these symmetrical proteins easier to work with.
Symmetric proteins have two or more components – called subunits – that fit together to form a complex. Like synchronized swimmers, these subunits are identical in shape, but they are translated and rotated to form symmetrical patterns. And they are fearful! Symmetric proteins coat the outside of viruses like HIV, Zika and Ebola. They also form channels in cell membranes and are the target of many modern drugs.
Typically, researchers use three techniques to determine the 3D molecular architecture of proteins: X-ray crystallography, nuclear magnetic resonance (NMR) and cryo-electron microscopy. Each technique, though, has some drawbacks. Proteins can be difficult to crystalize for X-ray diffraction. This is particularly true of membrane proteins, which sit in the cell membrane. NMR is difficult to use with larger proteins and can provide ambiguous measurements. Cryo-electron microscopy is best used for very large proteins, and many symmetric proteins are too small. In all of these techniques, the measurements are indirect, so researchers have to make inferences.
NMR distance measurements of symmetric proteins are inherently ambiguous. These ambiguities lead to multiple ways of assembling the subunits of a protein. When building a model to fit the data, Donald and his team started developing new algorithms using algebraic topology to better deal with the challenges that arise from studying symmetrical proteins.
They began studying diaglycerol kinase (DAGK), a drug target in bacterial cell walls, and found two structure models. One had previously been solved using X-ray crystallography, and one using NMR. The structures appeared vastly different. “They can’t both be right,” Donald said. “So, first we wanted to see if we could make the NMR data match what we found with the X-ray crystallography model.”
It turns out they could, which led to another question: How many other models are out there? By making small, mathematical changes to the protein structure diagrams, they could generate more folds, test the folds, and see which ones fit the data. They made all of the possible structures and analyzed them to see which ones fit the data best, and in addition to finding NMR and crystal structures, they found other folds that are predicted to be even better fits to the data.
Donald worked with Jeff Martin, a software engineer and computer science Ph.D. student (’14) in the Donald Lab to develop these methods. “We applied them in our labs, and they looked promising,” Donald said, “so we wanted to see how widespread this is.” They took a case from literature where there was disagreement. “This could actually be a very big problem for experimental protocols, so we are proposing to fix this problem in our grant.” The team will develop algorithms that work for any symmetric protein and that also apply to particular systems of interest that are symmetric systems in virology and immunology, like the Zika virus and HIV.
“Mathematical symmetry is important in how biology structures the interactions between molecular partners, and we can use topology to understand those interactions and analyze the data that tells us what the structures are,” Donald said.
This research is supported by the National Institute of Health grant R01GM118543. Other collaborators include Leonard Spicer, Ph.D.; Pei Zhou, Ph.D.; Hashim Al-Hashimi, Ph.D.; Scott Schmidler, Ph.D.; and Jeffrey Hoch, Ph.D.