Globe shapes with zeros and ones

$1M grant fuels discovery and analysis of proteoforms

May 31, 2016

Opening the door a better understanding of proteins—and our ability to treat and prevent diseases—the National Institutes of Health awarded $1.18 million to faculty at the Indiana University School of Informatics and Computing and the IU School of Medicine at IUPUI for a research collaboration that unites two cutting-edge technologies in the discovery and analysis of proteoforms.

Professor Xiaowen Liu from the Department of BioHealth Informatics at the School of Informatics and Computing and Professor Yunlong Liu from the Department of Medical and Molecular Genetics at the School of Medicine are principal investigators on the project, which combines their respective expertise in mass spectrometry-based top-down proteomics data analysis, and RNA (ribonucleic acid) sequencing.

The research team for the project, “Computational tools for top down mass spectrometry based proteoform identification and proteogenomics,” also includes Professor Harikrishna Nakshatri from the School of Medicine and Professor Si Wu from the University of Oklahoma.

In tandem, the technologies represent an enormous leap in discovering and analyzing proteins that can now be seen “top-down,” or intact, through mass spectrometry—a bird’s eye view that presents a new world in comprehending the number and classification of proteins, as well as a challenge in grasping their breadth and complexity.

The limitations of previous tools necessitated breaking the proteins into pieces, and the data models used to analyze them don’t translate to the comprehensive top-down view. Xiaowen Liu is one of the few researchers creating algorithms that accommodate the intricacy of the top-down mass spectrometry data.

His novel data model, the “mass graph,” will incorporate Yunlong Liu’s RNA sequencing (RNA-Seq) modeling expertise to further aid in accurate identification at the proteome level. RNA sequences provide information that enables more precision in creating a data template for targeting proteins. The team will also develop a software pipeline that utilizes these technologies.

How many proteoforms might be discovered? There may be hundreds of thousands, says Yunlong Liu. “The proteins are more than just a sequence of amino acids. Some amino acids can be modified, and this technology will allow us to see those modification patterns as well.”

Getting a better picture of proteoforms will enhance our understanding of living organisms—and present the opportunity for advances in diagnosing patients. Comparing the protein forms for healthy samples and patient samples could reveal biomarkers that will improve medical prognoses, says Xiaowen Liu.

And there are very promising translational benefits for treatments, says Yunlong Liu. “We need to understand the form of the protein before we design a drug to target it. And so there is great therapeutic potential.”

Media Contact

Joanne Lovrinic