Protein Primary Structure Determination Methods
Protein molecules are biological macromolecules formed by many amino acids connected by peptide bonds. All physiologically functional proteins in the body are structured. Each protein has a certain percentage of amino acid mass, the sequence of amino acids, and a specific arrangement of peptide chain space. Therefore, the protein molecular structure composed of the amino acid sequence and peptide chain space arrangement is the structural basis for the unique physiological functions of each protein.
More than 20 types of amino acids constitute the proteins in the human body, and the sequences and spatial positions of these amino acids are almost infinite. Different amino acid sequences and specific spatial arrangements can create tens of thousands of proteins in the human body and complete tens of millions of physiological functions endowed by life.
In 1952, Danish scientists proposed to divide the complex molecular structure of proteins into four levels, namely primary, secondary, tertiary, and quaternary structures. The latter three are collectively referred to as high-level structures or spatial conformations. In protein molecules, the sequence of amino acids from the N-terminus to the C-terminus is called the primary structure of the protein. The main chemical bond in the primary structure is the peptide bond. In addition, the position of all disulfide bonds in the protein molecule also belongs to the category of primary structure.Commonly used methods to characterize proteins are as follows.
Edman Degradation Method
This is a traditional method for determining the N-terminal sequence of proteins. By gradually removing the amino acids at the N-terminus of the protein and identifying each removed amino acid, the N-terminal amino acid sequence can be labeled and analyzed without interfering with the peptide bonds. However, this method is not suitable for N-terminal closure or chemical modification.
Mass Spectrometry
This is one of the most commonly used and effective methods at present. Mass spectrometry can accurately determine the mass of proteins or peptide segments, and help infer the amino acid sequence through the fragment pattern. Especially the tandem mass spectrometry (Tandem MS, or MS/MS) technology, through two-stage mass analysis, can analyze the amino acid sequence of proteins in more detail.
Genomic Sequencing and Bioinformatics Analysis
Since proteins are coded by genes, by determining the DNA sequence of the corresponding genes and performing bioinformatically translation, the amino acid sequence of the protein can be inferred.
Each of the methods mentioned above has its advantages and limitations. For example, mass spectrometry analysis is fast and sensitive, but it has certain requirements for the purity and quantity of the sample; while Edman degradation is suitable for shorter sequences, but it is slower. In practical applications, the most suitable method or a combination of multiple methods is often selected based on the nature of the sample and available resources.
How to order?