Instant Access to A Level Biology Revision

Sign up now to get access to the entire library of A Level Biology resources for all exam boards

Join now

If you're ready to pass your A-Level Biology exams, become a member now to get complete access to our entire library of revision materials.

Join over 22,000 learners who have passed their exams thanks to us!

Sign up below to get instant access!

Join now →

Or try a sample...

Not ready to purchase the revision kit yet? No problem. If you want to see what we offer before purchasing, we have a free membership with sample revision materials.

Signup as a free member below and you'll be brought back to this page to try the sample materials before you buy.

Download the samples →

Structure of Proteins

Introduction

Proteins are the organic molecules made up of amino acids. These amino acids are linked together via peptide bonds in the form of long chains. The amino acids present in these long chains determine the final structure, properties, and functions of proteins.

Amino acids in proteins don’t lie in linear chains. Rather, the chains of amino acids are arranged to form molecules having complex structures. In order to understand this complexity, the structure of proteins is studied at four levels of organization:

  1. Primary structure
  2. Secondary structure
  3. Tertiary structure
  4. Quaternary structure

In this article, we will study these four levels of protein structure. 

Primary Structure

At this level of protein structure, the sequence of amino acids in proteins is studied. This is the basic level of protein structure. All the properties and functions of a protein are dependent on the sequence of amino acids in the peptide chains present in it. 

At the level of primary structure, we study the sequence of amino acids that are linked via peptide bonds in the polypeptide chains present in a protein. 

Peptide Bond

It is a covalent bond that is formed when the carboxylic group of one amino acid reacts with the amino group of another amino acid. The resultant dipeptide also has a free amino group on one end and a free carboxylic group on the other end. Thus, it can also form additional peptide bonds at both of its ends. 

All the amino acids present in a protein are together linked and sequenced via peptide bonds to form long linear chains. The sequence of amino acids in a polypeptide chain is dependent on the sequence of nucleotides in the gene encoding the protein. One amino acid is determined by a collection of three nucleotides known as genetic code.

Characteristics of peptide bond

The peptide bond is considered to have partial characteristics of a double bond. It is shorter and rigid than a typical single covalent bond. This rigid characteristic of the peptide bond prevents the free rotation of groups around it. 

The carbonyl and the amide groups present in the peptide bond do not release or accept electrons and remain unchanged at the physiologic pH. However, a peptide bond is polar in characteristic due to highly electronegative atoms (O and N)and thus takes part in the formation of hydrogen bonds. 

In a peptide chain, only the terminal carboxylic and amino groups and the groups present in the side chains of amino acids can accept or release electrons at the physiologic pH of our body.  

The peptide chain formed as a result of peptide bonds have a free amino group at one end and a free carboxylic group at the other end. These ends are known as N-terminal and C-terminal respectively. 

Numbering of amino acids

While sequencing the amino acids in a peptide chain, the N-terminal is written on the left side while the C-terminal is written on the right side. All the amino acids are sequenced or numbered from the N-terminal to the C-terminal of the peptide chain. For example, consider a tetrapeptide having N-terminal valine, glycine, methionine and a C-terminal leucine is sequenced as follows;

  • N-terminal valine is numbered 1.
  • Glycine is numbered N-2
  • Methionine is numbered N-3
  • C-terminal is numbered N-4

This is the universally accepted standard way of sequencing amino acids. Whenever an amino acid is quoted from a peptide chain, its mentioned by its number from the N-terminal. 

Determining the Primary Structure

The first step in determining the primary structure of a polypeptide chain is to isolate and quantify individual amino acids present in it. This is done in an automatic machine called Amino acid Analyzer.

Amino acid analyzer automatically cleaves the peptide bonds to release amino acids sequentially, separates them using ion-exchange chromatographic technique, and measure individual amino acids by using a spectrometer. 

After quantifying the individual amino acids, the sequence of amino acids can be determined by a stepwise process. This sequencing process uses Phenylisothiocynate, also known as Edman reagent, to label the N-terminal of the polypeptide chain. The Edman reagent forms phenylthiohydantoin derivative with the N-terminal amino acid. This derivate introduces an instability at the N-terminal peptide bond. As a result, the N-terminal peptide bond is cleaved with its amino acid leaving the rest of peptide bonds intact. The process is repeated multiple times and each amino acid released can be identified. The sequencing process has also become automated nowadays. 

Polypeptides having more than 100 amino acids are first cleaved into smaller polypeptides and are than sequenced using the above method. 

Determining primary structure by DNA Sequencing

The amino acid sequence of a polypeptide chain can also be determined by sequencing the nucleotides present in its gene. This is a relatively easier process if the gene of a polypeptide is already known.

Secondary Structure

The polypeptide chains present in a protein do not lie linearly nor they undergo random organization. Rather, the amino acids lying close to one another in the linear polypeptide chain undergo regular arrangements. These regular arrangements formed by interactions among the amino acids are termed as secondary structure of proteins. The three most commonly occurring secondary structures are alpha-helix, beta-sheets and beta-bends (beta-turns). 

Each of these secondary structures is described below.

Alpha-Helix

Alpha-helix is the most common polypeptide helix found in nature. It is a spiral structure with the central backbone or core made up of a tightly packed polypeptide having the side chains of amino acid directed outwards. The outward orientation of side chains prevents any steric hindrance.

It is stabilized by extensive hydrogen bonding that exists among the amino acids of the polypeptide chain.  In alpha helix, the hydrogen bonds are formed between the carbonyl group of one peptide bond and the amide group of another peptide bond. The two groups are present four residues apart. These bonds are directed upwards parallel to the axis of the helix. 

There are 3.6 amino acids in each turn of alpha-helix. The amino acids that are three to four residues apart in the primary structure are present in the immediate vicinity in this secondary structure. 

Three types of amino acids disrupt the alpha-helix. These are;

  • Proline, having a secondary amino group that does not fit in the right-handed spiral of alpha-helix
  • Large number of charged amino acids repel each other and disrupt the alpha helix
  • Bulky amino acids if they are present in large amount also distort alpha-helix

Examples

Proteins having alpha helix include keratin and myoglobin. Keratin has an almost entire alpha-helical structure and is a fibrous protein present in hair, nails, etc. Myoglobin also has the whole alpha-helical structure, but it is a globular protein majorly present in the skeletal muscles.  

Beta-sheets

It is another type of secondary structure found abundantly in proteins. As the name indicates, they have a sheet-like in which all the peptide bonds are involved in hydrogen bonding.

There are two essential differences between alpha-helix and beta-sheets;

  • Beta-sheets consist of two polypeptide chains or two segments of the same polypeptide chain instead of a single polypeptide chain in alpha-helix.
  • Hydrogen bonds in beta-sheet are perpendicular to the polypeptide backbone while they are parallel in the case of alpha-helix. 

Beta-sheets can be of two types, parallel or anti-parallel. 

  • In parallel sheets, the two polypeptide chains are arranged parallel to each other, the N-terminal and the C-terminal of both the chains are in the same direction. 
  • In anti-parallel sheets, the two polypeptide chains are arranged anti-parallel with the C-terminals and the N-terminals in opposite direction. The anti-parallel sheets are usually formed between the segments of a single polypeptide chain when it folds onto itself. 

Beta-sheets are also held together by the hydrogen bonds between the peptide bond components of polypeptide chains. There are two types of hydrogen bonds in beta-sheets;

  • Inter-chain, they are formed between two different polypeptide chains 
  • Intra-chain, they are formed between two segments of the same polypeptide chain

Examples

Many fibrous proteins have the entire structure made up of beta-sheets. Examples include amyloid proteins present in the brain, etc. 

Beta-bends

These are the secondary structures that reverse the direction of a polypeptide chain. When a single polypeptide chain folds onto itself to form beta-sheets between its segments, the folding is done by beta-bends. These bends connect the two antiparallel sheets. 

Beta-bends mostly include charged residues. They are mostly made up of four amino acids; 

  • Proline
  • Alanine
  • Valine
  • Leucine

In addition to the hydrogen bonds, ionic bonds between the charged groups also play a role in stabilizing beta-bends.

Super secondary structures

In making globular proteins like myoglobin, hemoglobin, etc., multiple secondary structures are combined resulting in the formation of specific geometric patterns called super secondary structures or motifs. Multiple alpha helices, beta sheets, etc. are connected via beta-bends to form these motifs. 

They usually result from the close packing of the secondary structures and assist in the folding of proteins. 

Tertiary Structure

Tertiary structure involves the formation of structural and functional units called domains. This structural level is seen only in the globular proteins like hemoglobin and myoglobin.

Within the globular proteins, the tertiary structure is arranged in such a way that it has hydrophobic amino acids buried inside the core while the hydrophilic amino acids are present on the outer surface of proteins.

Domains

Domains are the structural and functional units of polypeptides. Polypeptides having more than 200 amino acids usually have two or more domains. 

Domains have a core that is made up of super-secondary structures or motifs. The rest of the polypeptide chain is folded around this core of the domain. Folding of the polypeptide chain in each domain is independent of folding in other domains. 

Folding

Folding of the polypeptide chain is dependent on the interactions between the side chains of amino acids within a domain. The following interactions among the amino acids are responsible for stabilizing the tertiary structure and folding of the polypeptide chain. 

  1. Disulfide bonds: These are the covalent bonds formed by the sulfhydryl groups of two cysteine residues to form a cystine residue. The two cysteine residues maybe hundreds of amino acids apart, but they are brought together by these disulfide bonds. These bonds largely contribute to the stability of the tertiary structure and prevent denaturation of proteins.  
  2. Hydrophobic interaction: These interactions among the non-polar amino acids render them buried on the inside of the globular proteins. 
  3. Hydrogen bonds: These are formed between the amino acids having oxygen or nitrogen in their side chains. The hydrogen bonds among the hydrophilic amino acids on the surface of globular proteins are responsible for their solubility in the aqueous solutions. 
  4. Ionic interactions: These are formed between the charged groups of amino acids. 

Guided by these interactions, folding of the polypeptide chain occurs within the cells in seconds to minutes in a non-random, ordered fashion and the protein acquires its final functional compact form.

Quaternary structure

Proteins that are made up of a single polypeptide chain are called monomeric proteins. The proteins that are made of multiple polypeptide chains are called multimeric proteins. 

In multimeric proteins, the arrangement of multiple polypeptide chains is called the quaternary structure of proteins. These polypeptide chains are mostly held by non-covalent interactions. 

For example, hemoglobin is made up of four polypeptide chains, two alpha chains, and two beta chains. One alpha and one beta chain together form one domain. Thus, in hemoglobin, there are two domains. Both these domains are held by non-covalent bonds in their quaternary structure. 

Summary

Proteins are highly complex biopolymers. Their structure is studied at four levels of organization. 

The primary structure involves the arrangement, number, and sequencing of amino acids in polypeptide chains. 

  • Amino acids present in the polypeptide chain can be quantified and sequenced by an automated analyzer. 
  • Amino acid sequence can also be determined by the nucleotide sequence of the corresponding gene. 

The secondary sequence involves the specific geometric arrangement of polypeptide chains. It includes, 

  • Alpha-helix
  • Beta-sheets
  • Beta-bends

Secondary structures are stabilized by hydrogen bonds. 

Super secondary structure are formed by combining these secondary structures via beta-bends. 

Tertiary structures are seen only in globular proteins and involve folding of polypeptide chains within different domains. 

Each domain is an independent structural and functional unit of protein. 

Folding inside each domain is dependent on the four types of interactions that stabilize the tertiary structure. These are;

  1. Disulfide bridges
  2. Hydrogen bonds
  3. Ionic interactions
  4. Hydrophobic interactions

Quaternary structure is seen in multimeric proteins and involves the holding together of domains or polypeptide chains via non-covalent interactions. 

Frequently Asked Questions

What are the four levels of protein structure?

The four levels of protein structure include primary structure, secondary structure, tertiary structure and quaternary structure. 

What is the secondary structure of proteins?

The secondary structure of proteins includes regular arrangements formed by polypeptide chains. The secondary structure includes alpha helix, beta sheets, beta bends, and beta loops.

What are the types of proteins?

There are two types of proteins, fibrous proteins and globular proteins. Fibrous proteins are in the form of fibres and have secondary structures. Globular proteins have a globular shape and have tertiary or quaternary structures.

Where are structural proteins found?

Structural proteins include collagen, keratin, elastin, etc. They are found in ligaments, bones, hair, nails, beaks, feathers, etc. 

References

  1. H. Stephen Stoker (1 January 2015). Organic and Biological Chemistry. Cengage Learning. p. 371. ISBN 978-1-305-68645-8.
  2. Brocchieri L, Karlin S (10 June 2005). “Protein length in eukaryotic and prokaryotic proteomes”. Nucleic Acids Research. 33 (10): 3390–3400. doi:10.1093/nar/gki615PMC 1150220PMID 15951512.
  3. Sanger, F.; Tuppy, H. (1 September 1951). “The amino-acid sequence in the phenylalanyl chain of insulin. I. The identification of lower peptides from partial hydrolysates”. The Biochemical Journal. 49 (4): 463–481. doi:10.1042/bj0490463ISSN 0264-6021PMC 1197535PMID 14886310.
  4. Donald, Voet (2011). Biochemistry. Voet, Judith G. (4th ed.). Hoboken, NJ: John Wiley & Sons. ISBN9780470570951OCLC690489261.