The Structure of Genes and the Control of Gene Expression

Join now

If you're ready to pass your A-Level Biology exams, become a member now to get complete access to our entire library of revision materials.

Join over 22,000 learners who have passed their exams thanks to us!

Sign up below to get instant access!

Join now →

Or try a sample...

Not ready to purchase the revision kit yet? No problem. If you want to see what we offer before purchasing, we have a free membership with sample revision materials.

Signup as a free member below and you'll be brought back to this page to try the sample materials before you buy.

Download the samples →


  • Regulation of the patterns of gene expression in eukaryotic cells is a complicated process that occurs at various levels, from elements within the DNA to modification of complete proteins
  • Not every piece of DNA holds instructions to make a protein, some sequences exist to regulate the expression of other genes.
  • Genes are made up of introns and exons. Introns are removed from the final mRNA.
  • Promoters and enhancers are regulatory elements that do not code for proteins, but regulate the expression of genes in the correct place at the correct time

The way in which DNA transmits its information to the cell is through encoding proteins. The genome (all the DNA in the cell) not simply consist of a sequence of letters for the cell to interpret. The genome can be divided into functional units called genes. The genes are what encode for specific proteins, which are the essence of the cell and allow it to carry out its required functions.

A very simplistic view of the eukaryotic genome this is that every gene can code for one protein with a particular function of the cell. However, the reality of the situation is much more complex, as only 1.5% of the human genome is made up of genes that actually code for proteins. What is the rest of the DNA doing? Most of this so-called “non-coding” DNA was initially considered to be “junk”, but more recent studies have shown that many these of non-protein coding regions are important in controlling gene activity.

To better understanding what these non-coding regions are doing, we have to first understand the structure of a eukaryotic gene.

How is a gene structured?

The genome is not simply a chain of protein-coding genes one after the other. Even within one gene, the protein-coding sequences are interrupted by non-coding regions. These non-coding interruptions are known as intervening sequences or introns. Conversely, the coding sequences that are actually expressed are called exons.

Most, but not all structural eukaryote genes contain introns. Importantly, these introns are initially transcribed with the exons to form the pre-mRNA, however, they are cut out of the transcript and the remaining exons are joined together before the mRNA is finished being processed. This process is called RNA splicing. This completed, processed mRNA is called the mature mRNA.

Generally, the more complex organisms have more and larger introns. One reason for the existence of the intron/exon structure is that exons can code for different functional regions of proteins, so with the inclusion/exclusion of certain exons, genes can produce various forms of their protein for in different tissues or at different times. This is because the transcription machinery can skip certain exons and include other ones, creating transcripts with different sequences. This process is called alternative splicing, and represents an important layer of controlling the proteins that are produced in cells.

Gene Expression

Genes encode proteins and proteins dictate cell function. Therefore, the combination of genes that are active in a particular cell determine the identity of the cell and the tasks it is carrying out. The activation of genes is called gene expression, and it is tightly regulated at several points, from the initial process of transcription, to splicing, to the translation of the proteins and the modifications to the final protein structures. These processes are tightly controlled at every step to closely monitor and maintain cell types with the required characteristics.

How is gene expression regulated?

The main control point for gene expression is usually at the start of transcription. That is, controlling the signal that tells the cell to produce this mRNA and make this protein.

Transcript processing provides an additional level of regulation. This level of regulation includes splicing, where alternative transcripts can be produced depending on the needs of the cell. Additionally, newly synthesized transcripts can be enzymatically broken down to control protein levels in the cell in response to different cues.

The variety of cell types that exist in a multicellular organism comes from the complexity brought about by variety of potential gene expression profiles. Different cell types possess distinct sets of regulators that initiate or repress the production of different transcripts and proteins

Regulatory elements

As mentioned previously, a large proportion of the genome codes for important regulators. That is, the sequences do not make a protein, but the sequences are key in modulating the expression of other genes.

Transcription usually starts when RNA polymerase binds to an important regulatory region called the gene promoter. This sequence is usually located just upstream from the starting point for transcription. The binding of specific proteins to the promoter is usually required for the gene to be transcribed. Thus, the cell can regulate whether the gene is expressed or not through these regulatory factors, which can be general or cell-type specific.

More recently, another type of regulatory elements have been discovered, called enhancer sequences. These also represent a binding site for other regulatory proteins, and they help to control and fine-tune the expression of a gene. There is no strict criteria for what defines an enhancer region, and the landscape of enhancers in the genome is hugely complicated. Enhancers can also only be active in certain cell types of conditions, or can be active all the time. One interesting feature of enhancers is they can be located thousands of nucleotides away from the gene they control. Further complicating things, some regulatory elements or proteins can affect the transcription of multiple genes, and some regulatory proteins can even have different roles for different genes!

The timed turning off/on of genes in a cell represents on layer of control of the content of proteins in a cell, and thus the identity of that cell. The complexity of the genome in eukaryotes is due to the presence of multiple regulators on the DNA. The concerted actions of regulatory sequences in the DNA, the presence of regulatory factors, and the post-transcriptional/translational processing of a transcript or a protein allows fine-tuning of the gene expression patterns in a cell. This allows cells to perform their required tasks and to respond quickly to changes in the environment.

Recommended Reading & References: Image