Manipulating Genomes (OCR A-Level Biology A): Revision Notes
Uniqueness of DNA
DNA contains the genetic instructions for all living organisms, and whilst much of our DNA is shared with other humans and even other species, specific regions make each individual's DNA unique. This uniqueness forms the basis of DNA databases used in forensic science and identification.

DNA databases and individual identity
DNA held in forensic databases represents only a small portion of an individual's total genetic material. However, this sample is sufficient to uniquely identify a person, with the exception of identical twins who share the same DNA sequence.
Although humans share the majority of their genome with each other and have significant DNA similarities with other species, certain regions show variation between individuals. These variable regions can be analysed to create a unique genetic profile.
This makes sense when we consider that humans use many of the same enzymes for fundamental processes such as respiration, and share developmental genes like homeobox genes with other eukaryotes.
DNA analysis techniques, combined with sophisticated data storage and retrieval systems, have transformed our ability to identify individuals and understand genetic diversity since the 1980s.
What is a genome?
The genome is the minimum quantity of genetic material that contains one complete set of all the genes of an individual, population, or species. In humans, this includes the DNA found in 22 autosomes (non-sex chromosomes), the X chromosome, the Y chromosome, and mitochondrial DNA. Plant genomes also include chloroplast DNA.
Genomics is the application of genetics and molecular biology techniques to map genes on chromosomes and determine the complete base sequence of genes or entire genomes.
Genome size and complexity
Genomes vary dramatically in size and complexity across different organisms. Since most genomes consist of double-stranded DNA, their size is measured in:
- Base pairs (bp): individual paired nucleotides
- Kilobase pairs (kbp): bp
- Megabase pairs (Mbp): bp
Examples of genomes
| Organism | Genome structure | Size | Number of genes |
|---|---|---|---|
| HIV (virus) | Single-stranded RNA | 5000 bases | 9 |
| Escherichia coli (bacterium) | Single, circular double-stranded DNA | bp | ~4000 |
| Mouse (Mus musculus) | Linear double-stranded DNA in 19 autosomes + X and Y chromosomes | bp | ~23000 |
| Mouse mitochondrial genome | Circular double-stranded DNA | bp | 13 proteins + tRNA and rRNA genes |
| Human (Homo sapiens) | Linear double-stranded DNA in 22 autosomes + X and Y chromosomes | bp | ~21000 |
| Human mitochondrial genome | Circular double-stranded DNA | bp | Proteins + tRNA and rRNA genes |
This comparison reveals several key patterns:
- Viral genomes are the simplest, containing minimal genetic information
- Prokaryotic genomes are smaller and less complex than eukaryotic genomes
- Interestingly, humans have only slightly more genes than mice (~21000 vs ~23000), despite our perceived complexity
- Mitochondrial genomes are remarkably similar in size between mice and humans
Structure of eukaryotic genomes
Eukaryotic genomes contain several distinct types of DNA sequence, each with different functions.
Structural genes
Structural genes code for polypeptides by specifying the sequence of amino acids. Each structural gene contains:
- Exons: coding sequences that are transcribed into mRNA and translated into protein
- Introns: non-coding sequences that are transcribed but removed during mRNA processing before translation
Worked Example: Human Haemoglobin β-polypeptide Gene
The gene coding for the β-polypeptide of human haemoglobin demonstrates the typical structure of a eukaryotic gene:
Total gene length: bp
Exon composition:
- Three exons with a combined length of bp
- These code for a polypeptide of amino acids
Calculation: Since the genetic code is a triplet code:
Intron composition:
- Remaining sequence: bp
- This consists of two introns that are removed during mRNA processing
Regulatory genes
Regulatory genes control gene expression. Some regulatory genes code for proteins called transcription factors, which bind to DNA and influence whether a gene is transcribed. Others code for various forms of RNA that also control transcription.
Promoters
Promoters are control sequences located upstream (towards the 5' end) of genes. They serve as binding sites for RNA polymerase at the start of transcription. The presence and accessibility of promoters determines whether a gene can be transcribed.
Non-coding DNA
Long stretches of DNA separate structural and regulatory genes. Previously dismissed as "junk DNA", these regions are now understood to play roles in gene regulation and chromosome structure.
Organellar genomes
Eukaryotic genomes include DNA in organelles:
Mitochondrial DNA (mtDNA): present in all eukaryotes, this circular double-stranded DNA resembles prokaryotic genomes with minimal non-coding sequences
Chloroplast DNA (ctDNA): present in plants and some protoctists, also circular and prokaryotic-like in structure
Both organellar genomes originated from ancient prokaryotes that were engulfed by early eukaryotic cells through endosymbiosis.
The Arabidopsis thaliana genome
Thale cress (Arabidopsis thaliana) serves as a model organism for plant genetics. It has a diploid number of chromosomes in the nucleus, plus DNA in chloroplasts and mitochondria.

The Arabidopsis genome structure demonstrates the typical organisation of a plant genome:
| Genome component | Size (Mbp) | Number of genes |
|---|---|---|
| Nuclear chromosome I | 29.1 | 6543 |
| Nuclear chromosome II | 19.6 | 4036 |
| Nuclear chromosome III | 23.2 | 5220 |
| Nuclear chromosome IV | 17.6 | 3825 |
| Nuclear chromosome V | 25.9 | 5874 |
| Total nuclear genome | 115.4 | 25498 |
| Mitochondrion | 0.37 | 58 |
| Chloroplast | 0.15 | 79 |
This table shows that the nuclear genome contains the vast majority of genes, whilst organellar genomes are much smaller. The variation in chromosome size reflects differences in the number of genes and amount of non-coding DNA each contains.
Key Points to Remember:
-
The genome contains one complete set of all genes, including nuclear and organellar DNA in eukaryotes
-
Whilst humans share most DNA sequences with each other and other species, specific variable regions make each person's DNA unique (except identical twins)
-
Eukaryotic genomes contain structural genes (with exons and introns), regulatory genes, promoters, and non-coding DNA
-
Genome size is measured in base pairs (bp), kilobase pairs (kbp = bp), or megabase pairs (Mbp = bp)