DNA – A double helix made of base pairs
In order to understand epigenetics, it helps to have a broad view of the human genomic landscape. This module will review the basic organization of the genome, from the tiny units called base pairs through larger structures, such as chromosomes.
DNA contains four bases, represented by the letters A (for adenine), T (for thymine), G (for guanine), and C (for cytosine). These bases are attached to a “string” made of a special type of sugar called deoxyribose. DNA contains two sugar strings that are twisted around each other: a double helix. The bases on one side of a double helix pair up with the bases on the other side of the double helix in order to hold the two strings together. The bases in DNA are selective about their pairings. Normally, they will only form two specific combinations: G pairs up with C and A pairs up with T. These A—T and G—C base pairs (abbreviated bp) are not only the foundation of DNA; they are also the basic unit of measurement in the genome.
Figure 1. DNA has a double-helix shape. Bases are found in pairs on the inside of the double helix. The bases in DNA are named A, T, G, and C. T forms pairs with A, and vice versa. G forms pairs with C, and vice versa. Base pairs can be used as units to measure the length of DNA.
Figure 1 (above) shows the stereotypical model of DNA: a double-helix shape with color-coded base pairs. While this model provides a good introduction to DNA, it is an oversimplification of the architecture of the human genome. The double-helix model in Figure 1 is like zooming in on a single leaf on a single tree, when what you are really trying to understand is the forest. In reality, there is such a vast amount of DNA in the genome that it must be divided up and condensed many, many times in order to be packaged inside the microscopic cells of the human body. The sections below first quantify exactly how big of a storage problem this is and then slowly zoom out to provide a birds-eye view of the human genome.
Big genomes lead to storage problems
The human genome is very large; it is made of over 6 billion bp, or about 2 meters of DNA (over 6 and a half feet), which contains about 38,000 genes. Each cell  in the body contains its own, complete copy of the genome. The massive quantity of DNA in our bodies swiftly causes a storage problem. Consider: the largest cell in the human body is the oocyte, or “egg cell,” which is only found in females and is about one millimeter in diameter — approximately the size of a comma. Additionally, not every cell in the body needs access to all 38,000 genes all the time. If all the cells in every tissue and organ tried to express every gene in the genome simultaneously, the chaos would be completely incompatible with life.
Our bodies manage both the storage problem and the gene expression problem through many levels of DNA compression. Compressing the genome allows it to fit inside the nucleus of a single cell. By controlling which parts of the genome are more or less tightly compressed, a cell can access just those parts of the DNA which are necessary to its long-term role in the body, while retaining the flexibility to respond to pressing demands from a changing environment.
Major motifs in genome compression
DNA compression has three major repeating motifs: lines, coils, and loops.
Figure 3. Lines, coils, and loops are the three major motifs of DNA compression.
In the sections that follow, we will watch as a linear strand of DNA becomes coiled (forming chromatosomes). Then the coil itself will also be coiled, looped, and coiled again (forming chromatin). In the end, DNA will be condensed into a shape that looks roughly like a line (forming chromosomes). The contortions that the genome undergoes are dizzying, but it is important to appreciate their complexity in order to understand why epigenetics is a necessary form of regulation in big genomes and how epigenetic changes act on multiple levels of genomic organization.
Beginning with the end in mind: Chromosomes
It will help to know in advance that human DNA does not exist in one continuous double-helix string. Human DNA is divided into 46 “strings” of DNA. These strings can be organized into 23 pairs. Each separate string is called a chromosome, a term which we will circle back around to shortly.
Figure 4. With one exception (human mitochondrial DNA), all human DNA is organized into structures called chromosomes. The figure above represents a karyotype—a diagram or photograph of all the chromosomes in a single cell. Human beings have 46 chromosomes in each cell, which form 23 pairs. One copy in each pair is inherited from a person’s mother (yellow) and the other is inherited from a person’s father (blue). The chromosomes in the 23rd pair are called the “sex chromosomes” and are labeled X or Y. Usually, but not always, a biological male has one X chromosome and one Y chromosome (XY). Usually, but not always, a biological female has two X chromosomes (XX).
Nucleosomes & chromatosomes
The first level of DNA compression is the nucleosome. Nucleosomes are made of DNA wrapped around a core of proteins. The center of each nucleosome is made of eight pieces of protein called histones, which come together to form a histone core.
Figure 5. Eight histones come together to form one histone core. A histone core is at the center of every nucleosome.
If DNA is like a string, then histone cores are the spools about which this string is wound.
Figure 6. DNA coils around the histone core to form a nucleosome.
Pieces of DNA called linker DNA protrude from each “spool” to connect one nucleosome to the next. A special linker histone makes contact with the linker DNA as it enters and exits the nucleosome. When you add a linker histone to a nucleosome, the entire unit is called a chromatosome.
Figure 7. A chromatosome is constructed by adding a linker histone to a nucleosome. The linker histone pinches linker DNA as it enters and exits the nucleosome.
Each chromatosome can hold only 166 bp of DNA (146 bp around the nucleosome, plus 20 bp held by the linker histone).5,6 Human genes vary in length, with a median length of 14,000 bp and an average of 27,000 bp.7 Thus, many chromatosomes are required to spool up a single gene, and many more are needed to contain the entire genome.
Figure 8. Many chromatosomes are needed to spool up the entire human genome. Linker DNA connects one chromatosome to the next.
The next level of DNA compression is chromatin. The chromatosome and linker DNA are together twisted into a tight coil called chromatin fiber. The chromatin fiber can be further condensed by looping it and then coiling it again.
Figure 9. The structure called “chromatin” actually represents several levels of DNA compression. Chromatosomes coil to form 30-nanometer chromatin fibers. Thirty-nanometer chromatin fibers loop to form 300-nanometer chromatin fibers. Three-hundred-nanometer fibers are further coiled to form 700-nanometer chromatin fiber.
The final level of DNA compression is the chromosome. Chromatin fibers are folded on themselves to produce a linear chromosome. Recall: human DNA is not contiguous. It is divided into 46 pieces. Each of these 46 pieces is a separate chromosome, all of which are contained in the nucleus.
Figure 10. Chromatin folds to form a condensed, linear structure called a chromosome. A complete set of 46 chromosomes fits inside the nucleus of each cell in the human body.
When genes are being expressed (being transcribed), the many levels of DNA compression cause a problem. It is necessary for enzymes to access the DNA and gently pull apart a section of the double helix in order to read the underlying code, but chromatin and histones are a prohibitive barrier. Thus, the cell has ways of modifying histones or temporarily removing them from key sites. Increasing and decreasing the accessibility of specific genes, or even whole chromosomes, is the secret of epigenetics.
Tweaking the degree of compression is one of the major tools of epigenetic regulation.
Figure 11. DNA is packaged and compressed into chromosomes that fit within the nucleus of a cell. Epigenetics regulates gene expression by altering the accessibility of specific parts of the genome to enzymes and other important molecules. This is achieved, in part, by changing the level of compression at key places in the genome.
 Morton, N. E. Parameters of the human genome. Proc. Natl. Acad. Sci. U. S. A. 88, 7474–6 (1991).
 Ezkurdia, I. et al. Multiple evidence strands suggest that there may be as few as 19,000 human protein-coding genes. Hum. Mol. Genet. (2014). doi:10.1093/hmg/ddu309
 Bianconi, E. et al. An estimation of the number of cells in the human body. Ann. Hum. Biol. 40, 463–71 (2013).
 Bednar, J. & Dimitrov, S. Chromatin under mechanical stress: from single 30 nm fibers to single nucleosomes. FEBS J. 278, 2231–43 (2011).
 Luger, K., Maeder, A. W., Richmond, R. K., Sargent, D. F. & Richmond, T. J. Crystal structure of the nucleosome core particle at 2.8Aa resolution. Nature 389, 251–260 (1997).
 International Human Genome Sequencing Consortium. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001).
Teaching / Learning Materials
- U. of Utah Genetic Science Learning Center: Epigenetics Multimedia
- NOVA ScienceNow: Epigenetics with Dr. Randy L. Jirtle (Video)
- NHGRI Fact Sheet: Epigenomics
- NCBI Bookshelf: Epigenomics Help – Epigenomics Scientific Background
- Geneimprint: Gateway to gene imprinting information
- EpiGenie Science Writers
- Scitable by Nature Education: Genetics Topic Room
Organizations and Landmark Projects
- International Human Epigenome Consortium
- NIH Roadmap Epigenomics Project
- ENCODE Project: Encyclopedia of DNA Elements
- NHGRI: National Human Genome Research Institute
- NIH Epigenomics Program
- NCHPEG BSSR: Genetics and Social Science – Expanding Transdisciplinary Research
Scholarly Review Articles
- Enivronmental epigenomics and disease susceptibility Jirtle R. and Skinner M., Nat Rev Genet (2007) 8(4):253-62 doi: 10.1038/nrg2045
- Epigenetics, chromatin and genome organization: recent advances from the ENCODE project Siggens L. and Ekwall K., J Intern Med (2014) doi: 10.1111/joim.12231
- Epigenetics as a unifying principle in the aetiology of complex traits and diseases Petronis A., Nature (2010) 465(7299):721-7 doi: 10.1038/nature09230
- Epigenetics and the environmental regulation of the genome and its function Zhang T. and Meaney M., Annu Rev Psychol (2010) 61:439-66 doi: 10.1146/annurev.psych.60.110707.163625
Epigenomic Data: Visualize, Browse, and Download