In response to a question on CIX which surprised me...
Someone asked what it meant that their genome was only a couple of percent different to that of a chimpanzee. When told about their having megabytes of genome, they were surprised - which astonished me. So I wrote this.
Genomes are *big*.
Each strand of DNA in the pair that forms a double helix has one of 4
possible "letters" (chemically, "bases" ) in each position: A, G, C or T.
Each only pairs with one matching letter: A <=> T or G <=> C. These base
pairs (BPs) are the building blocks of the genetic code.
Humans have around 3,000,000,000 base pairs: 3 billion or 3 thousand
million.
The human genome is thus pretty big. It's far from the biggest, though;
the genome of the onion is about 20x larger and the largest currently
known genome is a single-celled organism, a species of amoeba -
/Amoeba dubia/ - at 670 thousand million base pairs.
Only about 2-3% of this codes for proteins or contains "punctuation" such
as start/stop messages. The rest is of unknown purpose, the so-called
"junk DNA" - although that term is now deprecated as some of it seems to
do *something.*
In humans, that 2-3% comprises about 35,000 genes.
Many bacteria get by with only about 2,000 genes. /Mycoplasma gentalium/
has the smallest known genome of a self-reproducing organism, at 483 genes
in 580,000 BPs. The thale cress /Arabidopsis thaliana/ has the smallest
known plant genome at 25,498 genes in 115,409,949 BPs.
Yeast has about 6,000 (in 12M BPs) and rice about 60,000 (in 430M BPs).
Source for most of the figures:
http://users.rcn.com/jkimball.ma.ultranet/BiologyPages/G/GenomeSizes.html
Someone asked what it meant that their genome was only a couple of percent different to that of a chimpanzee. When told about their having megabytes of genome, they were surprised - which astonished me. So I wrote this.
Genomes are *big*.
Each strand of DNA in the pair that forms a double helix has one of 4
possible "letters" (chemically, "bases" ) in each position: A, G, C or T.
Each only pairs with one matching letter: A <=> T or G <=> C. These base
pairs (BPs) are the building blocks of the genetic code.
Humans have around 3,000,000,000 base pairs: 3 billion or 3 thousand
million.
The human genome is thus pretty big. It's far from the biggest, though;
the genome of the onion is about 20x larger and the largest currently
known genome is a single-celled organism, a species of amoeba -
/Amoeba dubia/ - at 670 thousand million base pairs.
Only about 2-3% of this codes for proteins or contains "punctuation" such
as start/stop messages. The rest is of unknown purpose, the so-called
"junk DNA" - although that term is now deprecated as some of it seems to
do *something.*
In humans, that 2-3% comprises about 35,000 genes.
Many bacteria get by with only about 2,000 genes. /Mycoplasma gentalium/
has the smallest known genome of a self-reproducing organism, at 483 genes
in 580,000 BPs. The thale cress /Arabidopsis thaliana/ has the smallest
known plant genome at 25,498 genes in 115,409,949 BPs.
Yeast has about 6,000 (in 12M BPs) and rice about 60,000 (in 430M BPs).
Source for most of the figures:
http://users.rcn.com/jkimball.ma.ultranet/BiologyPages/G/GenomeSizes.html