deoxyribonucleic acid : the nucleic acid in which the sugar is
deoxyribose, constituting the primary genetic material of all cellular
organisms and the DNA viruses, and occurring predominantly in the nucleus.
It is a linear or circular polymer with a backbone composed of deoxyribose
moieties that are linked by phosphate groups attached to their 5' and 3'
hydroxyls, with side chains composed of purine (adenine, guanine) and pyrimidine
(cytosine, thymine) bases attached to the sugars. The strands are twisted
to form a double helix and are antiparallel. DNA is duplicated by replication,
and it serves as a template for synthesis of ribonucleic acid (transcription).
monosaccharide
b-D-deoxyribofuranose
nitrogenated bases : one of the first things any biology student
learns is that DNA, the recipe for life, is written with 4 letters. But
what if you could add extra ones? Researchers who have managed to build,
and replicate, DNA with an ersatz fifth letter are on their way to finding
out. Getting the modified DNA to work requires the team to answer all sorts
of basic biochemistry questions. But the ultimate hope is that a few of
these artificial letters could be sprinkled into the genome of a living
microbe, to track its adaptation and evolution. A fifth base, called 3-fluorobenzene
(3FB), pairs with itself, forming a completely new base pair. The base
had to be designed so that it would fit into the DNA strand without disturbing
its structure. But a tougher challenge was to find something that DNA polymerase
would recognize. Molecules that had large flat surfaces were initially
designed so they would pack well in the DNA strand, a bit like pushing
a pile of poker chips into a neat stack. But the large flat areas overlapped
each other as they bonded, distorting the base pair so the polymerase was
unable to get past and extend the strand. 3FB is hydrophobic and this turned
out to be enough to pair up 2 bases in a strand of DNA. As the DNA strand
replicates, the new base gets picked up and matched against another 3FB
without problems about 100 times less often than the standard base pairs.
But this is still fairly good: it means just one mistake per 1,000 base
pairs. They are trying to evolve polymerases that recognize the fake base
pairs and work with them more efficiently. So far the new polymerases are
not only better at working with Romesberg's base pairs, they are better
at matching the natural pairs too. There's no need to worry about the modified
microbes going out of control: they need a steady supply of the fake base
to reproduce unless they start making it themselves, which won't happen
this millennium
purines
guanine (G)
adenine (A)
pyrimidines
thymine (T)
cytosine (C)
nucleotides
dAMP
dGMP
dTMP
dCMP
hydrogen bonds : in double-stranded DNA (dsDNA), adenine forms 2
hydrogen bonds with thymine, and cytosine forms 3 with guanine; these are
complementary base pairs.
Maurice Wilkins's research provided the proof that James Watson and
Francis Crick needed to back up their theory about DNA's structure, which
is the cornerstone to understanding how the molecule replicates and transfers
its informationref.
He pioneered a technique called X-ray fibre diffraction, which can reveal
the molecular structure of biological material such as collagen or DNA.
Previously, X-ray images could only be derived from crystals, which excluded
many large biological molecules that prefer to form strands. Wilkins worked
on the DNA project with Rosalind Franklin, who took the X-ray photograph
that gave Watson and Crick their eureka moment. He then spent almost 10
years rigorously verifying that breakthrough.
linking number : in topology, the total number of times one
strand of the DNA double helix winds around the other in a right hand direction,
given a DNA molecule with constrained ends. 2 molecules differing only
in linking number are topoisomers.
writhing number (W) : in topology, the number of superhelical
turns in a DNA molecule with constrained ends
B-DNA : the usual double helical structure assumed by double-stranded
DNA; see illustration at deoxyribonucleic acid.
Z-DNA : a form of DNA in which the phosphate groups form a dinucleotide
repeating unit zigzagging up a left-handed helix with a single, deep groove;
it is particularly likely to occur in stretches of alternating purines
and pyrimidines
spacer DNA : the nucleotide sequences occurring between genes, in
eukaryotes often long and including many repetitive sequences; particularly,
the DNA occurring between the genes encoding ribosomal RNA.
complementary or copy DNA (cDNA) : synthetic DNA transcribed from
a specific RNA through the reaction of the enzyme reverse transcriptase.
nuclear DNA (nDNA) : the DNA of the chromosomes found in the nucleus
of a eukaryotic cell.
mitochondrial DNA (mtDNA) : the DNA of the mitochondrial chromosome,
existing in several thousand copies per cell and inherited exclusively
from the mother. Its code differs both from that of nuclear DNA and from
that of any present day prokaryote, and it evolves 5 to 10 times more rapidly
than nuclear DNA.
recombinant DNA : a DNA molecule composed of linked sequences not
normally occurring within the same molecule, such as a bacterial plasmid
into which has been inserted a segment of viral DNA.
single copy DNA (scDNA) : nucleotide sequences present once in the
haploid genome, as are the majority of the gene sequences encoding polypeptides
in eukaryotes.
repetitive DNA : nucleotide sequences occurring multiply within
a genome; they are characteristic of eukaryotes and generally do not encode
polypeptides. Sequences may be clustered or dispersed, and repeated moderately
(10 to 104 copies per genome) to highly (>106 copies
per genome). Moderately repetitive DNA sequences encode some structural
genes for ribosomal RNA and histones; highly repetitive sequences are mostly
satellite DNA
satellite DNA : short, highly repeated DNA sequences found in eukaryotes,
usually in clusters in constitutive heterochromatin and generally not transcribed
Modern DNA polymerases can polymerize even threonucleotides to form a DNA-threonucleic
acid (TNA) hybrid molecule with excellent base-pairing properties
Chromatin : the more readily
stainable portion of the cell nucleus, forming a network of nuclear fibrils.
It is a deoxyribonucleic acid attached to a protein (primarily histone)
structure base and is the carrier of the genes in inheritance. It occurs
in 2 states, euchromatin and heterochromatin, with different staining properties,
and during cell division it coils and folds to form the metaphase chromosomes.
the basic element of chromatin structure is the mononucleosome,
an octamer of histones (2 copies each of
with 146 bp of DNA wrapped nearly twice around. Adjacent nucleosomes are
separated by between 20 and 60 nt of DNA referred to as linker or internucleosomal
DNA. A fifth histone, H1 ...:
..., binds to DNA as it exits the nucleosome and interacts with the linker
DNA. H1 is thought to be essential for the condensation of polynucleosomes
into higher order structures. These structures include a solenoid of helically
arrayed nucleosomes and then some superhelical twisting of solenoidal loops,
the bases of which are attached to a non-histone protein scaffold. Although
high resolution X-ray crystallographic data exists that confirm the precise
structure of the mononucleosome and its associated DNA, higher order structures
are inferred from electronic microscopic observations and various indirect
biochemical assays. Individual histones undergo posttranslational modification
by phosphorylation, methylation and acetylation in ways that are expected
to alter the local properties of chromatin structure, enhancing or inhibiting
access of proteins to specific DNA sequences. These modifications are targeted
by sequence-specific DNA-binding proteins (often transcription factors),
which recruit modifying enzymes through protein-protein interactions. Finally,
the positions of individual nucleosomes need not be static - they might
be able to slide along the DNA, transiently exposing different sequences
in the linker regions between thought to increase this sliding and thereby
promote accessibility. DNA wrapped in nucleosomes is sterically occluded,
creating obstacles for polymerase, regulatory, remodeling, repair and recombination
complexes, which require access to the wrapped DNAref
(reproduced with permission from Nature
Reviews Immunology (Vol 3, No. 11, pp 890-899(2003)) copyright
Macmillan Magazines Ltd)
chromosome : in animal cells, a structure
in the nucleus containing a linear thread of DNA, which transmits genetic
information and is associated with RNA and histones; during cell division,
the material (chromatin) composing the chromosome is compactly coiled,
making it visible with appropriate staining and permitting its movement
in the cell with minimal entanglement. Each organism of a species normally
has a characteristic number of chromosomes in its somatic cells, 46 being
the number normally present in man, including the 2 (XX or XY) which determine
the sex of the organism
Symbols used in chromosome nomenclature :
A–G : chromosome groups
1–22 : autosome numbers
X, Y : sex chromosomes
/ : diagonal line separating cell lines in descriptions of mosaicism
? : identification of chromosome or chromosome structure questionable
+ - : when placed before the chromosome number, these denote addition or
loss of a whole chromosome; when placed after the chromosome number, they
denote an increase or decrease in length of a chromosome part.
: break with no reunion
: : break with reunion
-> : from . . . to . . .
ace : acentric
cen : centromere
del : deletion
der : derivative chromosome
dic : dicentric
dup : duplication
end : endoreduplication
h : secondary constriction or negatively staining region
i : isochromosome
ins : insertion
inv : inversion
inv ins : inverted insertion
mar : marker chromosome
mat : maternal origin
p : short arm
pat : paternal origin
q : long arm
r : ring chromosome
rep : reciprocal translocation
rec : recombinant chromosome
rob : robertsonian translocation
s : satellite
t : translocation
ter : terminal
Repeated symbols denote duplication of chromosome structure.
Symbols for rearrangements are placed before the chromosome number
and the rearranged chromosomes are placed in parenthesis, e.g., t(14q21q),
r(18).
Human male chromosomes with Giemsa
banding (type G banding),
arranged as a karyotype (the full chromosome set of the nucleus
of a cell; by extension, the photomicrograph of chromosomes arranged according
to a standard classification)
bivalent (chromosome) : the
structure formed by a pair of homologous chromosomes joined by synapsis
along their length during the zygotene and pachytene stages of the first
meiotic prophase. After each of the paired chromosomes separates into 2
sister chromatids during the pachytene stage, this structure is then called
a tetrad.
daughter chromosomes : the name
for chromatids when they reach the poles of the cell in the anaphase stage
of mitosis.
gametic chromosome : chromosome
of a haploid cell (gamete).
homologous chromosomes : a
matching pair of chromosomes, one from each parent, with the same gene
loci in the same order.
dicentric chromosome : a structurally
abnormal chromosome with two centromeres.
acentric chromosome : a chromosome
with no centromere.
acrocentric chromosome : a
chromosome with the centromere near one end. In humans such chromosomes
have satellited short arms that carry genes for ribosomal RNA.
metacentric chromosome : a
chromosome with its centromere in the center and arms of equal length.
submetacentric chromosome
: a chromosome with its centromere slightly off-center so that the arms
are different in length.
telocentric chromosome : a
chromosome with a terminal centromere; not normally found in humans.
mitochondrial or small chromosome (m-chromosome) : a small single
circular chromosome within each mitochondrion, capable of synthesizing
protein since it contains ribosomal RNA, messenger RNA, and transfer RNA.
The mitochondrial chromosome carries the genes for 13 proteins and is the
basis for maternal inheritance (q.v.); some authorities consider it the
25th human chromosome in addition to the 22 autosomes and the X and Y chromosomes.
nucleolar chromosomes : those
in relation to which the nucleoli reorganize during the telophase of mitosis.
chromatid : one of the paired chromosome
strands, joined at the centromere, which make
up a metaphase chromosome, resulting from chromosome reduplication during
the S phase (DNA synthetic phase) of interphase.
nonsister chromatids : the 2
chromatids of one homologous chromosome with respect to those of the other
homologue.
sister chromatids : the 2 chromatids
of a chromosome held together by a centromere; dyads.
chromonema / chromoneme : the coiled central
thread of a chromatid, as opposed to the more densely coiled chromomere
regions.
giant or lampbrush chromosomes
: giant chromosomes of the oocytes of many lower animals arranged like
a cylindrical brush.
giant or polytene chromosomes
: giant bundles of unseparated chromonemata occurring
especially in the salivary glands of some insects
ring chromosome : a chromosome in
which both ends have been lost (deletion) and the 2 broken ends have reunited
to form a ring
heterotypical, odd
or sex chromosomes : chromosomes that are associated with the determination
of sex, in mammals constituting an unequal pair, the X and the Y chromosome.
W chromosomes : the sex chromosomes
of certain insects, birds, and fishes, in which the female is heterogametic
(i.e., has a W and a Z chromosome) and the males are homogametic
(having only Z chromosomes).
X chromosome : the female sex chromosome,
being the differential sex chromosome carried by half the male gametes
and all female gametes in man and other male-heterogametic species.
Y chromosome : the male sex chromosome,
being the differential sex chromosome carried by half the male gametes
and none of the female gametes in man and in some other male-heterogametic
species in which the homologue of the X chromosome has been retained.
somatic chromosome : a chromosome
of a diploid (tissue) cell of the body.
accessory, B or supernumerary chromosome : one or more extra chromosomes
found inconstantly in wild populations of certain species of animals; they
are not homologous to members of the regular set of chromosomes and apparently
exert little influence on the phenotypic effect
centromere / kinetochore / primary constriction
: the constricted portion of the chromosome at which the chromatids are
joined and by which the chromosome is attached to the spindle during cell
division. According to its location, a centromere is said to be metacentric
(central), submetacentric (off center), acrocentric (near
one end), or telocentric (at one end). The last type does not occur
in human chromosomes. At centromeres the more compact CENtromere
Protein-A (CENPA) (necessary across eukaryotes to preserve centromere
locationref)
replaces H3, which forms tetramers with histone H4
throughout most genomic chromatin. The spectrum of histone modifications
(such as acetylation, methylation and phosphorylation) present in human
and Drosophila melanogaster CEN chromatin is distinct from that
of both euchromatin and flanking heterochromatin. This distinct modification
pattern contributes to the unique domain organization and 3D structure
of centromeric regions, and/or to the epigenetic information that determines
centromere identityref.
arm ratio : a figure expressing the relation
of the length of the longer arm of a mitotic chromosome to that of the
shorter arm
Design of enzymes that work with artificial bases :
rational design
directed evolution : millions of mutant polymerases by randomly
scrambling part of the natural enzyme's chemical structure. In one case
the mutant runs out of steam and stops working after adding 5 artificial
bases to a growing chain.
a variant of HIV-reverse transcriptase (RT) where Tyr 188 is replaced by
Leu (Y188L) has emerged from experiments where HIV was challenged to grow
in the presence of drugs targeted against the RT, such as L-697639, TIBO
and nevirapine (drugs that bind at a site near, but not in, the active
site) for their ability to synthesize duplex DNA incorporating the non-standard
base pair between 2,4-diaminopyrimidine (pyDAD), a pyrimidine presenting
a hydrogen bond ‘donor–acceptor–donor’ pattern to the complementary base,
and xanthine (puADA), a purine presenting a hydrogen bond ‘acceptor–donor–acceptor’
pattern. This base pair fits the Watson–Crick geometry, but is joined by
a pattern of hydrogen bond donor and acceptor groups different from those
joining the GC and AT pairs. A second mutation, E478Q, was introduced into
the Y188L variant, in the event that the residual nuclease activity observed
is due to the RT, and not a contaminant. The doubly mutated RT incorporated
the non-standard pair with sufficient fidelity that the variant could be
used to amplify oligonucleotides containing pyDAD and puADA through several
rounds of a polymerase chain reaction (PCR) without losing the non-standard
base pair. This is the first time where DNA containing non-standard base
pairs with alternative hydrogen bonding patterns has been amplified by
a full PCR : most constructed polymerases fail when researchers try to
make multiple copies of artificial DNA using PCR (after several rounds
of copying, imperfect polymerases start to weed out non-standard DNA)ref
Stretches of transiently existing Z-DNA, of which there are perhaps
100,000 in the human genome, may help to switch on genes by making them
more accessible to proteins, such as transcription factors, that stimulate
gene activity. The vaccinia virus
may specifically hijack vulnerable Z-DNA, thereby crippling human cells.
The number of replication origin sequences in the genome does
not change during the lifespan of an organism, but the number of active
origins does vary according to the developmental stage. In the early embryo,
the greater number of active origins may be supported by the higher density
of origin recognition proteins and by the less constrained genomic regulation.
In later stages, more concerned with differentiation than proliferation,
it is thought that epigenetic marking restrains most potential origins
and restricts the speed of DNA replication and cell division, with the
rate of replication fork movement remaining more or less constant : the
efficiency of some origins relies more on nucleotide availability and/or
fork progression rate than on specific cis-sequences. Some initiation sites
identified lie in intergenic regions and co-map with previously identified
A+T rich matrix attachment regions (MARs)ref.
Telomeres undergo 100 bp shortening
at each replication cycle : when they become 3,000-5,000 bp-long,
no further replication cycle can occur : telomeres can be artificially
extended by using nanocircles which can stick to the consensus sequence.
Mice have much longer telomeres than humans and do not normally undergo
significant telomere shortening (which occurs in premalignant cells and
aging
tissues in humans) and crisis. However, engineered mice that do not possess
mTerc (the RNA subunit of telomerase) display more gross chromosomal aberrations
and have a shift in tumour spectrum to one that resembles that of aged
humans. Mutations in the telomerase
RNA component (TERC) cause autosomal
dominant dyskeratosis congenita.
Silencing
It often occurs through DNA methylation.
in Eukarya : 5mC
3-5% of all C residues in animals. 70÷80 % of all C residues in
CpG boxes (on both strands !) are 5mC : main exception are CpG
islands (a.k.a. HpaII tiny fragments (HTF)) contained
in promoter regions of housekeeping genes.
many more in plants, where 5mC is contained in a -CpNpGp-
box (where N stands for any nucleotide)
virtually no 5mC is detectable in DNA from some species (Insect,
...)
in Bacteria :
< 1% of all A residues are N6mA
in Escherichia coliN6mA in the 5'-GATC-3' box
is used in mismatch repair
<< 1% of all C residues are N4mC
Silencing mainly exists in order to...
ensure tissue and stage specific gene expression. Methylation at regulatory
regions, especially promoters, correlates with transcriptional activity:
Sequences near silent genes generally are methylated, whereas those near
active regions are not. Scientists traditionally have measured these modifications
on a gene-by-gene basis, but in December 2004ref
the Human Epigenome
Project (HEP) by Human Epigenome
Consortium released its first results regarding DNA methylation patterns
in the human MHC, the most gene-dense region of its size in the human genome,
in 7 human tissues: adipose, brain, breast, liver, lung, muscle, and prostate.
The MHC is also highly polymorphic, which means the researchers could expect
detectable differences in methylation between individuals. At about 4 megabases,
the MHC is the largest region to have its DNA methylation pattern mapped
to date. In the mammalian genome, methyl groups attach to DNA at CpG dinucleotides,
where a cytosine base is followed by a guanine. By examining methylation
patterns at CpGs, researchers can infer which regions of the genome are
active in a particular cell. The authors examined methylation patterns
in likely regulatory regions of MHC genes, as well as in CpG-dense regions
within each gene. They isolated 253 DNA fragments, representing 90 genes,
which is > 70% of all expressed genes in the MHC. To sequence these fragments
the researchers used a method called bisulfite sequencing. When DNA is
treated with sodium bisulfite, unmethylated cytosines are converted to
uracil, but methylated cytosines remain untouched. The DNA is then subjected
to PCR and sequenced."We can compare the sequence with the original sequence
and see where the changes have occurred : we can get information on every
single CpG site. Researchers traditionally have sequenced multiple subclones
of bisulfite PCR products. Instead, Rakyan and his colleagues sequenced
PCR products directly, using a program they developed called epigenetic
sequencing methylation (ESME) analysis software. The program calculates
methylation levels by comparing the C to T signal at CpG sites. Using ESME
to sequence DNA methylation directly is better for high-throughput purposes
than is sequencing subclones : > 90% of the fragments were either hypomethylated
or hypermethylated. The epigenetic state of a genome has to be tightly
regulated. So that means either you keep it methylated or you keep it unmethylated.
But 14 amplicons in the MHC produced heterogeneous data: in the same tissue
type, some of the amplicons were methylated while others were not. Many
researchers think this pattern could arise from aberrant methylation in
some cells, and might underlie the etiology of certain diseases, especially
cancers. Alternatively, this heterogeneity might be found if different
cell types with different methylation profiles are present in the same
tissue. Methylation profiles varied somewhat between tissues and individuals.
These differences mean that the epigenome really exists in hundreds of
different forms : even with the best high throughput one could imagine,
I think doing every epigenome in a person is going to be a massive amount
of work. Recently, however, the authors developed an alternative method,
using matrix-assisted laser desorption/ionization (MALDI) mass spectrometry,
to identify shortcuts that could reveal regional methylation patterns and
make it easier to determine a person's epigenomic statusref.
Some sites essentially give you the same information as an entire region,
so if a particular site ... is methylated, then all the others will be
methylated as well. Identifying these "methylation-variable positions"
will allow fast, automated epigenotyping of biological samples : the next
phase of the project is determining methylation profiles for human chromosomes
6, 13, 20, and 22.
prevent transcription of selfish DNA (5mC often causes transition
to T and nonsense mutations)
Main involved enzymes are :
DNA C methyltransferases
DNMT1
(mainly maintanance DNA C methyltransferases associated to replicative
forks; only partially de novo methylase : it has sex-specific promoters
and variable 5' exons). Complexes of Dnmt1 with transcriptional repressors,
DMAP1 and pRB, have been described providing a direct link to transcriptional
regulation and tumor suppressionref
DNMT2
(down-regulated in cancers where peri-centromeric satellites are hypomethylated)
DNMT3A
(de novo DNA C methyltransferases) plays a role in methylating CpG
poor regions or repetitive DNA elements outside of the S phase of the cell
cycle
DNMT3L
is expressed in testes during a brief perinatal period in the non-dividing
precursors of spermatogonial stem cells and might have a function in the
de
novo methylation of dispersed repeated sequences in a premeiotic genome
scanning process that occurs in male germ cells at about the time of birthref.
dim-5, a gene that encodes a histone H3 Lys9 methyltransferase,
is required for DNA methylation in Neurospora crassaref.
The histone enzyme is, in turn, influenced by modifications of histone
H3. So even though DNA methylation is guided by a DNA methyltransferase
encoded by dim-2, it still takes orders from the chromatin. Deletion
of genes that encode RNAi molecular machinery causes a loss of histone
H3 Lys9 methylation and impaired centromere function in Schizosaccharomyces
pomberef
DNA C demethylase : it has not been
identified yet. While demethylation in dividing cells might be due just
to a decrease in DNA C methyltransferases activity, demethylation after
fertilization (i.e. in a non-dividing cell) requires the existance of DNA
C demethylase.
methyl-CpG binding
domain (MBD) protein complexes include histone deacetylases (HDACs).
This recruitment of HDACs is suggested to promote local chromatin condensation
and thereby repress gene expressionref
MECP2
: it binds 5mC and recruits HDAC1
and SIN3A
: histone deacetylation promotes chromatine condensation while the latter
is a transcriptional repressor. Mutations causes Rett
syndrome
X inactivation : flies, worms, and
mammals have evolved dosage compensation strategies to equalize
the levels of X-linked gene expression between males (XY) and females (XX)
by transcriptionally silencing one X chromosome in XX embryos
in worms, this is related to an ancient complex of proteins called the
13S
condensin complex that is involved in chromosome resolution and compaction
during both mitosis and meiosis, and so it's pretty clear that the worm
evolved the process of dosage compensation by stealing components that
previously did for other roles and recruiting them to the new role of gene
expression : certain proteins thus have dual roles, whose function is determined
by what complex they bind to—the mitotic/meiotic complex for chromosome
resolution and segregation or the dosage compensation complex (DCC)
for gene regulation. There are 3 classes of X chromosomal DNA : bits of
DNA that robustly recruit the DCC, bits that recruit the DCC but less robustly,
and curiously bits of the X chromosome that seem to have no kind of recruitment
ability (despite containing known dosage-compensated genes)ref.
Given there are dual functional proteins, they must be properly put on
the X chromosome only in hermaphrodites
MSL complexes bind the single male X chromosome in Drosophila
to increase transcription approximately 2-fold. Complexes contain at least
5 proteins and 2 noncoding RNAs, roX1 and roX2, which bind
to X chromosome (not by simple RNA-DNA complementarity)ref
in the prevailing view, the XX zygote inherits 2 active X chromosomes,
one each from the mother and father, and X inactivation does not occur
until after implantation.
a chromatin-based counting mechanism restricts X inactivation to cells
with more than one X chromosome : genes on the 2 active X chromosomes in
undifferentiated, XX female embryonic stem cells (ES cells) are marked
by hyperacetylation of all core histones, hyper(di)methylation of H3 Lys4
(H3K4) and hypo(di)methylation of H3 Lys9 (H3K9), compared
with autosomal genes or genes on the single active X in XY male cells.
The mark is found on both coding and promoter regions. On differentiation,
and after the onset of X inactivation, the mark is reversed on the inactive
X, whose genes show extreme hypoacetylation of all 4 core histones,
hypo(di)methylation of H3K4 and hyper(di)methylation of H3K9. The mark
is retained on the active X in female ES cells for at least several days
of differentiation, but is not present in adult females. The selective
marking of X-linked genes in female ES cells distinguishes them from the
equivalent genes in male : the mark forms part of a chromatin-based mechanism
that restricts X-inactivation to cells with > 1 X chromosome. Parental
imprinting and/or counting mechanisms ensure that X
(inactive)-specific transcript (XIST) is expressed only on the inactive
X chromosome : differential de novo methylation at the Xist
promoter, which is mediated by Dnmt3a
and/or Dnmt3b,
is a consequence of monoallelic expression of Xist and a mechanism(s) other
than DNA methylation plays a principal role in initiating X-inactivationref.
Although maintenance-type DNA methylation is not essential for X-inactivation
to occur, it is required for the stable repression of Xist in differentiated
cells. Chromosome silencing then results from the accumulation of the non-coding
Xist RNA silencing signal, in cis, over the entire length of the
X chromosome. Following differentiation and X inactivation, the active
X remains hyperacetylated for several days, while the epigenetic marking
on the inactivated X is lost. Similar results are observed for DNA methylation.
CCCTC-binding
factor (CTCF), an 11-zinc-finger factor expressed primarily in the
nucleus of somatic cells and involved in gene regulation : it utilizes
different zinc fingers to bind varying DNA target sites and form methylation-sensitive
insulators that regulate X-chromosome inactivation. The presence of monoubiquitylated
histone
H2A (uH2A) on the inactive X temporally correlates with the recruitment
of Polycomb group (PcG) proteins belonging to Polycomb repressor complex
1 (PRC1), known to be involved in gene silencing. PRC1
Ring1B
protein is involved in genome-wide H2A ubiquitylation : Ring1B and its
closely related homolog Ring1A
in uH2A enrichment on the inactive Xref1,
ref2.
... but in mice there is evidence to the contrary : paternal X chromosome
is already silent at zygotic gene activation (2-cell stage) and exhibits
a gradient of silencing (i.e. genes close to the X-inactivation centre
show the greatest degree of inactivation, whereas more distal genes show
variable inactivation and can partially escape silencing). After implantation,
imprinted silencing in extraembryonic tissues becomes globalized and more
complete on a gene-by-gene basisref.
Anyway 15% of the genes on the inactive X chromosome are active in all
women : another 10% of genes from the inactive X are switched on in just
some women. Because the genes expressed from the inactive X are also expressed
from a woman's active X, women get a higher dose of these genes than men.
So these genes may underlie traits that differ between the sexes, and clinical
abnormalities in patients with abnormal X chromosomesref.
parental imprinting has an essential
role in normal embryonal mammalian development.
demethylation occurs immediately after egg fertilization, apart
from imprinted genes (dotted line), which are protected : DNA demethylation
seems to precede gene reprogramming, and is absolutely necessary for oct4
transcription. Reprogramming by oocytes occurs in the absence of DNA replication
and RNA/protein synthesis. It is also selective, operating only on the
promoter, but not enhancers, of oct4; both a putative Sp1/Sp3 and
a GGGAGGG binding site are required for demethylation and transcriptionref.
sex-influenced tissue-specific remethylation at blastocyst stage (E5)ref
: the latter is different according to intraembryonic or extraembryonic
cell location
primordial germinal cells (PGCs) (but not somatic cells) undergo a new
demethylation while migrating along the genital ridge at E12÷E13
in Mus
remethylation takes place during germinal cell maturation (gametogenesis).
Many imprinted genes have been shown to contain a differentially methylated
region (DMR) : the Polycomb group (PcG) protein embryonic
ectoderm development (EED), a histone deacetylase, is implicated in
epigenetic regulation of autosomal imprinted loci and imprinted X-chromosome
inactivation in extraembryonic cells but not of random X inactivation in
embryonic cells. CTCF-like
(CTCFL) is a CTCF paralog normally expressed primarily in the cytoplasm
of spermatocytes in a mutually exclusive pattern that correlates with resetting
of methylation marks during male germ cell differentiation (chromatin insulator)ref.
Chromatin insulators demarcate expression domains by blocking the cis
effects of enhancers or silencers in a position-dependent manner. CTCF
carries a post-translational poly(ADP-ribosyl)ation,
which imparts chromatin insulator properties to CTCF at both imprinted
and nonimprinted loci. The poly(ADP-ribosyl)ation mark, which exclusively
segregates with the maternal allele of the insulator domain in the H19
imprinting control region, requires the bases that are essential for interaction
with CTCF3ref.
Epigenetic modifications in an imprinting cluster are controlled by a hierarchy
of DMRs suggesting long-range chromatin interactions rather than linear
spreading of DNA methylationref
: DMRs in Igf2 and H19
contain chromatin boundaries, silencers and activators and regulate the
reciprocal expression of the 2 genes in a methylation-sensitive manner
by allowing them exclusive access to a shared set of enhancers allowing
the intervening DNA to loop outref.
> 20 genes are imprinted in Homo sapiens genome : they all have
positive effects on fetal growth (expecially neuronal growth)
maternally imprinted (paternally expressed) genes
1p31 : ARHI
: its expression is associated with growth suppression : tumor suppressor
gene whose function is abrogated in ovarian and breast cancers.
20q11.2-q12 : neuronatin
(NNAT) : found within an intron of the BLCAP
gene, but on the opposite strand. This gene is imprinted and is expressed
only from the paternal allele, while BLCAP is not imprinted.
20q13.2-q13.3 : Gas
locus
: each of the upstream exons is within a differentially methylated region,
commonly found in imprinted genes. However, the close proximity (14 kb)
of two oppositely expressed promoter regions is unusual. In addition, one
of the alternate 5' exons introduces a frameshift relative to the other
transcripts, resulting in one isoform which is structurally unrelated to
the others. An antisense transcript exists, and may regulate imprinting
in this region.
Gnasxl encodes the unusual Gsa
isoform XLas. Mice with mutations
in Gnasxl have poor postnatal growth and survival and a spectrum
of phenotypic effects that indicate that XLas
controls a number of key postnatal physiological adaptations, including
suckling, blood glucose and energy homeostasisref
Both Dnmt3a and Dnmt3L, but not Dnmt3b,
are required for methylation of most imprinted loci in paternal and maternal
germ cells, and additional factors are involvedref.
Imprinting prevents parthenogenesis
in mammals and is often disrupted in congenital
malformation syndromes,
tumours
and cloned animals.
Journals : Nucleic
Acids Research[free at PubMedCentral : archive starts with Vol.
28(1); 2000] Web resources : The
DNA double helix at Biolegy Learning Center, University of Arizona