Health Dictionary Find a Doctor

Core promoter gene transcriptions

Editor-In-Chief: Henry A. Hoff

The diagram shows an overview of the four core promoter elements B recognition element (BRE), TATA box, initiator element (Inr), and downstream promoter element (DPE), with their respective consensus sequences and their distance from the transcription start site.[1] Credit: Jennifer E.F. Butler & James T. Kadonaga.{{free media}}

A core promoter is that portion of the proximal promoter that contains the transcription start sites.

Biochemical definition: the minimal stretch of DNA sequence that is sufficient to direct accurate initiation of transcription. An acceptable range of the length of a core promoter is typically 60 to 120 base pairs.

Genomics definition: short sequences surrounding the transcription start sites (TSSs).

It contains a binding site for RNA polymerase (RNA polymerase I, RNA polymerase II, or RNA polymerase III) holoenzymes.

A vast network of regulatory factors that contribute to the initiation of transcription by RNA polymerase ultimately target any specific gene’s core promoter.

The core promoter includes the transcription start site(s) (TSS).

That portion of the core promoter that is upstream of the TSS is also part of the proximal promoter.

The core promoter is approximately -34 bp upstream from the TSS. “Several factors have been identified that bind to core promoters (reviewed in Smale, 1997)”.[2][3]

Genetics

Genetics

File:Bob, the guinea pig.jpg
This is an image of Bob, the guinea pig. Credit: selbst.

Genetics involves the expression, transmission, and variation of inherited characteristics.

Gene transcriptions

Gene transcriptions

DNA is a double helix of interlinked nucleotides surrounded by an epigenome. On the basis of biochemical signals, an enzyme, specifically a ribonucleic acid (RNA) polymerase, is chemically bonded to one of the strands (the template strand) of this double helix. The polymerase, once phosphorylated, begins to catalyze the formation of RNA using the template strand. Although the catalysis may have more than one beginning nucleotide (a start site) and more than one ending nucleotide (a stop site) along the DNA, each nucleotide sequence catalyzed that ultimately produces approximately the same RNA is part of a gene. The catalysis of each RNA representation from the template DNA is a transcription, specifically a gene transcription. The overall process is also referred to as gene transcription.

Promoters

Promoters

Def. a “section of DNA that controls the initiation of RNA transcription as a product of a gene”[4] is called a promoter.

Proximal promoters

Proximal promoters

Def a section of promoter DNA which includes the transcription start sites that is neighboring the start sites is called a proximal promoter.

Cores

Cores

Def. a central or most important part of something is called a core.

Theoretical core promoters

Theoretical core promoters

Def. “the factors, including RNA polymerase II itself, that are minimally essential for transcription in vitro from an isolated core promoter” is called the basal machinery, or basal transcription machinery.[5]

Def. one or more sequence motifs containing the transcription start sites (TSSs), juxtaposed to the motif containing the TSSs, or in the proximal promoter that are only found in this core of motifs is called a core promoter.

Metal responsive elements

Metal responsive elements

A metal responsive element (MRE), or TGC box, may occur in the core promoter of some human DNA genes.

“The metallothionein (MT) genes provide a good example of eucaryotic promoter architecture. MT genes specify the synthesis of low-molecular-weight metal-binding proteins. They are transcriptionally regulated by the metal ions cadmium and zinc (11), glucocorticoid hormones (18), interferon (14), interleukin-1 (22), and tumor promoters (2). The metal ion regulation of MTs is conferred by a short sequence element called the metal-responsive element (MRE [21]) or TGC box (31, 34), which functions as a metal ion-dependent enhancer.”[6]

GC boxes

GC boxes

Def. a “sequence of contiguous guanine, guanine, guanine, cytosine, and guanine, in that order, along a DNA strand”[7] is called a GC box.

“[A] GC box is a distinct pattern of nucleotides found in the promoter region of some eukaryotic genes upstream of the TATA box and approximately 110 bases upstream from the transcription initiation site. It has a consensus sequence GGGCGG which is position dependent and orientation independent. The GC elements are bound by transcription factors and have similar functions to enhancers.[8][9]

“A large subclass of polymerase II promoters lacks both TATAA and CCAAT sequence motifs but contains multiple GC boxes. This promoter class includes several housekeeping genes (e.g., the genes encoding dihydrofolate reductase [DHFR] …, hydroxymethylglutaryl coenzyme A reductase [39], hypoxanthine guanine phosphoribosyltransferase [33], and adenosine deaminase [46]) [and] nonhousekeeping genes (e.g., the transforming growth factor alpha [9, 23], rat malic enzyme [36], human c-Ha-ras [21], epidermal growth factor receptor [22], and nerve growth factor receptor [42] genes).”[10]

“[A] GC box-binding factor is required for transcription and … a truncated promoter containing one GC box is transcriptionally inactive (44). … the DNA-protein interactions occurring at the GC boxes in the DHFR promoter are functionally distinct and that factors binding to the GC boxes must interact in a position-dependent manner.”[10]

“In promoters containing multiple GC boxes but lacking the TATAA box, transcription start sites may be single and specific, as observed in the nerve growth factor receptor gene (42) and the cellular retinol-binding protein gene (37), or there may be multiple heterogeneous start sites, such as those found in the c-myb (4), insulin receptor (45), and Ha-ras (21) genes. … GC boxes are responsible for directing transcription from the major and the minor start sites. … All TATAA-less promoters have at least two GC boxes”[10].

“A GC box sequence, one of the most common regulatory DNA elements of eukaryotic genes, is recognized by the Spl transcription factor; its consensus sequence is represented as 5′-G/T G/A GGCG G/T G/A G/A C/T-3′ [or 5β€²-KRGGCGKRRY-3β€²] (Briggs et al., 1986).”[11]

HY boxes

HY boxes

A core responsive element is the hypertrophy region HY box between -89 and -60 nucleotides (nts) upstream from the transcription start site.[12]

CAAT boxes

CAAT boxes

“[A] CCAAT box (also sometimes abbreviated a CAAT box or CAT box) is a distinct pattern of nucleotides with GGCCAATCT consensus sequence that occur upstream by 75-80 bases to the initial transcription site. The CAAT box signals the binding site for the RNA transcription factor, and is typically accompanied by a conserved consensus sequence. It is an invariant DNA sequence at about minus 70 base pairs from the origin of transcription in many eukaryotic promoters. Genes that have this element seem to require it for the gene to be transcribed in sufficient quantities. It is frequently absent from genes that encode proteins used in virtually all cells. This box along with the GC box is known for binding general transcription factors. CAAT and GC are primarily located in the region from 100-150bp upstream from the TATA box. Both of these consensus sequences belong to the regulatory promoter. Full gene expression occurs when transcription activator proteins bind to each module within the regulatory promoter. Protein specific binding is required for the CCAAT box activation. These proteins are known as CCAAT box binding proteins/CCAAT box binding factors. A CCAAT box is a feature frequently found before eukaryote coding regions”.[13]

B recognition elements

B recognition elements

“The B recognition element (BRE) is a DNA sequence found in the promoter region of most genes in eukaryotes and Archaea.[14][15] The BRE is a cis-regulatory element that is found immediately upstream of the TATA box, and consists of 7 nucleotides.”[16]

“The Transcription Factor IIB (TFIIB) recognizes this sequence in the DNA, and binds to it. The fourth and fifth alpha helices of TFIIB intercalate with the major groove of the DNA at the BRE. TFIIB is one part of the preinitiation complex that helps RNA Polymerase II bind to the DNA.”[16]

The consensus sequence is 5’-G/C G/C G/A C G C C-3’.[17]

The general consensus sequence using degenerate nucleotides is 5’-SSRCGCC-3’, where S = G or C and R = A or G.[18]

“The position in nucleotides (nt) relative to the transcription start site (TSS, +1)” is -35 for the BRE. Of human promoters, some “22-25% [are] BRE containing promoters … the functional consensus sequences for BRE … motif [is] still poorly defined.”[18]

EIF4E basal elements

EIF4E basal elements

The EIF4E basal element, also eIF4E, (4EBE) is a basal promoter element for the eukaryotic translation initiation factor 4E. “Interactions between 4EBE and upstream activator sites are position, distance, and sequence dependent.”[19]

TATA boxes

TATA boxes

Def. a “DNA sequence (cis-regulatory element) found in the promoter region of genes in archaea and eukaryotes”[20] is called a TATA box.

The TATA box can be an AT-rich sequence “located at a fixed distance upstream of the transcription start site[5].

TBP-like factors

TBP-like factors

Notation: let the symbol TLF designate a TATA binding protein-like factor.

The human gene TBPL1 (TBP-like 1, also TLF and TRF2[5]), GeneID: 9519, encodes a protein that “does not bind to the TATA box and initiates transcription from TATA-less promoters.”[21]

Downstream TFIIB recognition

Downstream TFIIB recognition

The downstream TFIIB recognition element (dBRE) has a consensus sequence in the transcription direction on the template strand of 3′-RTDKKKK-5′, using degenerate nucleotides, or 3′-A/G-T-A/G/T-G/T-G/T-G/T-G/T-5′.[22]

dBRE is cis-TATA box, between the TATA box and the Inr or transcription start site (TSS) and trans-TSS.[22]

Initiator elements

Initiator elements

For RNA polymerase II holoenzyme to transcribe a gene, the gene’s promoter must be located. After the promoter is located, the transcription start site (TSS) is pinpointed by using nucleotide sequences that include the TSS or perhaps allow distance measurement to the TSS. Within the promoter, most human genes lack a TATA box and have an initiator element (Inr) or downstream promoter element instead.

“RNA pol II itself recognizes features of the Inr which might assist the correct positioning of the polymerase on the promoter (Carcamo et al., 1991; Weis and Reinberg, 1997).”[2][23][24]

Transcription start sites

Transcription start sites

The transcription start site (TSS) is the location on the DNA template strand where transcription begins at the 3′-end of a gene.[25] This location corresponds to the 5′-end of the mRNA which by convention is used to designate DNA locations.[25] For example, the 5′-TATA-box-3′ designation refers to the directionality of the mRNA and corresponds to the 3′-TATA-box-5′ designation for nucleotides on the template strand.[25] The template strand is the DNA strand being transcribed by RNA polymerase.[25]

Downstream core elements

Downstream core elements

“[N]onredundant human promoter sequences 600 bp long (βˆ’499 to +100 bp around the TSS) [are available] from [the] Eukaryotic Promoter Database (EPD) release 75 (4, 68) (http://www.epd.isb-sib.ch/), and … promoters sequences 1,200 bp long (βˆ’1,000 to +200 bp) [are available] from the Database of Transcriptional Start Sites (DBTSS) (59, 74, 75) (http://dbtss.hgc.jp/index.html)”[26].

The downstream core element (DCE) is a transcription core promoter sequence that is within the transcribed portion of a gene.

The consensus sequence for the DCE is CTTC…CTGT…AGC.[26] These three consensus elements are referred to as subelements: “SI is CTTC, SII is CTGT, and SIII is AGC.”[26]

The number of nucleotides between each subelement can apparently vary down to none.

A core promoter that contains all three subelements may be much less common than one containing only one or two.[26] “SI resides approximately from +6 to +11, SII from +16 to +21, and SIII from +30 to +34.”[26]

SI as 3′-CTTC-5′ can occur as 3 of 4 (CTT, TTC) or 4 of 4 (CTTC). SII as 3′-CTGT-5′ can also occur as 3 of 4 (CTG, TGT) or 4 of 4 (CTGT). SIII as AGC is not known to vary.

DCE SIII can function independently of SI and SII.[26]

Transcription factor II D (TFIID), a transcription factor that is part of the RNA polymerase II holoenzyme, interacts with promoters containing only SIII of the DCE suggesting a critical spacing parameter between SIII and the TATA box, initiator element, or some combination of the two.[26] TFIID probably serves as a core promoter recognition complex.[26]

TAF1 interacts with the DCE in a sequence-dependent manner.[26]

The differences between core promoters with downstream elements may be explained by

  1. “TATA- and DPE-dependent promoters are specific for particular enhancers”[26],
  2. “preferences of activators for specific core promoter architectures”[26], and
  3. “the presence of a DCE or [downstream core promoter element (DPE)] might be indicative of an architecture designed for specific regulatory networks, such as the regulation of housekeeping promoters versus tissue-specific promoters (or other highly regulated promoters) or the regulation of subsets of viral promoters.”[26]
Motif ten elements

Motif ten elements

The motif ten element (MTE) is a downstream core promoter element that “promotes transcription by RNA polymerase II when it is located precisely at positions +18 to +27 relative to A+1 in the initiator (Inr) element.”[27]

The motif 10 consensus sequence is CSARCSSAACGS [5′-C-C/G-A-A/G-C-C/G-C/G-A-A-C-G-C/G-3′].[27] By convention, the consensus sequence 5′-C-C/G-A-A/G-C-C/G-C/G-A-A-C-G-C/G-3′ is stated as it would be translated into mRNA. In the direction of transcription on the template strand this consensus sequence becomes 3′-C-C/G-A-A/G-C-C/G-C/G-A-A-C-G-C/G-5′.

Downstream promoter elements

Downstream promoter elements

“The downstream promoter element (DPE) is a core promoter element … present in other species including humans and excluding Saccharomyces cerevisiae.[28] Like all core promoters, the DPE plays an important role in the initiation of gene transcription by RNA polymerase II.”[29]

The core sequence of the DPE is located precisely +28 to +32 nts relative to the A+1 nt in the Inr.[17]

Super core promoters

Super core promoters

A super core promoter (SCP) contains a TATA box, Inr, motif ten element (MTE), and DPE in a single promoter.[30] The SCP is the strongest core promoter observed in vitro and yields high levels of transcription in conjunction with transcriptional enhancers.[28]

Promoters can be classified based on the motifs found in the core promoter which include the TFIIB recognition element (BREu), typically starting from -37 bp to -32 bp upstream of the TSS, the TATA box, -31 bp to -26 bp upstream of the TSS, the Inr -2 bp to +4 bp, the Motif Ten Element (MTE), +18 bp to +27 bp downstream of the TSS, and, the Downstream Promoter Element (DPE), +28 bp to +32 bp.[31] Of these, DPE and BREu are the most common, present in 25% of the human core promoters, and the TATA box, present in 13% of the human core promoters.[31] The Downstream Core Element (DCE) (N5-7[CTTC]N7-8[CTGT]N7-11[AGC]N1-2) +10 bp to +40 bp can be present in promoters containing a TATA box and/or Inr, presumably does not occur with a DPE or MTE.[31]

The BRE is specifically recognized by TFIIB, but all other core promoter elements are TFIID-interaction sites: TAF6 and TAF9 contact the DPE, TAF1 and TAF2 contact the Inr, and TAF1 contacts the DCE.[32][33]

Hypotheses

Hypotheses

  1. Each portion of a DNA that becomes active has a core promoter.
  2. The “minimal portion of the promoter required to properly initiate transcription”.[34]
Comparisons of negative direction promoter elements

Comparisons of negative direction promoter elements

Butler (2002) Watson (2014) Juven-Gershon (2008) Butler (2002)
~-37 to -32 BREu SSRCGCC ~-31 to -26 TATAWAW -2 to +4 Inr YYRNWYY +28 to +32 DPE RGWYV
UTR nn(4560-2846) UTR nn(4560-2846) UTR nn(4560-2846) UTR nn(4560-2846)
TTACTCC at 4557
GGACC at 4546
ciAGTGTAA at 4533
ciTGTCT at 4518
GGACC at 4494
AGTCG at 4489
GGTCG at 4480
ciGGTCT at 4448
AGTCC at 4436
AGATG at 4430
GGTCA at 4415
TCACACT at 4361
GTCACA at 4359 Ngoc (2017)
CCCACT at 4353 Ngoc (2017) GGACA at 4369
TCGGACC at 4349
GTCACT at 4319 Ngoc (2017) GGACC at 4349
GGTCG at 4345
CCAGTTT at 4309
TCCAGT at 4307 Ngoc (2017)
TCGGACC at 4300 GGACC at 4300
ciGAACT at 4294
ciGGTCCGA at 4255 ciTGTCC at 4282
ciCGACT at 4276
CTGCACC at 4238 ciGAACC at 4268
TCGGTCT at 4233 GGTCG at 4261
GGTCC at 4253
TCACTCT at 4202 ciGGTCT at 4233
GTCACT at 4200 Ngoc (2017)
TCGAACC at 4188 AGATG at 4212
GGACA at 4208
ciGAACC at 4188
AGTTC at 4178
CCGGTCC at 4170 GGTCC at 4170
ciAGTACGG at 4118 ciCGACT at 4145
CCGTACC at 4107 AGTCC at 4138
CCGGTCC at 4102 GGTCG at 4130
TTACACT at 4092 GGACA at 4121
GGTCC at 4102
ciCAACC at 4097
TCACTCT at 4051 ciTATCT at 4079
TTGTATC at 4046
TCGGACC at 4037 AGATG at 4062
TCGGACC at 4037 AGATG at 4062
GAGACT at 4053 Matsumoto (2020)
GGACC at 4037
GGTCG at 4033
AGTTC at 4027
GGTTC at 4019
ciGAACT at 4012
ciAGTGTGG at 3967 ciCGACT at 3994
CCGGTCC at 3951 GGTTG at 3979
TTCACA at 3939 Ngoc (2017) GGACA at 3970
CTACTTT at 3922 GGTCC at 3951
CTACTTT at 3922 ciCAACC at 3946
ciCAACC at 3942
ciGGACT at 3932
TCATTCT at 3893 ciTGTCT at 3917
TCATTC at 3892 (Butler 2002) ciTGTCT at 3917
CTCATT at 3891 Ngoc (2017) ciTGTCT at 3917
GGACC at 3906
ciGGTCCGG at 3873 GGTCC at 3885
CTGGTCC at 3871 GGTCC at 3871
ciGGTATGG at 3858 ciCGACC at 3864
CTCATA at 3829 Ngoc (2017) GGACG at 3861
TCCACT at 3825 Ngoc (2017) AGTTC at 3844
CTACACC at 3810 AGACC at 3835
ciCATCT at 3820
ciCAACT at 3805
ciGAACC at 3793
CTGTTCT at 3759 ciGGACT at 3781
ciTGACC at 3749
GGACC at 3744
GGTCGTG at 3733 Kadonaga (2002)
GGTCG at 3731
ciCGACC at 3719
AGACG at 3706
GGTCG at 3701
GGTCG at 3682
ciTGTCT at 3672
ciCGACT at 3649
AGTCTC at 3645 Matsumoto (2020)
ciCAACC at 3606
ciCGTCT at 3589
GGTCC at 3585
GGACG at 3579
ciGAACT at 3571
GGTCC at 3564
AGACA at 3556
ciTGACT at 3542
ciCAACT at 3533
ciTAACC at 3529
ciGGTCTAG at 3488 AGTTG at 3523
TTGGTCT at 3486 ciCAACT at 3505
TCATTT at 3481 (Butler 2002) ciCAACT at 3505
GTCATT at 3480 Ngoc (2017) ciCAACT at 3505
TTGATCT at 3463 ciGGTCT at 3486
CCGTATC at 3446 ciGATCT at 3463
TTCACT at 3410 Ngoc (2017) ciTATCC at 3447
CCGAACT at 3401 AGTTG at 3431
ciAGTCCGA at 3398 ciTATCT at 3422
TCGTTCT at 3374 ciGAACT at 3401
TTGTTCT at 3340 AGTCC at 3396
TCGTTTT at 3313 ciTAACT at 3358
TTGTTCT at 3307 AGACA at 3319
TCGGACC at 3298 GGACC at 3298
TCGGTTC at 3273 GGACC at 3298
ciAGTGCGG at 3281 GGTCG at 3294
TCGGTTC at 3273 GGTTC at 3273
ciCATCT at 3256
GGTCC at 3249
ciCiGAACT at 3242
ciCGACT at 3224
AGTCC at 3217
CCACACC at 3186 GGTCG at 3209
CCCACA at 3184 Ngoc (2017) AGTCG at 3204
GGACA at 3200
TTGTATT at 3169 ciCGACC at 3180
CCACTTT at 3146
TCCACT at 3144 Ngoc (2017)
TTGTTCC at 3141
ciGGACCGG at 3130
TCGGACC at 3128 AGATG at 3158
GGTTG at 3137
TCGGACC at 3128 GGACC at 3128
GGTCG at 3124
ciCAACC at 3116
AGTCC at 3110
ciGAACT at 3103
ciCGACT at 3085
GGTCGTG at 3072 Kadonaga (2002)
CCGCACC at 3047 CCGCACC at 3047 GGTCG at 3070
ciGATTCGA at 3033 GGACA at 3061
TTGATTC at 3031 GGACA at 3061
ciCGACC at 3041
CCGATTT at 3009 ciCGACC at 3035
GGATA at 2996
AGATG at 2988
TTGATTC at 2914 ciCiGAACC at 2921
ciAAAGTAG at 2887 ciCiGAACC at 2921
TATATAT at 2872 TTCACA at 2860 Ngoc (2017) ciTATCT at 2903
ciTTATATA at 2871 ciTGTCT at 2878
ciTTTTATA at 2869 ciTGTCT at 2878
TATAAAA at 2853 ciTGTCT at 2878
GGTTA at 2848
UTR pn(4560-2846) UTR pn(4560-2846) UTR pn(4560-2846) UTR pn(4560-2846)
ciGGAATGA at 4555
TTAATTC at 4542
TCACATT at 4533
ciAGTCCAA at 4502
AGACA at 4507
CCACTTT at 4461 AGTCC at 4500
ciGATCC at 4476
AGATC at 4475
GGACA at 4468
CCACTCC at 4425 ciCATCC at 4456
ciGAACC at 4451
CCAGTTC at 4417 AGTTC at 4417
ciAGTGTGA at 4361 AGTTC at 4417
CTGCACT at 4340 ciTGTCT at 4371
AGACC at 4365
CCGGACT at 4327 ciGGACT at 4327
GGTCA at 4307
ciAAAATAA at 4221 GGATC at 4288
AGACG at 4235
ciTGTCT at 4210
ciAGTTCAA at 4177 AGACC at 4204
AGACA at 4181
AGTTC at 4175
GGATC at 4157
ciAATGTGA at 4092 AGTCC at 4126
ciAAAATAA at 4071 AGTTG at 4096
ciAGACCAG at 4032 ciCATCT at 4058
ciAGTTCAA at 4026 ciGAGACT at 4053 Matsumoto (2020)
AGACC at 4030
AGTTC at 4024
TCACACC at 3967 GGATC at 4006
GGTTG at 3945
ciGGAGTAA at 3891 AGATG at 3919
ciGGACCAG at 3870 ciCATCC at 3903
CCATACC at 3858 GGACC at 3868
ciGATGTGG at 3810 ciCAACT at 3849
ciTGTCT at 3833
GGTCG at 3813
CTGAACC at 3784 GGTTG at 3804
ciAATGCAG at 3772 ciGAACC at 3784
ciGGACTGG at 3749 AGACC at 3761
CTGGACT at 3747 GGACA at 3756
ciGGAACAG at 3725 ciGGACT at 3747
CCATTTC at 3688 ciCGTCC at 3698
ciAATCCAG at 3681 ciCGTCC at 3698
GGATA at 3655
ciGGACT at 3640
AGATG at 3627
AGATG at 3620
CTGCTCC at 3582 GGTTG at 3605
ciCATCT at 3551
GGTTG at 3532
CCAGATC at 3488 ciCAACT at 3524
ciAAACCAG at 3485
ciGAACTAG at 3462 AGATC at 3488
AGATA at 3465
ciGAACT at 3460
ciGAAGTGA at 3410 AGACA at 3433
ciAAATTGA at 3358 GGACA at 3389
ciAAAACAA at 3330
ciAGAGCAA at 3311 ciTGTCT at 3321
TTGCACT at 3289
TTGAACC at 3245 AGATC at 3276
GGTTG at 3261
ciGGTGTGG at 3186 ciGAACC at 3245
ciAAATTAG at 3176
ciAGACCAG at 3123 ciCATCT at 3154
AGACC at 3121
AGTTG at 3115
ciAAACTAA at 3030 GGATC at 3097
ciAAAATAA at 3013
ciAGAATGG at 3004
ciTGTCT at 2986
AGATA at 2981
AGACA at 2948
ciCAACT at 2911
TATAAA at 2874 Butler (2002) AGATG at 2905
AGATG at 2894
AGACA at 2880
Cores nn (2846-2811) Cores nn (2846-2811) Cores nn (2846-2811) Cores nn (2846-2811)
Cores pn (2846-2811) Cores pn (2846-2811) Cores pn (2846-2811) Cores pn (2846-2811)
ciAAAACAA at 2842 CAACC at 2844
Proximals nn (2811-2596) Proximals nn (2811-2596) Proximals nn (2811-2596) Proximals nn (2811-2596)
ciACTGAG at 2787 Ngoc (2017)
TCGTACT at 2784
TCGGACC at 2770 GGACC at 2770
ciAGTACGG at 2753 ciTGTCT at 2778
GTCACT at 2739 Ngoc (2017) GGTCG at 2766
TTGGACC at 2720 GGACC at 2720
ciCGACT at 2744
ciGAACT at 2714
ciCAACT at 2705
ciTTTATA at 2638 Butler (2002) ciCGACT at 2696
ciTGTCC at 2689
GTCACA at 2656 Ngoc (2017) GGTCG at 2681
GGACAT at 2673
ciATTTATA at 2638 TCACACC at 2658 GGACA at 2672
CCACTTT at 2619 GGTCA at 2654
TTGTACC at 2614 AGTCG at 2650
GGTTGT at 2611 Juven-Gershon (2010)
TCACACC at 2605 GGTTG at 2610
GTCACA at 2603 Ngoc (2017)
GGTCA at 2601
Proximals pn (2811-2596) Proximals pn (2811-2596) Proximals pn (2811-2596) Proximals pn (2811-2596)
ATGACT at 2786 Juven-Gershon (2010)
AGTTG at 2733
ciGAACC at 2717
AGTTG at 2704
AGACC at 2598
Distal nn (2596-1) Distal nn (2596-1) Distal nn (2596-1) Distal nn (2596-1)
ciCAACT at 2593
AGTCCT at 2588 Juven-Gershon (2010)
CCAGTCC at 2587 AGTCC at 2587
ciGAACT at 2580
ciAGTACGG at 2535 ciCGTCC at 2568
ciCGACT at 2562
CCGGTCC at 2519 GGTTG at 2547
TCATTCT at 2503 GGACA at 2538
TTGTTTT at 2490 GGTCC at 2519
ciTGTCC at 2514
TCGTTTT at 2476 AGTTA at 2496
TCACTCT at 2449
TCGGACC at 2435 ciTGTCT at 2443
ciAGTGTGG at 2418 GGACC at 2435
TTGGACC at 2385 GGTCG at 2431
ciCGTCC at 2389
GGACC at 2385
ciGAACT at 2379
ciCGTCC at 2367
ciCGACT at 2361
GGTCG at 2346
GGACA at 2337
CCACTTT at 2282 ciCGACC at 2326
TCGTACC at 2277 AGATG at 2294
TCGGACC at 2268 GGACC at 2268
TCAAACT at 2257 GGTCG at 2264
CCAGTCC at 2250 AGTCC at 2250
CCACGCC at 2197 ciAGTGCGG at 2208 ciCGACT at 2226
CCGCTTT at 2157 GGTCA at 2211
TTGTACC at 2152 AGATG at 2169
TCAAACT at 2141 GGTTG at 2148
AGTCC at 2134
ciGAACT at 2127
ciTGTCT at 2119
TCACATT at 2087 ciCGACT at 2109
CCGGTCC at 2077 GGATC at 2093
TTACACC at 2065 GGTCC at 2077
TCGTTCT at 2023 ciCGACC at 2069
TCGGACC at 2009 AGACA at 2029
ciAGTGCGG at 1992 ciTGTCT at 2017
GGACC at 2009
TTGGACC at 1959 GGTCG at 2005
CCGTACT at 1953 ciCGTCT at 1967
GGACC at 1959
ciCGTCC at 1941
ciTGACT at 1935
ciGAACC at 1927
CCGCACC at 1897 GGTCG at 1920
GGACA at 1911
ciCGACC at 1891
ciGGACCGA at 1843 AGATG at 1867
ciCAACT at 1853
GGACC at 1841
ciCGTCC at 1823
ciTTTATA at 1740 Butler (2002) GGTTC at 1817
ciAGTGCAG at 1773 ciCGACT at 1800
GGTCG at 1785
TTATACC at 1742 AGACA at 1776
ciAAAATAG at 1730 ciCGACC at 1756
CCGCGCC at 1762 ciCGACC at 1746
TTAATTT at 1697 ciTATCT at 1710
TATAAA at 1602 Butler (2002) ciGGTCT at 1670
ciCATCT at 1653
ciAGAACGG at 1608 ciGAACC at 1649
ciGGACT at 1623
TTGGATT at 1591 ciCGTCT at 1614
TTACTTT at 1582 GGTCG at 1611
CCGTTTT at 1561 ciTGTCT at 1567
TTGCTTC at 1555
ciGATATAG at 1528 GGTCA at 1532
AGATA at 1525
ciGGTCT at 1518
CCACACT at 1479 AGTTG at 1513
ciGGTCCGA at 1462 AGTCG at 1486
ciAGAGCGA at 1448 ciCGACC at 1464
GGTCC at 1460
AGACA at 1452
TTGTTTT at 1394 ciGGTCT at 1411
TCGTTTT at 1371 AGTTG at 1406
TTATTCT at 1365
TCAGACC at 1356 AGACC at 1356
TTGGATC at 1306 GGTCA at 1352
ciCGTCT at 1314
ciGATCC at 1307
GGATC at 1306
ciGAACT at 1300
ciCGTCC at 1288
CCGCACC at 1244 ciCGACT at 1282
AGTCC at 1275
GGTCG at 1267
CCACTTT at 1212 GGACA at 1258
TTGTACC at 1207 AGATG at 1224
ciGGACCGG at 1200 GGTTG at 1203
TCGGACC at 1198 GGACC at 1198
GGTCG at 1194
ciAGTGTGG at 1128 ciGGACT at 1173
GGTCG at 1140
GGACA at 1131
TCACTCT at 1079 ciCGACC at 1111
ciGAAGTGA at 1056 AGACA at 1085
ciTGTCT at 1073
GGTCG at 1061
TTGGACC at 1015 ciTAACC at 1045
ciCGTCT at 1023
GGACC at 1015
TTAGTCC at 984 ciGAACT at 1009
ciCGTCC at 997
ciCGACT at 991
AGTCC at 984
CCGTACC at 953 GGTCG at 976
TCGGTCC at 948 ciCATCT at 970
GGACA at 967
TCGCTCT at 913 GGTCC at 948
AGACA at 919
ciTGTCT at 907
TCGGACC at 899 GGACC at 899
ciAGTGTGG at 882 GGTCG at 895
TCGGTTC at 874 GGTTC at 874
GGTCC at 850
ciGAACT at 843
ciCGTCC at 831
ciCGACT at 825
CTACACC at 787 GGTCG at 810
GGACA at 801
ciCGACC at 781
TCGCACC at 741 AGATG at 758
ciGGACTGG at 734 GGTCG at 737
TCGGACT at 732 ciGGACT at 732
CCAGTCC at 714 GGTCG at 728
CCGGTTC at 692 AGTCC at 714
ciCGTCC at 697
ciAGTGCGG at 664 GGTTC at 692
CCGGTCC at 648 GGTCG at 676
GGACA at 667
GGTCC at 648
TTATACC at 605 ciTAACC at 643
ciGGACCGA at 598 AGATG at 624
CCAGTCC at 578 GGACC at 596
CCGGTTC at 556 ciTAACT at 585
AGTCC at 578
ciCGTCC at 565
ciTGTCC at 561
GGTTC at 556
TCGGACC at 508 GGTCG at 540
GGACC at 508
TCACTTT at 473 GGTCG at 504
TTGTATC at 468 AGATG at 481
TCGGACC at 459 GGACC at 459
CCAGTCC at 441 AGTCC at 441
CCGGTTC at 419 ciTGTCC at 424
CCACGCC at 380 GGTTC at 419
GGTCG at 403
GGACA at 394
CTGCTTT at 312 ciTATCT at 355
TCACTCT at 301 ciGAACC at 328
TATAAA at 221 Butler (2002) TTATACT at 274 ciTGACT at 307
TTGGTCC at 262 ciTGTCT at 289
TTATAAAA at 222 Carninci (2006) CTACATT at 247 ciCATCT at 284
TATAAAA at 183 Carninci (2006) ciGATACAA at 213 GGTCC at 262
CCATATT at 181 AGATA at 234
CCGTACT at 124 ciTGTCT at 168
ciCGACT at 140
CCGTTTC at 93 ciTGACT at 130
ciCATCC at 119
CTATACC at 77 ciTATCT at 100
TTGTTCC at 71 ciCAACT at 85
GGTCG at 35
ciTGACT at 17
ciTGTCT at 13
Distal pn (2596-1) Distal pn (2596-1) Distal pn (2596-1) Distal pn (2596-1)
AGTTG at 2592
GGTCA at 2585
GGATC at 2574
AAAACAA at 2509 AGTCC at 2543
AAAGCAA at 2480
AAAGCAA at 2474
GATTCGG at 2454
AGAGTGA at 2447
ciCTGCACT at 2426
ciTTGAACC at 2382 AGATC at 2413
GGTTG at 2398
ciCTACTCC at 2352 ciGAACC at 2382
AAACTAG at 2313
AATACAA at 2305
AGACCAG at 2263 ciCATCT at 2290
GGACA at 2271
AGACC at 2261
GGTCA at 2248
GGATC at 2239
GGTGCGG at 2197 GGTTG at 2234
AAAATGA at 2187 ciTGACC at 2189
GATACAA at 2180
AGACCAA at 2147 AGATA at 2177
AGTTTGA at 2141 ciTGTCT at 2165
AGTGTAA at 2087 AGACC at 2145
GGTGCAG at 2082
AATGTGG at 2065
AGAGCAA at 2021 ciTGTCT at 2031
ciCTGCACT at 2000 AGACC at 2121
GGACA at 2117
AGAATGG at 1948 AGATC at 1987
ciGGCGTGG at 1897 AGACTGA at 1935 ciGAACC at 1956
AAATTAG at 1887 ciGAGACT at 1933 Matsumoto (2020)
AATACAA at 1878
ciCATCT at 1863
ciCATCC at 1838
AGACC at 1834
AGATG at 1828
ciGATCC at 1813
GGATC at 1812
AATATGG at 1742 ciCGTCT at 1774
ciTTATTTT at 1727 ciTATCT at 1731
GAATTAA at 1696
AAAGCGG at 1680 ciGAACT at 1685
GAAATGA at 1663 ciGAACT at 1685
GAAACAA at 1585 AGATA at 1595
AATACAG at 1566 ciCATCC at 1572
AGAACGA at 1553 AGACA at 1569
AGTGCAA at 1536 AGACA at 1569
ciTATCC at 1529
GGTGTGA at 1479 ciCAACC at 1514
AGTGCAG at 1471 ciGATCT at 1482
ciTCGCTCT at 1450 ciGATCT at 1482
AGATG at 1438
AAAACAA at 1388 ciCAACC at 1407
ciCCATTTC at 1380 ciCAACC at 1407
AGAGCAA at 1369
AGTCTGG at 1356
ciCCAGTCT at 1354
ciTTGCACT at 1347
ciTTGCACC at 1339
ciGGCGTGG at 1244 ciTTGAACC at 1303 GGTTG at 1319
AAATTAG at 1234 ciGAACC at 1303
ciTGTCT at 1222
GGACGCC at 1153 ciCGACC at 1191
AGTTC at 1177
GGATC at 1167
ciAGAGTGA at 1077 GGACG at 1151
ciTCACTCC at 1058 ciTGTCT at 1087
ciAGATTGG at 1045 ciGAGACT at 1081 Matsumoto (2020)
ciTTGAACC at 1012 ciTGACT at 1051
GGTTG at 1028
ciGATCCAG at 975 ciGAACC at 1012
ciGATCC at 973
ciAGAGCGA at 911 AGATC at 972
ciTGTCT at 921
ciGAGACT at 915 Matsumoto (2020)
ciTTGAACC at 846 AGATC at 877
ciGATGTGG at 787 GGTTG at 862
ciAAATTAG at 777 ciGAACC at 846
ciAATACAA at 769 GGATG at 784
ciAGACCAG at 727 ciCGTCT at 754
ciAGTTCGA at 721 ciTGACC at 734
AGACC at 725
AGTTC at 719
ciAAATTGG at 643 GGTCA at 712
ciAATACAA at 635 GGATC at 703
ciAATATGG at 605 ciTAACC at 614
ciAGATTGA at 585 ciCATCC at 593
AGATC at 589
GGTCA at 576
GGTCA at 568
AGACA at 559
ciAAATTAG at 499 GGATC at 525
ciAATACGA at 492
ciAGTGCGA at 448 ciTGTCT at 479
GGTCA at 439
GGATC at 430
ciGGTGCGG at 380 AGACA at 422
ciAAACTGA at 307
ciAGAACAG at 288
ciAATATGA at 274
ciAAACCAG at 261
ciAGTTCAA at 255
ciGATGTAA at 247 AGTTC at 253
ciGAAACAA at 229 AGATG at 244
ciGGTATAA at 181 GGTCA at 206
ciAAAACAG at 167 AGACA at 170
CTGCATT at 152 AGTCG at 157
ciAAACTGA at 130 AGTCG at 157
ciGATATGG at 77 GGATA at 108
ciAAAACAA at 69 GGATA at 98
AGTTG at 84
ciGGACCAG at 34 GGATA at 74
CTGAATT at 20 AGATA at 57
ciAGACTGA at 17 GGACC at 32
Comparisons of positive direction promoter elements

Comparisons of positive direction promoter elements

Butler (2002) Watson (2014) Juven-Gershon (2008) Butler (2002)
~-37 to -32 BREu SSRCGCC ~-31 to -26 TATAWAW -2 to +4 Inr YYRNWYY +28 to +32 DPE RGWYV
Cores np(4445-4265) Cores np(4445-4265) Cores np(4445-4265) Cores np(4445-4265)
ciGGAACAG at 4445
ciGGTCTGG at 4416 GGTCC at 4420
ciGGTCT at 4414
ciGGAGTGA at 4350 ciGGTCT at 4380
CTGCACC at 4343 ciTGTCC at 4367
ciCGACC at 4358
AGACA at 4332
AGACG at 4319
GGTCA at 4269
Proximals np(4265-4050) Proximals np(4265-4050) Proximals np(4265-4050) Proximals np(4265-4050)
GGACA at 4252
ciTGACC at 4216
AGTTC at 4200
TTAGTTT at 4139 ciCATCC at 4183
ciGATTTAG at 4136
TTGATTT at 4134
TCACTCT at 4128
TCATTTT at 4120
ciGAAATGA at 4094
ciAGAACAG at 4069 GGATG at 4099
ciGATCC at 4081
GGATC at 4080
GGTTC at 4073
ciCGTCT at 4056
Distals np(4050-1) Distals np(4050-1) Distals np(4050-1) Distals np(4050-1)
ciGAACT at 4048
ciAGAGTGG at 4040
ciGGTGTGA at 3971 ciTGACC at 4018
ciAGTGTGG at 3966
ciAGTCTGA at 3924 ciGAACC at 3937
ciAGAGTGA at 3876 ciCAACC at 3911
ciGAACCAG at 3840 AGACA at 3893
ciAGAATGA at 3835 AGTCC at 3863
TCACACC at 3824 ciGAACC at 3856
ciAATCCGA at 3799 ciGAACC at 3838
GGTCA at 3820
ciCGACT at 3801
ciTGACC at 3784
ciGGTCT at 3771
ciTAACT at 3733
ciGAAGCGG at 3670 ciTGACC at 3714
ciCATCC at 3629
CTGTTCC at 3625
ciAGTGTGA at 3594 ciTGTCC at 3619
ciGGAATGA at 3567 ciGGTCT at 3608
CCAGACC at 3550 ciTGTCC at 3577
ciGGACCAG at 3547 GGATG at 3574
AGACC at 3550
TCACACT at 3507 GGACC at 3545
TCACACT at 3507 GGACA at 3530
ciAGTGCAG at 3465 GGTTG at 3490
ciGATGCAG at 3460 AGATG at 3475
ciGGAATGA at 3441 GGATG at 3457
TTGCATC at 3402 AGATG at 3418
CTGTTCC at 3352 ciCATCT at 3403
TTGCACT at 3343 ciTGTCT at 3392
TTGCACT at 3343 ciTATCC at 3384
TTGCACT at 3343 AGTTA at 3381
CCGCATC at 3328
CTGCACC at 3322
CTGCTCC at 3309
CTGGTCT at 3299 ciCATCT at 3329
TCGCTCT at 3276 ciGGTCT at 3299
TCGCTCT at 3276 ciCAACT at 3291
CTGGTCT at 3245 AGTCG at 3283
ciCGTCT at 3256
ciGGACCAA at 3174 ciGGTCT at 3245
ciGGACCAA at 3174 ciCGTCC at 3203
ciGAAATGG at 3168
ciAATATGG at 3162 GGACC at 3172
CCAGTCC at 3084 GGACA at 3131
CCAGTCC at 3084 ciCATCC at 3108
ciGGTCTGG at 3021 AGTCC at 3084
ciTGTCT at 3053
GGTTG at 3050
CCAGTCC at 2998 GGTTA at 3024
CCAGTCC at 2998 ciGGTCT at 3019
CCAGTCC at 2998 ciGGTCT at 3019
CTGCTCC at 2978 ciTGTCT at 3004
ciGGTCTGA at 2943 AGTCC at 2998
ciGATTTGA at 2871 AGTTC at 2954
ciGATTTGA at 2871 ciTGACT at 2945
ciGATTTGA at 2871 ciGGTCT at 2941
ciGATTTGA at 2871 GGTTC at 2922
ciTGACC at 2873
TCAGATT at 2868
ciAGAATGA at 2841 AGACC at 2861
ciGGTGCAA at 2801 ciCATCT at 2852
ciGGTGCAA at 2801 ciGGACT at 2820
ciGAACC at 2776
ciAAAGTGG at 2711 GGATA at 2737
ciAGAGCAA at 2705 GGATG at 2714
ciGGACTGA at 2674
ciGATATAA at 2662
CCACACT at 2636 ciGGACT at 2672
ciGAAATAG at 2626 AGTTA at 2666
ciTTTATA at 2588 Butler (2002) CCACACC at 2602 GGATA at 2659
TTATACC at 2590 AGTCA at 2618
TTATACC at 2590 AGTCA at 2613
TTATACC at 2590 AGTCA at 2607
CCGCACC at 2566 ciGAACC at 2579
CTAATTT at 2440 ciGATCC at 2514
CTAATTT at 2440 ciGGTCT at 2489
CTACACC at 2430 ciTGTCT at 2466
CTACACC at 2430 GGACA at 2460
GGATG at 2409
ciGGTGCAA at 2335 ciGATCC at 2378
ciAGTGCAG at 2327 ciCGACT at 2359
TCACTCT at 2306
CTGTTTC at 2263
TCAATCT at 2235 ciGGACT at 2271
ciAGATCAA at 2232 ciGGTCT at 2258
CCAGATC at 2230
ciGAACCAG at 2227
CTGCATT at 2206 AGATC at 2230
TCATATT at 2178 ciGAACC at 2225
GGTCA at 2220
ciTGACC at 2213
ciTGTCT at 2172
AGTTA at 2134
TCGCTTC at 2095 ciCAACC at 2120
ciCATCT at 2111
ciAGTGCAG at 2064 AGTCA at 2100
CCAGTCC at 2026 ciTGTCT at 2078
ciAAAGCAG at 2007 GGTCA at 2035
CTATTTC at 1978 AGTCC at 2026
ciGGTGTGG at 1971 ciCAACC at 2013
ciGAACTGG at 1953 AGTTC at 1987
ciGGTCT at 1958
CCACTTC at 1914 ciGAACT at 1951
ciCGTCC at 1930
GGTTC at 1926
GGATG at 1878
GGACA at 1869
ciTGTCT at 1862
ciGGCGCCC at 1770 ciGGTGTGG at 1805 AGTCC at 1841
ciAGTGCAG at 1787 AGTCC at 1826
GGGCGCC at 1769 ciGAACC at 1811
ciGGTGCGG at 1764
CCAGACT at 1744
GGACGCC at 1672 ciTGTCT at 1731
ciGGTCT at 1711
ciTGTCT at 1862
GGACA at 1693
GGTCG at 1687
ciGAAGCGG at 1636 GGACG at 1670
ciTGACC at 1662
ciAGTGCGG at 1590 ciCAACC at 1616
CTGCACT at 1472 AGTCG at 1528
ciCGTCT at 1493
ciAATGCGG at 1422
CTGCACT at 1372 ciCGTCT at 1393
ciAATGCGG at 1322
GCACGCC at 1302
ciAGTGCGG at 1254 ciCAACC at 1280
ciAGTGCGG at 1170
ciAGTGCGG at 1086
ciCGACT at 998
ciTGTCC at 993
ciGATCC at 965
AGATC at 964
ciCAACC at 944
ciGGACT at 914
ciCGACT at 898
ciTGTCC at 893
ciGATCC at 865
AGATC at 864
ciCAACC at 844
ciGGTGCAG at 784 ciGGACT at 814
CCGGACT at 746 AGTCC at 757
ciGGACT at 746
AGACA at 712
ciAGTGCGG at 666 ciCATCC at 698
ciAGTGCGG at 582 ciCATCC at 629
ciCAACC at 608
ciAGTGCGG at 498
ciGGTGCGG at 489
ciAGACCGG at 442
ciGGAGCGA at 429 AGACC at 440
GGACG at 410
AGACG at 398
CCACACT at 345 GGACG at 359
GGACG at 323
GGTTC at 305
ciAATGTGA at 230 ciTGTCT at 268
GGTCC at 218
GGACC at 187
CTGTTTT at 147 AGTCC at 172
AGATG at 166
TTGTATT at 115 GGTCA at 153
ciTGTCT at 100
ciAGAGTGG at 53 ciTGTCC at 82
GGATG at 59
GGACC at 37
ciCATCC at 30
Cores pp(4445-4265) Cores pp(4445-4265) Cores pp(4445-4265) Cores pp(4445-4265)
GGACC at 4424
AGACC at 4416
GGACC at 4409
ciGGTCT at 4330
ciCGTCT at 4317
ciGAACC at 4300
AGTCA at 4271
Proximals pp(4265-4050) Proximals pp(4265-4050) Proximals pp(4265-4050) Proximals pp(4265-4050)
GGACG at 4231
ciGGACT at 4214
ciGGACT at 4186
ciCGACC at 4177
ciTAACT at 4161
ciGAACT at 4131
ciTGACT at 4089
ciGATCC at 4077
AGATC at 4076
ciTGTCC at 4070
ciGATCT at 4065
AGATC at 4064
AGTCG at 4052
Distals pp(4050-1) Distals pp(4050-1) Distals pp(4050-1) Distals pp(4050-1)
ciCATCT at 4036
GGTCC at 4032
AGTCG at 4023
ciGAACT at 4016
AGTCG at 3997
ciCGACC at 3989
ciTGTCC at 3975
ciCGTCT at 3916
ciGGTCT at 3891
AGTCC at 3868
GGTCA at 3841
ciCGTCT at 3831
ciGGTCT at 3806
GGACC at 3787
ciCGACT at 3778
ciCGTCC at 3768
AGTCG at 3775
GGACC at 3758
ciCATCC at 3753
ciTGACT at 3735
AGTCC at 3728
GGTCG at 3720
ciCGTCC at 3694
GGTCC at 3687
GGACC at 3679
ciCGTCC at 3662
ciTGTCC at 3636
GGTTG at 3633
GGACA at 3622
GGACA at 3617
ciCGACT at 3588
ciTGTCC at 3571
ciGGTCT at 3548
GGTCC at 3536
ciCGACC at 3526
ciGATCC at 3522
GGACC at 3496
ciGATCC at 3484
ciCGTCT at 3473
ciCGTCC at 3466
GGACA at 3434
AGTTA at 3424
ciCATCT at 3416
AGACC at 3405
GGTCA at 3379
GGACC at 3362
AGACG at 3358
ciTGACC at 3345
AGACG at 3306
GGACC at 3296
AGTTG at 3290
AGACG at 3278
AGACG at 3267
AGATA at 3258
ciCGACC at 3242
GGTCG at 3239
ciGGTCT at 3221
ciCGTCT at 3214
ciTGTCT at 3179
AGTCG at 3155
ciCGTCC at 3147
ciTGTCT at 3133
ciCGTCC at 3128
ciTGACC at 3117
GGTCC at 3111
ciGGTCT at 3091
GGTCA at 3082
AGACG at 3060
GGACC at 3047
AGTCG at 3041
AGTCC at 3034
AGACC at 3021
GGTCC at 3016
GGTCA at 2996
GGACC at 2988
AGACC at 2983
AGACG at 2975
ciGGACT at 2968
AGACA at 2957
AGTCA at 2936
AGACA at 2925
ciCGACT at 2915
GGTTA at 2908
GGACC at 2891
AGACC at 2883
GGTCC at 2876
ciCGTCT at 2859
AGACG at 2856
ciTGTCT at 2837
ciCAACC at 2816
ciCGACC at 2810
GGTCC at 2780
ciCGACC at 2770
ciCGTCC at 2745
ciCGACC at 2734
ciCGTCT at 2721
ciCGTCC at 2683
ciTGACT at 2674
ciTGTCT at 2652
ciGATCC at 2639
ciTATCT at 2627
ciTTTATA at 2588 Butler (2002) CCACACC at 2602 AGTCC at 2620
ciGGCGTGG at 2566 TTATACC at 2590 AGTTC at 2615
GGTCA at 2605
GGTTC at 2593
GGTCC at 2574
GGACC at 2569
ciTATCC at 2550
ciCAACC at 2541
AGTCG at 2526
GGACG at 2520
AGTTC at 2508
GGACC at 2501
ciGATCC at 2482
GGATC at 2481
GGACC at 2433
ciTGTCT at 2414
ciCGACC at 2405
GGTTC at 2398
AGTCG at 2390
AGTCC at 2372
ciCGACC at 2320
GGTCC at 2316
AGACA at 2308
ciCGTCC at 2296
AGACA at 2260
ciCATCC at 2255
GGACA at 2250
AGTTA at 2233
ciGGTCT at 2228
ciGGACT at 2211
AGTCG at 2198
ciCAACC at 2185
AGACA at 2182
AGATC at 2167
ciTGTCC at 2125
AGTCC at 2115
AGTCG at 2102
AGTCA at 2098
AGTCA at 2060
GGTCG at 2052
GGTCA at 2024
GGTTG at 2012
AGACC at 1992
ciTGTCC at 1966
ciTGACC at 1953
ciCGTCT at 1937
ciCGTCC at 1905
GGTCC at 1893
ciCATCC at 1875
AGACC at 1864
GGACA at 1860
GGTCC at 1855
CCACGCC at 1764 ciAGTGCAG at 1787 GGACC at 1815
ciGAACC at 1799
ciCGTCC at 1788
ciCGACC at 1779
GGACG at 1776
ciGGTCT at 1742
ciCGACC at 1736
AGACG at 1733
ciGGACT at 1676
ciGGACT at 1660
ciGGTCT at 1631
AGTTG at 1621
AGTCG at 1603
GGATG at 1573
ciGGCGCCG at 1438 CTGCACT at 1472 AGACG at 1495
AGACC at 1476
GGACG at 1469
GGTCG at 1463
GGTCG at 1457
ciCGTCT at 1416
ciGGCGCCG at 1338 CTGCACT at 1372 GGACG at 1411
AGACG at 1395
AGACC at 1376
GGACG at 1369
GGTCG at 1363
GGTCG at 1357
ciCGTCT at 1316
GGACG at 1311
ciTGACT at 1286
GGATG at 1283
GGTTG at 1279
GGTCG at 1271
AGTCG at 1267
GGTCA at 1250
GGACC at 1199
GGATG at 1195
GGTCC at 1175
ciTGACC at 1140
GGTCG at 1127
CGACGCC at 1033 ciAGTGCGG at 1086 GGACG at 1118
GGACG at 1075
GGACA at 991
ciGGACT at 959
GGACC at 947
GGTTG at 943
ciGGTCT at 935
AGTCG at 931
GGACG at 907
GGACA at 891
ciGGACT at 859
GGACC at 847
GGTTG at 843
ciGGTCT at 835
AGTCG at 831
ciCGACC at 779
ciGGCGCGC at 682 CCGGACT at 746
ciGGACT at 725
GGTCC at 707
ciCGTCC at 658
GGATG at 649
GGTCG at 623
GGTCG at 617
AGTCG at 613
GGTTG at 607
GGACC at 598
ciTGTCC at 552
CCACGCC at 489 ciAGTGCGG at 498 GGTCC at 515
AGTCG at 511
ciGGTCT at 468
ciCGTCT at 438
GGACG at 435
GGTCC at 424
ciCGACC at 417
ciCGTCT at 396
ciCGACC at 386
ciCGTCC at 379
ciTGTCC at 365
ciTGACC at 347
GGTCG at 329
ciCGTCC at 318
GGACC at 286
ciCGACC at 277
AGACC at 270
AGACG at 223
GGTCC at 215
ciGGTCT at 204
ciCGTCC at 194
GGACG at 191
GGTTC at 177
ciTGTCC at 157
GGACA at 144
AGACC at 102
AGACA at 98
AGTCC at 90
GGACC at 40
GGTCC at 33
ciTAACC at 24
ciGGTCT at 15
GGTCC at 8
Acknowledgements

Acknowledgements

The content on this page was first contributed by: Henry A. Hoff.

Initial content for this page in some instances came from Wikiversity.

See also

See also

References

References

  1. ↑ Jennifer E.F. Butler, James T. Kadonaga (October 15, 2002). “The RNA polymerase II core promoter: a key component in the regulation of gene expression”. Genes & Development. 16 (20): 2583–292. doi:10.1101/gad.1026202. PMIDΒ 12381658.
  2. ↑ 2.0 2.1 Gillian E. Chalkley and C. Peter Verrijzer (September 1, 1999). “DNA binding site selection by RNA polymerase II TAFs: a TAFII250-TAFII150 complex recognizes the Initiator” (PDF). The EMBO Journal. 18 (17): 4835–45. PMIDΒ 10469661. Retrieved 2012-04-26.
  3. ↑ S. T. Smale (1997). “Transcription initiation from TATA-less promoters within eukaryotic protein-coding genes”. Biochim. Biophys. Acta. 1351: 73–88. Retrieved 2012-04-26.
  4. ↑ Ceyockey (28 January 2005). promoter. San Francisco, California: Wikimedia Foundation, Inc. Retrieved 2012-09-29.
  5. ↑ 5.0 5.1 5.2 Stephen T. Smale and James T. Kadonaga (July 2003). “The RNA Polymerase II Core Promoter” (PDF). Annual Review of Biochemistry. 72 (1): 449–79. doi:10.1146/annurev.biochem.72.121801.161520. PMIDΒ 12651739. Retrieved 2012-05-07.
  6. ↑ Robert D. Andersen, Susan J. Taplitz, Sandy Wong, Greg Bristol, Bill Larkin, and Harvey R. Herschman (October 1987). “Metal-Dependent Binding of a Factor In Vivo to the Metal-Responsive Elements of the Metallothionein 1 Gene Promoter” (PDF). Molecular and Cellular Biology. 7 (10): 3574–81. doi:10.1128/MCB.7.10.3574. Retrieved 2013-04-15.
  7. ↑ Msh210 (23 February 2010). “GC box”. San Francisco, California: Wikimedia Foundation, Inc. Retrieved 2013-01-27.
  8. ↑ Klug WS, Cummings MR, Spencer CA, Palladina, MA (2009). Concepts of Genetics: Ninth Edition. San Francisco: Pearson Benjamin Cummings. pp.Β 463–464. ISBNΒ 978-0-321-54098-0.
  9. ↑ “GC box”. San Francisco, California: Wikimedia Foundation, Inc. June 23, 2012. Retrieved 2013-01-27.
  10. ↑ 10.0 10.1 10.2 Michael C. Blake, Robert C. Jambou, Andrew G. Swick, Jeanne W. Kahn, and Jane Clifford Azizkhan (December 1990). “Transcriptional Initiation Is Controlled by Upstream GC-Box Interactions in a TATAA-Less Promoter” (PDF). Molecular and Cellular Biology. 10 (12): 6632–41. doi:10.1128/MCB.10.12.6632. PMIDΒ 2247077. Retrieved 2013-01-27.
  11. ↑ H Imataka, K Sogawa, KI Yasumoto, Y Kikuchi, K Sasano, A Kobayashi, M Hayami, and Y Fujii-Kuriyama (October 1992). “Two regulatory proteins that bind to the basic transcription element (BTE), a GC box sequence in the promoter region of the rat P-4501A1 gene” (PDF). The EMBO Journal. 11 (10): 3663–71. PMIDΒ 1356762. Retrieved 2013-01-27.
  12. ↑ Akiro Higashikawa, Taku Saito, Toshiyuki Ikeda, Satoru Kamekura, Naohiro Kawamura, Akinori Kan, Yasushi Oshima, Shinsuke Ohba, Naoshi Ogata, Katsushi Takeshita, Kozo Nakamura, Ung-Il Chung, Hiroshi Kawaguchi (January 2009). “Identification of the core element responsive to runt-related transcription factor 2 in the promoter of human type x collagen gene”. Arthritis & Rheumatism. 60 (1): 166–78. doi:10.1002/art.24243. PMIDΒ 19116917. Retrieved 2013-06-18.
  13. ↑ “CAAT box”. San Francisco, California: Wikimedia Foundation, Inc. April 8, 2013. Retrieved 2013-04-14.
  14. ↑ Lagrange T, Kapanidis AN, Tang H, Reinberg D, Ebright RH (1998). “New core promoter element in RNA polymerase II-dependent transcription: sequence-specific DNA binding by transcription factor IIB”. Genes & Development. 12 (1): 34–44. doi:10.1101/gad.12.1.34. PMCΒ 316406. PMIDΒ 9420329.
  15. ↑ Littlefield O, Korkhin Y, Sigler PB (1999). “The structural basis for the oriented assembly of a TBP/TFB/promoter complex”. Proceedings of the National Academy of Sciences of the USA. 96 (24): 13668–73. doi:10.1073/pnas.96.24.13668. PMCΒ 24122. PMIDΒ 10570130.
  16. ↑ 16.0 16.1 “B recognition element”. San Francisco, California: Wikimedia Foundation, Inc. January 30, 2013. Retrieved 2013-01-30.
  17. ↑ 17.0 17.1 Alan K. Kutach, James T. Kadonaga (July 2000). “The Downstream Promoter Element DPE Appears To Be as Widely Used as the TATA Box in Drosophila Core Promoters” (PDF). Molecular and Cellular Biology. 20 (13): 4754–64. PMIDΒ 10848601. Retrieved 2012-07-15.
  18. ↑ 18.0 18.1 Chuhu Yang, Eugene Bolotin, Tao Jiang, Frances M. Sladek, Ernest Martinez. (March 7, 2007). “Prevalence of the initiator over the TATA box in human and yeast genes and identification of DNA motifs enriched in human TATA-less core promoters”. Gene. 389 (1): 52–65. doi:10.1016/j.gene.2006.09.029. PMIDΒ 17123746.
  19. ↑ Mary Lynch, Li Chen, Michael J. Ravitz, Sapna Mehtani, Kevin Korenblat, Michael J. Pazin and Emmett V. Schmidt (August 2005). “hnRNP K Binds a Core Polypyrimidine Element in the Eukaryotic Translation Initiation Factor 4E (eIF4E) Promoter, and Its Regulation of eIF4E Contributes to Neoplastic Transformation”. Molecular and Cellular Biology. 25 (15): 6436–53. doi:10.1128/MCB.25.15.6436-6453.2005. Retrieved 2013-03-17.
  20. ↑ “TATA box”. San Francisco, California: Wikimedia Foundation, Inc. June 17, 2013. Retrieved 2014-05-07.
  21. ↑ National Center for Biotechnology Information (April 28, 2012). “TBPL1 TBP-like 1 [ Homo sapiens ]”. 8600 Rockville Pike, Bethesda MD, 20894 USA: U.S. National Library of Medicine. Retrieved 2012-04-30.
  22. ↑ 22.0 22.1 Wensheng Deng, Stefan G.E. Roberts (October 15, 2005). “A core promoter element downstream of the TATA box that is recognized by TFIIB”. Genes & Development. 19 (20): 2418–23. doi:10.1101/gad.342405. PMIDΒ 16230532.
  23. ↑ J. Carcamo, L. Buckbinder and D. Reinberg (1991). “The initiator directs the assembly of a transcription factor IID-dependent transcription complex”. Proc. Natl. Acad. Sci, USA. 88: 8052–6. Retrieved 2012-04-26.
  24. ↑ L. Weis and D. Reinberg (1997). “Accurate positioning of RNA polymerase II on a natural TATA-less promoter is independent of TATA-binding protein associated factors and initiator-binding proteins” (PDF). Mol. Cell. Biol. 17: 2973–84. Retrieved 2012-04-26.
  25. ↑ 25.0 25.1 25.2 25.3 Marketa J. Zvelebil, Jeremy O. Baum (2008). Dom Holdsworth, ed. Understanding bioinformatics. New York: Garland Science. p.Β 772. ISBNΒ 978-0815340249.
  26. ↑ 26.00 26.01 26.02 26.03 26.04 26.05 26.06 26.07 26.08 26.09 26.10 26.11 Dong-Hoon Lee, Naum Gershenzon, Malavika Gupta, Ilya P. Ioshikhes, Danny Reinberg and Brian A. Lewis (November 2005). “Functional Characterization of Core Promoter Elements: the Downstream Core Element Is Recognized by TAF1”. Molecular and Cellular Biology. 25 (21): 9674–86. doi:10.1128/MCB.25.21.9674-9686.2005. PMIDΒ 16227614. Retrieved 2010-10-23.
  27. ↑ 27.0 27.1 Chin Yan Lim, Buyung Santoso, Thomas Boulay, Emily Dong, Uwe Ohler, and James T. Kadonaga (July 1, 2004). “The MTE, a new core promoter element for transcription by RNA polymerase II”. Genes & Development. 18 (13): 1606–17. doi:10.1101/gad.1193404. PMIDΒ 15231738. Retrieved 2013-02-10.
  28. ↑ 28.0 28.1 Tamar Juven-Gershon, James T. Kadonaga (March 15, 2010). “Regulation of Gene Expression via the Core Promoter and the Basal Transcriptional Machinery”. Developmental Biology. 339 (2): 225–9. doi:10.1016/j.ydbio.2009.08.009. PMCΒ 2830304. PMIDΒ 19682982.
  29. ↑ “Downstream promoter element”. San Francisco, California: Wikimedia Foundation, Inc. May 6, 2012. Retrieved 2012-05-20.
  30. ↑ Tamar Juven-Gershon, Susan Cheng & James T Kadonaga (23 October 2006). “Rational design of a super core promoter that enhances gene expression”. Nature Methods. 3: 917-922. doi:10.1038/nmeth937.
  31. ↑ 31.0 31.1 31.2 Glenn A. Maston, Sara K. Evans, and Michael R. Green (2006). “Transcriptional Regulatory Elements in the Human Genome”. Annual Review of Genomics and Human Genetics. 7: 29-59. doi:10.1146/annurev.genom.7.080505.115623. https://www.annualreviews.org/doi/pdf/10.1146/annurev.genom.7.080505.115623.
  32. ↑ Lee DH, Gershenzon N, Gupta M, Ioshikhes IP, Reinberg D, Lewis BA (2005). “Functional characterization of core promoter elements: the downstream core element is recognized by TAF1”. Mol. Cell. Biol. 25:9674–86.
  33. ↑ Stephen T. Smale and James T. Kadonaga (July 2003). “The RNA Polymerase II Core Promoter” (PDF). Annual Review of Biochemistry. 72 (1): 449–79. doi:10.1146/annurev.biochem.72.121801.161520. PMID 12651739.
  34. ↑ Cquan (2 October 2006). “Promoter (genetics)”. San Francisco, California: Wikimedia Foundation, Inc. Retrieved 2016-01-09.
Further reading

Further reading

External links

{{Phosphate biochemistry}}


Looking for the patient version?

Back to the patient-friendly article

Β© 2026 MyEClinic – IFTM Institut fΓΌr Telematik in der Medizin GmbH