Misplaced Pages

GATA2

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

An intron is any nucleotide sequence within a gene that is not expressed or operative in the final RNA product. The word intron is derived from the term intr agenic regi on , i.e., a region inside a gene. The term intron refers to both the DNA sequence within a gene and the corresponding RNA sequence in RNA transcripts . The non-intron sequences that become joined by this RNA processing to form the mature RNA are called exons .

#895104

114-428: 2624 14461 ENSG00000179348 ENSMUSG00000015053 P23769 O09100 NM_032638 NM_001145661 NM_001145662 NM_008090 NM_001355253 NP_001139133 NP_001139134 NP_116027 NP_032116 NP_001342182 GATA2 or GATA-binding factor 2 is a transcription factor , i.e. a nuclear protein which regulates the expression of genes . It regulates many genes that are critical for

228-400: A cistron . Although introns are sometimes called intervening sequences , the term "intervening sequence" can refer to any of several families of internal nucleic acid sequences that are not present in the final gene product, including inteins , untranslated regions (UTR), and nucleotides removed by RNA editing , in addition to introns. The frequency of introns within different genomes

342-445: A cryptic splice site or mutate a functional site. They can also be somatic cell mutations that affect splicing in a particular tissue or a cell line. When the mutant allele is in a heterozygous state this will result in production of two abundant splice variants; one functional and one non-functional. In the homozygous state the mutant alleles may cause a genetic disease such as the hemophilia found in descendants of Queen Victoria where

456-512: A different strength of interaction. For example, although the consensus binding site for the TATA-binding protein (TBP) is TATAAAA, the TBP transcription factor can also bind similar sequences such as TATATAT or TATATAA. Because transcription factors can bind a set of related sequences and these sequences tend to be short, potential transcription factor binding sites can occur by chance if

570-400: A fifth family, but little is known about the biochemical apparatus that mediates their splicing. They appear to be related to group II introns, and possibly to spliceosomal introns. Nuclear pre-mRNA introns (spliceosomal introns) are characterized by specific intron sequences located at the boundaries between introns and exons. These sequences are recognized by spliceosomal RNA molecules when

684-452: A gene on a chromosome into RNA, and then the RNA is translated into protein. Any of these steps can be regulated to affect the production (and thus activity) of a transcription factor. An implication of this is that transcription factors can regulate themselves. For example, in a negative feedback loop, the transcription factor acts as its own repressor: If the transcription factor protein binds

798-573: A group II intron, and intronization. In theory it should be easiest to deduce the origin of recently gained introns due to the lack of host-induced mutations, yet even introns gained recently did not arise from any of the aforementioned mechanisms. These findings thus raise the question of whether or not the proposed mechanisms of intron gain fail to describe the mechanistic origin of many novel introns because they are not accurate mechanisms of intron gain, or if there are other, yet to be discovered, processes generating novel introns. In intron transposition,

912-421: A host cell to promote pathogenesis. A well studied example of this are the transcription-activator like effectors ( TAL effectors ) secreted by Xanthomonas bacteria. When injected into plants, these proteins can enter the nucleus of the plant cell, bind plant promoter sequences, and activate transcription of plant genes that aid in bacterial infection. TAL effectors contain a central repeat region in which there

1026-773: A living cell. Additional recognition specificity, however, may be obtained through the use of more than one DNA-binding domain (for example tandem DBDs in the same transcription factor or through dimerization of two transcription factors) that bind to two or more adjacent sequences of DNA. Transcription factors are of clinical significance for at least two reasons: (1) mutations can be associated with specific diseases, and (2) they can be targets of medications. Due to their important roles in development, intercellular signaling, and cell cycle, some human diseases have been associated with mutations in transcription factors. Many transcription factors are either tumor suppressors or oncogenes , and, thus, mutations or aberrant regulation of them

1140-417: A major role in determining sex in humans. Cells can communicate with each other by releasing molecules that produce signaling cascades within another receptive cell. If the signal requires upregulation or downregulation of genes in the recipient cell, often transcription factors will be downstream in the signaling cascade. Estrogen signaling is an example of a fairly short signaling cascade that involves

1254-853: A methylated CpG site, 175 transcription factors (34%) that had enhanced binding if their binding sequence had a methylated CpG site, and 25 transcription factors (5%) were either inhibited or had enhanced binding depending on where in the binding sequence the methylated CpG was located. TET enzymes do not specifically bind to methylcytosine except when recruited (see DNA demethylation ). Multiple transcription factors important in cell differentiation and lineage specification, including NANOG , SALL4 A, WT1 , EBF1 , PU.1 , and E2A , have been shown to recruit TET enzymes to specific genomic loci (primarily enhancers) to act on methylcytosine (mC) and convert it to hydroxymethylcytosine hmC (and in most cases marking them for subsequent complete demethylation to cytosine). TET-mediated conversion of mC to hmC appears to disrupt

SECTION 10

#1732800897896

1368-497: A mutation in one of the introns in a blood clotting factor gene creates a cryptic 3' splice site resulting in aberrant splicing. A significant fraction of human deaths by disease may be caused by mutations that interfere with normal splicing; mostly by creating cryptic splice sites. Incorrectly spliced transcripts can easily be detected and their sequences entered into the online databases. They are usually described as "alternatively spliced" transcripts, which can be confusing because

1482-424: A result of factors such as infections or other stresses. In consequence, the signs and symptoms of their disease appear and/or become progressively more severe. The role of GATA2 deficiency in leading to any of the leukemia types is not understood. Likewise, the role of GATA2 overexpression in non-familial AML as well as development of the blast crisis in chronic myelogenous leukemia and progression of prostate cancer

1596-490: A second example of negative feed-back, GATA2 transcription factor stimulates the expression of the GATA1 transcription factor which in turn can displace GATA2 transcription factor from its gene-stimulating binding sites thereby limiting GATA2's actions. The human GATA2 gene is expressed in hematological bone marrow cells at the stem cell and later progenitor cell stages of their development . Increases and/or decreases in

1710-419: A significant error rate even though there are spliceosome accessory factors that suppress the accidental cleavage of cryptic splice sites. Under ideal circumstances, the splicing reaction is likely to be 99.999% accurate (error rate of 10 ) and the correct exons will be joined and the correct intron will be deleted. However, these ideal conditions require very close matches to the best splice site sequences and

1824-443: A single gene and a single precursor mRNA transcript. The control of alternative RNA splicing is performed by a complex network of signaling molecules that respond to a wide range of intracellular and extracellular signals. Introns contain several short sequences that are important for efficient splicing, such as acceptor and donor sites at either end of the intron as well as a branch point site, which are required for proper splicing by

1938-475: A single gene. Furthermore, some introns play essential roles in a wide range of gene expression regulatory functions such as nonsense-mediated decay and mRNA export. After the initial discovery of introns in protein-coding genes of the eukaryotic nucleus, there was significant debate as to whether introns in modern-day organisms were inherited from a common ancient ancestor (termed the introns-early hypothesis), or whether they appeared in genes rather recently in

2052-452: A smaller number. Therefore, approximately 10% of genes in the genome code for transcription factors, which makes this family the single largest family of human proteins. Furthermore, genes are often flanked by several binding sites for distinct transcription factors, and efficient expression of each of these genes requires the cooperative action of several different transcription factors (see, for example, hepatocyte nuclear factors ). Hence,

2166-505: A specific DNA sequence . The function of TFs is to regulate—turn on and off—genes in order to make sure that they are expressed in the desired cells at the right time and in the right amount throughout the life of the cell and the organism. Groups of TFs function in a coordinated fashion to direct cell division , cell growth , and cell death throughout life; cell migration and organization ( body plan ) during embryonic development; and intermittently in response to signals from outside

2280-409: A specific location within the anticodon loop of unspliced tRNA precursors, and are removed by a tRNA splicing endonuclease. The exons are then linked together by a second protein, the tRNA splicing ligase. Note that self-splicing introns are also sometimes found within tRNA genes. Group I and group II introns are found in genes encoding proteins ( messenger RNA ), transfer RNA and ribosomal RNA in

2394-463: A target for therapeutic intervention. This overexpression is not due to mutation but rather caused at least in part by the overexpression of EVI1 , a transcription factor that stimulates GATA2 expression. GATA2 overexpression also occurs in prostate cancer where it appears to increase metastasis in the early stages of androgen-dependent disease and to stimulate prostate cancer cell survival and proliferation through activating by an unknown mechanism

SECTION 20

#1732800897896

2508-457: A tendency towards intron gain in larger species due to their smaller population sizes, and the converse in smaller (particularly unicellular) species. Biological factors also influence which genes in a genome lose or accumulate introns. Alternative splicing of exons within a gene after intron excision acts to introduce greater variability of protein sequences translated from a single gene, allowing multiple related proteins to be generated from

2622-430: A very wide range of living organisms. Following transcription into RNA, group I and group II introns also make extensive internal interactions that allow them to fold into a specific, complex three-dimensional architecture . These complex architectures allow some group I and group II introns to be self-splicing , that is, the intron-containing RNA molecule can rearrange its own covalent structure so as to precisely remove

2736-505: A wide range of tissues, GATA2 similarly interacts with HDAC3 , LMO2 , POU1F1 , POU5F1 , PML SPI1 , and ZBTB16 . GATA2 binds to a specific nucleic acid sequence viz., (T/A(GATA)A/G), on the promoter and enhancer sites of its target genes and in doing so either stimulates or suppresses the expression of these target genes. However, there are thousands of sites in human DNA with this nucleotide sequence but for unknown reasons GATA2 binds to <1% of these. Furthermore, all members of

2850-540: A wide variety of genes throughout organisms, bacteria, and viruses within all of the biological kingdoms. The fact that genes were split or interrupted by introns was discovered independently in 1977 by Phillip Allen Sharp and Richard J. Roberts , for which they shared the Nobel Prize in Physiology or Medicine in 1993, though credit was excluded for the researchers and collaborators in their labs that did

2964-419: Is chromatin immunoprecipitation (ChIP). This technique relies on chemical fixation of chromatin with formaldehyde , followed by co-precipitation of DNA and the transcription factor of interest using an antibody that specifically targets that protein. The DNA sequences can then be identified by microarray or high-throughput sequencing ( ChIP-seq ) to determine transcription factor binding sites. If no antibody

3078-450: Is a simple relationship between the identity of two critical residues in sequential repeats and sequential DNA bases in the TAL effector's target site. This property likely makes it easier for these proteins to evolve in order to better compete with the defense mechanisms of the host cell. It is common in biology for important processes to have multiple layers of regulation and control. This

3192-455: Is also true with transcription factors: Not only do transcription factors control the rates of transcription to regulate the amounts of gene products (RNA and protein) available to the cell but transcription factors themselves are regulated (often by other transcription factors). Below is a brief synopsis of some of the ways that the activity of transcription factors can be regulated: Transcription factors (like all proteins) are transcribed from

3306-534: Is associated with cancer. Three groups of transcription factors are known to be important in human cancer: (1) the NF-kappaB and AP-1 families, (2) the STAT family and (3) the steroid receptors . Below are a few of the better-studied examples: Approximately 10% of currently prescribed drugs directly target the nuclear receptor class of transcription factors. Examples include tamoxifen and bicalutamide for

3420-595: Is available for the protein of interest, DamID may be a convenient alternative. As described in more detail below, transcription factors may be classified by their (1) mechanism of action, (2) regulatory function, or (3) sequence homology (and hence structural similarity) in their DNA-binding domains. They are also classified by 3D structure of their DBD and the way it contacts DNA. There are two mechanistic classes of transcription factors: Transcription factors have been classified according to their regulatory function: Transcription factors are often classified based on

3534-442: Is called its DNA-binding domain. Below is a partial list of some of the major families of DNA-binding domains/transcription factors: The DNA sequence that a transcription factor binds to is called a transcription factor-binding site or response element . Transcription factors interact with their binding sites using a combination of electrostatic (of which hydrogen bonds are a special case) and Van der Waals forces . Due to

GATA2 - Misplaced Pages Continue

3648-403: Is followed by guanine in the 5' to 3' DNA sequence, a CpG site .) Methylation of CpG sites in a promoter region of a gene usually represses gene transcription, while methylation of CpGs in the body of a gene increases expression. TET enzymes play a central role in demethylation of methylated cytosines. Demethylation of CpGs in a gene promoter by TET enzyme activity increases transcription of

3762-438: Is indeed the case. While the catalytic reaction may be accurate enough for effective processing most of the time, the overall error rate may be partly limited by the fidelity of transcription because transcription errors will introduce mutations that create cryptic splice sites. In addition, the transcription error rate of 10 – 10 is high enough that one in every 25,000 transcribed exons will have an incorporation error in one of

3876-419: Is located 9.5 kilobases (i.e. kb) down-stream from the gene's transcript initiation site and is a critically important enhancer of the gene's expression. Regulation of GATA2 expression is highly complex. For example, in hematological stem cells, GATA2 transcription factor itself binds to one of these sites and in doing so is part of functionally important positive feedback autoregulation circuit wherein

3990-493: Is no less than 0.1% per intron. This relatively high level of splicing errors explains why most splice variants are rapidly degraded by nonsense-mediated decay. The presence of sloppy binding sites within genes causes splicing errors and it may seem strange that these sites haven't been eliminated by natural selection. The argument for their persistence is similar to the argument for junk DNA. Although mutations which create or disrupt binding sites may be slightly deleterious,

4104-409: Is not clear that they are "drugable" but progress has been made on Pax2 and the notch pathway. Gene duplications have played a crucial role in the evolution of species. This applies particularly to transcription factors. Once they occur as duplicates, accumulated mutations encoding for one copy can take place without negatively affecting the regulation of downstream targets. However, changes of

4218-478: Is not understood. Scores of different types of inactivating GATA mutations have been associated with GATA2 deficiency; these include frameshift , point , insertion , splice site and deletion mutations scattered throughout the gene but concentrated in the region encoding the GATA2 transcription factor's C-ZnF, N-ZnF, and 9.5 kb sites. Rare cases of GATA2 deficiency involve large mutational deletions that include

4332-437: Is observed to vary widely across the spectrum of biological organisms. For example, introns are extremely common within the nuclear genome of jawed vertebrates (e.g. humans, mice, and pufferfish (fugu)), where protein-coding genes almost always contain multiple introns, while introns are rare within the nuclear genes of some eukaryotic microorganisms, for example baker's/brewer's yeast ( Saccharomyces cerevisiae ). In contrast,

4446-414: Is organized with the help of histones into compact particles called nucleosomes , where sequences of about 147 DNA base pairs make ~1.65 turns around histone protein octamers. DNA within nucleosomes is inaccessible to many transcription factors. Some transcription factors, so-called pioneer factors are still able to bind their DNA binding sites on the nucleosomal DNA. For most other transcription factors,

4560-401: Is termed the "GATA switch". In all events, the actions of GATA2, particularly with referenced to its interactions with many other gene-regulating factors, in controlling its target genes is extremely complex and not fully understood. Familial and sporadic inactivating mutations in one of the two parental GATA2 genes causes a reduction, i.e. a haploinsufficiency , in the cellular levels of

4674-628: Is that they contain at least one DNA-binding domain (DBD), which attaches to a specific sequence of DNA adjacent to the genes that they regulate. TFs are grouped into classes based on their DBDs. Other proteins such as coactivators , chromatin remodelers , histone acetyltransferases , histone deacetylases , kinases , and methylases are also essential to gene regulation, but lack DNA-binding domains, and therefore are not TFs. TFs are of interest in medicine because TF mutations can cause specific diseases, and medications can be potentially targeted toward them. Transcription factors are essential for

GATA2 - Misplaced Pages Continue

4788-455: Is the process by which mutations create novel introns from formerly exonic sequence. Thus, unlike other proposed mechanisms of intron gain, this mechanism does not require the insertion or generation of DNA to create a novel intron. The only hypothesized mechanism of recent intron gain lacking any direct evidence is that of group II intron insertion, which when demonstrated in vivo, abolishes gene expression. Group II introns are therefore likely

4902-611: The GATA2 gene cause a reduction in the cellular levels of GATA2 and the development of a wide range of familial hematological, immunological, lymphatic, and/or other disorders that are grouped together into a common disease termed GATA2 deficiency . Less commonly, these disorders are associated with non-familial (i.e. sporadic or acquired) GATA inactivating mutations. GATA2 deficiency often begins with seemingly benign abnormalities but if untreated progresses to life-threatening opportunistic infections , virus-induced cancers , lung failure ,

5016-514: The TET1 protein that initiates a pathway of DNA demethylation . EGR1, together with TET1, is employed in programming the distribution of methylation sites on brain DNA during brain development and in learning (see Epigenetics in learning and memory ). Transcription factors are modular in structure and contain the following domains : The portion ( domain ) of the transcription factor that binds DNA

5130-442: The embryonic development , self-renewal , maintenance, and functionality of blood-forming , lymphatic system-forming , and other tissue-forming stem cells . GATA2 is encoded by the GATA2 gene, a gene which often suffers germline and somatic mutations which lead to a wide range of familial and sporadic diseases, respectively. The gene and its product are targets for the treatment of these diseases. Inactivating mutations of

5244-920: The estrogen receptor transcription factor: Estrogen is secreted by tissues such as the ovaries and placenta , crosses the cell membrane of the recipient cell, and is bound by the estrogen receptor in the cell's cytoplasm . The estrogen receptor then goes to the cell's nucleus and binds to its DNA-binding sites , changing the transcriptional regulation of the associated genes. Not only do transcription factors act downstream of signaling cascades related to biological stimuli but they can also be downstream of signaling cascades involved in environmental stimuli. Examples include heat shock factor (HSF), which upregulates genes necessary for survival at higher temperatures, hypoxia inducible factor (HIF), which upregulates genes necessary for cell survival in low-oxygen environments, and sterol regulatory element binding protein (SREBP), which helps maintain proper lipid levels in

5358-410: The formation of mature blood cells . Inactivation of one mouse Gata2 gene is neither lethal nor associated with most of the signs of human GATA2 deficiency; however, these animals do show a ~50% reduction in their hematopoietic stem cells along with a reduced ability to repopulate the bone marrow of mouse recipients. The latter findings, human clinical studies, and experiments on human tissues support

5472-566: The genomic level, DNA- sequencing and database research are commonly used. The protein version of the transcription factor is detectable by using specific antibodies . The sample is detected on a western blot . By using electrophoretic mobility shift assay (EMSA), the activation profile of transcription factors can be detected. A multiplex approach for activation profiling is a TF chip system where several different transcription factors can be detected in parallel. The most commonly used method for identifying transcription factor binding sites

5586-479: The mitochondrial genomes of vertebrates are entirely devoid of introns, while those of eukaryotic microorganisms may contain many introns. A particularly extreme case is the Drosophila dhc7 gene containing a ≥3.6 megabase (Mb) intron, which takes roughly three days to transcribe. On the other extreme, a 2015 study suggests that the shortest known metazoan intron length is 30 base pairs (bp) belonging to

5700-422: The myelodysplastic syndrome (i.e. MDS), and/or acute myeloid leukemia , principally acute myeloid leukemia (AML), less commonly chronic myelomonocytic leukemia (CMML), and rarely a lymphoid leukemia . Overexpression of the GATA2 transcription factor that is not due to mutations in the GATA2 gene appears to be a secondary factor that promotes the aggressiveness of non-familial EVI1 positive AML as well as

5814-641: The myelodysplastic syndrome , and/or leukemias , particularly AML. The various presentations of GATA2 deficiency include all cases of Monocytopenia and Mycobacterium Avium Complex/Dendritic Cell Monocyte, B and NK Lymphocyte deficiency (i.e. MonoMAC) and the Emberger syndrome as well as a significant percentage of cases of familial myelodysplastic syndrome/acute myeloid leukemia , congenital neutropenia , chronic myelomonocytic leukemia , aplastic anemia , and several other presentations . The L359V gain of function mutation (see above section on mutation) increases

SECTION 50

#1732800897896

5928-427: The preinitiation complex and RNA polymerase . Thus, for a single transcription factor to initiate transcription, all of these other proteins must also be present, and the transcription factor must be in a state where it can bind to them if necessary. Cofactors are proteins that modulate the effects of transcription factors. Cofactors are interchangeable between specific gene promoters; the protein complex that occupies

6042-456: The sequence similarity and hence the tertiary structure of their DNA-binding domains. The following classification is based of the 3D structure of their DBD and the way it contacts DNA. It was first developed for Human TF and later extended to rodents and also to plants. There are numerous databases cataloging information about transcription factors, but their scope and utility vary dramatically. Some may contain only information about

6156-705: The spliceosome . Some introns are known to enhance the expression of the gene that they are contained in by a process known as intron-mediated enhancement (IME). Actively transcribed regions of DNA frequently form R-loops that are vulnerable to DNA damage . In highly expressed yeast genes, introns inhibit R-loop formation and the occurrence of DNA damage. Genome-wide analysis in both yeast and humans revealed that intron-containing genes have decreased R-loop levels and decreased DNA damage compared to intronless genes of similar expression. Insertion of an intron within an R-loop prone gene can also suppress R-loop formation and recombination . Bonnet et al. (2017) speculated that

6270-532: The 359 amino acid position (i.e. within the N-ZnF site) of the transcription factor and has been detected in individuals undergoing the blast crisis of chronic myelogenous leukemia . Analyses of individuals with AML have discovered many cases of GATA2 deficiency in which one parental GATA2 gene was not mutated but silenced by hypermethylation of its gene promoter . Further studies are required to integrate this hypermethylation-induced form of GATA2 deficiency into

6384-426: The 3q21.3 locus plus contiguous adjacent genes; these mutations seem more likely than other types of GATA mutations to cause increased susceptibilities to viral infections, developmental lymphatic disorders, and neurological disturbances. One GATA2 mutation is a gain of function type , i.e. it is associated with an increase in the activity rather than levels of GATA2. This mutation substitutes valine for leucine in

6498-431: The DNA binding specificities of the single-copy Leafy transcription factor, which occurs in most land plants, have recently been elucidated. In that respect, a single-copy transcription factor can undergo a change of specificity through a promiscuous intermediate without losing function. Similar mechanisms have been proposed in the context of all alternative phylogenetic hypotheses, and the role of transcription factors in

6612-411: The DNA of its own gene, it down-regulates the production of more of itself. This is one mechanism to maintain low levels of a transcription factor in a cell. In eukaryotes , transcription factors (like most proteins) are transcribed in the nucleus but are then translated in the cell's cytoplasm . Many proteins that are active in the nucleus contain nuclear localization signals that direct them to

6726-430: The DNA sequence is long enough. It is unlikely, however, that a transcription factor will bind all compatible sequences in the genome of the cell . Other constraints, such as DNA accessibility in the cell or availability of cofactors may also help dictate where a transcription factor will actually bind. Thus, given the genome sequence, it is still difficult to predict where a transcription factor will actually bind in

6840-479: The GATA transcription factor family bind to this same nucleotide sequence and in doing so may in certain instances serve to interfere with GATA2 binding or even displace the GATA2 that is already bound to these sites. For example, displacement of GATA2 bond to this sequence by the GATA1 transcription factor appears important for the normal development of some types of hematological stem cells. This displacement phenomenon

6954-484: The GATA2 deficiency syndrome. This epigenetic gene silencing also occurs in certain types of non-small-cell lung carcinoma and is suggested to have a protective effect on progression of the disease. Elevated levels of GATA2 transcription factor due to overexpression of its gene GATA2 is a common finding in AML. It is associated with a poor prognosis, appears to promote progression of the disease, and therefore proposed to be

SECTION 60

#1732800897896

7068-475: The GATA2 transcription factor. In consequence, individuals commonly develop a disease termed GATA2 deficiency . GATA2 deficiency is a grouping of various clinical presentations in which GATA2 haploinsufficiency results in the development over time of hematological, immunological, lymphatic, and/or other presentations that may begin as apparently benign abnormalities but commonly progress to life-threatening opportunistic infections , virus infection-induced cancers ,

7182-399: The absence of any competing cryptic splice site sequences within the introns and those conditions are rarely met in large eukaryotic genes that may cover more than 40 kilobase pairs. Recent studies have shown that the actual error rate can be considerably higher than 10 and may be as high as 2% or 3% errors (error rate of 2 or 3 x 10 ) per gene. Additional studies suggest that the error rate

7296-442: The activity of the GATA2 transcription factor. The mutation occurs during the blast crisis of chronic myelogenous leukemia and is proposed to play a role in the transformation of the chronic and/or accelerated phases of this disease to its blast crisis phase. The repression of GATA2 expression due to methylation of promoter sites in the GATA2 gene rather than a mutation in this gene has been suggested to be an alternate cause for

7410-753: The actual proteins, some about their binding sites, or about their target genes. Examples include the following: Intron Introns are found in the genes of most eukaryotes and many eukaryotic viruses and they can be located in both protein-coding genes and genes that function as RNA ( noncoding genes ). There are four main types of introns: tRNA introns, group I introns, group II introns, and spliceosomal introns (see below). Introns are rare in Bacteria and Archaea (prokaryotes). Introns were first discovered in protein-coding genes of adenovirus , and were subsequently identified in genes encoding transfer RNA and ribosomal RNA genes. Introns are now known to occur within

7524-467: The adjacent gene is either up- or down-regulated . Transcription factors use a variety of mechanisms for the regulation of gene expression. These mechanisms include: Transcription factors are one of the groups of proteins that read and interpret the genetic "blueprint" in the DNA. They bind to the DNA and help initiate a program of increased or decreased gene transcription. As such, they are vital for many important cellular processes. Below are some of

7638-551: The androgen pathway in androgen-independent (i.e. castration-resistant) disease). This article incorporates text from the United States National Library of Medicine , which is in the public domain . Transcription factor In molecular biology , a transcription factor ( TF ) (or sequence-specific DNA-binding factor ) is a protein that controls the rate of transcription of genetic information from DNA to messenger RNA , by binding to

7752-413: The bacterial endosymbiont invaded the host genome. In the beginning these self-splicing introns excised themselves from the mRNA precursor but over time some of them lost that ability and their excision had to be aided in trans by other group II introns. Eventually a number of specific trans-acting introns evolved and these became the precursors to the snRNAs of the spliceosome. The efficiency of splicing

7866-411: The binding of 5mC-binding proteins including MECP2 and MBD ( Methyl-CpG-binding domain ) proteins, facilitating nucleosome remodeling and the binding of transcription factors, thereby activating transcription of those genes. EGR1 is an important transcription factor in memory formation. It has an essential role in brain neuron epigenetic reprogramming. The transcription factor EGR1 recruits

7980-456: The cell, such as a hormone . There are approximately 1600 TFs in the human genome . Transcription factors are members of the proteome as well as regulome . TFs work alone or with other proteins in a complex, by promoting (as an activator ), or blocking (as a repressor ) the recruitment of RNA polymerase (the enzyme that performs the transcription of genetic information from DNA to RNA) to specific genes. A defining feature of TFs

8094-450: The cell. Many transcription factors, especially some that are proto-oncogenes or tumor suppressors , help regulate the cell cycle and as such determine how large a cell will get and when it can divide into two daughter cells. One example is the Myc oncogene, which has important roles in cell growth and apoptosis . Transcription factors can also be used to alter gene expression in

8208-432: The claim of function must be accompanied by convincing evidence that multiple functional products are produced from the same gene. While introns do not encode protein products, they are integral to gene expression regulation. Some introns themselves encode functional RNAs through further processing after splicing to generate noncoding RNA molecules. Alternative splicing is widely used to generate multiple proteins from

8322-408: The combinatorial use of a subset of the approximately 2000 human transcription factors easily accounts for the unique regulation of each gene in the human genome during development . Transcription factors bind to either enhancer or promoter regions of DNA adjacent to the genes that they regulate based on recognizing specific DNA motifs. Depending on the transcription factor, the transcription of

8436-409: The conclusion that in humans both parental GATA2 genes are required for sufficient numbers of hematopoietic stem cells to emerge from the hemogenic endothelium during embryogenesis and for these cells and subsequent progenitor cells to survive, self-renew , and differentiate into mature cells. As GATA2 deficient individuals age, their deficiency in hematopoietic stem cells worsens, probably as

8550-412: The development of its valves. The human gene is also expressed in endothelium , some non-hematological stem cells, the central nervous system , and, to lesser extents, prostate, endometrium, and certain cancerous tissues. The Gata2 gene in mice has a structure similar to its human counterpart, Deletion of both parental Gata2 genes in mice is lethal by day 10 of embryogenesis due to a total failure in

8664-510: The diagnostic category of GATA2 deficiency. Non-mutational stimulation of GATA2 expression and consequential aggressiveness in EVI1-positive AML appears due to the ability of EVI1 , a transcription factor, to directly stimulate the expression of the GATA2 gene. The reason for the overexpression of GATA2 that begins in the early stages of prostate cancer is unclear but may involve the ability of FOXA1 to act indirect to stimulate

8778-437: The duplication of this sequence on each side of the transposon. Such an insertion could intronize the transposon without disrupting the coding sequence when a transposon inserts into the sequence AGGT or encodes the splice sites within the transposon sequence. Where intron-generating transposons do not create target site duplications, elements include both splice sites GT (5') and AG (3') thereby splicing precisely without affecting

8892-579: The emergence of eukaryotes, or the initial stages of eukaryotic evolution, involved an intron invasion. Two definitive mechanisms of intron loss, reverse transcriptase-mediated intron loss (RTMIL) and genomic deletions, have been identified, and are known to occur. The definitive mechanisms of intron gain, however, remain elusive and controversial. At least seven mechanisms of intron gain have been reported thus far: intron transposition, transposon insertion, tandem genomic duplication, intron transfer, intron gain during double-strand break repair (DSBR), insertion of

9006-653: The evolution of all species. The transcription factors have a role in resistance activity which is important for successful biocontrol activity. The resistant to oxidative stress and alkaline pH sensing were contributed from the transcription factor Yap1 and Rim101 of the Papiliotrema terrestris LS28 as molecular tools revealed an understanding of the genetic mechanisms underlying the biocontrol activity which supports disease management programs based on biological and integrated control. There are different technologies available to analyze transcription factors. On

9120-474: The evolutionary process (termed the introns-late hypothesis). Another theory is that the spliceosome and the intron-exon structure of genes is a relic of the RNA world (the introns-first hypothesis). There is still considerable debate about the extent to which of these hypotheses is most correct but the popular consensus at the moment is that following the formation of the first eukaryotic cell, group II introns from

9234-554: The experiments resulting in the discovery, Susan Berget and Louise Chow . The term intron was introduced by American biochemist Walter Gilbert : "The notion of the cistron [i.e., gene] ... must be replaced by that of a transcription unit containing regions which will be lost from the mature messenger – which I suggest we call introns (for intragenic regions) – alternating with regions which will be expressed – exons." (Gilbert 1978) The term intron also refers to intracistron , i.e., an additional piece of DNA that arises within

9348-951: The expression of the GATA2 gene. The full length GATA2 transcription factor is a moderately sized protein consisting of 480 amino acids. Of its two zinc fingers, C-ZnF (located toward the protein's C-terminus ) is responsible for binding to specific DNA sites while its N-ZnF (located toward the proteins N-terminus ) is responsible for interacting with various other nuclear proteins that regulate its activity. The transcription factor also contains two transactivation domains and one negative regulatory domain which interact with other nuclear proteins to up-regulate and down-regulate, respectively, its activity. In promoting embryonic and/or adult-type haematopoiesis (i.e. maturation of hematological and immunological cells), GATA2 interacts with other transcription factors (viz., RUNX1 , SCL/TAL1 , GFI1 , GFI1b , MYB , IKZF1 , Transcription factor PU.1 , LYL1 ) and cellular receptors (viz., MPL , GPR56 ). In

9462-404: The fact that splicing of RNA molecules containing group II introns generates branched introns (like those of spliceosomal RNAs), while group I introns use a non-encoded guanosine nucleotide (typically GTP) to initiate splicing, adding it on to the 5'-end of the excised intron. The spliceosome is a very complex structure containing up to one hundred proteins and five different RNAs. The substrate of

9576-565: The function of introns in maintaining genetic stability may explain their evolutionary maintenance at certain locations, particularly in highly expressed genes. The physical presence of introns promotes cellular resistance to starvation via intron enhanced repression of ribosomal protein genes of nutrient-sensing pathways. Introns may be lost or gained over evolutionary time, as shown by many comparative studies of orthologous genes. Subsequent analyses have identified thousands of examples of intron loss and gain events, and it has been proposed that

9690-412: The gene code for two Zinc finger structural motifs of the GATA2 transcription factor. These sites are critical for regulating the ability of the transcription factor to stimulate its target genes. The GATA2 gene has at least five separate sites which bind nuclear factors that regulate its expression. One particularly important such site is located in intron 4. This site, termed the 9.5 kb enhancer,

9804-535: The gene that they regulate. Other transcription factors differentially regulate the expression of various genes by binding to enhancer regions of DNA adjacent to regulated genes. These transcription factors are critical to making sure that genes are expressed in the right cell at the right time and in the right amount, depending on the changing requirements of the organism. Many transcription factors in multicellular organisms are involved in development. Responding to stimuli, these transcription factors turn on/off

9918-432: The gene's expression regulate the self-renewal , survival, and progression of these immature cells toward their final mature forms viz., erythrocytes , certain types of lymphocytes (i.e. B cells , NK cells , and T helper cells ), monocytes , neutrophils , platelets , plasmacytoid dendritic cells , macrophages and mast cells. The gene is likewise critical for the formation of the lymphatic system , particularly for

10032-469: The gene. The DNA binding sites of 519 transcription factors were evaluated. Of these, 169 transcription factors (33%) did not have CpG dinucleotides in their binding sites, and 33 transcription factors (6%) could bind to a CpG-containing motif but did not display a preference for a binding site with either a methylated or unmethylated CpG. There were 117 transcription factors (23%) that were inhibited from binding to their binding sequence if it contained

10146-573: The human MST1L gene. The shortest known introns belong to the heterotrich ciliates, such as Stentor coeruleus , in which most (> 95%) introns are 15 or 16 bp long. Splicing of all intron-containing RNA molecules is superficially similar, as described above. However, different types of introns were identified through the examination of intron structure by DNA sequence analysis, together with genetic and biochemical analysis of RNA splicing reactions. At least four distinct classes of introns have been identified: Group III introns are proposed to be

10260-402: The human genome contains an average of 8.4 introns/gene (139,418 in the genome), the unicellular fungus Encephalitozoon cuniculi contains only 0.0075 introns/gene (15 introns in the genome). Since eukaryotes arose from a common ancestor ( common descent ), there must have been extensive gain or loss of introns during evolutionary time. This process is thought to be subject to selection, with

10374-422: The idea that tandem genomic duplication is a prevalent mechanism for intron gain. The testing of other proposed mechanisms in vivo, particularly intron gain during DSBR, intron transfer, and intronization, is possible, although these mechanisms must be demonstrated in vivo to solidify them as actual mechanisms of intron gain. Further genomic analyses, especially when executed at the population level, may then quantify

10488-594: The important functions and biological roles transcription factors are involved in: In eukaryotes , an important class of transcription factors called general transcription factors (GTFs) are necessary for transcription to occur. Many of these GTFs do not actually bind DNA, but rather are part of the large transcription preinitiation complex that interacts with RNA polymerase directly. The most common GTFs are TFIIA , TFIIB , TFIID (see also TATA binding protein ), TFIIE , TFIIF , and TFIIH . The preinitiation complex binds to promoter regions of DNA upstream to

10602-406: The intron and link the exons together in the correct order. In some cases, particular intron-binding proteins are involved in splicing, acting in such a way that they assist the intron in folding into the three-dimensional structure that is necessary for self-splicing activity. Group I and group II introns are distinguished by different sets of internal conserved sequences and folded structures, and by

10716-436: The large number of possible such mutations makes it inevitable that some will reach fixation in a population. This is particularly relevant in species, such as humans, with relatively small long-term effective population sizes. It is plausible, then, that the human genome carries a substantial load of suboptimal sequences which cause the generation of aberrant transcript isoforms. In this study, we present direct evidence that this

10830-526: The most commonly purported intron gain mechanism, a spliced intron is thought to reverse splice into either its own mRNA or another mRNA at a previously intron-less position. This intron-containing mRNA is then reverse transcribed and the resulting intron-containing cDNA may then cause intron gain via complete or partial recombination with its original genomic locus. Transposon insertions have been shown to generate thousands of new introns across diverse eukaryotic species. Transposon insertions sometimes result in

10944-432: The nature of these chemical interactions, most transcription factors bind DNA in a sequence specific manner. However, not all bases in the transcription factor-binding site may actually interact with the transcription factor. In addition, some of these interactions may be weaker than others. Thus, transcription factors do not bind just one sequence but are capable of binding a subset of closely related sequences, each with

11058-530: The nucleosome should be actively unwound by molecular motors such as chromatin remodelers . Alternatively, the nucleosome can be partially unwrapped by thermal fluctuations, allowing temporary access to the transcription factor binding site. In many cases, a transcription factor needs to compete for binding to its DNA binding site with other transcription factors and histones or non-histone chromatin proteins. Pairs of transcription factors and other proteins can play antagonistic roles (activator versus repressor) in

11172-415: The nucleus. But, for many transcription factors, this is a key point in their regulation. Important classes of transcription factors such as some nuclear receptors must first bind a ligand while in the cytoplasm before they can relocate to the nucleus. Transcription factors may be activated (or deactivated) through their signal-sensing domain by a number of mechanisms including: In eukaryotes, DNA

11286-473: The number of conserved introns flanked by repeats in other organisms, though, for statistical relevance. For group II intron insertion, the retrohoming of a group II intron into a nuclear gene was proposed to cause recent spliceosomal intron gain. Intron transfer has been hypothesized to result in intron gain when a paralog or pseudogene gains an intron and then transfers this intron via recombination to an intron-absent location in its sister paralog. Intronization

11400-458: The presumed ancestors of spliceosomal introns, acting as site-specific retroelements, and are no longer responsible for intron gain. Tandem genomic duplication is the only proposed mechanism with supporting in vivo experimental evidence: a short intragenic tandem duplication can insert a novel intron into a protein-coding gene, leaving the corresponding peptide sequence unchanged. This mechanism also has extensive indirect evidence lending support to

11514-439: The progression of prostate cancer . The GATA2 gene is a member of the evolutionarily conserved GATA transcription factor gene family. All vertebrate species tested so far, including humans and mice, express 6 GATA genes, GATA1 through GATA6 . The human GATA2 gene is located on the long (or "q") arm of chromosome 3 at position 21.3 (i.e. the 3q21.3 locus) and consists of 8 exons . Two sites, termed C-ZnF and N-ZnF, of

11628-503: The promoter DNA and the amino acid sequence of the cofactor determine its spatial conformation. For example, certain steroid receptors can exchange cofactors with NF-κB , which is a switch between inflammation and cellular differentiation; thereby steroids can affect the inflammatory response and function of certain tissues. Transcription factors and methylated cytosines in DNA both have major roles in regulating gene expression. (Methylation of cytosine in DNA primarily occurs where cytosine

11742-432: The protein-coding sequence. It is not yet understood why these elements are spliced, whether by chance, or by some preferential action by the transposon. In tandem genomic duplication, due to the similarity between consensus donor and acceptor splice sites, which both closely resemble AGGT, the tandem genomic duplication of an exonic segment harboring an AGGT sequence generates two potential splice sites. When recognized by

11856-408: The reaction is a long RNA molecule and the transesterification reactions catalyzed by the spliceosome require the bringing together of sites that may be thousands of nucleotides apart. All biochemical reactions are associated with known error rates and the more complicated the reaction the higher the error rate. Therefore, it is not surprising that the splicing reaction catalyzed by the spliceosome has

11970-515: The regulation of gene expression and are, as a consequence, found in all living organisms. The number of transcription factors found within an organism increases with genome size, and larger genomes tend to have more transcription factors per gene. There are approximately 2800 proteins in the human genome that contain DNA-binding domains, and 1600 of these are presumed to function as transcription factors, though other studies indicate it to be

12084-425: The regulation of the same gene . Most transcription factors do not work alone. Many large TF families form complex homotypic or heterotypic interactions through dimerization. For gene transcription to occur, a number of transcription factors must bind to DNA regulatory sequences. This collection of transcription factors, in turn, recruit intermediary proteins such as cofactors that allow efficient recruitment of

12198-413: The splice sites leading to a skipped intron or a skipped exon. Almost all multi-exon genes will produce incorrectly spliced transcripts but the frequency of this background noise will depend on the size of the genes, the number of introns, and the quality of the splice site sequences. In some cases, splice variants will be produced by mutations in the gene (DNA). These can be SNP polymorphisms that create

12312-474: The spliceosome, the sequence between the original and duplicated AGGT will be spliced, resulting in the creation of an intron without alteration of the coding sequence of the gene. Double-stranded break repair via non-homologous end joining was recently identified as a source of intron gain when researchers identified short direct repeats flanking 43% of gained introns in Daphnia. These numbers must be compared to

12426-526: The splicing reactions are initiated. In addition, they contain a branch point, a particular nucleotide sequence near the 3' end of the intron that becomes covalently linked to the 5' end of the intron during the splicing process, generating a branched ( lariat ) intron. Apart from these three short conserved elements, nuclear pre-mRNA intron sequences are highly variable. Nuclear pre-mRNA introns are often much longer than their surrounding exons. Transfer RNA introns that depend upon proteins for removal occur at

12540-460: The term does not distinguish between real, biologically relevant, alternative splicing and processing noise due to splicing errors. One of the central issues in the field of alternative splicing is working out the differences between these two possibilities. Many scientists have argued that the null hypothesis should be splicing noise, putting the burden of proof on those who claim biologically relevant alternative splicing. According to those scientists,

12654-457: The transcription factor acts to promote its own production; in a second example of a positive feed back circuit, GATA2 stimulates production of Interleukin 1 beta and CXCL2 which act indirectly to simulate GATA2 expression. In an example of a negative feedback circuit, the GATA2 transcription factor indirectly causes activation of the G protein coupled receptor , GPR65 , which then acts, also indirectly, to repress GATA2 gene expression. In

12768-447: The transcription of the appropriate genes, which, in turn, allows for changes in cell morphology or activities needed for cell fate determination and cellular differentiation . The Hox transcription factor family, for example, is important for proper body pattern formation in organisms as diverse as fruit flies to humans. Another example is the transcription factor encoded by the sex-determining region Y (SRY) gene, which plays

12882-495: The treatment of breast and prostate cancer , respectively, and various types of anti-inflammatory and anabolic steroids . In addition, transcription factors are often indirectly modulated by drugs through signaling cascades . It might be possible to directly target other less-explored transcription factors such as NF-κB with drugs. Transcription factors outside the nuclear receptor family are thought to be more difficult to target with small molecule therapeutics since it

12996-448: Was improved by association with stabilizing proteins to form the primitive spliceosome. Early studies of genomic DNA sequences from a wide range of organisms show that the intron-exon structure of homologous genes in different organisms can vary widely. More recent studies of entire eukaryotic genomes have now shown that the lengths and density (introns/gene) of introns varies considerably between related species. For example, while

#895104