Mechanisms of Post-transcriptional Gene Regulation
in Mammalian Cells
Thomas Tuschl studies the role of RNA-binding proteins (RBPs) and noncoding RNAs (ncRNAs) in post-transcriptional gene regulation in human cells. He develops genome-wide approaches for the characterization of mRNA-RBP and mRNA-ncRNA interaction networks and their alterations in pathogenic states. These studies may lead to novel prognostic or diagnostic approaches as well as therapeutic approaches.
Post-transcriptional regulation of mRNA stability and translation by noncoding RNAs (ncRNAs) and RNA-binding proteins (RBPs) represents a vital cellular process. Deregulation of the mRNA-RBP and mRNA-ncRNA interactions by genetic mutation, deletion, or dysregulation results in human diseases. For example, the loss of FMR1 gene expression causes fragile X syndrome, and dominant monoallelic mutations in FUS and TARDBP/TDP-43 protein trigger amyotrophic lateral sclerosis. The identification of the RNA targets of ncRNAs and RBPs relevant to the phenotypic changes observed in disease is a difficult undertaking because of the complexity of the underlying interaction networks. The recent development of RNA deep-sequencing technology has enabled transcriptome-wide approaches for identification of RBP-RNA– and RBP-ncRNA–targeting networks.
Our laboratory uses a cell-based approach to identify the RNA target sites of RBPs, known as photoactivatable ribonucleoside-enhanced crosslinking and immunoprecipitation (PAR-CLIP). Cells are first cultured in 4-thiouridine-(4SU-)–supplemented medium, followed by UV 365-nm irradiation of live cells for crosslinking 4SU-modified RNA bound by RBPs. Cross-linked RBP-RNA segments are subsequently isolated by immunoprecipitation from cell lysates, and the cross-linked RNAs are subjected to small RNA cDNA library preparation and deep sequencing. The resulting sequence reads are mapped to the genome and transcriptome, and clusters of sequence reads with T-to-C sequence transitions represent 4SU-crosslinked RNA-binding sites. We have applied this method to determine the RNA targets and underlying RNA recognition elements for several families of RBPs, including AGO/EIF2C, IGF2BP, QKI, PUM, FUS, ELAVL, and FMR1. PAR-CLIP typically yields thousands to tens of thousands of RNA-binding sites in support of complex RNA interaction networks. These natural RNA-binding sites also represent excellent ligands for further biochemical and collaborative structural studies conducted by Dinshaw Patel’s laboratory at Memorial Sloan-Kettering Cancer Center.
To assess the regulatory function of target RNA binding and the physiologic consequences, we monitor global changes in RNA and protein abundance in the same cell systems established for defining the RNA targets upon overexpression or small interfering RNA (siRNA) knockdown of the selected RBP. Subsequent correlation of regulated to PAR-CLIP–identified targets reveals critical RNA sequence and context features and allows for the ranking of RNA targets for follow-up studies. Some of the complex bioinformatic analysis is carried out in collaboration with Uwe Ohler’s laboratory at Duke University. Specific phenotypic studies of regulatory functions of RBPs may be conducted in animal models.
The human genome encodes nearly 1,800 RBPs, several hundred of which represent mRNA-binding proteins contributing to pre-mRNA processing, transport, stability, and translation, as well as rRNA- and tRNA-binding proteins. One of the long-term aims of my laboratory is to determine a complete RNA-RBP interactome and to develop cell-based and molecular assays to elucidate the specific regulatory and basic molecular functions of mRNA-binding proteins.
Characterization of Noncoding RNAs
Our interest in post-transcriptional gene regulation was triggered by the discovery of short double-stranded RNAs (dsRNAs) as sequence-specific repressors of gene expression in processes known as RNA silencing and RNA interference. We defined the size and structure of the dsRNA-processing intermediates, the siRNAs, and studied their assembly into target RNA-binding and -cleaving ribonucleoprotein complexes. In this process, we also discovered hundreds of endogenous small RNAs implicated in gene regulation.
The most abundant class of small ncRNAs have been termed microRNAs (miRNAs); these are encoded in the genome in the form of short, slightly imperfect inverted repeats of approximately 25 base pair lengths. Mature miRNAs are predominantly 21 or 22 nucleotides (nt) long, evolutionarily conserved, and present in every cell type. Humans express several hundred miRNA genes. A small subset of these miRNA genes are expressed with cell-type and developmental stage specificity and contribute to establishment or maintenance of cell lineages.
Because miRNAs are important negative regulators of gene expression, we continue to characterize their expression in normal and disease states in tissues and cells, and we are studying them in extracellular body fluids such as serum and plasma to evaluate their utility as biomarkers and in diagnostics. We further developed quantification of miRNAs in formalin-fixed, paraffin-embedded archival tissue sections using multicolor fluorescence RNA in situ hybridization (RNA FISH) by enhancing RNA fixation and probe specificity and controlling for RNA quality by simultaneously monitoring for unrelated abundant ncRNAs, without the use of signal amplification otherwise required for miRNA visualization.
The second class of small ncRNAs, together with their interacting Piwi proteins, are specifically expressed in adult male germ cells and are required for their maintenance and sperm formation. These ncRNAs, termed piRNAs, are 26 to 32 nt long, begin with a 5 uridine, and are 2-O-methylated at their 3 ends. In contrast to miRNAs, which originate from dsRNA precursors, piRNAs are processed from long, single-stranded, nonconserved primary transcripts. The human genome encodes about 150 piRNA-producing genes, yielding hundreds of thousands of sequence-distinct piRNAs. Piwi proteins and piRNA biogenesis factors are essential for germ cell maintenance and have been implicated in protection against transposable elements. We are curating piRNA-producing genes and their borders in an attempt to gain insights into the signals necessary for their biogenesis. More recently, we initiated a search for piRNAs in the ovary, where germline cell expansion takes place during embryonic development of the ovary rather than at the onset of puberty, as seen in the male reproductive organ.
Finally, many longer ncRNAs (50–200 nt), such as tRNAs, snRNAs, and snoRNAs, have not been fully characterized in humans. We started to develop RNA-sequencing approaches for measuring the absolute abundance of highly expressed ncRNAs and its variation in normal and disease states. To capture any variation in tissue archives at cellular resolution, we have developed multicolor RNA FISH protocols that use directly labeled probes for the analysis of archival tissue sections. This approach can be extended to capture specific mRNAs and provides a powerful alternative to current immunohistochemistry approaches.
Grants from the National Institutes of Health, Starr Foundation, and Simons Foundation provided partial support for these projects.