Transcriptional profiling by RNA sequencing (RNAseq) enables the sensitive and accurate characterization of the transcriptome1–4. Following enumeration of reads, various methods are available to normalize counts, estimate probability distributions and identify differential gene expression between biological conditions3,5,6. RNAseq has the added advantages of being extremely high-throughput and relatively inexpensive, with a high signal-to-noise ratio and a dynamic range encompassing 4–5 orders of magnitude. This combination of throughput and sensitivity enables the detection of rare transcripts from nanograms of RNA.
We have described a method to produce large quantities of highly enriched, electrically active glutamatergic neurons (ESNs) from suspension-adapted mouse embryonic stem cells (ESCs)7,8. This technique is a modification of the 4/4 method, and is based on the spatiotemporal changes in morphogens that occur during neural induction and patterning9–11. In this method, neuroepithelial stem cells (NESCs) are derived from ESCs by the withdrawal of leukemia inhibitory factor (LIF), then induced to undergo neurogenesis and neural patterning by supplementation with all-trans retinoic acid (RA) in the presence of fetal bovine serum12. We have modified the 4/4 method to include feeder cell-free, suspension culture of ESNs; differentiation under rotary conditions to normalize the intra-aggregate environment; and neuronal induction and neural induction and patterning using 6 µM RA. These refinements have resulted in a facile and economical method to generate large quantities of highly enriched glutamatergic neurons13. Using immunocytochemistry, we have shown that ESNs are composed mostly of glutamatergic neurons (~95%), with about 5% GABAergic neurons, and no evidence of dopaminergic, serotonergic, cholinergic or glycinergic subtypes7. Expression profiling using RNA sequencing has confirmed that derived cultures are primarily glutamatergic, and identified the abundant expression of a wide range of cortical markers, including reelin, Pax6, Otx1, Ctip2 and Cux1/213–19.
Although aspects of corticogenesis have been replicated in vitro by the directed differentiation of pluripotent stem cells, spatiotemporal changes in gene expression responsible for regionalization and apical-to-basal patterning of the cerebral cortex are not well understood20–22. The embryological origin of the cerebral cortex is the telencephalon, which is the most anterior structure of the mammalian neural tube23. Cortical glutamatergic neurons are generated by proliferation and laminar patterning of the dorsal telencephalon, whereas inhibitory GABAergic neurons derive from the ventral telencephalon and migrate tangentially into cortical layers24,25. Our ability to derive predominantly glutamatergic neurons that express markers of the telencephalon and all six cortical layers is consistent with an in vitro model of the developing telencephalon and cortical layer formation23. We propose to apply this model to identify the temporal changes in gene and isoform expression associated with neural patterning and corticogenesis. If successful, we intend to use ESNs to functionally interrogate the roles of individual genes in executing the transcriptional programs underlying the formation of the cerebral cortex.
To evaluate the transcriptional changes associated with differentiation of cortical glutamatergic neurons, we conducted a longitudinal expression profile of the deep transcriptome during neurogenesis from days in vitro (DIV) -8 to 28, where DIV 0 corresponds to the end of differentiation. High quality RNA was isolated from ESCs (DIV -8; n=4); neuroepithelial stem cells (DIV -4; n=3); radial glia (DIV 0; n=3); developmental stage (DS) I/II neurons (DIV 1; n=4); DS III/IV neurons (DIV 7; n=5); and maturing DS IV/V neurons at DIV 16 (n=4); DIV 21 (n=4); and DIV 28 (n=4). The summary data, including quality scores, are presented in Data File 1, and the raw transcript read counts for each biological replicate are presented in Data File 2. The FASTQ files generated for each biological replicate are available at the Sequence Read Archive (SRA), a freely accessible database provided by NCBI (http://www.ncbi.nlm.nih.gov/sra/), under the accession number PRJNA185305.
In addition to providing the basis for a study to characterize transcriptional processes involved in corticogenesis and neuronal maturation, we can foresee several other applications for this data. First, the identification of genes and isoforms that exhibit differential expression in synchronization with known markers is expected to provide novel insight into molecular mechanisms of neurogenesis and neural patterning. Second, a few algorithms are available for statistical determination of differential gene expression within longitudinal data sets. The depth and quality of these data suggests it may be well-suited for the development and validation of such methods. Third, we envision this dataset facilitating inter-specific comparisons of transcription during neurogenesis and corticogenesis, providing insight into transcriptional mechanisms common to and unique in the evolution of the mammalian cerebral cortex. For these reasons, we are making this data publically available for research efforts in bioinformatics, stem cell research and developmental neuroscience studies.