More than half of Caenorhabditis elegans pre-mRNAs lose their original 59 ends in a process termed "trans-splicing" in which the RNA extending from the transcription start site (TSS) to the site of trans-splicing of the primary transcript, termed the "outron," is replaced with a 22-nt spliced leader. This complicates the mapping of TSSs, leading to a lack of available TSS mapping data for these genes. We used growth at low temperature and nuclear isolation to enrich for transcripts still containing outrons, applying a modified SAGE capture procedure and high-throughput sequencing to characterize 59 termini in this transcript population. We report from this data both a landscape of 59-end utilization for C. elegans and a representative collection of TSSs for 7351 trans-spliced genes. TSS distributions for individual genes were often dispersed, with a greater average number of TSSs for trans-spliced genes, suggesting that trans-splicing may remove selective pressure for a single TSS. Upstream of newly defined TSSs, we observed well-known motifs (including TATAA-box and SP1) as well as novel motifs. Several of these motifs showed association with tissue-specific expression and/or conservation among six worm species. Comparing TSS features between trans-spliced and non-trans-spliced genes, we found stronger signals among outron TSSs for preferentially positioning of flanking nucleosomes and for downstream Pol II enrichment. Our data provide an enabling resource for both experimental and theoretical analysis of gene structure and function in C. elegans.
All Science Journal Classification (ASJC) codes