1-2. Significance of Full-length cDNA Data

Intensive analysis of the 5'-ends of full-length cDNAs enabled us to elucidate the genome-wide features of
1-2-1. Detailed positions of the transcriptional start sites

1-2-2. Promoter regions adjacent to the TSS

1-2-1. Identification of the promoter regions Significance of Full-length cDNA Data
Despite important motifs to understand the transcriptional regulations of genes are embedded in the promoters , the number of genes whose promoters have been identified is limited. In Eukaryotic Promoter Database, which accumulates previously characterized promoter sequences, only several hundreds of human promoters have been registered. This may be due to the fact that the exact mRNA start sites have not been identified for most
of the genes. The conventional methods to identify the mRNA start site, such as S1 mapping, primer extension or 5'RACE are technically difficult and often lead to the inaccurate identification of the mRNA start sites. The "oligo-capped" cDNA libraries are found to be good resources for identification of the mRNA start site for many genes. We aligned these transcriptional start sites onto
the genomic sequences and retrieved adjacent promoter regions.
1-2-2. Determinataion of the exact Transcriptional start sites (TSS) Significance of Full-length cDNA Data
Detailed analysis of the 5'-ends of full-length cDNAs revealed that the exact mRNA start sites are scattered over a relatively wide region for most of the human genes. Although in vitro experiments using cell free system and viral promoters (e.g. adenovirus major late promoter) observed that the diverse transcriptional initiation occurs especially when the canonical TATA box is lacking from the promoter, there has been no report showing whether such a diverse initiation event is actually occurring in human cells at the genome-wide level in vivo. In most cases, the transcription start sites have been regarded as "regions" rather than as static "positions". Previous reports have usually described only one or at most a few transcription start sites for each gene (also visit Eukaryotic Promoter Database, which collects the previously reported transcription start sites, at http://www.epd.isb-sib.ch; Perier et al., 2000). Even if a gene has multiple start sites, most of them have been overlooked. The distribution of the mRNA start sites should reflect the dynamic nature of the transcriptional initiation events in vivo. Precise information about the position and the frequency of the initial nucleotide of the transcription presented in this study should lay the groundwork for elucidating the biophysical principles that govern transcription initiation.

<<Back