Full-length complementary DNAs (cDNAs) are essential for the correct annotation of genomic sequences and for the functional analysis of genes and their products. We isolated 155,144 RIKEN Arabidopsis full-length (RAFL) cDNA clones. The 3′-end expressed sequence tags (ESTs) of 155,144 RAFL cDNAs were clustered into 14,668 non-redundant cDNA groups that represented approximately 60% of the predicted genes. We also obtained 5′ ESTs from 14,034 nonredundant cDNA groups and constructed a promoter database. The sequence database of the RAFL cDNAs is useful for promoter analysis and correct annotation of predicted transcription units and gene products. Furthermore, the full-length cDNAs are a useful resource for analyses of the expression profile, function, and structure of plant proteins.
We have determined full-length sequences of RAFL cDNA clones in collaboration with the Arabidopsis SSP group, which comprises investigators from the Salk Institute (PI: Dr. Joseph R. Ecker), the Stanford Genome Technology Center (PI: Dr. Ronald W. Davis), and the Plant Gene Expression Center (PI: Dr. Athanasios Theologis). After the full-length sequences of the RAFL cDNA clones were determined, the RAFL cDNA clones became available from the RIKEN Bioresource Center (BRC).
Rice full-length cDNA clones were collected and completely sequenced by the joint collaboration of the National Institute of Agrobiological Sciences (NIAS), Foundation of Advancement of International Science (FAIS), and RIKEN institute (RIKEN), under the supervision of the Bio-oriented Technology Research Advancement Institution (BRAIN). The full-length cDNA libraries were constructed using approximately twenty types of stressed tissues of Oryza sativa L. ssp. japonica cv. Nipponbare, from which 170,000 clones were randomly picked. According to the sequence information of the 3′ terminal single-pass sequence, the clones were grouped into 28,000 independent groups. All representative clones have been completely sequenced.