The amount of finished sequence produced by the Caenorhabditis elegans Genome Project grew to over 14 Mb at the end of March. This marks more than a six-fold increase from the 2.2 Mb reported in mid-1994 [see HGN 6(2), 1-2 (July 1994)]. Of the new total, investigators headed by Richard Wilson and Robert Waterston [Washington University (WU), St. Louis] contributed 272 cosmids (8,339,124 bases), and researchers led by John Sulston (Sanger Centre, U.K.) contributed 183 cosmids (5,634,590 bases). The St. Louis group has also finished 39 yeast cosmids (1,248,089 bases). [Figures provided by LaDeana Hillier and David States, WU] Sanger Centre Ftp Site
Cosmid sequences are available by anonymous ftp (ftp.sanger.ac.uk in the directory pub/databases/C.elegans_sequences or via ftp://ftp.sanger.ac.uk/). The ftp site, which is updated regularly, contains all the Sanger Centre C. elegans sequence data as well as completed contigs from St. Louis.
Sequences are divided into the following three directories.
* EMBL_SEQUENCES: Finished, annotated, already submitted to public databases. * FINISHED_SEQUENCES: Finished but not annotated; may contain errors and change from day to day. * UNFINISHED_SEQUENCES: Very preliminary, may contain contamination; major changes when updated. Useful for mapping and gene hunting.
For further information, contact Steven Jones (Fax: +44-1223/494919, sjj@sanger.ac.uk), who would appreciate comments on these sequences. Washington University, St. Louis
The St. Louis group emphasizes making finished data available as rapidly as possible through the Sanger Centre ftp server and public sequence databases. Cosmids are processed individually; as each is finished with double stranding and the resolution of sequence conflicts and ambiguities, it is annotated (including gene prediction and BLAST searches for homology) and submitted to the public databases. |