first large set of genomic sequence data (whole genome shotgun reads
or WGS) was contributed by the Washington University,
Genome sequencing Center (WUGSC) in November 2004. As a result, Biomphalaria
glabrata (BB02 strain) became a registered species
in the NCBI
trace archive in Summer 2005.
The Genome Center at Washington University
has been assigned the full project of sequencing the genome of Biomphalaria
to) the webpage dedicated to the genome sequencing of Biomphalaria
are yielding progress in the characterization of the genome of Biomphalaria
Recently, preliminary assemblies became available from the Genome
Sequencing center at Washington University.
*These are PRELIMINARY Bg Genomic Data,
subject to change*
at your own discretion!*
*Be aware that the assembly is highly fragmented*
*DO NOT contact the sequencing center on this subject*
*future assemblies will likely show improvements*
The sequencing effort is still in progress, these sequence data are
not to be used for independent large scale analysis or reports.
Updates and progress reports will follow
Blast access to the data, and sequence data are available through http://184.108.40.206/blast_bg/2index.html
This site will be available as long as traffic remains manageable.
The planning of this
project and exploration of available resources
- An initial meeting of consortium
members, joint with the 2005 Schistosoma and Filarial Genome
Network Meeting, Rockville MD, USA 1-2 September 2005 (summary
- A meeting of white paper authors
with members of GSC senior staff, WUGSC, St. Louis MO, USA 6 October
Interactively, a proposal is now being
drafted for approval by NHGRI of the specific strategy to characterize
the genome of Biomphalaria
Report on Update Meeting
on "Progress and development of the genome sequencing project for
(October 2007, ASTMH 56th annual meeting, Philadelphia
- Important points,
consider for your own research
1) The sequencing project will incorporate
manually improved characterization of 12 BACs. These will be selected
by WUGSC with input from the consortium. So if you have a favorite
gene, this is your chance to identify a relevant BAC clone from the
BB02 BAC library (see http://biology.unm.edu/biomphalaria-genome/BAC.html,
for info and link to the literature report), either using existing
info or by probing the library yourself (available from http://www.genome.arizona.edu/orders/.)
The time frame for BAC selection will end by early to mid 2008.
Send specific clone identifiers to firstname.lastname@example.org, this information
will be forwarded to the sequencing center for consideration. Selection
by WUGSC of BACs for sequencing will consider distribution of clones
among proposing research groups, time frame, progress of the genome
sequencing project and technical feasibility.
2) Also, it is possible to provide tissue- or cell-specific RNA samples
(BB02 strain Biomphalaria
glabrata) to WUGSC. These
will be used to generate cDNA libraries and characterize additional
ESTs as part of the sequencing effort. Communicate with WUGSC (contact
info below) before sending samples to avoid duplicate effort. WUGSC
already have exhaustively sequenced BB02 "whole body" library
and now have total RNA samples from BB02 hepatopancreas and ovotestes.
Dr S. Clifton, project leader of B. glabrata genome sequencing, WUGSC.
-Results to date derive from exploratory sequencing efforts by WUGSC.
Now that preparations have been completed, a strategy was presented
to complete the majority of sequencing within about a year.
-The sequencing strategy will include manually improved characterization
of 12 BACs from the BB02 BAC library. The target BACs will be selected
by WUGSC with input from the consortium.
-Sequencing will include an extensive 454 sequencing component (massively
-Sequencing efforts will include an EST component, submission of relevant
tissue/cell specific samples from BB02 Biomphalaria
glabrata is welcomed.
-Data generated will be available publicly (download instruction are
on the web and below).
-WUGSC will assemble genomic sequence data and provide a level of automated
annotation. It is not anticipated that this will result in a final,
complete view of the genome. Further interpretation and development
of the genome will take continued effort from the scientific community.
Topics discussed at the meeting
- The consortium continually intends that genome sequence data will
be publicly available for management and annotation.
- Dr Guilherme Oliveira (FIOCRUZ, Brazil) proposed to apply existing
expertise, computational infrastructure and funding available in Brazil
to initiate a database (design and format similar to SchistoDB) for
management and continued annotation of the Biomphalaria
This offer merits serious consideration, in the mean time it would be
helpful to be informed of alternative possibilities.
- WUGSC will continue to monitor quality of sequence data with regards
to timely updates, automated annotation and removal of vector/linker
- The consortium will explore possibilities to interact with genome
efforts for other gastropod species (Aplysia, Lottia)
to aid interpretation of sequence data.
- Contribution to the sequencing effort (resources/annotation/funding)
from other (international) entities continues to be welcome.
In November 2007,
WUGSC had generated ~60000 ESTs (dbEST and trace archives), ~60000+
WGS, and 12079 454-EST from BB02 snails (about 90% of all sequence entries
for Biomphalaria glabrata
in GenBank) as part of the preparatory phase of the project.
The “whole body” BB02 cDNA library has been exhausted (chance
that the next sequence will be novel is ~ 0.100). Other tissue/cell-specific
BB02 cDNA libraries will be characterized, using samples provided by
Concerns regarding haplotype diversity were approached by selfing and
bottle necking the BB02 strain of Biomphalaria
glabrata for six generations.
DNA is extracted from the resulting snails (whole body minus reproductive
organs). DNA is quantified from staining intensity on gels relative
to references, a dark pigment that co-purifies with genomic DNA made
spectrophotometry unreliable. With the DNA available, WUGSC will proceed
in the following manner (genome size estimate ~931Mb);
1X coverage from plasmids (WGS) on 3730 (931Mb)
10X 454 coverage (9.31Gb, WGS)
~0.1X BAC End Sequence (BES) (66 Mb or ~ 550 nt of both termini of 60,000
454-based cDNA sequencing (ESTs)
12 ”improved" BACs Shotgun (SG) + Pre-Finish (PF) + manual
Further cDNA and genome sequencing could be considered once an evaluation
of the initial assembly has been completed. Sequencing/Assembly concerns
include: high A/T content (60%), unknown repetitive content, heterozygosity/recombination
(used 40 organisms for BAC library, and expect to use many for WGS DNA).
Choose Genomes in banner; Invertebrates; Biomphalaria glabrata; Sequences
and maps (from menu on left side of page)
Retrieving data from ftp site:
Data will be put on ftp site per request of GSC staff member.
Access the GSC web site:
All package files are found in directories of their respective name
on FTP site.
To unpack: See readme file first!
gzip (or gunzip) -d BGAC-aaa01.tar.gz
tar xf BGAC-aaa01.tar
This will unpack into a directory with subsequent trees named chromat_dir,
phd_dir, traceinfo, fasta, qual, exp. These trees allow for immediate
use of consed
if installed. If your trace viewer requires it, uncompress each trace
file in the same way
Ex: gzip -d aaa01a01.b1.gz
Ex: + ftp genome.wustl.edu + gunzip traces.tar.gz
cd /BGAC-aaa01 gunzip exp.tar.gz
get traces.tar.gz tar -xvf traces.tar
get exp.tar.gz tar -xvf exp.tar
EST data are in the dbEST, NCBI Expressed Sequence Tags Database
DNA data (individual reads) are in the trace archives of NCBI
Note: The links to the NCBI databases are also provided on
WUGSC Contact information
Sandra W. Clifton, Ph.D.
GSC Assistant Director, Genetics Research Assistant Professor, Snail
Washington University Genome Sequencing Center.
314-286-1467 - OFFICE
Robert S. Fulton, MS
Finishing Group Leader
314-286-1810 - FAX
314-286-1838 - OFFICE
Production Group Leader
314-286-1460 - OFFICE
314-286-1810 - FAX