biomphalaria genome initiative

Biomphalaria glabrata genome sequencing

On behalf of the Biomphalaria glabrata genome initiative, a white paper proposal for the sequencing of the complete 931 Mb genome of the snail Biomphalaria glabrata (BB02 strain), was submitted to the National Human Genome Research Institute (NHGRI). Summer 2004, the Comparative Genome Evolution (CGE) working group included a "high priority" recommendation in its proposal. Biomphalaria glabrata was added to the sequencing pipeline shortly thereafter, and listed as one of the new genomic sequencing targets.

The first large set of genomic sequence data (whole genome shotgun reads or WGS) was contributed by the Washington University, Genome sequencing Center (WUGSC) in November 2004. As a result, Biomphalaria glabrata (BB02 strain) became a registered species in the NCBI trace archive in Summer 2005.

The Genome Center at Washington University has been assigned the full project of sequencing the genome of Biomphalaria glabrata: (link to) the webpage dedicated to the genome sequencing of Biomphalaria glabrata.


Continuing efforts are yielding progress in the characterization of the genome of Biomphalaria glabrata
Recently, preliminary assemblies became available from the Genome Sequencing center at Washington University.

*These are PRELIMINARY Bg Genomic Data, subject to change*
Use at your own discretion!*
*Be aware that the assembly is highly fragmented*
*DO NOT contact the sequencing center on this subject*
*future assemblies will likely show improvements*

The sequencing effort is still in progress, these sequence data are not to be used for independent large scale analysis or reports.

Updates and progress reports will follow as appropriate.
Blast access to the data, and sequence data are available through
This site will be available as long as traffic remains manageable.

The planning of this project and exploration of available resources

  • An initial meeting of consortium members, joint with the 2005 Schistosoma and Filarial Genome Network Meeting, Rockville MD, USA 1-2 September 2005 (summary report PDF-file).
  • A meeting of white paper authors with members of GSC senior staff, WUGSC, St. Louis MO, USA 6 October 2005.

Interactively, a proposal is now being drafted for approval by NHGRI of the specific strategy to characterize the genome of Biomphalaria glabrata.

Report on Update Meeting on "Progress and development of the genome sequencing project for Biomphalaria glabrata",
(October 2007, ASTMH 56th annual meeting, Philadelphia PA USA).

  • Important points, consider for your own research

    1) The sequencing project will incorporate manually improved characterization of 12 BACs. These will be selected by WUGSC with input from the consortium. So if you have a favorite gene, this is your chance to identify a relevant BAC clone from the BB02 BAC library (see, for info and link to the literature report), either using existing info or by probing the library yourself (available from The time frame for BAC selection will end by early to mid 2008.
    Send specific clone identifiers to, this information will be forwarded to the sequencing center for consideration. Selection by WUGSC of BACs for sequencing will consider distribution of clones among proposing research groups, time frame, progress of the genome sequencing project and technical feasibility.

    2) Also, it is possible to provide tissue- or cell-specific RNA samples (BB02 strain
    Biomphalaria glabrata) to WUGSC. These will be used to generate cDNA libraries and characterize additional ESTs as part of the sequencing effort. Communicate with WUGSC (contact info below) before sending samples to avoid duplicate effort. WUGSC already have exhaustively sequenced BB02 "whole body" library and now have total RNA samples from BB02 hepatopancreas and ovotestes.

  • Summary Update meeting

Presentation by Dr S. Clifton, project leader of B. glabrata genome sequencing, WUGSC.
-Results to date derive from exploratory sequencing efforts by WUGSC. Now that preparations have been completed, a strategy was presented to complete the majority of sequencing within about a year.
-The sequencing strategy will include manually improved characterization of 12 BACs from the BB02 BAC library. The target BACs will be selected by WUGSC with input from the consortium.
-Sequencing will include an extensive 454 sequencing component (massively parallel sequencing)
-Sequencing efforts will include an EST component, submission of relevant tissue/cell specific samples from BB02 Biomphalaria glabrata is welcomed.
-Data generated will be available publicly (download instruction are on the web and below).
-WUGSC will assemble genomic sequence data and provide a level of automated annotation. It is not anticipated that this will result in a final, complete view of the genome. Further interpretation and development of the genome will take continued effort from the scientific community.

Topics discussed at the meeting
- The consortium continually intends that genome sequence data will be publicly available for management and annotation.
- Dr Guilherme Oliveira (FIOCRUZ, Brazil) proposed to apply existing expertise, computational infrastructure and funding available in Brazil to initiate a database (design and format similar to SchistoDB) for management and continued annotation of the Biomphalaria glabrata genome.
This offer merits serious consideration, in the mean time it would be helpful to be informed of alternative possibilities.
- WUGSC will continue to monitor quality of sequence data with regards to timely updates, automated annotation and removal of vector/linker sequences.
- The consortium will explore possibilities to interact with genome efforts for other gastropod species (Aplysia, Lottia) to aid interpretation of sequence data.
- Contribution to the sequencing effort (resources/annotation/funding) from other (international) entities continues to be welcome.

In November 2007, WUGSC had generated ~60000 ESTs (dbEST and trace archives), ~60000+ WGS, and 12079 454-EST from BB02 snails (about 90% of all sequence entries for Biomphalaria glabrata in GenBank) as part of the preparatory phase of the project.
The “whole body” BB02 cDNA library has been exhausted (chance that the next sequence will be novel is ~ 0.100). Other tissue/cell-specific BB02 cDNA libraries will be characterized, using samples provided by consortium members.
Concerns regarding haplotype diversity were approached by selfing and bottle necking the BB02 strain of Biomphalaria glabrata
for six generations. DNA is extracted from the resulting snails (whole body minus reproductive organs). DNA is quantified from staining intensity on gels relative to references, a dark pigment that co-purifies with genomic DNA made spectrophotometry unreliable. With the DNA available, WUGSC will proceed in the following manner (genome size estimate ~931Mb);

1X coverage from plasmids (WGS) on 3730 (931Mb)
10X 454 coverage (9.31Gb, WGS)
~0.1X BAC End Sequence (BES) (66 Mb or ~ 550 nt of both termini of 60,000 BACs)
454-based cDNA sequencing (ESTs)
12 ”improved" BACs Shotgun (SG) + Pre-Finish (PF) + manual improvement

Further cDNA and genome sequencing could be considered once an evaluation of the initial assembly has been completed. Sequencing/Assembly concerns include: high A/T content (60%), unknown repetitive content, heterozygosity/recombination (used 40 organisms for BAC library, and expect to use many for WGS DNA).

Data Access

Choose Genomes in banner; Invertebrates; Biomphalaria glabrata; Sequences and maps (from menu on left side of page)

Retrieving data from ftp site:
Data will be put on ftp site per request of GSC staff member.
Access the GSC web site:
All package files are found in directories of their respective name on FTP site.
To unpack: See readme file first!
gzip (or gunzip) -d BGAC-aaa01.tar.gz
tar xf BGAC-aaa01.tar

This will unpack into a directory with subsequent trees named chromat_dir, edit_dir,
phd_dir, traceinfo, fasta, qual, exp. These trees allow for immediate use of consed
if installed. If your trace viewer requires it, uncompress each trace file in the same way
Ex: gzip -d aaa01a01.b1.gz

Ex: + ftp + gunzip traces.tar.gz
cd /BGAC-aaa01 gunzip exp.tar.gz
get traces.tar.gz tar -xvf traces.tar
get exp.tar.gz tar -xvf exp.tar

EST data are in the dbEST, NCBI Expressed Sequence Tags Database
DNA data (individual reads) are in the trace archives of NCBI
Note: The links to the NCBI databases are also provided on

WUGSC Contact information
Sandra W. Clifton, Ph.D.
GSC Assistant Director, Genetics Research Assistant Professor, Snail Project Manager
Washington University Genome Sequencing Center.
314-286-1810 -FAX
314-286-1467 - OFFICE

Robert S. Fulton, MS
Finishing Group Leader
314-286-1810 - FAX
314-286-1838 - OFFICE

Lucinda Fulton
Production Group Leader
314-286-1460 - OFFICE
314-286-1810 - FAX

updates will follow as information becomes available.
If you are preparing a Biomphalaria glabrata-related grant proposal, inclusion of mention of the genome sequencing project may be useful to your proposal's goals.