SNPlinks: A Guide to SNP Databases

NCBI SNP Consortium AB Ensembl HGVBase

NCBI dbSNP http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=snp

dbSNP is the principal repository of SNP data generated from the public Human Genome Mapping Project. The NCBI host site also holds the most important collection of nucleotide (GenBank), gene (LocusLink), protein (Protein), published article (PubMed and Medline) and inherited disorder (OMIM) databases currently available.

The NCBI homepage gives links to all of these databases: http://www.ncbi.nlm.nih.gov/ with the full site map on: http://www.ncbi.nlm.nih.gov/Sitemap/index.html. 
NCBI has held each of the draft versions of the human sequence since 1990 and now contains the reference sequence completed in April 2003. This consists of 2.9 billion bases (99% coverage of gene-containing DNA) with an error rate of 1 in 10,000 bases. dbSNP is regularly updated in synchrony with genome rebuilds ensuring the highest quality of scrutiny. 

There is an extensive pdf-based user handbook that can be downloaded by chapter: http://www.ncbi.nlm.nih.gov/books/bv.fcgi?call=bv.View..ShowTOC&rid=handbook.TOC&depth=2.

NCBI uses a standardised query system for the major databases it hosts termed Entrez: http://www.ncbi.nlm.nih.gov/Entrez/
This system provides a unified approach to constructing queries and integrates data from several sources in the information returned from a query, principally through the use of cross-referenced hyperlinks. It provides the easiest way to collect SNPs with particular characteristics in common such as type of substitution or minimum allele frequency. Users familiar with PubMed queries can use exactly the same rules and syntax for constructing searches. Entrez gene was added to Entrez SNP and the other 12 searchable databases in December 2003. There is a global search engine for querying all 14 data sets at once.
 


The SNP Consortium http://snp.cshl.org/

This database is run by Cold Spring Harbor Laboratory on behalf of a private/public partnership of 17 organisations forming The SNP Consortium (TSC). It currently comprises 1.8 million loci all of which are listed in dbSNP – both databases are fully cross referenced but The SNP Consortium uses a different locus ID number prefixed with the letters TSC. This database actively collates SNPs in the public domain with the highest levels of validation, normally in samples from African, European and Asian populations.


Applied Biosystems Assays-on-Demand Database at MyScience

http://myscience.appliedbiosystems.com/cdsEntry/Form/assay_search_basic.jsp

 This database is a subset of 96,000 SNPs taken from the Celera Discovery System (CDS) subscription genome database (http://www.celeradiscoverysystem.com/) currently comprising 4.1 million markers discovered from the private HGMP initiative. SNPs in this database have been selected specifically to provide a gene-centric SNP linkage map as part of the assay design service for the ABI Taqman system. The database can be freely accessed and queried with several search criteria within the limitations of the scope of the data provided. However the more detailed information necessary for developing genotyping assays, such as flanking sequence, requires a CDS subscription.


Ensembl http://www.ensembl.org/Homo_sapiens/

A genome database and browser run by EBI and The Sanger Centre. It parallels the NCBI content with a similar search system and graphical browser. The Sanger Centre website provides an alternative BLAST site (http://www.sanger.ac.uk/cgi-bin/blast/submitblast/hgp
and sequence alignment tool: SSAHA (
http://www.sanger.ac.uk/Software/analysis/SSAHA/).


HGVBase (formerly HGBase) http://hgvbase.cgb.ki.se/

Comprises 2.86 million human genome variants including SNPs DIPs (deletion/insertion polymorphisms) and STRs. Lists low frequency variants and new mutations plus about 40% of dbSNP loci re-curated for HGVBase. In addition this site provides a very useful listing of 45 online SNP-related databases. (http://hgvbase.cgb.ki.se/cgi-bin/main.pl?page=databases_.htm).


Chris Phillips 18.02.2004