The Intracellular Pathogen Cooperative Group @ St George's

Bioinformatics and Related Web Links

Session Started

 Last updated : 5th Jan 2007

Cooperative Home Page

Old Phone Book (Internal)

St George's Portal

Seminars

London Technology Network Funding Sources

Stores

Genreal Bioinformatics Resources

Local Resource pages

Local Bioinformatics Teaching Resources (Internal Access only)

Remote Resources/References

  • The International Society for Magnetic Resonance in Medicine is a nonprofit professional association devoted to furthering the development and application of magnetic resonance techniques in medicine and biology. The Society holds annual scientific meetings and sponsors other major educational and scientific workshops.

Primary Sequence Databases Tools

Nucleic Acid Sequence Databases

Protein Sequence Databases

  • The National Center for Biotechnology Information (NCBI): public databases, develops and distributes software tools for analyzing genome data, based in Bethesda MD US
    • Cross database searching using ENTREZ GenBank (nucleotides and proteins), PubMed (MEDLINE), 3D structures, genomes, and PopSet databases. This is a  similar type of search engine to SRS at the EBI
    • Searching GenBank
    • Genome Resources
    • dbEST is a division of GenBank that contains sequence data and other information on "single-pass" cDNA sequences, or Expressed Sequence Tags, from a number of organisms
  • European Bioinformatics Institute (EBI) Home Page: maintains and provides access to public databases such as EMBL and information services, has an out-station based in UK at Hinxton Cambs
    • Some of the EMBOSS tools with an web interface
    • SRS search session allows complex cross database linking of queries to most major sequence, functional and literature databases, as well as a range of tools
  • DNA Data Bank of Japan (DDBJ) Japanese National database resource
  • "estinformatics" resources/databases of ESTs, GSSs, etc. which have been cleaned up of vector and E.coli sequence
  • miRBase (http://microrna.sanger.ac.uk/) is the new home of microRNA data on the web, providing data previously accessible from the miRNA Registry. Old miRNA Registry addresses should redirect you to this page.
    • The miRBase Sequence Database is a searchable database of published miRNA sequences and annotation. The data were previously provided by the miRNA Registry.
    • The miRBase Registry continues to provide gene hunters with unique names for novel miRNA genes prior to publication of results.
    • The miRBase Targets database is a new resource of predicted miRNA targets in animals.
  • human, mouse, and rat miRNAs, ARGONAUTE — a database for gene regulation by mammalian microRNAs

NMR

Genome Sequence Databases and Genome Annotation

  • The Institute for Genomic Research (Formerly TIGR) Institute with similar goals and aims as the Sanger center:- structural, functional and comparative analysis of genomes and gene products from a wide variety of organisms
  • The Sanger Centre Web Server one of the leading genomics centres in the world, dedicated to analysing and understanding genomes. Provides access to Software tools for interogating the genonme of selected organisms.
  • Genome Squencing projects at the Sanger
  • Laboratory of Genomics of Microbial Pathogens, Institut Pasteur
  • Ensembl Genome Server
  • NCBI - The whole genomes of over 1000 viruses and over 100 microbes can be found in Entrez Genome. The genomes represent both completely sequenced organisms
  • Encylopedia of DNA Elements (ENCODE) project started in september 2003 aims to map the functional elements of the Human genome, see the UCSC ENCODE browser
  • GenomeNet is a Japanese network of database and computational services for genome research and related research areas in molecular and cellular biology
  • GeneCards Homepage, GeneCards™ is a database of human genes, their products and their involvement in diseases. It offers information about the functions of all human genes that have an approved symbol.
  • The Joint Genome Institute (JGI) established in 1997, is a consortium of scientists, engineers and support staff from the U.S. Department of Energy's Lawrence Berkeley , Lawrence Livermore, and Los Alamos National Laboratories. We aim to develop and exploit new sequencing and other high-throughput, genome-scale and computational technologies as a means for discovering and characterizing the basic principles and relationships underlying the organization, function, and evolution of living systems. DOE expanded its genomic research to include the Microbial Genome Initiative in 1994.

Organism specific databases

Gene Ontology (GO / Functional Groupings)

  • Department of Computer Science, Wayne State University hosts a collection of GO tools which are free tools to  academics and includes Onto-Express (OE) as a novel tool able to automatically translate gene lists of differentially regulated genes into functional profiles.
  • have a look at http://www.geneontology.org/GO.tools.microarray.shtml or other tools at geneontology.org
  • WEGO (Web Gene Ontology Annotation Plot) is a useful tool for plotting GO annotation results : http://wego.genomics.org.cn/cgi-bin/wego/index.pl
  • GOToolBox web server. This site provides a series of programs allowing the functional investigation of groups of genes, based on the Gene Ontology^TM ressource. Gene Ontology for Significant Collection of Annotations: GO-Scan is a tool that selects and presents relevant Gene Ontology (GO) annotations for a gene "hit" list from an Affymetrix microarray experiment
  • GENETOOLS is a collection of web-based tools on top of a database that brings together information from a broad range of resources, and provides this in a manner particularly useful for genome-wide analyses. Today, the two main tools connected to this database are the NMC Annotation Database V2.0 and eGOn V2.0

Prokaryotic Genome Databases

Other Genome Databases

  • MBGD is a database for comparative analysis of completely sequenced microbial genomes, the number of which is now growing rapidly. The aim of MBGD is to facilitate comparative genomics from various points of view such as orthologue identification, paralogue clustering, motif analysis and gene order comparison.
  • TIGR Microbial Database: a listing of published microbial genomes and chromosomes and those in progress
  • The Integrated Microbial Genomes (IMG) system provides a framework for comparative analysis of the genomes sequenced by the Joint Genome Institute. The IMG produce some very useful web based comparative genomics tools.

Metabolic Databases (see also miscellaneous tools metabolic pathway modelling)

Protein Resources

Databases of Biological Markers & Genetic Disorders

  • The BioCyc Knowledge Library is a collection of Pathway/Genome Databases. Each database describes the genome and metabolic pathways of a single organism, with the exception of the MetaCyc.
    • EcoCyc is a bioinformatics database that describes the genome and the biochemical machinery of E. coli.
    • MetaCyc metabolic pathway database contains pathways from over 150 different organisms
  • Kyoto Encyclopedia of Genes and Genomes, KEGG integrates the information about genes and proteins generated by genome sequencing, functional genomics and proteomics with metabolic pathways
  • The Reactome knowledgebase relies on collaborations with research biologists to construct expert consensus views of key biological processes
    • Skypainter is a tool to determine which events (reactions and/or pathways) are statistically overrepresented in a set of genes as specified by submitted list of identifiers. In other words, given a list of genes, Skypainter can identify common events for these genes
  • Database for Annotation, Visualization and Integrated Discovery (DAVID)is a web-based tool that provides integrated solutions for the annotation and analysis of genome-scale datasets derived from high-throughput technologies such as microarray and proteomic platforms. Analysis results and graphical displays remain dynamically linked to primary data and external data repositories, thereby furnishing in-depth as well as broad-based data coverage. The functionality provided by DAVID accelerates the analysis of genome-scale datasets by facilitating the transition from data collection to biological meaning.
  • National Cancer Institute
  • OMIM -- Online: Mendelian Inheritance in Man: database catalog of human genes and genetic disorders
  • The IGMS is a comprehensive information system that combines the knowledge from genomic sequence, genetic map and genetic disorders databases.
  • PathDB database on pathologically relevant mutated forms of transcription factors and transcription factor binding sites held at at BIOBASE
  • NCBI plays a major role in facilitating the identification and cataloging of SNPs through its creation and maintenance of the public SNP database (dbSNP)
  • dbSTS is an NCBI resource that contains sequence and mapping data on short genomic landmark sequences or Sequence Tagged Sites
  • Cytokine Online Pathfinder Encyclopaedia COPE is an encyclopaedic dictionary of information relating to cytoines compiled by Horst Ibelgauft
  • Cytokines Web Provides information about cytokines and their receptors.
  • Pathogen Host Interaction Database PHI-base> This database contains curated molecular and biological information on genes with published affect on the outcome of host-pathogen interactions. Information is also given on the target sites of some anti-infective chemistries.

Infectious Disease Databases and Resources

  • The primary mission of the BioHealthBase system is to assist scientific researchers in their development of vaccines, therapeutics, and diagnostics. The National Institute of Allergy and Infectious Disease (NIAID) Division of Microbiology and Infectious Diseases (DMID) recognizes the challenge posed by bioterrorism, the emergence of disease due to drug-resistant variants of etiologic organisms. DMID has envisioned a consortium of Bioinformatics Resource Centers (BRCs) for Biodefense and Emerging/Re-emerging Infectious Diseases that will provide information technology (IT) support for experimental studies of pathogenic organisms that could be used for biowarfare and bioterrorist activities, many of which also pose an ongoing threat to public health.

Other Functional and Motif based Secondary Databases

  • small RNA database, Small RNAs are broadly defined as the RNAs not directly involved in protein synthesis. Small RNAs are usually in the 75-400 nucleotides range, although some are as long as thousand base pairs. They are synthesized by either RNA Polymerase I, II or III.
  • Ribosomal Database Project Online Analyses, The Ribosomal Database Project (RDP) provides ribosome related data services to the scientific community, including online data analysis, rRNA derived phylogenetic trees, and aligned and annotated rRNA sequences
  • TransTerm - Translational Signal Database, a database of sequence contexts about the stop and start codons of many species found in GenBank. TransTerm also contains codon usage data for these same species and summary statistics for the sequences analysed.


  • Blocks are multiply aligned ungapped segments corresponding to the most highly conserved regions of proteins. The Blocks database can be searched in a variety of ways
  • SMART allows the identification and annotation of genetically mobile domains and the analysis of domain architectures. More than 500 domain families found in signalling, extracellular and chromatin-associated proteins are detectable. These domains are extensively annotated
  • InterPro is an integrated documentation resource for protein families, domains and sites. It combines a number of databases  that use different methodologies and a varying degree of biological information on well-characterised proteins to derive protein signatures.
  • TRANSFAC is a database on eukaryotic cis-acting regulatory DNA elements and trans-acting factors at BIOBASE
  • Gene Prediction, Extraction, Description and Analysis Tool PEDANT :Genome analysis and annotation Tool used by many of the automatically annotated databases. Site also lists Computational analysis of complete genomic sequences as well as Experimental and unfinished genomic sequences using PEDANT
  • EXProt (database for EXPerimentally verified Protein functions) is a new non-redundant database containing protein sequences for which the function has been experimentally verified
  • The BioModels Database is a new effort to develop a data resource that will allow biologists to store, search and retrieve published mathematical models of biological interests. The models in the BioModels Database are annotated and linked to relevant data resources, such as publications, databases of coumpounds and pathways, controlled vocabularies

PCR -Primer Design -siRNA

Collections of Real-Time PCR primer and probe sets

  • Real-Time PCR technical resource page : useful background on real-time systems maintained by M.Tevfik Dorak
  • Reat-Time PCR technical tutoial produced by Margaret Hunt at University of South Carolina
  • GeneQuantification web pages Technische Universitat Munchen, describes and summarises all technical aspects involved in quantitative gene expression analysis using real-time qPCR & qRT-PCR, has a useful listing of interesting papers
  • Lux Primer design and ordering service  at Invitrogen for real-time/qPCR and standard PCR applications
  • TM calculator for Designing LDR Probes , as a downloadable excel template using Nearest Neighbour (NNM) (10-40bp, Breslauer, K.J. et.al. (1986)) and standard calculation of Meinkoth, J. and Wahl, G. (1984)
  • WWW programs for primer design :
  • Primer Banks for Real-time PCR
    • a public database holding real time PCR primers and probes for popular chemistries at University of Gent for gene expression RTPrimerDB and Metylation studies methPrimerDB
    • Real Time PCR Primer Sets mainly Sybr Green probes listed
    • PrimerBank is a public resource for PCR primers. These primers are designed for gene expression detection or quantification (real-time PCR). PrimerBank contains about 180,000 primers covering most known human and mouse genes.

siRNA/RNAi technology

  • Dharmacon Research, Inc. was founded in 1995 to develop and commercialize a new technology for RNA oligonucleotide synthesis. The one of the main focuses of  interest the company currently has is in small inhibitory RNAs. Dharmacon web interface for siRNA design
  • Invitrogen's RNAi Block-iT RNAi Designer page According to Invitrogen "The designer uses a rational design scheme based on statistical analysis of multiple validated siRNA training sets and a proprietary algorithm to select unique target sequences  that have a markedly improved probability of success in silencing the target gene"
  • MWG offers licensed1 custom siRNA and pre-designed siRNA synthesis with siMAX™ technology, the also provide a free web based design tool.
  • See these companys Akceli; Inc.,Alnylam Pharmaceuticals,Ambion; Inc.,BD Biosciences CLONTECH; Dharmacon; Inc.,Imgenex Corporation; Mirus Corporation; Promega Corporation; QIAGEN N.V.; Sequitur, Inc.; Sirna Therapeutics, Inc.
  • See nucleic Acid databases for links to miRBase and Argonaut

Chemical/Drug databases  

Miscellaneous Databases and Information resources

  • Links to Databases, From this page you can access an ever increasing number of biological and related links. The number of entries in the database currently exceeds 2500.
  • MetaBase, the database of biological databases implemented using MediaWiki
  • genetoolsis a site providing a useful listing of Bioinformatic tools see site map for listing of databases, software utilities etc
  • Microbes.info is an internet gateway portal designed to bring useful and interesting microbiology informational resources
  • Institute Fur Molecular Biotechnologie Jena:  entry page with a range of useful links

Glossarys of terms

  • 2can EBI Bioinformatics Educational Resources
    • an excellent searchable glossary of terms
    • tutorials
    • EBIs own listing of resources on the internet
  • NGRI talking Glossary The Talking Glossary of Genetic Terms was introduced to help people without scientific backgrounds understand the terms and concepts used in genetic research
  • American Type Culture Collection Home Page ATCC is a global nonprofit bioresource center that provides biological products, technical services
  • Cell Line Data Base Servizio Biotecnologie, Istituto Nazionale per la Ricerca sul Cancro, Italy.CLDB, the first database set up within the Interlab Project, contains detailed information on 4.850 human and animal cell lines that are available in many Italian laboratories and in some of the most important European cell banks and cell culture collections.
  • CGSC: E.coli Genetic Stock Center The CGSC Database of E.coli genetic information includes genotypes and reference information for the strains in the CGSC collection, gene names, properties, and linkage map, gene product information, and information on specific mutations.
  • Invitrogen in 2005 made available a series tools to non-for-profit organisations including Vector NTI, Vector Designer, OligoPerfect Desinger and LuxTM Designeralong with other online tools such as clickable pathway maps linked to gene and protein information iPath™ contains 225 interactive maps of biological signaling and metabolic pathways.

Miscellaneous BioInformatcs Tools

  • MSight, created by the Proteome Informatics Group, was specifically developed for the representation of mass spectra along with data from the separation step. The software allows graphical exploration inside huge datasets.
  • biosequence conversion tool is one of the miscellaneous tools available in the EBI toolbox, converts sequence between different file formats
  • EZ-Retrieve :A web-server for batch retrieval of coordinate-specified human DNA sequences and underscoring putative transcription factor-binding sites. 
  • RepeatMasker is a program that screens DNA sequences for interspersed repeats and low complexity DNA sequences. The output of the program is a detailed annotation of the repeats that are present in the query sequence as well as a modified version of the query sequence in which all the annotated repeats have been masked (default: replaced by Ns). On average, almost 50% of a human genomic DNA sequence currently will be masked by the program
  • Transcription binding site search tool and Veiwer. An easy to use interface can be installed & run from your desk top.
  • The Integrative Genomics Viewer (IGV) from the Broad Institute is a fast, flexible viewer for genomic data. IGV visually integrates datasets from various platforms and sources
  • Scriptome project tool box aims to provide experimental biologists with tools for exploring and manipulating biological data. The site provides perl scrpits which can be copied & pasted in an x-windows session to help bench biologists to "eyeball", filter, format, and analyze the many large files they get from those and other programs.
  • In silico experiments with complete bacterial genomes: includes in-silico PCR, Amplified Fragment Length Polymorphism PCR, restriction digestion and virtual Pulsed Field Gel Electrophoresis in 1.2% agarose on the products as well as other restriction enzyme based tools on 148 genomes
  • Tools, Databases: at IMB
  • Pairwise or Multiple sequence alignment
    • ThermonucleotideBLAST is a downloadable source code for searching a target database of nucleic acid sequences using an assay specific query. ThermonucleotideBLAST queries are based on biochemical assays (i.e. a pair of oligonucleotide sequences representing PCR primers or Padlock probes, a triplet of oligos representing PCR primers and a TaqMan probe or a single oligo representing a hybridization probe). Unlike existing programs (i.e. BLAST) which use heuristic measures of sequence similarity for identifying matches between a query and target sequence, ThermonucleotideBLAST uses physically relevant measures of sequence similarity -- free energy and melting temperature.
    • ClustalW at Baylor College of Medicine US, ClustalW is also available from the EBI sequence ToolBox tab and elsewhere
    • BCM Search Launcher: Multiple Sequence Alignments. Includes options for
    • MultAlin : Multiple sequence alignment with hierarchical clustering at INRA France
    • DIALIGN is a novel program for multiple alignment developed by Burkhard Morgenstern et al.and constructs pairwise and multiple alignments by comparing whole segments of the sequences. This approach is very efficient where sequences are not globally related but share only local similarities, as is the case with genomic DNA and with many protein families.
    • MGA allows a direct comparison of the genomic DNA sequences of sufficiently similar organisms by the use of anchored segments.An alignment of 74% of the complete genomes of three of strains of E.coli (lengths: 5,528,445; 5,498,450; 4,639,221 bp.) is produced in 30 minutes. Must been run locally
    • VMatch software tool for efficiently solving large scale sequence matching tasks. It uses string comparisons and masking to make multiple alignments to reveal unique sequences across many genomes and is very fast. Unfortunately Vmatch must also be run locally.
  • Biological Pathway modelling

BioInformatic & Software Resources

Programming Languages

Java

PERL DOS & HTML

C++

Links pages

on-line journals/periodicals 

Journal Collections & Bibliographic databases (PubMed/BIDS)

 

  • The ISI Web of Knowledge **Service for UK Education provides a single route to all the Thomson Scientific products subscribed to by your institution.
  • Science Direct, SGUL has a subscription, however, ScienceDirect also has 60 plus free/complementary Journals listed
  • Scirus is the most comprehensive science-specific search engine available on the Internet, linking to more than 167 million indexed scientific pages and documents. With Scirus you can search through a variety of sources, such as Medline, ScienceDirect, BioMedCentral, preprint servers, patents and web sites relevant for your research.
  • Google Schloar is a new beta search engine able to carry out "deep searches" and is extremely good at specifically finding  literature, including peer-reviewed papers, theses, books, preprints, abstracts and technical reports from all broad areas of research.
  • PubMed
  • PubMed Central is a searchable digital archive of life sciences journal literature at the U.S. National Institutes of Health (NIH)
  • An excellent  extensive listing of free on-line medical journals
  • SGUL Journals From May 15th 2006 there will be a new E-Journal only A-Z web list for SGUL users available here:
  • A separate Print A-Z list where you can check all the print journal holdings
  • World-Wide Web Virtual Library: Biosciences
  • World-Wide Web Virtual Library: Biochemistry, Biophysics, and Molecular Biology (Biosciences)
  • Wadsworth-Tuberculosis
  • Journal Impact Factors for 2002-2004 or this link includes 2005 as an excel file
    • On-line medical/educational references

      More are available on the support page for my own yr 2 CBL group , it may also be helpful to other MBBS students tutoring Group Home Page

      • British National Formulary (requires free registration)
      • Department of Health Providing health and social care policy, guidance and publications.
        • Almost all current and many old DH publications, including statistical reports, surveys, press releases, circulars and legislation, are available in electronic form in this section. To find what you're looking for, type search terms into the publications library (by following the link to the right), or browse by category with the links below.
      • eMC - Electronic Medicines Compendium http://emc.medicines.org.uk/ The eMC provides Data Sheets and Summaries of Product Characteristics (SPCs) for 2,500 medicines licensed in the UK
      • eMedicine has good searchable general resources for learning
      • FlyingPublisher is a gate way to Free Online Medical Journals Books and Web sites, this is one of the best free online resources for medicine with over 650 online medical books writtenfor and by Doctors
      • Google Schloar is a new beta search engine able to carry out "deep searches" and is extremely good at specifically finding  literature, including peer-reviewed papers, theses, books, preprints, abstracts and technical reports from all broad areas of research.
      • Health Protection Agency is an independent body that protects the health and well-being, they provide various resources particularly around infectious diseases
      • Medcyclopaedia™  The Encyclopaedia of Medical Imaging's eight book volumes: Physics, Techniques and Procedures, Normal Anatomy, Musculoskeletal and Soft Tissue Imaging, Gastrointestinal and Urogenital Imaging, Chest and Cardiovascular Imaging, Neuroradiology and Head and Neck Imaging, and Paediatric Imaging. Access is free (copy text and images for non-commercial use provided that you refer to the source)
      • MedlinePlus is one of the services provided to the US National Library of Medicine and the NIH. It provides a searchable links to medical resources in US, the dictionary tab is particularly useful for clarifying terminology and provides cross indexing to the equivalent UK spelling and terminology.
      • Medcyclopaedia™The Encyclopaedia of Medical Imaging's eight book volumes: Physics, Techniques and Procedures, Normal Anatomy, Musculoskeletal and Soft Tissue Imaging, Gastrointestinal and Urogenital Imaging, Chest and Cardiovascular Imaging, Neuroradiology and Head and Neck Imaging, and Paediatric Imaging. Access is free (copy text and images for non-commercial use provided that you refer to the source)
      • National Library of Medicine
      • National Electronic Library of Health a portal for health issues and information. Athens password required for some resources
      • National Library of Medicine  The Library collects materials in all areas of biomedicine and health care, as well as works on biomedical aspects of technology, the humanities, and the physical, life, and social sciences. It includes searchable information resources for the layman such as MedLinePlus and health professional.
      • National Electronic Library of Health Programme is working with NHS Libraries to develop a digital library for NHS staff, patients and the public, it provides a portal for health issues and information. Athens password required for some resources
      • PBMRT - Bio/Chemical Journals and Newsletters

      • PubMed this site includes PubMed Books which is searched by selecting Books in the drop down next to the search text field
      • Scirus is the most comprehensive science-specific search engine available on the Internet, linking to more than 167 million indexed scientific pages and documents. With Scirus you can search through a variety of sources, such as Medline, ScienceDirect, BioMedCentral, preprint servers, patents and web sites relevant for your research.
      • PathCAL resource is a set of student tutorials on biology, pathology etc, aimed at those learning the basics of disease, then put in your Athens username and password. 
      • Primal Pictures you will need a St George's Athens account which provides an interactive multimedia overview of human anatomy. It provides a complete 3D model of human anatomy. Connect to the Athens website(http://www.athensams.net/myathens) Log in with your St George's Athens username and password and click on Resources Select Ovid Online from the list of resources on screen
      • UpToDate is specifically designed to answer the clinical questions that arise in daily practice, contains peer-reviewed, clinical information across 13 clinical specialties including internal medicine, obsterics and gynecology, general practice and paediatrics. As of Oct 08 Access has ended.
      • Wikipaedia good genreal Information Web-based, free-content encyclopedia. Please remember some care should be taken to cross validate internet sources.