Collection of softwares and tutorials

####Biology

#####Sequence alignment

Description:SeqAn is an open source C++ library of efficient algorithms and data structures for the analysis of sequences with the focus on biological data. Our library applies a unique generic design that guarantees high performance, generality, extensibility, and integration with other libraries. SeqAn is easy to use and simplifies the development of new software tools with a minimal loss of performance.

#####Pathway analysis

Protein attribute and protein-protein interactions

Description: PredictProtein integrates feature prediction for secondary structure, solvent accessibility, transmembrane helices, globular regions, coiled-coil regions ,structural switch regions, B-values, disorder regions, intra-residue contacts, protein-protein and protein-DNA binding sites, sub-cellular localization, domain boundaries, beta-barrels, cysteine bonds, metal binding sites and disulphide bridge

Evaluation:  Not done yet.

Usage: Web server and Debian package

Pros: see mirror

Cons: see mirror

Publication:http://www.ncbi.nlm.nih.gov/pubmed/15215403[ref:718, 20111127]

Demands: Academic users can get 3 free times in one year.

Mirror:http://hpdb.hbu.cn/thesis/2005/jy/index.asp

 

6.GenomeTools

URL:http://genometools.org/

Description: The GenomeTools genome analysis system is a  free collection of bioinformatics tools (in the realm of genome informatics[visualization, mapping, repeat, genomebrowser ]) combined into a single binary named gt. It is based on a C library named “libgenometools” which consists of several modules.

 

7.GenomeThreader

URL:http://genomethreader.org/

Description:GenomeThreader is a software tool to compute gene structure predictions. The gene structure predictions are calculated using a similarity-based approach where additional cDNA/EST and/or protein sequences are used to predict gene structures via spliced alignments. GenomeThreader was motivated by disabling limitations in GeneSeqer, a popular gene prediction program which is widely used for plant genome annotation.

Evaluation: Plant prediction, plantGDB usage.

Usage: Free of charge for non-commercial usage.

Publication: http://dl.acm.org/citation.cfm?id=1709691.1709739[ref:47, 20111127]

 

 

 

 

9.Mobyle

URL:https://projets.pasteur.fr/wiki/mobyle

Description:Mobyle is a framework and web portal specifically aimed at the integration of bioinformatics software and databanks. An integrative platform can do many things.

Publication:http://dx.doi.org/10.1093/bioinformatics/btp493[ref:44, 20111125]

 

10.FunNet

URL:http://www.funnet.info/

Description:FunNet is an original integrative tool for exploring transcriptional interactions in microarray gene expression datasets. The analytical approach implemented in FunNet relies on knowledge extracted from public annotation databases to improve the biological relevance of the modular interaction patterns identified in co-expression networks.

Evaluation: worth a try

 

11.Expression Profiler at the EBI

URL:http://www.ebi.ac.uk/expressionprofiler/

Description:Expression Profiler: Next Generation is an open, extensible web-based collaborative platform for microarray gene expression, sequence and PPI data analysis, exposing distinct chainable components for clustering, pattern discovery, statistics (thru R), machine-learning algorithms and visualization.

Publication:nar.oxfordjournals.org/content/32/suppl_2/W465.full[ref:96, 20111125]

Evalution: great

 

12.EGAN: Exploratory Gene Association Networks

URL:http://akt.ucsf.edu/EGAN/

Description:EGAN is a software tool that allows a bench biologist to visualize and interpret the results of high-throughput exploratory assays in an interactive hypergraph of genes, relationships (protein-protein interactions, literature co-occurrence, etc.) and meta-data (annotation, signaling pathways, etc.). EGAN provides comprehensive, automated calculation of meta-data coincidence (over-representation, enrichment) for user- and assay-defined gene lists, and provides direct links to web resources and literature (NCBI Entrez Gene, PubMed, KEGG, Gene Ontology, iHOP, Google, etc.).

Publication:http://bioinformatics.oxfordjournals.org/cgi/pmidlookup?view=long&pmid=19933825[ref:9, 20111125]

Evaluation: interesting**

 

#####Genome browser

  • IGV

URL:http://www.broadinstitute.org/igv/

Description:The Integrative Genomics Viewer (IGV) is a high-performance visualization tool for interactive exploration of large, integrated datasets. It supports a wide variety of data types including sequence alignments, microarrays, and genomic annotations.

Evaluation: Very well.

14.Pathway Commons

URL:http://www.pathwaycommons.org/pc/home.do

Description: Browse and search pathways across multiple valuable public pathway databases.** Download an integrated set of pathways in BioPAX format for global analysis.

Evaluation:Looks great, but no graphical output. It has API and can work as a plugin of Cytoscape

Publication:http://nar.oxfordjournals.org/content/early/2010/11/10/nar.gkq1039.abstract[ref:19, 20111125]

 

15.QuickGO

URL:http://www.ebi.ac.uk/QuickGO/

Description:QuickGO is a web-based tool that allows easy browsing of the Gene Ontology (GO) and all associated electronic and manual GO annotations provided by the GO Consortium annotation groups. QuickGO offers a range of facilities including bulk downloads of GO annotation data which can be extensively filtered by a range of different parameters and GO slim set generation.

Feature: farther term, child term, co-occuring terms.

Evaluation: Very good

 

16.CGAP GO browser

URL:http://cgap.nci.nih.gov/Genes/GOBrowser

Description:With the CGAP GO browser, you can browse through the GO vocabularies, and find human and mouse genes assigned to each term.

Evaluation: maybe useful.

 

17.STRAP

URL:http://www.bumc.bu.edu/cardiovascularproteomics/cpctools/strap/

Description:STRAP, the Software Tool for Rapid Annotation of Proteins, saves you time by automatically annotating a protein list with information that helps you meaningfully interpret your mass spectrometry data.

Publication:pubs.acs.org/doi/abs/10.1021/ac901335x[ref:19, 20111125]

 

18.Manatee

URL:http://manatee.sourceforge.net/

Description:Manatee is a web-based tool used to perform manual functional annotation. It has been specifically designed to optimize the ability of curators to evaluate all available sequence-based and experimental data to assign the best possible annotation to a given gene product.
Manatee allows users to view, modify, and store annotation through interactions with an underlying relational database where all of the information is stored. Manatee supports the storage of multiple types of functional annotation including protein names, gene symbols, EC numbers, Gene Ontology terms, and associated supporting evidence. In addition, Manatee provides summary views of statistics and information from the genome as a whole.

Evaluation: Jcvi and Tiger

 

19.PINGO

URL:http://www.psb.ugent.be/esb/PiNGO/Home.html

Description:PiNGO is a Java-based tool to easily find unknown genes in a network that are significantly associated with user-defined target Gene Ontology (GO) categories. PiNGO is implemented as a plugin for Cytoscape, a popular open source software platform for visualizing and integrating molecular interaction networks. PiNGO predicts the categorization of a gene based on the annotations of its neighbors, using the enrichment statistics of its sister tool BiNGO. Networks can either be selected from the Cytoscape interface or uploaded from file. The main advantage of PiNGO is its flexibility. PiNGO also takes full advantage of Cytoscape’s versatile visualization environment.

 

20.LEMONE

URL:http://bioinformatics.psb.ugent.be/software/details/LeMoNe

Description:LeMoNe is a software package for Learning Module Networks from gene expression data.

Evaluation: Not known

21.ENIGMA

URL:http://bioinformatics.psb.ugent.be/ENIGMA/

Description: ENIGMA is a software tool to extract gene expression modules from perturbational microarray data, based on the use of combinatorial statistics and graph-based clustering. The modules are further characterized by incorporating other data types, e.g. GO annotation, protein interactions and transcription factor binding information, and by suggesting regulators that might have an effect on the expression of (some of) the genes in the module.

Feature: A little old.

 

22.G-SESAME

URL:http://bioinformatics.clemson.edu/G-SESAME/index.php

Description:*Gene Semantic Similarity Analysis and Measurement Tools. *

Feature: Compare two genes or go terms for their semantics similarity.**  Very slow.

 

23.ToppGene suit

URL:http://toppgene.cchmc.org/

Description:A one-stop portal for gene list enrichment analysis and candidate gene prioritization
based on functional annotations and protein interactions network .

Feature: qi

 

  1. TXTGate

URL:http://tomcat.esat.kuleuven.be/txtgate/

Description:TXTGate is a literature index database and is part of an experimental platform to evaluate (combinations of) information extraction and indexing from a variety of biological annotation databases. It is designed towards the summarization and analysis of groups of genes based on text.

Feature:

Publication: http://genomebiology.com/2004/5/6/R43[ref:64, 20111125]

 

 

  1. DAVID

URL:http://david.abcc.ncifcrf.gov/

Description:

  Identify enriched biological themes, particularly GO terms
  Discover enriched functional-related gene groups
  Cluster redundant annotation terms
  Visualize genes on BioCarta & KEGG pathway maps
  Display related many-genes-to-many-terms on 2-D view.
  Search for other functionally related genes not in the list
  List interacting proteins
  Explore gene names in batch
  Link gene-disease associations
  Highlight protein functional domains and motifs
  Redirect to related literatures
  Convert gene identifiers from one type to another.
  And more

Feature:

Evaluation: great

 

26.FunSpec

URL:http://funspec.med.utoronto.ca/

Description:FunSpec (an acronym for “Functional Specification”) inputs a list of yeast gene names, and outputs a summary of functional classes, cellular localizations, protein complexes, etc. that are enriched in the list.

 

27.FunCluster

URL:http://corneliu.henegar.info/FunCluster.htm

Description:”FunCluster” is a genomic data analysis algorithm which performs functional analysis of  gene expression data obtained from cDNA microarray experiments. Besides automated functional annotation of gene expression data, FunCluster functional analysis aims to detect co-regulated biological processes through a specially designed clustering procedure involving biological annotations and gene expression data.

Feature: Algorithm may be useful.

 

28.FuncAssociate 2.0

URL:http://llama.med.harvard.edu/cgi/func/funcassociate

Description:FuncAssociate is a web-based tool that accepts as input a list of genes, and returns a list of GO attributes that are over- (or under-) represented among the genes in the input list.

 

29.BLAST2GO

URL:http://www.blast2go.com/b2ghome

Description:An ALL in ONE tool for functional annotation of (novel) sequences and the analysis of annotation data.

 

30.immport

URL:https://www.immport.org/immportWeb/home/home.do?loginType=full

Description:ImmPort, the Immunology Database and Analysis Portal, is a one stop shop to access reference and experiment data for immunologists. ImmPort provides advanced information technology support in the production, analysis, archiving, and exchange of scientific data for the diverse community of life science researchers supported by NIAID/DAIT.

Feature:

 

31.STEM

URL:http://gene.ml.cmu.edu/stem/

Description:The Short Time-series Expression Miner (STEM) is a Java program for clustering, comparing, and visualizing short time series gene expression data from microarray experiments (~8 time points or fewer). STEM allows researchers to identify significant temporal expression profiles and the genes associated with these profiles and to compare the behavior of these genes across multiple conditions.

Feature: Less than 8 time points.

Evaluation: looks good

 

32.GeneMANIA

URL:http://www.genemania.org/

Description:GeneMANIA finds other genes that are related to a set of input genes, using a very large set of functional association data. Association data include protein and genetic interactions, pathways, co-expression, co-localization and protein domain similarity. You can use GeneMANIA to find new members of a pathway or complex, find additional genes you may have missed in your screen or find new genes with a specific function, such as protein kinases. Your question is defined by the set of genes you input.

Feature:cytoscape plugin or web server

Evalutaion: Openhelix tutorial. The results are much different with Pathway studio.

 

 

33.BLIP

URL:http://www.blipkit.org/

Description:Blip is a collection of logic programming modules intended primarily for bioinformatics and biomedical applications, although it contains some modules which may be of more general interest. Blip is intended to be both an application library, and a deductive database/query system. Blip is written in SWI-Prolog, a fast, robust and scalable implementation of ISO Prolog.

Feature:

Evaluation:

 

34.Graphweb

URL:http://biit.cs.ut.ee/graphweb/index.cgi

Description:GraphWebis a public web server for graph-based analysis of biological networks that:

  • analyses directed and undirected, weighted and unweighted heterogeneous networks of genes, proteins and microarray probesets for many eukaryotic genomes;
  • integrates multiple diverse datasets into global networks;
  • incorporates multispecies data using gene orthology mapping;
  • filters nodes and edges based on dataset support, edge weight and node annotation;
  • detects gene modules from networks using a collection of algorithms;
  • interprets discovered modules using Gene Ontology, pathways, and cis-regulatory motifs.

Publication:http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2447774/[ref;27, 20111125】

 

35.AVIDAS

URL:http://www.strandls.com/Avadis

Description:The collection of algorithms and visualizations in AVADIS® grows as new applications using the platform are developed. Currently, the algorithms that AVADIS® platform contains range from general purpose statistical mining and modelling algorithms, to text mining algorithms, to very application-specific algorithms for microarray/NGS data analysis, QSAR modelling and biological networks analysis. AVADIS® has a collection of powerful mining algorithms like PCA, ANOVA, T-test, clustering, classification and regression methods.

Feature:

 

 

36.BGI WEGO

URL:http://wego.genomics.org.cn/cgi-bin/wego/index.pl

Description:Web Gene Ontology Annotation Plotting. It has become one of the daily tools for downstream gene annotation analysis, especially when performing comparative genomics tasks.

Feature: It can compare annotation of several gene lists.

 

37.Whatizit

URL:http://www.ebi.ac.uk/webservices/whatizit/info.jsf

Description: Whatizit is a text processing system that allows you to do textmining tasks on text. The tasks come defined by the pipelines in the drop down list of the above window and the text can be pasted in the text area. The description of each individual task/pipeline can be found following the link next to the submit button. Whatizit is also a Medline abstracts retrieval/search engine. Instead of providing the text by Copy&Paste, you can launch a Medline search. The abstracts that match your search critetia are retrieved and processed by a pipeline of your choice.

 

38.G2D

URL:http://www.ogic.ca/projects/g2d_2/

Description:Welcome to G2D, a database of candidate genes for mapped inherited human diseases. Candidate priorities are automatically established by a data mining algorithm that extracts putative genes in the chromosomal region where the disease is mapped, and evaluates their possible relation to the disease based on the phenotype of the disorder

 

  1. Sequence Format conversion

http://sequenceconversion.bugaco.com/converter/biology/sequences/

http://www.agapow.net/software/bioscripts.convert

http://biowiki.org/StockholmTools

  1. VISTA

URL:http://genome.lbl.gov/vista/index.shtml

Description:VISTA is a comprehensive suite of programs and databases for comparative analysis of genomic sequences. There are two ways of using VISTA - you can submit your own sequences and alignments for analysis (VISTA servers) or examine pre-computed whole-genome alignments of different species.

 

41.Expasy

URL:http://www.expasy.org/

42.EBI

 

43.Anno-J

URL:http://www.annoj.org/

Description:Anno-J is a Web 2.0 application designed for visualizing deep sequencing data and other genome annotation data. It is intended to run in modern W3C compliant browsers*, and allows flexible configuration of plugins and data streams from providers located anywhere on the internet.

 

 

44.JMP life science

URL:http://www.jmp.com/support/downloads/life_sciences_documentation/documentation.shtml

Description:Welcome to JMP Life Sciences, a powerful desktop software system for integrated statistical analysis of clinical, genetic marker, microarray, and spectral (proteomics and metabolomics, for example) data. JMP Life Sciences software consists of more than 200 independent analytical procedures (APs). The purpose of this manual is to provide you with informative examples of how to use JMP Life Sciences software to extract the maximum amount of useful information from your data.

 

45.Mahout

URL:http://mahout.apache.org/

Description:The Apache Mahout™ machine learning library’s goal is to build scalable machine learning libraries.

 

 

46.MATLAb Mathworks

URL:http://www.mathworks.cn/index.html

 

47.TINKER

URL:http://dasher.wustl.edu/ffe/

Description:The TINKER molecular modeling software is a complete and general package for molecular mechanics and dynamics, with some special features for biopolymers. TINKER has the ability to use any of several common parameter sets, such as Amber (ff94, ff96, ff98, ff99, ff99SB), CHARMM (19, 22, 22/CMAP), Allinger MM (MM2-1991 and MM3-2000), OPLS (OPLS-UA, OPLS-AA), Merck Molecular Force Field (MMFF), Liam Dang’s polarizable model, and the AMOEBA (2004, 2009) polarizable atomic multipole force field. Parameter sets for other widely-used force fields are under consideration for future releases.

 

48.vigyaan

URL:http://www.vigyaancd.org/

Description:At present the following ready to use software comes on VigyaanCD: Arka/GP, Artemis, Bioperl, BLAST (NCBI-tools), ClustalW/ClustalX, Cn3D, EMBOSS tools, Garlic, Glimmer, GROMACS, Ghemical, GNU R, Gnuplot, GIMP, ImageMagick, Jmol, MPQC, MUMer, NJPlot, Open Babel, Octave, PSI3, PyMOL, Ramachandran plot viewer, Rasmol, Raster3D, Seaview, TINKER, XDrawChem, Xmgr and Xfig. GNU C/C++/Fortran compilers and additional Linux tools (such as ps2pdf) are also available.

 

49.ghemical

URl:http://www.bioinformatics.org/ghemical/ghemical/index.html

Description:Ghemical is computational chemistry package

 

50.MPQC

URL:http://www.mpqc.org/

Description:The Massively Parallel Quantum Chemistry Program MPQC is the Massively Parallel Quantum Chemistry Program. It computes properties of atoms and molecules from first principles using the time independent Schrödinger equation. It runs on a wide range of architectures ranging from individual workstations to symmetric multiprocessors to massively parallel computers.

 

51.GLIMMER

URl:http://www.cbcb.umd.edu/software/glimmer/

Description:Glimmer is a system for finding genes in microbial DNA, especially the genomes of bacteria, archaea, and viruses. Glimmer (Gene Locator and Interpolated Markov ModelER) uses interpolated Markov models (IMMs) to identify the coding regions and distinguish them from noncoding DNA.

 

52.STADEN

URL:http://staden.sourceforge.net/

DEscription:This is a free to academics (charge for commercial users) package including sequence assemble, trace viewing/editing and sequence analysis tools. It also includes a GUI to the free EMBOSS suite.

 

53.NetSurfP -

Description:Protein Surface Accessibility and Secondary Structure Predictions

 

54.NetTurnP

Description: Prediction of Beta-turn regions in protein sequences

 

55.MODELLER

Description: Used for homology or comparative modeling of protein three-dimensional structures

 

56.AutoDock

Description:Suite of Automated Docking Tools

57.Gromacs

URL:http://www.gromacs.org

Description: A molecular dynamics package primarily designed for biomolecular systems such as proteins and lipids. GROMACS is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles.

 

58.Pymol

URl:http://en.wikipedia.org/wiki/Pymol

Description:yMOL is an open-source, user-sponsored, molecular visualization system created by Warren Lyford DeLano and commercialized by DeLano Scientific LLC, which is a private software company dedicated to creating useful tools that become universally accessible to scientific and educational communities. It can produce high quality 3D images of small molecules and biological macromolecules, such as proteins. According to the author, almost a quarter of all published images of 3D protein structures in the scientific literature were made[when?] using PyMOL.

 

59.STING

URL:http://www.cbi.cnptia.embrapa.br/SMS

Description:STING (Sequence To and withIN Graphics) is a free Web-based suite of programs for a comprehensive analysis of the relationship between protein sequence, structure, function, and stability.

 

60.MEME

URL:http://meme.nbcr.net

Description:Motif-based sequence analysis tools

 

61.Cluster

URL:http://www.rocksclusters.org/wordpress/

 

62.Geneinfo

URL:http://code.google.com/p/geneinfo/

Description:Application to retrieve gene information from various sources

 

63.ProtFun

URL:http://www.cbs.dtu.dk/services/ProtFun/

Description:The ProtFun 2.2 server produces * ab initio * predictions of protein function from sequence. The method queries a large number of other feature prediction servers to obtain information on various post-translational and localizational aspects of the protein, which are integrated into final predictions of the cellular role, enzyme class (if any), and selected Gene Ontology categories of the submitted sequence.

 

64.Annotea

URL:http://www.w3.org/2001/Annotea/

Description:Annotea is a W3C LEAD (Live Early Adoption and Demonstration) project under Semantic Web Advanced Development (SWAD).

 

65.Protein structure

URL:http://predictioncenter.org/

 

 

66.BWA

 

67.BOWTIE

 

68.KAKS

 

69.Phostcon

 

70.EBI的基因全局搜索,关键词全局搜索,特定基因及其家族。

 

71.UGENE

URL:http://ugene.unipro.ru/

Description:

  • New tools: MrBayes, BWA (all platforms), update of Bowtie and BLAST tools
  • Short reads assembly viewer: performance and reads coloring improvements
  • Work with large data sets: open, view and annotate huge DNA files on a usual desktop
  • Workflow designer: new data filtering elements
  • Sequence viewer: new DNA flexibility and GC Frame Plot graphs
  • All in one package: download UGENE, documentation and external tools in a single file!

72.Phobos

URL:http://www.ruhr-uni-bochum.de/spezzoo/cm/cm_phobos.htm

Description:a tandem repeat search tool for complete genomes

 

73.TRedD

URL:http://tandem.sci.brooklyn.cuny.edu/

Description:TANDEm repeats databse and finding software

 

74.CREAD

URL:http://rulai.cshl.edu/cread/index.shtml

Description:Comprehensive Regulatory Element Analysis and Discovery

 

75.miRfocus

URL:http://www.mirfocus.org

76.mummer

URL:http://mummer.sourceforge.net/

77.Pathway related
URL:http://www.genecloud.org/#1098693
Gene Cloud (genecloud.org) is a novel tool presenting gene-gene associations based on the scientific literature. It was developed by the Knockout Mouse Repository (www.komp.org) to help our customers find products related to other products they chose. We have built a detailed graph model of gene-gene associations based on how many times two genes are cited in the same article. If two genes are cited in many papers together, they are considered strongly connected.

 

78.

 Pathway.Enrichment Analysis Tools

GSEA
DAVID
GOMiner
Babelomics
MAPPFinderGOStats
Ontotools
GOTM
FunSpec
GeneMergeFuncAssociate
GOToolBox
GFINDer
WebGestalt
GOALPathway Explorer
PLAGE
t-profiler
WebBayGO
JProGOADGO
GeneTrail
GAZER
PathExpress </p>

Picture related tools:

1.protein domain

http://prosite.expasy.org/cgi-bin/prosite/mydomains/

2.Exon-intron

http://wormweb.org/exonintron

3.Map protein domain to gene exon and get record from NCBI

http://code.google.com/p/variationtoolkit/

4.bioGPS

5.bioGraph

 

Network

1.Pajek

http://vlado.fmf.uni-lj.si/pub/networks/pajek/

CHENTONG

CHENTONG
积微,月不胜日,时不胜月,岁不胜时。凡人好敖慢小事,大事至,然后兴之务之。如是,则常不胜夫敦比于小事者矣!何也?小事之至也数,其悬日也博,其为积也大。大事之至也希,其悬日也浅,其为积也小。故善日者王,善时者霸,补漏者危,大荒者亡!故,王者敬日,霸者敬时,仅存之国危而后戚之。亡国至亡而后知亡,至死而后知死,亡国之祸败,不可胜悔也。霸者之善著也,可以时托也。王者之功名,不可胜日志也。财物货宝以大为重,政教功名者反是,能积微者速成。诗曰:德如毛,民鲜能克举之。此之谓也。

生信宝典文章集锦

### 程序学习心得* [生物信息之程序学习](http://mp.weixin.qq.com/s?__biz=MzI5MTcwNjA4NQ==&mid=2247483927&idx=1&sn=23adf2b9d13400f2081f790e674e...… Continue reading

R统计绘图 - 柱状图

Published on August 12, 2017

R 学习 - 维恩图

Published on August 01, 2017