pocl-4.0
PoCL is a portable open source (MIT-licensed) implementation of the OpenCL standard (1.2 with some 2.0 features supported).
PoCL is a portable open source (MIT-licensed) implementation of the OpenCL standard (1.2 with some 2.0 features supported).
Dorado is a high-performance, easy-to-use, open source basecaller for Oxford Nanopore reads.
pycistarget is a python module to perform motif enrichment analysis in sets of regions with different tools and identify high confidence TF cistromes.
SCENIC+ is a python package to build enhancer driven gene regulatory networks (GRNs) using combined or separate single-cell gene expression (scRNA-seq) and ...
DeepTCR is a python package that has a collection of unsupervised and supervised deep learning methods to parse TCRSeq data.
IgBLAST faclilitates the analysis of immunoglobulin and T cell receptor variable domain sequences.
MrBayes is a program for the Bayesian estimation of phylogeny.
The MEME Suite allows you to: * discover motifs using MEME, DREME (DNA only) or GLAM2 on groups of related DNA or protein sequences, * search sequence datab...
Hypermutation analysis software using BetaRat distribution for Bayesian analysis of the relative probability ratio (RPR) of observing mutations in two conte...
Ultrafast consensus module for raw de novo genome assembly of long uncorrected reads.
Tesseract is an optical character recognition engine. OCR
Scientific workflow engine designed for simplicity & scalability.
A Python 3.8+ library for the PubWeb platform.
Seurat is an R package designed for QC, analysis, and exploration of single cell RNA-seq data. fhSeurat module has additional Bioconductor packages for singl...
AGAT: Another GTF/GFF Analysis Toolkit. Suite of tools to handle gene annotations in any GTF/GFF format.
Python interface for rapid clustering of large sets of CDR3 sequences with unknown antigen specificity.
This repository contains a reference implementation of the BGEN format, written in C++. The library can be used as the basis for BGEN support in other soft...
The alleleCount package primarily exists to prevent code duplication between some other projects, specifically AscatNGS and Battenberg. As of v4 the perl co...
iVar is a computational package that contains functions broadly useful for viral amplicon-based sequencing.
GoPeaks is a peak caller designed for CUT&TAG/CUT&RUN sequencing data.
An error model and pipeline for analyzing deep mutational scanning (DMS) data and diagnosing common experimental pathologies.
Spektral is a Python library for graph deep learning. The main goal of this project is to provide a simple but flexible framework for creating graph neural ...
Tcr Receptor Utilities for Solid Tissue (TRUST) is a computational tool to analyze TCR and BCR sequences using unselected RNA sequencing data, profiled from ...
The MEME Suite allows you to: * discover motifs using MEME, DREME (DNA only) or GLAM2 on groups of related DNA or protein sequences, * search sequence datab...
SuperLU is a general purpose library for the direct solution of large, sparse, nonsymmetric systems of linear equations on high performance machines.
Regenie is a C++ program for whole genome regression modelling of large genome-wide association studies. It is developed and supported by a team of scientis...
This repository contains a reference implementation of the BGEN format, written in C++. The library can be used as the basis for BGEN support in other soft...
Interface to various variant calling formats.
EggNOG-mapper is a tool for fast functional annotation of novel sequences. It uses precomputed orthologous groups and phylogenies from the eggNOG database (h...
This is the RStudio Server version. RStudio is a set of integrated tools designed to help you be more productive with R.
The Genome Analysis Toolkit or GATK is a software package developed at the Broad Institute to analyse next-generation resequencing data. The toolkit offers ...
SHAPEIT4 is a fast and accurate method for estimation of haplotypes (aka phasing) for SNP array and high coverage sequencing data.
PyTorch Geometric (PyG) is a geometric deep learning extension library for PyTorch.
splitpipe tool from Parse Biosciences. The pipeline takes FASTQ files and delivers processed data in the form of a cell-gene count matrix, which serves as t...
Delly is an integrated structural variant (SV) prediction method that can discover, genotype and visualize deletions, tandem duplications, inversions and tra...
kallisto is a program for quantifying abundances of transcripts from RNA-Seq data, or more generally of target sequences using high-throughput sequencing re...
Scientific workflow engine designed for simplicity & scalability.
GROMACS is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of partic...
BaseSpace is a powerful website where biologists and informaticians can easily store, analyze, and share genetic data. BaseSpace is a commerical product fro...
R is a free software environment for statistical computing and graphics.
splitpipe tool from Parse Biosciences. The pipeline takes FASTQ files and delivers processed data in the form of a cell-gene count matrix, which serves as t...
Composable transformations of Python+NumPy programs, differentiate, vectorize, JIT to GPU/TPU, and more
R is a free software environment for statistical computing and graphics.
GRIDSS is a module software suite containing tools useful for the detection of genomic rearrangements. GRIDSS includes a genome-wide break-end assembler, as ...
Cell Ranger is a set of analysis pipelines that process Chromium single-cell RNA-seq output to align reads, generate gene-cell matrices and perform cluster...
Scientific workflow engine designed for simplicity & scalability.
PLINK2 is a free, open-source whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally eff...
FlashPCA performs fast principal component analysis (PCA) of single nucleotide polymorphism (SNP) data.
The gdc-client provides several convenience functions over the GDC API which provides general download/upload via HTTPS.
Spectra stands for Sparse Eigenvalue Computation Toolkit as a Redesigned ARPACK. It is a C++ library for large scale eigenvalue problems, built on top of Ei...
RAxML-NG is a phylogenetic tree inference tool which uses maximum-likelihood (ML) optimality criterion. Its search heuristic is based on iteratively perform...
PLINK is a free, open-source whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally effi...
Built to complement the rich, open source Python community, the Anaconda platform provides an enterprise-ready data analytics platform that empowers compani...
BBMap short read aligner, and other bioinformatic tools.
Metal - Meta Analysis Helper. The METAL software is designed to facilitate meta-analysis of large datasets (such as several whole genome scans) in a conveni...
BAli-Phy estimates multiple sequence alignments and evolutionary trees from DNA, amino acid, or codon sequences.
GRIDSS is a module software suite containing tools useful for the detection of genomic rearrangements. GRIDSS includes a genome-wide break-end assembler, as ...
EMAN2 is the successor to EMAN1. It is a broadly based greyscale scientific image processing suite with a primary focus on processing data from transmission...
Space Ranger is a set of analysis pipelines that process Visium spatial RNA-seq output and brightfield microscope images in order to detect tissue, align rea...
Cell Ranger ARC is a set of analysis pipelines that process Chromium Single Cell Multiome ATAC + Gene Expression sequencing data to generate a variety of a...
Topaz is a pipeline for particle picking in cryo-electron micrographs using neural networks and positive-unlabeled learning
PDFCrop is a Perl script that crops the white margins of PDF pages and rescales them to fit a standard size sheet of paper. It makes the printed pages far m...
RELION (for REgularised LIkelihood OptimisatioN) is a stand-alone computer program for Maximum A Posteriori refinement of (multiple) 3D reconstructions or 2...
Trinity represents a novel method for the efficient and robust de novo reconstruction of transcriptomes from RNA-Seq data. Trinity combines three independen...
RevBayes provides an interactive environment for statistical computation in phylogenetics. It is primarily intended for modeling, simulation, and Bayesian i...
EPACTS is a versatile software pipeline to perform various statistical tests for identifying genome-wide association from sequence data through a user-frien...
SvABA - Structural variation and indel analysis by assembly
R is a free software environment for statistical computing and graphics.
Program for infering admixture proportions and doing PCA with a single NGS sample. Inferences based on reference panel.
StringTie is a fast and highly efficient assembler of RNA-Seq alignments into potential transcripts
The SRA Toolkit, and the source-code SRA System Development Kit (SDK), will allow you to programmatically access data housed within SRA and convert it from...
Scientific workflow engine designed for simplicity & scalability.
STAR aligns RNA-seq reads to a reference genome using uncompressed suffix arrays.
medaka is a tool to create a consensus sequence from nanopore sequencing data.
Seurat is an R package designed for QC, analysis, and exploration of single cell RNA-seq data. fhSeurat module has additional Bioconductor packages for singl...
PyClone is a Python package that wraps rclone and provides a threaded interface for an installation at the host or container level.
The EIGENSOFT package combines functionality from our population genetics methods (Patterson et al. 2006) and our EIGENSTRAT stratification correction meth...
ClustalW2 is a general purpose multiple sequence alignment program for DNA or proteins.
Fast individual ancestry inference from DNA sequence data leveraging allele frequencies from multiple populations. iAdmix Using population allele frequencie...
A tool designed to provide fast all-in-one preprocessing for FastQ files. This tool is developed in C++ with multithreading supported to afford high perform...
(Fusion And Chromosomal Translocation Enumeration and Recovery Algorithm) is a tool for detection of genomic fusions in paired-end targeted (or genome-wide)...
The WiggleTools package allows genomewide data files to be manipulated as numerical functions, equipped with all the standard functional analysis operators (...
Raw sequences produced by next generation sequencing (NGS) machines may contain adapter, linker, barcode and fingerprint sequences. TagDust2 is a program to ...
MAGeCK-VISPR is a comprehensive quality control, analysis and visualization workflow for CRISPR/Cas9 screens The workflow combines the MAGeCK algorithm to id...
The Giotto package consists of two modules, Giotto Analyzer and Viewer, which provide tools to process, analyze and visualize single-cell spatial expression ...
Project Homepage: ancestry
A generalized CRISPR guideRNA design tool.
cDNA_Cupcake is a miscellaneous collection of Python and R scripts used for analyzing sequencing data.
SQANTI3 is the first module of the Functional IsoTranscriptomics (FIT) framework, that also includes IsoAnnot and tappAS. Used for new long read-defined tra...
DNA methylation is a major epigenetic modification regulating several biological processes. A standard approach in the study of DNA methylation is bisulfite...
STAR aligns RNA-seq reads to a reference genome using uncompressed suffix arrays.
A tool for CNV discovery and genotyping from depth-of-coverage by mapped reads
A basic Python-based EGA download client
easyblock name homepage toolchain exts_defaultclass builddependencies dependencies exts_list (‘colorcet’, modextrapaths # moduleclass
easyblock name homepage toolchain dependencies use_pip exts_list sanity_check_paths moduleclass
Scala’s Simple Build Tools, The interactive build tool. Define your tasks in Scala. Run them in parallel from sbt’s interactive shell
Filtering and trimming of long read sequencing data.
Cutadapt finds and removes adapter sequences, primers, poly-A tails and other types of unwanted sequence from your high-throughput sequencing reads.
Aggregate results from bioinformatics analyses across many samples into a single report. MultiQC searches a given directory for analysis logs and compiles a ...
A tool to map bisulfite converted sequence reads and determine cytosine methylation states
Medaka is a tool to create a consensus sequence of nanopore sequencing data.
MOFA is a factor analysis model that provides a general framework for the integration of multi-omic data sets in a completely unsupervised fashion.
Kraken is a system for assigning taxonomic labels to short DNA sequences, usually obtained through metagenomic studies. Previous attempts by other bioinforma...
A fast and effective stochastic algorithm to infer phylogenetic trees by maximum likelihood.
TraCeR reconstructs the sequences of rearranged and expressed T cell receptor genes from single-cell RNA-seq data. It then uses the TCR sequences to identify...
Project Homepage: TensorFlow
Seqtk is a fast and lightweight tool for processing sequences in the FASTA or FASTQ format. It seamlessly parses both FASTA and FASTQ files which can also be...
Clairvoyante, a multi-task five-layer convolutional neural network model for predicting variant type, zygosity, alternative allele and Indel length.
The Levenshtein Python C extension module contains functions for fast computation of; Levenshtein (edit) distance, and edit operations string similarity appr...
t-distributed stochastic neighbor embedding (t-SNE) is widely used for visualizing single-cell RNA-sequencing (scRNA-seq) data, but it scales poorly to large...
A python package that allows to count antibody TAGS from a CITE-seq and/or cell hashing experiment.
The smallgenomeutilities are a collection of scripts that is useful for dealing and manipulating NGS data of small viral genomes. They are written in Python ...
RevBayes provides an interactive environment for statistical computation in phylogenetics. It is primarily intended for modeling, simulation, and Bayesian in...
Filtering and trimming of long read sequencing data.
Project Homepage: STAR
The Vienna RNA Package consists of a C code library and several stand-alone programs for the prediction and comparison of RNA secondary structures.
An alignment and variant-calling pipeline for Illumina deep sequencing of HIV-1, based on the probabilistic aligner HMMER
A bioinformatics tool to PRe-process and show INformation of SEQuence data.
Tools for processing BAM files; bamsormadup, bamcollate2, bammarkduplicates, bammaskflags, bamrecompress, bamsort, bamtofastq
LEMON stands for Library for Efficient Modeling and Optimization in Networks. It is a C++ template library providing efficient implementations of common data...
Nextflow is a bioinformatics workflow manager that enables the development of portable and reproducible workflows. It supports deploying workflows on a varie...
R is a free software environment for statistical computing and graphics.
BBMap short read aligner, and other bioinformatic tools.
The Chromium Single Cell ATAC Software Suite is a complete package for analyzing and visualizing single cell chromatin accessibility data produced by the Chr...
name homepage toolchain source_urls builddependencies dependencies osdependencies configopts # exts_default_options # ] moduleclass
Boost provides free peer-reviewed portable C++ source libraries.
R is a free software environment for statistical computing and graphics.
Program proj is a standard Unix filter function which converts geographic longitude and latitude coordinates into cartesian coordinates
parasail is a SIMD C (C99) library containing implementations of the Smith-Waterman (local), Needleman-Wunsch (global), and semi-global pairwise sequence ali...
easyblock name homepage toolchain source_urls dependencies # sanity_check_paths modextrapaths moduleclass
Qcat is Python command-line tool for demultiplexing Oxford Nanopore reads from FASTQ files.
Implements a class of univariate and multivariate spatial generalised linear mixed models for areal unit data, with inference in a Bayesian setting using Mar...
The Gurobi Optimizer allows users to state their toughest business problems as mathematical models, and then automatically considers billions or even trillio...
Universal Command Line Environment for AWS. Includes package awscli-plugin-endpiont
R is a free software environment for statistical computing and graphics. Built for Ubuntu 18.04.
STAR-Fusion is a component of the Trinity Cancer Transcriptome Analysis Toolkit (CTAT). STAR-Fusion uses the STAR aligner to identify candidate fusion transc...
Software package for signal-level analysis of Oxford Nanopore sequencing data.
Chromium Single Cell Software Suite is a set of software applications for analyzing and visualizing single cell 3’ RNA-seq data produced by the 10x Genomics ...
SAM Tools provide various utilities for manipulating alignments in the SAM format, including sorting, merging, indexing and generating alignments in a per-p...
A C library for reading/writing high-throughput sequencing data. HTSlib also provides the bgzip, htsfile, and tabix utilities.
The Genome Analysis Toolkit or GATK is a software package developed at the Broad Institute to analyse next-generation resequencing data. The toolkit offers a...
BamTools provides both a programmer’s API and an end-user’s toolkit for handling BAM files.
Guppy software supports MinIT and MinION instruments from Nanopore Technologies
Julia is a high-level, high-performance dynamic programming language for technical computing, with syntax that is familiar to users of other technical comput...
R is a free software environment for statistical computing and graphics.
BEAST is a cross-platform program for Bayesian MCMC analysis of molecular sequences. It is entirely orientated towards rooted, time-measured phylogenies infe...
SPRING is a collection of pre-processing scripts and a web browser-based tool for visualizing and interacting with high dimensional data. View an example dat...
umap is installed as Python module. Uniform Manifold Approximation and Projection (UMAP) is a dimension reduction technique that can be used for visualisatio...
Python is a programming language that lets you work more quickly and integrate your systems more effectively.
Larry Wall’s Practical Extraction and Report Language
Pplacer places reads on a phylogenetic tree. guppy (Grand Unified Phylogenetic Placement Yanalyzer) yanalyzes them. rppr is a helpful tool for working with r...
OCaml is a general purpose industrial-strength programming language with an emphasis on expressiveness and safety. Developed for more than 20 years at Inria ...
FastANI is developed for fast alignment-free computation of whole-genome Average Nucleotide Identity (ANI). ANI is defined as mean nucleotide identity of ort...
Prodigal (Prokaryotic Dynamic Programming Genefinding Algorithm) is a microbial (bacterial and archaeal) gene finding program developed at Oak Ridge National...
GTDB-Tk is a software toolkit for assigning objective taxonomic classifications to bacterial and archaeal genomes. It is computationally efficient and design...
The latest Fred Hutch Python build with over 470 modules.
Python is a programming language that lets you work more quickly and integrate your systems more effectively. Basic Python package to be used as base package...
The LLVM Core libraries provide a modern source- and target-independent optimizer, along with code generation support for many popular CPUs (as well as some ...
Basic Local Alignment Search Tool, or BLAST, is an algorithm for comparing primary biological sequence information, such as the amino-acid sequences of diffe...
The aim of VCFtools is to provide easily accessible methods for working with complex genetic variation data in the form of VCF files.
We often have to convert between sequence formats and do little tasks on them, and it’s not worth writing scripts for that. Seqmagick is a kickass little uti...
Biopython is a set of freely available tools for biological computation written in Python by an international team of developers. It is a distributed collabo...
The gdc-client provides several convenience functions over the GDC API which provides general download/upload via HTTPS.
PEAR is an ultrafast, memory-efficient and highly accurate pair-end read merger. It is fully parallelized and can run with as low as just a few kilobytes of ...
HMMER is used for searching sequence databases for homologs of protein sequences, and for making protein sequence alignments. It implements methods using pro...
Strelka2 is a fast and accurate small variant caller optimized for analysis of germline variation in small cohorts and somatic variation in tumor/normal samp...
The program PHASE implements a Bayesian statistical method for reconstructing haplotypes from population genotype data. Documentation: http://stephenslab.uch...
beagle-lib is a high-performance library that can perform the core calculations at the heart of most Bayesian and Maximum Likelihood phylogenetics packages.
MUMmer is a system for rapidly aligning entire genomes, whether in complete or draft form. AMOS makes use of it.
The Genome Analysis Toolkit or GATK is a software package developed at the Broad Institute to analyse next-generation resequencing data. The toolkit offers a...
FastTree infers approximately-maximum-likelihood phylogenetic trees from alignments of nucleotide or protein sequences. FastTree can handle alignments with u...
FastQC is a quality control application for high throughput sequence data. It reads in sequence data in a variety of formats and can either provide an intera...
FLASH (Fast Length Adjustment of SHort reads) is a very fast and accurate software tool to merge paired-end reads from next-generation sequencing experiments...
Eigen is a C++ template library for linear algebra: matrices, vectors, numerical solvers, and related algorithms.
Accelerated BLAST compatible local sequence aligner
Perl binding for MySQL
Bioperl is the product of a community effort to produce Perl code which is useful in biology. Examples include Sequence objects, Alignment objects and databa...
BEAST is a cross-platform program for Bayesian MCMC analysis of molecular sequences. It is entirely orientated towards rooted, time-measured phylogenies infe...
Burrows-Wheeler Aligner (BWA) is an efficient program that aligns relatively short nucleotide sequences against a long reference sequence such as the human g...
Basic Local Alignment Search Tool, or BLAST, is an algorithm for comparing primary biological sequence information, such as the amino-acid sequences of diffe...
BEDOPS is an open-source command-line toolkit that performs highly efficient and scalable Boolean and other set operations, statistical calculations, archivi...
BBMap short read aligner, and other bioinformatic tools.
Variant Effect Predictor (VEP) determines the effect of your variants (SNPs, insertions, deletions, CNVs or structural variants) on genes, transcripts, and p...
MariaDB An enhanced, drop-in replacement for MySQL.
Perl binding for MySQL
Bioperl is the product of a community effort to produce Perl code which is useful in biology. Examples include Sequence objects, Alignment objects and databa...
Perl library for reading files using HTSlib including BAM/CRAM, Tabix and BCF database files
BBTools is a suite of fast, multithreaded bioinformatics tools designed for analysis of DNA and RNA sequence data. BBTools can handle common sequencing file ...
SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM, BAM and CRAM formats, written by Heng ...
HTSlib is an implementation of a unified C library for accessing common file formats, such as SAM, CRAM and VCF, used for high-throughput sequencing data, an...
BamTools provides both a programmer’s API and an end-user’s toolkit for handling BAM files. The BAM Format is a binary format for storing sequence data.
The BEDTools allow a fast and flexible way of comparing large datasets of genomic features. The BEDtools utilities allow one to address common genomics tasks...
BCFtools is a set of utilities that manipulate variant calls in the Variant Call Format (VCF) and its binary counterpart BCF.
Bowtie 2 is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. It is particularly good at aligning reads of ab...
SHAP package added to Python build: Python-3.6.5-foss-2016b-fh3 SHAP (SHapley Additive exPlanations) is a unified approach to explain the output of any machi...
AGFusion is a python package for annotating gene fusions from the human or mouse genomes. AGFusion simply needs the reference genome, the two gene partners, ...
JupyterLab is the next-generation web-based user interface for Project Jupyter. The Jupyter Notebook is an open-source web application that allows you to cre...
cisTEM is user-friendly software to process cryo-EM images of macromolecular complexes and obtain high-resolution 3D reconstructions from them.
R is a free software environment for statistical computing and graphics. This is our bigest and best R release ever. Over 835 R packages and Bioconductor pa...
Industrial-strength natural language processing in Python
UniRef90 database is downloaded to /shared/biodata/ncbi-blast/uniref90.fasta. To be used with PSI-BLAST and to support Psipred. PsiPred
Basic Local Alignment Search Tool BLAST finds regions of similarity between biological sequences. The program compares nucleotide or protein sequences to ...
The PSIPRED Protein Sequence Analysis Workbench aggregates several UCL structure prediction methods into one location.
LoFreq is a fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data. It makes full use of base-call qualities an...
The XHMM (eXome-Hidden Markov Model) software suite calls copy number variation from next generation sequencing projects, where exome capture was used or tar...