Stephen Woloszynek and I have started to develop an R package, theseus, to help microbiologists and ecologists analyze amplicon sequencing data. There are many packages available that focus on numerical ecology, two of the most prolific being vegan and phyloseq. Each package (not just those previously mentioned) has it’s own focuses, and correspondingly, positives and negatives.

My intention for theseus is to:

  • develop (new) or formalize (known) approaches for handling and processing microbial community information from raw reads through read trimming, pre-processing, and denoising
  • augment current functionality in other packages, including expansion of analytical methods and the visualization of results
  • provide users access to published datasets

For those who are interested, theseus (the package) is named for both Theseus (the man) and the thought experiment surrounding his ship.

“Official” DESCRIPTION: An approach to the visualization, analysis, and interpretation of (microbial) community composition data, especially those originating from amplicon sequencing. Analysis techniques include constrained and unconstrained ordination and visualizing taxonomic abundances and spatial patterns, among others. Methods intended to assist bioinformaticians and ecologists in selecting read trimming by quality scores and preprocessing/denoising of datasets are also provided.