folder

RTK: efficient rarefaction analysis of large datasets

Authors

  • P. Saary
  • K. Forslund
  • P. Bork
  • F. Hildebrand

Journal

  • Bioinformatics

Citation

  • Bioinformatics 33 (16): 2594-2595

Abstract

  • Motivation: The rapidly expanding microbiomics field is generating increasingly larger datasets, characterizing the microbiota in diverse environments. Although classical numerical ecology methods provide a robust statistical framework for their analysis, software currently available is inadequate for large datasets and some computationally intensive tasks, like rarefaction and associated analysis. Results: Here we present a software package for rarefaction analysis of large count matrices, as well as estimation and visualization of diversity, richness and evenness. Our software is designed for ease of use, operating at least 7x faster than existing solutions, despite requiring 10x less memory. Availability and implementation: C ++ and R source code (GPL v.2) as well as binaries are available from https://github.com/hildebra/Rarefaction and from CRAN (https://cran.r-project.org/). Contact: bork@embl.de, falk.hildebrand@embl.de


DOI

doi:10.1093/bioinformatics/btx206