| Title: | A Comprehensive Microbiome Data Processing Pipeline |
|---|---|
| Description: | Provides tools for cleaning, processing, and preparing microbiome sequencing data (e.g., 16S rRNA) for downstream analysis. Supports CSV, TXT, and Excel file formats. The main function, ezclean(), automates microbiome data transformation, including format validation, transposition, numeric conversion, and metadata integration. It also handles taxonomic levels efficiently, resolves duplicated taxa entries, and outputs a well-structured, analysis-ready dataset. The companion functions ezstat() run statistical tests and summarize results, while ezviz() produces publication-ready visualizations. |
| Authors: | Utsav Lamichhane [aut, cre] |
| Maintainer: | Utsav Lamichhane <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.2.0 |
| Built: | 2026-05-20 06:09:44 UTC |
| Source: | https://github.com/cran/mbX |
Processes microbiome and metadata files (e.g., 16S rRNA sequencing data) to produce an analysis-ready dataset. Supports CSV, TXT, and 'Excel' file formats. This function validates file formats, reads the data, and merges the datasets by the common column 'SampleID'. If a 'Taxonomy' column exists, the data are filtered to include only rows matching the provided taxonomic level.
ezclean(microbiome_data, metadata, level = "d")ezclean(microbiome_data, metadata, level = "d")
microbiome_data |
A string specifying the path to the microbiome data file. |
metadata |
A string specifying the path to the metadata file. |
level |
A string indicating the taxonomic level for filtering the data (e.g., "genus"). |
A data frame containing the cleaned and merged dataset.
## Not run: mb <- system.file("extdata", "microbiome.csv", package = "mbX") md <- system.file("extdata", "metadata.csv", package = "mbX") if (nzchar(mb) && nzchar(md)) { cleaned_data <- ezclean(mb, md, "g") head(cleaned_data) } else { message("Sample data files not found.") } ## End(Not run)## Not run: mb <- system.file("extdata", "microbiome.csv", package = "mbX") md <- system.file("extdata", "metadata.csv", package = "mbX") if (nzchar(mb) && nzchar(md)) { cleaned_data <- ezclean(mb, md, "g") head(cleaned_data) } else { message("Sample data files not found.") } ## End(Not run)
Performs Kruskal_Wallis tests, post_hoc Dunn comparisons, Compact Letter Display (CLD) summaries, and generates boxplots annotated with CLD letters for taxa abundances grouped by a chosen metadata variable.
ezstat(microbiome_data, metadata, level, selected_metadata)ezstat(microbiome_data, metadata, level, selected_metadata)
microbiome_data |
Character; path to the microbiome abundance table (CSV, TSV, XLS, or XLSX). |
metadata |
Character; path to the sample metadata file (CSV, TXT, XLS, or XLSX). |
level |
Character; taxonomic rank to aggregate at (e.g. "genus", "g"). |
selected_metadata |
Character; name of the categorical metadata column to group by. |
This function first calls ezclean to produce a cleaned, merged table of sample IDs, metadata, and taxa abundances at the requested taxonomic level. It then:
Runs Kruskal_Wallis tests on each taxon and writes results with FDR_correction.
Performs Dunns pairwise post_hoc tests (BH_adjusted) for taxa with KW p less than or equal to 0.05.
Computes CLD letters for significantly different groups and writes a summary Excel.
Generates high-resolution (900 dpi) boxplots annotated with CLD letters.
Invisibly returns the data.frame of cleaned sample_taxa abundances used for all analyses.
## Not run: mb <- system.file("extdata", "microbiome.csv", package = "mbX") md <- system.file("extdata", "metadata.csv", package = "mbX") if (nzchar(mb) && nzchar(md)) { ezstat(mb, md, "genus", "Group") } ## End(Not run)## Not run: mb <- system.file("extdata", "microbiome.csv", package = "mbX") md <- system.file("extdata", "metadata.csv", package = "mbX") if (nzchar(mb) && nzchar(md)) { ezstat(mb, md, "genus", "Group") } ## End(Not run)
Generates publication-ready visualizations for microbiome data. This function first processes the microbiome and metadata files using ezclean(), then creates a bar plot using ggplot2. Supported file formats are CSV, TXT, and 'Excel'. Note: Only one of the parameters top_taxa or threshold should be provided.
ezviz( microbiome_data, metadata, level, selected_metadata, top_taxa = NULL, threshold = NULL, flip = FALSE )ezviz( microbiome_data, metadata, level, selected_metadata, top_taxa = NULL, threshold = NULL, flip = FALSE )
microbiome_data |
A string specifying the path to the microbiome data file. |
metadata |
A string specifying the path to the metadata file. |
level |
A string indicating the taxonomic level for filtering the data (e.g., "genus"). |
selected_metadata |
A string specifying the metadata column used for grouping. |
top_taxa |
An optional numeric value indicating the number of top taxa to keep. Use this OR threshold, but not both. |
threshold |
An optional numeric value indicating the minimum threshold value; taxa below this threshold will be grouped into an "Other" category. |
flip |
Logical. If 'TRUE', the order of the stacks is reversed. |
A ggplot object containing the visualization.
mb <- system.file("extdata", "microbiome.csv", package = "mbX") md <- system.file("extdata", "metadata.csv", package = "mbX") plot_obj <- ezviz( microbiome_data = mb, metadata = md, level = "genus", selected_metadata = "sample_type", top_taxa = 20, flip = FALSE ) print(plot_obj)mb <- system.file("extdata", "microbiome.csv", package = "mbX") md <- system.file("extdata", "metadata.csv", package = "mbX") plot_obj <- ezviz( microbiome_data = mb, metadata = md, level = "genus", selected_metadata = "sample_type", top_taxa = 20, flip = FALSE ) print(plot_obj)