Convert TSV file to BIOM file

Context

This is a conversion tool that transforms an abundance table in TSV format into a BIOM file. BIOM is the standard format used in microbial ecology pipelines, allowing the storage of abundances, taxonomic metadata, and sequence information in a structured way. This tool is especially useful to reintroduce manually processed or exported TSV data back into a BIOM-compatible workflow.

How it does

The program parses the input TSV file that contains abundance data and metadata. If provided, it also integrates an additional TSV file that holds information about multiple taxonomic affiliations, which can occur when sequences have ambiguous assignments. Based on these inputs, the script reconstructs a BIOM file that preserves the hierarchical structure of taxonomies. If sequences are included in the TSV under the seed_sequence tag, the program can export them into a FASTA file.

Configuration: 16S V3V4 Swarm

sbatch -J tsv_to_biom -o LOGS/tsv_to_biom.out -e LOGS/tsv_to_biom.err -c 8 --export=ALL --wrap="module load devel/Miniforge/Miniforge3 && module load bioinfo/FROGS/FROGS-v5.0.2 && tsv_to_biom.py --input-tsv FROGS/SWARM/affiliation.tsv --input-multi-affi FROGS/SWARM/multi_aff.tsv --output-biom FROGS/SWARM/affiliation_bis.biom --output-fasta FROGS/SWARM/affiliation.fasta --log-file FROGS/SWARM/tsv_to_biom.log && module unload bioinfo/FROGS/FROGS-v5.0.2"
(to see all settings: tsv_to_biom.py --help)