A new pipeline to identify microbial life history traits!
Report Bug
·
Request Feature
·
Paper
Table of Contents
This pipeline requires the bellow software to be installed, please click on the icons for the installation instructions.
To run this pipeline simply execute the following command. Additional parameters are described below.
nextflow run jotech/mlhtThe input file must be in csv format with two columns, specifying the id and the path to the assembly file. The first line of the file must be a header. You can find an example of the input file in assets/samples.csv. The following is an example of a valid input file:
id,file
SAM-ID,/path/to/assembly.fasta
It is important to note that the id must be unique and cannot contain spaces. The path to the assembly file could be a local file or a remote file (e.g. ftp, http, etc.)
The first time you run the pipeline it will download the required databases to a folder called dbs. Then, you can pass the databases as parameters to avoid the download step. This is how the databases are passed to the pipeline:
- Bakta:
--bakta_db ./dbs/bakta - Eggnog:
--eggnog_db ./dbs/eggnog - Antismash:
--antismash_db ./dbs/antismash - Platon:
--platon_db ./dbs/platon - dbCAN:
--dbcan_db ./dbs/dbcan - Kofamscan:
--kofam_profiles ./dbs/kofam/profiles --kofam_ko_list ./dbs/kofam/ko_list
The dbs/ folder consist of symbolic links to the work/ folder. If you would like to delete work/ but keep the databases, you can do so by running the following command:
cp --dereference dbs/ hard_dbs/
rm -f work/ && rm -f dbs/ && mv hard_dbs/ dbs/It is often useful to cache the conda environment to avoid downloading the same packages multiple times. To do so, you can set the following environment variable:
export NXF_CONDA_CACHEDIR=/path/to/conda/environment/cache/directory/If you use this pipeline please cite the following paper:
Zimmermann, J., Mendoza-Mejía N., et al. (2024). A new pipeline to identify microbial life history traits.
The paper is not yet published, please contact the authors of this repo for further information.