Translating Tools to Toolchest

You probably run computational biology software on the command line like this:

$ kraken2 --db ./langmead_lab_standard/ --threads 8 --minimum-base-quality 25 --report ./kraken2_output/kraken2_report.txt example.fastq 1> ./kraken2_output/kraken2_output.txt

With Toolchest, the above translates to:

toolchest.kraken2(
    inputs="./example.fastq",
    output_path="./kraken2_output/",
    tool_args="--minimum-base-quality 25",
)
toolchest$kraken2(
    inputs = "./example.fastq",
    output_path = "./kraken2_output/",
    tool_args = "--minimum-base-quality 25"
)

Since the Toolchest client is based in Python or R, the exact syntax will look different, but the functionality is the same. Command-line arguments will become parameters for the Toolchest function call. Let's walk through how to translate your command to Toolchest:

Inputs

You can specify input files as if you were running the tool on your own machine. This will usually be handled by the inputs parameter, though the exact syntax may differ for a couple tools.

Outputs

Any output files generated will be downloaded to a single directory. You can specify either the output directory or the output file path itself with the output_path parameter. (Whether this is a path to a directory or file will differ from tool to tool.)

If the directory doesn't exist yet, Toolchest will create it for you.

Databases

If a database is needed (e.g., as a reference genome), you can specify it with database_name and database_version.

For each tool that requires a database, Toolchest has a selection of databases pre-implemented.

Other Arguments

We take care of translating the command line arguments for input, output, and infrastructure configuration. That means you don't need to set --threads, --report, or --db.

All other custom arguments will be handled by the tool_args parameter in the form of a string. This contains everything you'd usually pass on the command line (e.g., "--minimum-base-quality 25").

For certain tools, some arguments may be handled by their own specific parameters.

To make sure we're running the software at its best, some tool_args may be disabled.

Tool-Specific Details

Each tool has a documentation page listing:

  • input syntax
  • output syntax and output files
  • database options
  • supported arguments

Check out the relevant page for more tool-specific info!


Did this page help you?