AWS Integration

Toolchest supports integrating with AWS. If you have files or databases stored on AWS, you can use them with Toolchest calls, provided that Toolchest has access to them.

Input Files

Files stored on AWS's S3 service can be passed in as inputs, using the file's S3 URI. For example:

import toolchest_client as toolchest
toolchest.set_key("YOUR_KEY")

toolchest.kraken2(
    inputs="s3://toolchest-demo-data/SRR16201572_R1.fastq",
    output_path="./",
)
library(toolchest)
toolchest$set_key("YOUR_KEY")

toolchest$kraken2(
    inputs = "s3://toolchest-demo-data/SRR16201572_R1.fastq",
    output_path = "./"
)

Output to S3

Some tools support uploading outputs directly to your custom S3 bucket. For these runs, specify the S3 bucket/prefix as output_path. For example:

import toolchest_client as toolchest
toolchest.set_key("YOUR_KEY")

toolchest.kraken2(
    inputs="./example.fastq",
    output_path="s3://your-output/your-intended-subfolder",
)
library(toolchest)
toolchest$set_key("YOUR_KEY")

toolchest$kraken2(
    inputs = "./example.fastq",
    output_path = "s3://your-output/your-intended-subfolder"
)

Custom Databases

If you have a database stored on S3, use the custom_database_path parameter to specify a S3 URI that serves as a subfolder or common prefix for all of your intended database files.

Toolchest will need permissions to recursively copy the S3 URI to transfer all subfiles.

import toolchest_client as toolchest
toolchest.set_key("YOUR_KEY")

toolchest.kraken2(
    inputs="./example.fastq",
    output_path="./example_output_dir",
    custom_database_path="s3://your-databases/your-kraken2-database",
)
library(toolchest)
toolchest$set_key("YOUR_KEY")

toolchest$kraken2(
    inputs = "./example.fastq",
    output_path = "./example_output_dir",
    custom_database_path = "s3://your-databases/your-kraken2-database"
)

This feature is currently available for Kraken 2.

Granting permissions to Toolchest to access your S3 bucket

Add the following permissions for the bucket that you want to grant access for Toolchest

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Toolchest",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::172533437917:role/toolchest-worker-node-role"
            },
            "Action": [
                "s3:GetObject",
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::YOUR_BUCKET_NAME",
                "arn:aws:s3:::YOUR_BUCKET_NAME/*"
            ]
        }
    ]
}

With YOUR_BUCKET_NAME replaced with your bucket's name. You can use AWS IAM syntax here, so feel free to restrict Toolchest's access to specific files or folders.

After you add this policy, send us YOUR_BUCKET_NAME via email or Slack to complete the setup process.

On-Prem Toolchest

Toolchest also supports an on-prem deployment option contained entirely within your own AWS account. Get in touch with us if you'd like to know more!


Did this page help you?