Cloud Storage
Where Does Transformer Lab Store Files​
Transformer Lab runs as a central "coordinator" node, but dispatches workloads to different "worker" nodes. All of these nodes (workers and the coordinator) need to have common view of a shared storage directory. This can be stored in the cloud (usually recommended) but could also be on shared storage that is mounted to all nodes in common path (e.g. using NFS)
If you use our s3 or gcs storage option, Transformer Lab will mount the bucket automatically, you don't have to mount any drives yourself. But if you use our localfs storage engine, you map it to a directory that appears like a local path, but is mounted at the operating system level to a shared NFS or other storage engine.
AWS S3 Storage​
To use AWS S3 as remote storage:
-
Set
TFL_REMOTE_STORAGE_ENABLED=truein your.envfile. -
Configure AWS credentials for the
transformerlab-s3profile.Using AWS CLI (Recommended)​
If you have the AWS CLI installed, run:
aws configure --profile transformerlab-s3Enter your AWS Access Key ID, Secret Access Key, default region, and output format when prompted.
Manual Configuration​
Create or edit the AWS credentials file at
~/.aws/credentialsand add:[transformerlab-s3]
aws_access_key_id = YOUR_ACCESS_KEY_ID
aws_secret_access_key = YOUR_SECRET_ACCESS_KEYEnsure the profile has the necessary permissions to create and manage S3 buckets.
Google Cloud Storage (GCS)​
To use Google Cloud Storage instead of AWS S3:
-
Set
TFL_REMOTE_STORAGE_ENABLED=truein your.envfile. -
Set
REMOTE_WORKSPACE_HOST=gcpin the same.envfile. -
Optionally, set
GCP_PROJECTto specify the Google Cloud project. If not set, it defaults totransformerlab-workspace. -
Configure Google Cloud credentials:
Using gcloud CLI (Recommended)​
If you have the Google Cloud CLI installed, authenticate and set the project:
gcloud auth application-default login
gcloud config set project transformerlab-workspace # or your project nameManual Configuration​
Set the
GOOGLE_APPLICATION_CREDENTIALSenvironment variable to the path of your service account key JSON file:export GOOGLE_APPLICATION_CREDENTIALS="/path/to/your/service-account-key.json"You can obtain a service account key from the Google Cloud Console under IAM & Admin > Service Accounts.
Ensure the service account has the necessary permissions for Cloud Storage operations (Storage Admin or equivalent).
Azure Blob Storage​
To use Azure Blob Storage instead of AWS S3 or GCS:
-
Set
TFL_REMOTE_STORAGE_ENABLED=truein your.envfile. -
Set
TFL_STORAGE_PROVIDER=azurein the same.envfile. -
Configure Azure credentials using one of the following approaches:
Option A: Connection String (Simplest)​
Set the
AZURE_STORAGE_CONNECTION_STRINGenvironment variable in your.envfile:AZURE_STORAGE_CONNECTION_STRING="DefaultEndpointsProtocol=https;AccountName=your_account;AccountKey=your_key;EndpointSuffix=core.windows.net"You can find your connection string in the Azure Portal under Storage account → Access keys.
Option B: Account Name + Key​
Set the storage account name and access key separately:
AZURE_STORAGE_ACCOUNT="your_account_name"
AZURE_STORAGE_KEY="your_account_key"Option C: Account Name + SAS Token​
If you prefer to use a Shared Access Signature (SAS) token instead of the full account key:
AZURE_STORAGE_ACCOUNT="your_account_name"
AZURE_STORAGE_SAS_TOKEN="your_sas_token"Ensure the SAS token has sufficient permissions for read, write, list, and delete operations on containers and blobs.
Local Storage​
To use a shared filesystem (e.g. NFS) that is accessible via a local path:
-
Set
TFL_STORAGE_PROVIDER=localfsin your.envfile. -
Set
TFL_STORAGE_URI=/path/to/your/shared/folderin the same.envfile. -
Remove the line
TFL_REMOTE_STORAGE_ENABLED=truefrom your.envfile if it exists. -
If you run tasks with SkyPilot, configure hostPath volume mounts so your
TFL_STORAGE_URIis available inside SkyPilot task pods. See SkyPilot Volume Mounts for localfs.