# Transformer Lab — For Teams

> Self-hosting and team deployment docs for Transformer Lab.


## for-teams

Please contact us on Discord if you need any help :)

- [Transformer Lab Documentation](/for-teams.md): Please contact us on Discord if you need any help :)

### advanced-install

- [Setting up Authentication](/for-teams/advanced-install/authentication.md): Transformer Lab supports several authentication methods. Enable one or more of the following providers by setting environment variables in the Transformer Lab .env file.
- [Cloud Storage](/for-teams/advanced-install/cloud-storage.md): Where Does Transformer Lab Store Files
- [Connecting to Email SMTP](/for-teams/advanced-install/email.md): You can configure Transformer Lab to send invite and signup confirmation emails as part of the team invite workflow.
- [SkyPilot Volume Mounts for localfs](/for-teams/advanced-install/skypilot-volume-mounts.md): When you use TFLSTORAGEPROVIDER=localfs, Transformer Lab expects a shared network filesystem path (for example NFS) that is visible from the jobs it launches.

### agent-skill

Transformer Lab ships an agent skill that teaches AI coding agents — Claude Code, and other compatible assistants — how to drive Transformer Lab through the lab CLI.

- [Agent Skill (for AI Coding Agents)](/for-teams/agent-skill.md): Transformer Lab ships an agent skill that teaches AI coding agents — Claude Code, and other compatible assistants — how to drive Transformer Lab through the lab CLI.

### architecture

Transformer Lab has the following architecture

- [Overall Architecture](/for-teams/architecture.md): Transformer Lab has the following architecture

### autoresearch

What is Autoresearch?

- [Autoresearch](/for-teams/autoresearch.md): What is Autoresearch?

### cli

Transformer Lab can be accessed via CLI using the lab executable.

- [Transformer Lab CLI](/for-teams/cli.md): Transformer Lab can be accessed via CLI using the lab executable.

### configure-compute

- [AWS](/for-teams/configure-compute/aws.md): The AWS compute provider lets Transformer Lab launch ephemeral EC2 instances for training jobs directly from your AWS account. Each job gets its own instance, which self-terminates when the job finishes or crashes.
- [Azure](/for-teams/configure-compute/azure.md): The Azure compute provider lets Transformer Lab launch ephemeral Azure VMs for training jobs directly from your Azure subscription. Each job gets its own VM, which self-terminates when the job finishes or crashes.
- [dstack](/for-teams/configure-compute/dstack.md): After installing dstack and starting Transformer Lab, follow these steps to add it as a compute provider.
- [GCP](/for-teams/configure-compute/gcp.md): The GCP compute provider lets Transformer Lab launch ephemeral Google Compute Engine VMs for training jobs directly from your Google Cloud project. Each job gets its own VM, which self-terminates when the job finishes or crashes.
- [Runpod](/for-teams/configure-compute/runpod.md): After setting up a Runpod account and API key and starting Transformer Lab, follow these steps to add it as a compute provider.
- [SkyPilot](/for-teams/configure-compute/skypilot.md): After installing SkyPilot and starting Transformer Lab, follow these steps to add it as a compute provider.
- [Slurm](/for-teams/configure-compute/slurm.md): After installing Slurm and starting Transformer Lab, follow these steps to add it as a compute provider.

### faq

- [Deploying Transformer Lab in Airgapped & Secure Environments](/for-teams/faq/air-gapped-systems.md): Research environments with strict security requirements—including "airgapped" systems without internet connectivity—present unique challenges for modern software. At Transformer Lab, we don't expect you to change your security posture for us. Instead, we’ve built our platform to be infrastructure-agnostic, ensuring it respects the boundaries of your internal network.
- [Core Concepts](/for-teams/faq/concepts.md): This page defines the key terms used throughout Transformer Lab and explains how they relate to each other.
- [S3 Alternatives](/for-teams/faq/s3-alternatives.md): While our documentation frequently references AWS S3, it is important to know that S3 is not a hard requirement. Transformer Lab is designed to be storage-agnostic.

### install

Prerequisites

- [Install Instructions](/for-teams/install.md): Prerequisites

### install-gpu-orchestrator

- [Installing dstack](/for-teams/install-gpu-orchestrator/install-dstack.md): dstack is an open-source orchestrator for running AI workloads across clouds and on-prem infrastructure.
- [Installing Runpod](/for-teams/install-gpu-orchestrator/install-runpod.md): Runpod is a cloud provider that offers on-demand GPU infrastructure. Transformer Lab can connect to Runpod using an API key that you can generate from your Runpod account.
- [Installing SkyPilot](/for-teams/install-gpu-orchestrator/install-skypilot.md): You can install SkyPilot on the same compute node that Transformer Lab itself runs on.
- [Installing Slurm](/for-teams/install-gpu-orchestrator/install-slurm.md): Production
- [Choosing Between SkyPilot and Slurm](/for-teams/install-gpu-orchestrator/skypilot-vs-slurm.md): Transformer Lab abstracts the job submission process, so for the most part, Slurm and SkyPilot will work similary from a user-interface perspective in Transformer Lab.

### interact-examples

- [Run a Jupyter Notebook](/for-teams/interact-examples/jupyter.md): Running a Jupyter Notebook Service
- [Run an Ollama Server](/for-teams/interact-examples/ollama.md): Running an Ollama Server Service
- [Get Direct SSH Access to a Node](/for-teams/interact-examples/ssh.md): Running an SSH Service
- [Interact with a Model using vLLM](/for-teams/interact-examples/vllm.md): Running a vLLM Server Service
- [Run VSCode on a Remote Machine](/for-teams/interact-examples/vscode.md): Running a VSCode Service

### lab-sdk

The Lab SDK is a Python library that provides a simple, unified interface for integrating machine learning scripts with Transformer Lab.

- [Lab SDK](/for-teams/lab-sdk.md): The Lab SDK is a Python library that provides a simple, unified interface for integrating machine learning scripts with Transformer Lab.

### running-a-service

What is an Interactive Service?

- [Running an Interactive Service](/for-teams/running-a-service.md): What is an Interactive Service?

### running-a-task

- [Creating Tasks From Scratch](/for-teams/running-a-task/creating-scratch-tasks.md): This guide explains how to create a task from scratch, how task files appear on the compute machine, and how to modify your training scripts so that important outputs are available later in the GUI.
- [Multi-Node Tasks](/for-teams/running-a-task/multi-node-tasks-slurm-skypilot.md): This guide explains how multi-node tasks behave when resources.num_nodes > 1, and how to write task.yaml for SLURM and SkyPilot providers.
- [Quick Start](/for-teams/running-a-task/quick-start.md): This quick start helps a new Teams user go from login to running their first task.
- [Task Parameters](/for-teams/running-a-task/task-parameters.md): This guide explains how to define and configure task parameters in Transformer Lab. Parameters are used to pass configuration values, hyperparameters, and other settings to your task scripts, which can be accessed via lab.get_config().
- [Task Submission Overview](/for-teams/running-a-task/task-submission.md): This page explains how tasks work in Transformer Lab and how the different pieces fit together. It links out to focused guides for the GUI, CLI, and advanced task configurations.
- [Sweeps](/for-teams/running-a-task/task-submission-advanced.md): Once you are comfortable running single tasks, you can take advantage of parameterization and sweeps to explore many configurations automatically.
- [Task Submission Using AI Agents](/for-teams/running-a-task/task-submission-agent-skill.md): Transformer Lab ships an Agent Skill that lets Claude Code (or any coding agent that supports skills) drive the lab CLI on your behalf. Once installed, you can ask your agent in natural language to create tasks, queue jobs on compute providers, stream logs, and pull down artifacts — without memorizing CLI flags or writing task.yaml by hand.
- [Task Submission Using the CLI](/for-teams/running-a-task/task-submission-cli.md): Transformer Lab provides a CLI called lab for managing tasks and jobs from the terminal.
- [Using Existing Training Scripts Inside Tasks](/for-teams/running-a-task/task-submission-existing-scripts.md): If your tasks already launch your own training or evaluation scripts (via the run: command in task.yaml), you do not need to rewrite those scripts for Transformer Lab.
- [Task Submission Using the GUI](/for-teams/running-a-task/task-submission-gui.md): This guide shows how to submit tasks from the Transformer Lab user interface, using the same building blocks you see in the app:
- [Task YAML Structure](/for-teams/running-a-task/task-yaml-structure.md): This guide explains how to format YAML files for creating tasks in Transformer Lab. Tasks define jobs that run on compute providers and can include training scripts, evaluation scripts, or any other computational workloads.

### sweeps

What is a Sweep?

- [Running a Sweep](/for-teams/sweeps.md): What is a Sweep?

### update

To update Transformer Lab's server to the latest version, run the server update command in the cli:

- [Update Instructions](/for-teams/update.md): To update Transformer Lab's server to the latest version, run the server update command in the cli:

### viewing-jobs-and-artifacts

- [Working with Checkpoints](/for-teams/viewing-jobs-and-artifacts/checkpoints.md): What is a Checkpoint?
- [Evaluation Results & Chart View](/for-teams/viewing-jobs-and-artifacts/evals-viewing.md): This guide assumes you’ve already run a task that produced evaluation results.

### why

Your lab likely relies on Slurm to manage your compute cluster. Or your lab may be using cloud compute like services from Azure, AWS, or GCP. Or perhaps you are using a cluster that runs Kubernetes.

- [Why Should I use Transformer Lab at My Research Lab?](/for-teams/why.md): Your lab likely relies on Slurm to manage your compute cluster. Or your lab may be using cloud compute like services from Azure, AWS, or GCP. Or perhaps you are using a cluster that runs Kubernetes.