Transformer Lab

Our mission is to accelerate the pace of
machine learning.

We are working to push the limits at the very forefront of machine learning and AI research — through our own research lab, and through the tools we build in partnership with some of the world's best labs.


§1

Research

Transformer Lab is dedicated to exploring the frontier of artificial intelligence. We conduct research across diverse domains in machine learning and publish our findings in the open.

The defining property of the lab is velocity and versatility. We pursue diverse challenges across distinct domains of machine learning, with a bias toward novelty and a deep love for the technically intriguing.

§2

Publications

Selected results from our lab's recent work.

  1. Asaria, Salomone, Gandhi. A Cross-Model VLM-Judge Protocol for Single-Image 3D Mesh Quality (and Why Cheap Proxies Fall Short). arXiv preprint, Jun 2026. [3D]  arXiv:2606.18451
    A standardized evaluation protocol for single-image-to-3D mesh generators, using 24-view rendering and position-bias correction — and showing that common proxies like CLIP similarity and geometry-validity metrics don't substitute for a VLM judge.
  2. Asaria, Salomone, Gandhi. Reliable Neural-Codec Text-to-Speech by ASR Self-Verification and Distillation: Near-Zero Catastrophic Failures Across Models and Codecs. arXiv preprint, Jun 2026. [AUDIO]  arXiv:2606.18323
    ASR-based self-verification drives catastrophic failures (silence, early termination, repetition) to near zero in autoregressive neural-codec TTS, then distills the behavior for inference-time efficiency — generalizing across four TTS systems and three codecs.
  3. Asaria, Salomone, Gandhi. Neither Parallel Nor Sequential: How DiffusionGemma Actually Commits Tokens. arXiv preprint, Jun 2026. [LLM]  arXiv:2606.14620
    A close look at token-commitment patterns in DiffusionGemma 26B. Contrary to parallel-decoding marketing, the behavior is neither parallel nor block-autoregressive — weak left-to-right bias and substantial within-batch ordering ambiguity.
  4. Asaria, Salomone, Gandhi. Realizing Native INT8 Compute for Diffusion Transformers on Consumer GPUs: A Fused INT8 GEMM Kernel for Ideogram 4.0. arXiv preprint, Jun 2026. [SYSTEMS]  arXiv:2606.14598
    A fused Triton kernel that properly drives the INT8 tensor cores on consumer Ampere GPUs — ~1.1× end-to-end speedup, making 1024px generation feasible on a single RTX 3090.
  5. Gandhi, Asaria, Salomone. Holding the FP8 Quality Ceiling at 8-Bit Weights and Activations: INT8 and GGUF Post-Training Quantization of Ideogram 4.0 for Consumer GPUs. arXiv preprint, Jun 2026. [VISION]  arXiv:2606.12280
    Post-training quantization of Ideogram 4.0 where INT8 W8A8 comes out statistically indistinguishable from FP8 on key quality metrics, with INT8 and GGUF Q4_K both cutting compute for consumer-GPU deployment.

→ Read all of our research

§3

Research Tooling

Our lab doesn't just release papers and code, we also partner with the world's best labs, across academia and industry, to unlock velocity for their researchers (and their researchers' agents). The tools we build are designed to accelerate the entire research loop, from planning to publication.

platform

Transformer Lab

The workbench our researchers live in: train, tune, evaluate, and inspect models across modalities from one interface.

open source · self-hostable
orchestration

GPU-cluster coordination

Hundreds of distributed jobs across RunPod, Lambda, AWS, Azure, GCP, and in-house hardware — coordinated automatically.

multi-cloud · autoscaling
announcing soon

Intelligence collection

A new approach to gathering and grounding knowledge — not quite ready to reveal. For now we're sharing it only with our closest partners.

stay tuned
announcing soon

Intelligence orchestration

Still under wraps — for now we're sharing it only with our closest partners.

stay tuned
§4

Research at Maximum Velocity

Science is, fundamentally, a search algorithm through the infinite space of possible truths. Our goal is to transform research from a highly manual, sequential bottleneck into a massively parallel utility you can dial up, empowering scientists to act as conductors of an intellectual orchestra that can discover paths previously uncharted. Let's discover the unknown together!