One post tagged with "quantization"

Quantizing Ideogram 4.0 onto a 3090: an INT8 build that matches FP8 and a 4-bit GGUF that beats NF4

June 9, 2026 · 6 min read

Person

Person

Person

Person

INT8 and GGUF post-training quantization of Ideogram 4.0 on Ampere consumer GPUs, with the evidence behind each number.

Quick summary: Ideogram released two builds of their new model — a high-quality version meant for massive data center GPUs, and a lower-quality version for consumer cards. We took the high-quality version, shrank it down to the memory footprint of the consumer build, and kept the premium performance intact.