ByteDance Releases SDXL-Lightning Model For Text-To-Image Generation, Increasing Speed by 10 Times
According to the report from Jiemian News, it was learned from informed sources that ByteDance has released the text-to-image open model SDXL-Lightning. It is reported that this model can generate high-quality and high-resolution images in a very short time, making it one of the fastest text-to-image models currently available.
Text-to-image is a technology that uses artificial intelligence to generate images based on text descriptions. Currently, mainstream models in the text-to-image field use diffusion-based generation techniques, gradually transforming noise into images through multiple iterations. Although this technology can generate realistic images, it also has drawbacks such as high computational resource consumption and slow generation speed. It takes about 5 seconds to generate a high-quality image.
The SDXL-Lightning model from ByteDance adopts a progressive adversarial distillation technique, achieving unprecedented generation speed. This model can generate high-quality and high-resolution images in 2 or 4 steps, speeding up the generation process by ten times. It is the fastest text-to-image model at 1024 resolution, with computing costs reduced to one-tenth.
The ByteDance intelligent creation team stated that the model is an improved version based on the open-source text-to-image model SDXL, compatible with other tools and plugins in the open model community. SDXL-Lightning can be seamlessly integrated as a speed-up plugin into various styles of SDXL models such as cartoons and animations, and supports popular control plugins ControlNet and generation software ComfyUI. This facilitates developers, researchers, and creative professionals to combine these tools for innovation and collaboration in the industry.
The model has been publicly released on the AI open-source community Hugging Face, ranking high on the model trend list, and also becoming a popular model on Hugging Face Spaces.