FLUX.2 [klein] 4B โ Build Small starter
A 4-billion-parameter open-weights image model (Apache 2.0). text-to-image, image editing, run a LoRA, and train your own, in one pipeline. Runs on consumer GPUs (~13 GB) and on ZeroGPU โ fast (4 steps), and nothing to set up: no token, no gating.
Duplicate this Space or fork it as a starting point for your project.
Upload a photo and describe the change. Same model, pipe(prompt=โฆ, image=โฆ). Good for restyling, edits, and turning a friend's photo into something new.
| Input image | Edit prompt |
|---|
Compare base vs a LoRA at the same seed, side by side. Two ways to load one:
- Pick a ready-made klein LoRA from the Hub below (zero setup), or
- Upload your own
.safetensorsif you've trained one.
Some of these are image-edit LoRAs (sprite sheet, red-zoom): pick one and an input-image box appears, pre-filled with the repo's own example so it runs right away. The text-to-image ones (TRpFrog, sks dog) just need the prompt. The correct trigger fills in automatically.
Want to train your own? See the ๐ ๏ธ Train a LoRA tab. Browse more on the Hub: klein 4B LoRAs.
Train a LoRA in 3 steps
Teach klein a specific style, character, or look of your own. About 30 min on a ~$0.50 GPU. Worth doing when none of the ready-made LoRAs in the ๐ฏ Your LoRA tab give you the look you want.
AI Toolkit has a no-code web UI, so you don't have to edit YAML by hand.
| Path | Best for | Setup |
|---|---|---|
| RunPod template (easiest) | most people, ~$0.50/run | one-click deploy โ ยท UI auto-launches |
| AI Toolkit UI locally | you have a 24 GB+ NVIDIA GPU | git clone + npm run build_and_start |
| Modal (serverless) | no local GPU, pay per second | pip install modal |
New to it? Watch Ostris' 2-minute walkthrough. BFL's own docs: klein LoRA training and a worked example.
- 15โ40 images that share one look. Varied subjects/angles, โฅ1024 px.
- One
.txtcaption per image, same filename. - Caption the content, not the style. Describe what's in the image and nothing about the look. The model picks up the style on its own, so don't write "watercolor / retro / muted palette."
- Start every caption with a made-up trigger word (
MYSTYLE7,ZK_TOON) so it can't collide with real words. Use it identically everywhere.
MYSTYLE7. A portrait of an older man with a beard, plain background.
MYSTYLE7. A wide shot of a fishing boat on calm water at dawn.
No time to caption by hand? Auto-caption with any vision model ("describe content only, no style words"), then skim for leaks.
Use the ready config configs/my_lora_klein_4b.yaml (Files tab). Change just three lines marked <<< CHANGE >>>:
name:your output foldertrigger_word:must match your captionsdatasets: folder_path:where your images are
Then paste it into the AI Toolkit UI (or python run.py <config>) and hit Start. ~1800 steps takes 30โ40 min on a 4090.
Two common mistakes to avoid:
- Keep
arch: "flux2_klein_4b"in the config. Without it the trainer crashes (#691). - Watch the sample images, not the loss. Loss keeps dropping past the point where the images look best, which is usually around step 750โ1500. Pick that checkpoint's
.safetensors.
Download your .safetensors, open the ๐ฏ Your LoRA tab, upload it, put your trigger word in the prompt, and compare base โ LoRA.
In code, two lines on top of the normal pipeline:
pipe.load_lora_weights("my_lora.safetensors")
img = pipe(prompt="MYSTYLE7. a portrait, studio lighting",
num_inference_steps=50, guidance_scale=4.0).images[0]
Train your own FLUX.2 [klein] LoRA in ~30 minutes
A LoRA teaches klein a style, character, or look from a handful of images. It's a quick way to give your project its own visual identity, and it earns the ๐ฏ Well-Tuned merit badge.
This guide is self-contained. You need: ~20 images, a GPU for ~30 minutes
(a RunPod RTX 4090 is about $0.50 for a full run), and the config in
configs/my_lora_klein_4b.yaml.
Trainer used here: ostris/ai-toolkit โ a
popular community trainer (one of several; you can use any klein-compatible trainer).
Model: FLUX.2-klein-base-4B (Apache 2.0).
Don't want to train? You don't have to. The ๐ฏ Your LoRA tab in the Space has a dropdown of ready-made klein LoRAs from the Hub, and there are dozens more at huggingface.co/models. Train one when you want a specific look that the ready-made ones don't cover.
Pick your path (easiest first)
AI Toolkit has a no-code web UI โ you don't edit YAML by hand unless you want to. Three ways to run it:
| Path | Best for | Setup |
|---|---|---|
| RunPod official template | most people, ~$0.50/run | one click, UI auto-launches |
| AI Toolkit UI locally | you have a 24 GB+ NVIDIA GPU | git clone + npm run build_and_start |
| Modal (serverless) | no local GPU, pay per second | pip install modal && modal setup |
RunPod (recommended): deploy the official
AI Toolkit by Ostris template
on an RTX 4090 (24 GB) or L40S with an โฅ80 GB volume. The pod auto-launches the
AI Toolkit web UI โ you create a job, point it at your images, pick
FLUX.2-klein-base-4B, and click Start. The configs/my_lora_klein_4b.yaml in
this Space is the same job expressed as YAML, so you can either fill the UI form
or paste the config. Ostris has a 2-minute walkthrough video.
Local UI: if you have the GPU, follow the
ai-toolkit README โ clone,
install, npm run build_and_start, open localhost:8675. Same UI as the pod.
Modal: clone ai-toolkit, pip install modal, modal setup, add a READ
HF token, then run their Modal training command. Good if you'd rather not rent a
pod. Steps are in the ai-toolkit README.
Whichever you pick, the dataset + caption rules below are identical, and you end
up with a .safetensors you load in the ๐ฏ Your LoRA tab.
1. Build a dataset (15โ40 images)
A style LoRA is the easy win for a weekend. Collect 15โ40 images that share one look โ your own art, photos you have rights to, public-domain works (Wikimedia Commons is reliable and license-clean).
- Diverse subjects, angles, compositions. Avoid the same background repeated.
- โฅ1024 px on the long edge.
- Name them
img (1).png,img (2).png, โฆ each paired withimg (1).txt, โฆ
Captions: describe the CONTENT, never the style
This is the one rule people get wrong. For a style LoRA, your captions must describe what is in the image and say nothing about the style โ that's exactly what you want the model to infer on its own.
Each caption starts with your trigger word, then a literal description:
MYSTYLE7. A portrait of an older man with a beard, facing left, plain background.
MYSTYLE7. A wide shot of a small fishing boat on calm water at dawn.
Do not write "watercolor", "painterly", "vellum", "retro", "muted palette". If you describe the style in the caption, the model learns to need that word instead of baking the style into the weights.
Don't want to caption by hand? Any vision model (Qwen2.5-VL, GPT-4o, Gemini) can auto-caption with a "describe only content, no style words" prompt โ then skim the
.txtfiles and delete any style adjectives that leaked in (they leak ~25% of the time).
Pick a trigger word that is not a real word so it can't collide with the
model's vocabulary: MYSTYLE7, RISO_PR1NT, ZK_TOON. Use it identically in
every caption and in the config.
2. Get a GPU
RunPod (recommended, ~$0.50/run):
- Deploy the official AI Toolkit (Ostris) template on an RTX 4090 (24 GB) or L40S. It auto-launches the AI Toolkit UI.
- Volume โฅ 80 GB mounted at
/workspace(weights + checkpoints need room). - Upload your dataset folder to
/workspace/datasets/my_style/and the config to/workspace/configs/my_lora_klein_4b.yaml.
Colab: an L4 or A100 runtime works too โ pip install ai-toolkit, upload
the dataset, point the config at your Drive path.
3. Edit three lines in the config
Open configs/my_lora_klein_4b.yaml and change the lines marked <<< CHANGE >>>:
name:โ your output folder name.trigger_word:โ your trigger (must match your captions).datasets: folder_path:โ where you uploaded the images.
Also update the three sample.prompts to use your trigger so the in-training
preview images show your style forming.
The one line you must NOT delete: arch: "flux2_klein_4b". Without it
ai-toolkit falls back to a Stable Diffusion loader and crashes on a missing-unet
error (issue #691). The
official BFL example omits it โ that's a bug, the included config keeps it.
4. Train
In the AI Toolkit UI: paste the config, click Start. Or from the CLI:
cd /app/ai-toolkit
python run.py /workspace/configs/my_lora_klein_4b.yaml
It checkpoints every 250 steps into /app/ai-toolkit/output/<name>/ and writes
sample images alongside. A 1800-step run on a 4090 takes roughly 30โ40 minutes.
Watch the samples, not the loss. Loss keeps dropping past the point where the
images start to overfit. For most style LoRAs the visual peak is around
step 750โ1500, not the final step. Open the sample images, pick the
checkpoint that looks best, and use that .safetensors.
5. Use it
Download the .safetensors you picked, then in this Space open the
๐ฏ Your LoRA tab, upload it, put your trigger word in the prompt, and
compare base vs your fine-tune at the same seed.
In code it's two lines on top of the normal pipeline:
from diffusers import Flux2KleinPipeline
import torch
pipe = Flux2KleinPipeline.from_pretrained(
"black-forest-labs/FLUX.2-klein-base-4B", torch_dtype=torch.bfloat16
).to("cuda")
pipe.load_lora_weights("my_lora.safetensors")
img = pipe(
prompt="MYSTYLE7. a portrait of a person, studio lighting",
num_inference_steps=50, guidance_scale=4.0,
).images[0]
You trained on base (50 steps). The LoRA also loads on the distilled
FLUX.2-klein-4B(4 steps) for fast demos, with mild drift. The Space's LoRA tab uses distilled by default for speed; switchKLEIN_MODEL_IDto base for the cleanest results.
Quick knobs
| You want | Change in the config |
|---|---|
| More texture / grain | linear: 128, keep timestep_type: shift |
| Cleaner / simpler shapes | weight_decay: 0.0002 |
| Flat vector / graphic look | linear: 64, conv: 32, 15โ20 images |
| A character (not a style) | 10โ15 images, steps: 1000 |
| Loss stalls | lr: 0.0002 |
| Loss oscillates | lr: 0.00005 |
More detail: BFL klein training docs ยท training example.
FLUX.2 [klein]
Compact, open-weights image models from Black Forest Labs. Generate and edit images in one pipeline, run on consumer GPUs.
Models โ this Space runs klein 4B distilled. The family is a 2ร2: size (4B / 9B) ร variant (distilled / base)
- FLUX.2-klein-4B โ 4B distilled, 4 steps, Apache 2.0 (what this Space runs)
- FLUX.2-klein-base-4B โ 4B base, 50 steps, Apache 2.0 (LoRA-training target)
- FLUX.2-klein-9B โ 9B distilled, 4 steps, higher fidelity, gated
- FLUX.2-klein-base-9B โ 9B base, 50 steps, gated
- All BFL models on Hugging Face
Code & docs
- github.com/black-forest-labs/flux2 โ official inference code (Apache 2.0)
- docs.bfl.ai โ API & model docs
- Training a klein LoRA ยท worked example
- ostris/ai-toolkit โ a popular community LoRA trainer (one of several)
This Space
- Built for the Build Small Hackathon
- Duplicate it, or open the Files tab to read
app.pyand the guides - Browse community klein LoRAs: 4B ยท 9B
FLUX.2 [klein] ยท GitHub ยท Docs ยท Build Small Hackathon