CriticalResist8 in crushagent

The guide to local image creation with Z-image

Okay, I had a whole guide written but had to debug something on my computer and now it's gone 😩

Until I rewrite it, I will reserve this space for quick steps.

One

install ComfyUI: https://github.com/comfyanonymous/ComfyUI - they have a new desktop installer for Windows and Mac now, I assume it works if they offer it. Otherwise follow Manual Install which starts at "git clone this repo".

Two

Install the drivers for your card. Update your GPU's drivers AND the drivers for AI (on nvidia this is cuda). You can google for them, but you need them to be able to use your GPU to generate which is what makes it take 30 seconds and not 3 hours. If you're on linux I assume you know how to set up a python virtual environment with pip in it.

Three

While this is installing, grab z-image turbo .gguf files from here:

https://huggingface.co/jayn7/Z-Image-Turbo-GGUF/tree/main

So, this is a .gguf model. It's quantized, meaning you lose some accuracy in the neural network but if you look at their comparison (https://cdn-uploads.huggingface.co/production/uploads/651f78681719ac0cec346537/EpzgxY40FbLEE3oGUBDIi.png), it doesn't change a whole lot compared to bf16 which is the full-weights model. Like you could easily go down to Q5_K_S or even Q4 without losing any noticeable quality.

Which gguf do you pick though? That depends on how much Vram your GPU has. I recommend a model that is ~3 gigs below your GPU's Vram. So for 8GB Vram, pick a ~5GB model. YMMV though. Here is how you download models on huggingface:

(The small download icon next to the model's size. PS: the size is also the disk space it will take). You can also download all of them and try them out individually.

Download the gguf, it will take a while.

While it's downloading, let's go download --

Four

The VAE and Text encoder

Again nothing technical here, but you need these two files. Think of the model as a package, and it needs the individual parts to function. You can get the other parts from here: https://huggingface.co/Comfy-Org/z_image_turbo/tree/main

What we need is in split_files/vae and split_files/text_encoders. It's the two big files - the text encoder is about 8 gigs and the vae is about 335MB. I think this is the ComfyUI devs own repo on huggingface.

Five - Recap

By now you should have:

ComfyUI installed somewhere on your computer (git clone repo, install pytorch dependencies, run pip install -r requirements.txt)
a GGUF model of z-image turbo
the VAE
the text encoder

We are not quite done yet but almost.

Six

Place the 3 downloaded model files as follows:

gguf: ComfyUI/models/unet

vae: ComfyUI/models/vae

text encoder: ComfyUI/models/text_encoders

Then, try to run main.py in the ComfyUI folder (if manual install). You need python 3.13 at the latest to run it, I assume you know what that is otherwise you would have done desktop installer. You can run the script with:

cd path/to/comfyUI
python main.py

And see if it launches. If it all worked correctly, it will say "UI is available on 127.0.0.0:80811" or something like that. You can copy that URL and open it in your browser of choice (Firefox, Chrome, whichever).

If it didn't work, go to deepseek. Copy the full log with the python main.py command, send to deepseek, tell it what you were doing, and it will help debug.

Seven

You will be greeted by a splash screen on the first time, this is normal. At this time the first item in the splash screen is Z-image-turbo sample workflow, you can click on that and import it immediately. Or if it's not there at the time you read this guide, you can download the worfklow from Comfy's github, click the download button on the top right above the json code.

Then, simply drag and drop the downloaded file into ComfyUI's interface, it will automatically create the workflow.

⚠ At this point, if there are text fields in red in the workflow, it likely means you didn't import some of the files in the correct folders. Because we use a .gguf model however, it is correct and expected that the 'unet_name' field will be red. Follow along to fix this.

Now, we can't use that workflow immediately because he downloaded gguf models. Oh boy, are you ready for a couple more steps?

Eight

You need comfyUI manager. This is something everyone has with their comfyUI anyway, they just don't package it with the install because of licencing issues I imagine. Anyway, go to this git repo: https://github.com/Comfy-Org/ComfyUI-Manager. Clone the repo as per the instructions into ComfyUI/custom_nodes, then restart comfyUI completely (quit cmd and relaunch python main.py). It should install itself.

Nine

This time, you should see a blue "Manager" button at the top right of the ComfyUI, click on it then Custom Nodes Manager, then type 'gguf' in the search bar and install ComfyUI-GGUF (if presented with options whichever it picked in the list for you). Wait then click the restart buttons when prompted.

Ten

We are going to make changes to base workflow. Right click the huge node with the prompt in it, and do "Unpack subgraph." This will basically ungroup the various nodes that make this giant node. You can move them around by clicking and dragging with your mouse, it's a very visual interface and the whole point of doing nodes-based approaches. Move them around so you can see them all individually.

Eleven

For the final part, we are going to replace one of the nodes - specifically, the "Load diffusion model" one. Tap n key on your keyboard to load your nodes library, then search for 'gguf'. In 'bootleg', there will be a 'Unet loader (GGUF) if you installed the ComfyUI-GGUF mod correctly.

Add that GGUF node to your workflow by simply dragging and dropping it into your workflow area.

Twelve

Select the "load diffusion model" and the connection to the next node will be highlighted in white -- it goes to a bypassed (deactivated) 'model sampling' node. Put them next to each other.

Then, simply drag a line with the mouse cursor from the GGUF 'model' purple circle to the model sampling one:

Once you release the mouse it should look like this:

And just like that, you've had your first taste of working with a nodes approach. This is basically a pipeline and every node is a step in the pipeline. Yes, this gets very complicated very quickly and why I prefer Automatic1111's interface for most things.

In that unet load (GGUF), it should have found your unet_name already (the .gguf model). If it hasn't, open the dropdown menu and see if you can select it: (and if you still don't find it it might be in the wrong folder or you need to restart comfy, but it should find it eventually).

At this point press Ctrl+S to save your workflow. Next time you open Comfy, it will be right there waiting for you.

Thirteen - final step

To confirm everything works, simply try to run the workflow by clicking "Queue" (IIRC). It will take a while because it's loading the model files into your Vram (+ CPU if you don't have enough Vram for all of it), but once it's loaded image generation goes much faster.

Click that little >_ icon on the lower left sidebar to view the console. If you get into an error while it's trying to load the model or generate the image, you can send that console log to Deepseek and let it troubleshoot.

If everything works, congratulations, you have an image on the right! It gets automatically saved to ComfyUI/output on your computer.

Final tip: you can move your nodes around, we only use 3 or 4 in this workflow:

Yes it looks terrible but you will only use the prompt, the K Sampler settings (I recommend setting steps 9, cfg 1, sampler dpmpp_sde and scheduler ddim_uniform, though I'm still playing around with the settings), and the width and height of the final image. On my GPU it does really well with 1024x1024px -- bigger image = more time needed to generate. You can always upscale it afterwards with an upscaler.

You made it to the end, congratulations!

I know this is a shitload of text and I didn't even get how to install git or python if you don't have it, or setting up a python virtual environment just for Comfy. But either deepseek can help with that, or hopefully someone will evenutally make a youtube tutorial if there isn't one already...