Is this a half-finished model? The output image is nothing similar to the input.

#11
by JosephusCheung - opened

The output image deviates significantly from the input image. No matter how you adjust the generation parameters, the resemblance just isn't there. The results are even worse than a direct alignment using SigLIP, what's the benefit of a LLM here?

image.png

Sign up or log in to comment