How to generate an image caption from an image in Fooocus

At times, you may come across an image that you wish to describe accurately, either to create a similar image with minor adjustments or due to limited proficiency in English. ‘Fooocus’ offers a caption generation feature that caters to these specific needs. This tutorial provides a detailed, step-by-step guide to assist you in utilizing this functionality effectively.

Steps

We will be using the image below for this tutorial. If you have an image that you want to try, please feel free to do so.

Once you have the source image, do the following:

  • Check the Input Image checkbox
  • Check the Advanced checkbox below the text prompt
  • Click the Describe tab
  • Drag the source image to the canvas from your file viewer application (e.g. Windows Explorer or Mac Finder)
  • In Content Type, check Photograph

Then press Describe this image into Prompt.

You will see that a prompt “a man is riding a bicycle with his dog in the basket” is generated and appears in the prompt field.

For anime or art images like below, select Art/Anime for Content Type.

Generated caption is “open mouth, blue eyes, standing, outdoors, day, tree, pokemon (creature), no humans, sunlight, grass, nature, forest, fox, skateboard, orange fur“.

When you want to fine-tune a model using LoRA, you can also use this caption generation feature to create a caption for images in your training dataset.

Leave a Comment

Your email address will not be published. Required fields are marked *