How to use IP-Adapter to Use an Image as a Prompt in Automatic1111

IP-Adapter offers the innovative capability of using an image as a prompt to generate new images, in addition to the traditional text-based prompts. For example, if you wish to create a wood carving in the shape of an apple, you can input ‘wood carving’ as your text prompt and provide an image of an apple as the visual prompt. This feature significantly broadens the scope for creative image generation.

Furthermore, IP-Adapter extends its functionality to Image-to-Image transformations and Inpainting, offering diverse and creative approaches to image generation. The setup and usage of IP-Adapter are straightforward, making it accessible for a wide range of users. Let’s explore how to get started with this innovative tool. Let’s dive in.

Installation

Using IP-Adapter requires ControlNet extension and IP-Adapter models. If you are not sure how to install the extension, check out our step by step tutorial How to use ControlNet in Automatic1111 Part 2: Installation. You can download IP-Adapter models from https://huggingface.co/h94/IP-Adapter. Download the following and put them under stable-diffusion-webui/extensions/sd-webui-controlnet/models.

  • ip-adapter-full-face_sd15.bin
  • ip-adapter-plus_sd15.bin
  • ip-adapter_sd15_light.bin
  • ip-adapter-plus-face_sd15.bin
  • ip-adapter_sd15.bin
  • ip-adapter_sd15_vit-G.bin

We are done with the set up. Let’s move onto how to use it.

Using IP-Adapter

You can use IP-Adapter for text to image, image to image and Inpainting. For this tutorial, let’s try to make a wood carving of an apple similar to the one shown at the top of this page using text to image. First step is to upload the photo of an apple to the txt2img tab. You can download the below image. After downloading, make sure that the image is still 512×512.

Load the SD 1.5 model and enter “wood carving” in the positive prompt field.

In the ControlNet section, do the following:

  • Load the image of an apple by dragging the image from your file viewer into the canvas below Single Image.
  • Check Enable.
  • For Control Type, select IP-Adapter.
  • For Preprocessor, select ip-adapter_clip_sd15.
  • For Model, select ip-adapter_sd15.
  • For Control Weight, set to 1.
  • For Starting Control Step 0.
  • For Ending Control Step 1.

Now you hit Generate. You see an image of an apple but it does not look like wood.

What this means is that the image prompt is overpowering the text prompt. To reduce its influence, let’s set Control Weight to 0.4.

Hit Generate. Now, we have created an image of an apple-shaped wood carving.

In addition to Control Weight, depending on the prompts that you use, you may also want to tweak Starting Control Step and Ending Control Step in order to get an ideal result. IP-Adapter can be very powerful for Inpainting. For instance, in the image below, cyborgs were inpainted using IP-Adapter. Rather than relying solely on a text prompt for guidance, pre-generated images of a gorilla and a cyborg served as additional cues to steer the diffusion process. This approach ensures that the inpainted area aligns more closely with your envisioned concept, reducing the uncertainty that can arise from using only a text prompt.

References

[1] tencent-ailab. IP-Adapter. Retrieved from https://github.com/tencent-ailab/IP-Adapter.

Leave a Comment

Your email address will not be published. Required fields are marked *