Skip to content
Course Rockstar
TechnologyAll Levels

Prompt Engineering for Vision Models

Prompt engineering is used not only in text models but also in vision models. Depending on the vision model, they may use text prompts, but can also work with...

By Abigail Morgan on Coursera

About This Course

Prompt engineering is used not only in text models but also in vision models. Depending on the vision model, they may use text prompts, but can also work with pixel coordinates, bounding boxes, or segmentation masks. In this course, you’ll learn to prompt different vision models like Meta’s Segment Anything Model (SAM), a universal image segmentation model, OWL-ViT, a zero-shot object detection model, and Stable Diffusion 2.0, a widely used diffusion model. You’ll also use a fine-tuning technique called DreamBooth to tune a diffusion model to associate a text label with an object of your preference. In detail, you’ll explore: 1. Image Generation: Prompt with text and by adjusting hyperparameters like strength, guidance scale, and number of inference steps. 2. Image Segmentation: Prompt with positive or negative coordinates, and with bounding box coordinates. 3. Object detection: Prompt with natural language to produce a bounding box to isolate specific objects within images. 4. In-painting: Combine the above techniques to replace objects within an image with generated content. 5. Personalization with Fine-tuning: Generate custom images based on pictures of people or places that you provide, using a fine-tuning technique called DreamBooth. 6. Iterating and Experiment Tracking: Prompting and hyperparameter tuning are iterative processes, and therefore experiment tracking can help to identify the most effective combinations. This course will use Comet, a library to track experiments and optimize visual prompt engineering workflows.

Topics Covered

Frequently Asked Questions

How much does Prompt Engineering for Vision Models cost?

Visit the Prompt Engineering for Vision Models course page for current pricing and available discounts.

Who teaches Prompt Engineering for Vision Models?

Prompt Engineering for Vision Models is taught by Abigail Morgan, DeepLearning.AI.

What skill level is Prompt Engineering for Vision Models for?

This course is designed for all levels learners.

Similar Courses

Included with membership
Enroll Now
Students0
Duration1 hour
LevelAll Levels
Languageen
PlatformCoursera