Generative AI is having a moment. ChatGPT and art generators such as DALL-E 2, Stable Diffusion and Midjourney have proven their potential, and now millions are wracking their brains over how to get their outputs to look something like the vision in their head.

This is the goal of prompt engineering: the skill of crafting an input to deliver a desired result from generative AI.

Image created using Midjourney. Prompt: oil painting of a child with their grandparent enjoying a moment together and looking at each other. The child’s face is full of wonder and the grandparent’s face is lined with years of living, nostalgia, happy and sad memories and the wisdom of their years. Detailed faces. – – ar 3:2 – – no glasses

Despite being trained on more data and computational resources than ever before, generative AI models have limitations. For instance, they’re not trained to produce content aligned with goals such as truth, insight, reliability and originality.

They also lack common sense and a fundamental understanding of the world, which means they can generate flawed (and even nonsensical) content.

As such, prompt engineering is essential for unlocking generative AI’s capabilities. And luckily it isn’t a technical skill. It’s mostly about trial and error, and keeping a few things in mind.

ChatGPT

First, let’s use ChatGPT to illustrate how prompt engineering can be used for text outputs. If it’s used effectively, ChatGPT can generate essays, computer code, business plans, cover letters, poetry, jokes, and more.

Since it’s a chatbot, you may be inclined to engage with it conversationally. But this isn’t the best approach if you want tailored results. Instead, adopt the mindset that you’re programming the machine to perform a writing task for you.

Create a content brief similar to what you might give a hired professional writer. The key is to provide as much context as possible and use specific and detailed language. You can include information about:

  • your desired focus, format, style, intended audience and text length
  • a list of points you want addressed
  • what perspective you want the text written from, if applicable
  • and specific requirements, such as no jargon.

If you want a longer piece, you can generate it in steps. Start with the first few paragraphs and ask ChatGPT to continue in the next prompt. If you’re unsatisfied with a specific portion, you can ask for it to be rewritten according to new instructions.

But remember: no matter how much you tinker with your prompts, ChatGPT is subject to inaccuracies and making things up. So don’t take anything at face value. In the example below, the output mentions a “report” that doesn’t exist. It probably included this because my prompt asked it to use only reliable sources.

I used prompt engineering to get ChatGPT to write this news article, which provides inaccurate information.

Art generators

Midjourney is one of the most popular tools for art generation, and one of the easiest for beginners. So let’s use it for our next example.

Unlike for text generation, elaborate prompts aren’t necessarily better for image generation. The following example shows how a basic prompt combined with a style keyword is enough to create a variety of interesting images. Your style keyword may refer to a genre, art movement, technique, artist or specific work.

The following images were based on the prompt leopard on tree followed by different style keywords. These were (from the top left clockwise) synthwave, hyperrealist, expressionist and in the style of Zena Holloway. Holloway is a British photographer known for capturing her subjects in ethereal and somewhat surreal scenes, most often underwater.

Images generated by Midjourney.

You can also add keywords relating to:

  • image qualities, such as “beautiful” or “high definition”
  • objects you want pictured
  • and lighting and colours.

With Midjourney, you can even use certain specific commands for different features, including ––ar or ––aspect to set the aspect ratio, ––no to omit certain objects, and ––c to produce more “unusual” results. This command accepts values between 0-100 after it, where the default is 0 and 100 leads to the most unusual result.

You can also use ––s or ––stylize to generate more artistic images (at the expense of following the prompt less closely).

The following example applies some of these ideas to create a fantasy image with a dreamlike and futuristic look. The prompt used here was dreamy futuristic cityscape, beautiful, clouds, interesting colors, cinematic lighting, 8k, 4k ––ar 7:4 ––c 25 ––no windows.

Image generated by Midjourney.

Midjourney accepts multiple prompts for one image if you use a double colon. This can lead to results such as the image below, where I provided separate prompts for the owl and plants. The full prompt was oil painting of an ethereal owl :: flowers, colors :: abstract :: wisdom ––ar 7:4.

Image generated by Midjourney.

A more advanced type of prompting is to include an image as part of the prompt. Midjourney will then take the style of that image into account when generating a new one.

A good way to find inspiration and ideas is to explore the Midjourney gallery and style libraries.

Despite stunning results, generative AI is subject to inconsistencies such as the floating branch in this image. Prompt: woman watching the sunset, magical realism, very beautiful, nature, colourful, very detailed – – ar 7:4

A career of the future?

As generative AI models enter everyday life, prompting skills are likely to become more in-demand, especially from employers looking to get results using AI generators.

Some commentators are asking if becoming a “prompt engineer” may be a way for professionals such as designers, software engineers and content writers to save their jobs from automation, by integrating generative AI into their work. Others have suggested prompt engineering will itself be a career.

It’s hard to predict what role prompt engineering will play as AI models advance.

But it’s almost a given that more sophisticated generators will be able to handle more complex requests, inviting users to stretch their creativity. They will likely also have a better grasp of our preferences, reducing the need for tinkering.


This article is republished from The Conversation under a Creative Commons license. Read the original article.

Image: Midjourney/Marcel Scharth, Author provided

Marcel is a Lecturer in Business Analytics at the University of Sydney Business School. His research specialises in the fields of statistics, econometrics, machine learning, and data science.

Related content