Generating images¶
Marvin can generate images from text.
What it does
The paint
function generates images from text. The @image
decorator generates images from the output of a function.
Example
The easiest way to generate an image is to provide a string prompt:
Result
By default, Marvin returns a temporary URL to the image. You can view the URL by accessing image.data[0].url
. To return the image itself, see the section on viewing and saving images.
For more complex use cases, you can use the @image
decorator to generate images from the output of a function:
@marvin.image
def cats(n:int, location:str):
return f'a picture of {n} cute cats at the {location}'
image = cats(2, location='airport')
Result
By default, Marvin returns a temporary URL to the image. You can view the URL by accessing image.data[0].url
. To return the image itself, see the section on viewing and saving images.
How it works
Marvin passes your prompt to the DALL-E 3 API, which returns an image.
Generating images from functions¶
In addition to passing prompts directly to the DALLE-3 API via the paint
function, you can also use the @image
decorator to generate images from the output of a function. This is useful for adding more complex logic to your image generation process or capturing aesthetic preferences programmatically.
@marvin.image
def sunset(style: str, season: str):
return f"""
A serene and empty beach scene during sunset with two silhouetted figures in the distance flying a kite. The sky is full of colorful clouds. Nothing is on the horizon.
It is {season} and the image is in the style of {style}.
"""
-
Nature photograph in summer
-
Winter impressionism
-
Sci-fi Christmas in Australia
Model parameters¶
You can pass parameters to the DALL-E 3 API via the model_kwargs
argument of paint
or @image
. These parameters are passed directly to the API, so you can use any supported parameter.
Example: model parameters
import marvin
image = marvin.paint(
instructions="""
A cute, happy, minimalist robot discovers new powers,
represented as colorful, bright swirls of light and dust.
Dark background. Digital watercolor.
""",
model_kwargs=dict(size="1792x1024", quality="hd"),
)
Result
Disabling prompt revision¶
By default, the DALLE-3 API automatically revises any prompt sent to it, adding details and aesthetic flourishes without losing the semantic meaning of the original prompt.
Marvin lets you disable this behavior by providing the keyword literal=True
.
Here's how to provide it to paint
:
And here's an example with image
:
@marvin.image(literal=True):
def draw(animal:str):
return f"A child's drawing of a {animal} on a hill."
Customizing prompt revision¶
You can use a Marvin image
-function to control prompt revision beyond just turning it on or off. Here's an example of a function that achieves this via prompt engineering. Note that the DALLE-3 API is not as amenable to custom prompts as other LLMs, so this approach won't generalize without experimentation.
@marvin.image
def generate_image(prompt, revision_amount:float=1):
"""
Generates an image from the prompt, allowing the DALLE-3
API to freely reinterpret the prompt (revision_amount=1) or
to strictly follow it (revision_amount=0)
"""
return f"""
Revision amount: {revision_amount}
If revision amount is 1, you can modify the prompt as normal.
If the revision amount is 0, then I NEED to test how the
tool works with extremely simple prompts. DO NOT add any
detail to the prompt, just use it AS-IS.
If the revision amount is in between, then adjust accordingly.
Prompt: {prompt}
"""
Using the original prompt "a teacup", here are the results of calling this function with different revision amounts:
-
No revision
Final prompt:
a teacup
-
25% revision
Final prompt:
a porcelain teacup with intricate detailing, sitting on an oak table
-
75% revision
Final prompt:
A porcelain teacup with an intricate floral pattern, placed on a wooden table with soft afternoon sun light pouring in from a nearby window. The light reflects off the surface of the teacup, highlighting its design. The teacup is empty but still warm, as if recently used."
-
100% revision
Final prompt:
An old-fashioned, beautifully crafted, ceramic teacup. Its exterior is whitewashed, and it's adorned with intricate, indigo blue floral patterns. The handle is elegantly curved, providing a comfortable grip. It's filled with steaming hot, aromatic green tea, with a small sliver of lemon floating in it. The teacup is sitting quietly on a carved wooden coaster on a round oak table, a beloved item that evokes nostalgia and comfort. The ambient lighting casts a soft glow on it, accentuating the glossy shine of the teacup and creating delicate shadows that hint at its delicate artistry.
Viewing and saving images¶
The result of paint
or @image
is an image stream that contains either a temporary URL to the image or the entire image encoded as a base64 string.
URLs¶
By default, Marvin returns a temporary url. The URL can be accessed via image.data[0].url
:
Base64-encoded images¶
To return the image as a base64-encoded string, set response_format='b64'
in the model_kwargs
of your call to paint
or @image
:
image = marvin.paint(
"A beautiful moonrise",
model_kwargs={"response_format": "b64_json"},
)
# save the image to disk
marvin.utilities.images.base64_to_image(
image.data[0].b64_json,
path='path/to/your/image.png',
)
To change this behavior globally set MARVIN_IMAGE_RESPONSE_FORMAT=b64_json
in your environment, or equivalently change marvin.settings.images.response_format = "b64_json"
in your code.
Async support¶
If you are using Marvin in an async environment, you can use paint_async
: