Home Tech News OpenAI’s new AI image generator pushes the limits in detail and prompt fidelity – Ars Technica

OpenAI’s new AI image generator pushes the limits in detail and prompt fidelity – Ars Technica

by Norman Scott

OpenAI has announced the latest version of its AI image synthesis model, DALL-E 3, which offers full integration with ChatGPT. This new release is a text-to-image generator that creates novel images based on written descriptions or prompts. While OpenAI has not released detailed technical information about DALL-E 3, it is likely that the model has been trained on millions of images created by human artists and photographers, similar to previous versions. However, DALL-E 3 is expected to incorporate new training techniques and longer computational training time.

Judging by the samples provided by OpenAI, DALL-E 3 appears to be a highly capable image synthesis model that accurately follows prompts. The model generates images with minimal deformations, maintaining fidelity to the given instructions. Compared to its predecessor, DALL-E 3 refines small details more effectively, creating engaging images without the need for manual tweaks or prompt engineering.

DALL-E 3 also excels at handling text within images, which was a challenge for earlier models. It accurately renders in-image text such as labels and signs, allowing for more comprehensive and realistic image generation. This sets DALL-E 3 apart from competing models that struggle to achieve similar results.

One notable aspect of DALL-E 3 is its integration with ChatGPT. This means that the AI assistant can now serve as a brainstorming partner for refining images based on the context of the current conversation. This integration may introduce novel capabilities and enhance the overall user experience.

OpenAI acknowledges the controversies surrounding AI image generation technology. To address concerns, DALL-E 3 has been designed to decline requests for images in the style of living artists. OpenAI also provides a form for creators to opt out of having their images used to train future models. However, it remains to be seen whether these measures will satisfy artists who argue for opt-in AI training.

In terms of copyright, unlike its predecessor, DALL-E 3 allows users full ownership of the images they create. OpenAI explicitly states that users do not require permission to reprint, sell, or merchandise the generated images. However, there is still debate around copyright protection for purely AI-generated artwork, as current US copyright policy does not recognize it.

OpenAI has taken steps to ensure the safety of DALL-E 3. The model incorporates keyword and image detection filters to restrict the generation of violent, sexual, or hateful content. It also declines requests to generate images of public figures by name. OpenAI has actively worked with experts to identify and mitigate potential risks, including harmful biases and the creation of propaganda and misinformation.

Overall, DALL-E 3 represents a significant advancement in AI image synthesis. Its ability to accurately render images based on prompts and handle in-image text generation sets it apart from other models. As it becomes available to ChatGPT Plus and Enterprise customers in October, it is expected to further expand the capabilities of AI-generated imagery.

You may also like