I acquired an early have a look at ChatGPT Photographs 2.0, and it is spectacular - with one exception

I got an early look at ChatGPT Images 2.0, and it's impressive - with one exception — Elyse Betters Picaro / ZDNET

Observe ZDNET: Add us as a preferred source on Google.

ZDNET’s key takeaways

OpenAI reframes photographs as a visible language.
Considering mode builds context-aware infographics.
Model constancy remains to be inconsistent in early testing.

At this time, OpenAI introduced ChatGPT Photographs 2.0, its next-generation image modelwhich the corporate says is targeted on precision, usability, and sophisticated visible duties.

Probably the most notable new functionality is the flexibility to mix textual content and pictures to construct complicated, lovely pages. OpenAI is reframing the entire concept of picture era from a course of that creates decorations (their phrase) to a language (additionally their time period).

Additionally: The best AI image generators of 2026: There’s only one clear winner now

OpenAI describes it as, “A great picture does what a superb sentence does — it selects, arranges, and divulges. It may possibly clarify a mechanism, stage a temper, take a look at an concept, or make an argument.”

Considering capabilities allow complicated workflows

Along with its vastly improved potential to combine textual content and graphics, the brand new mannequin makes use of enhanced considering capabilities. It may possibly generate a number of photographs per immediate with continuity throughout outputs. This strategy is feasible as a result of the mannequin really integrates reasoning into the picture output.

Created by ChatGPT/Screenshot by David Gewirtz/ZDNET

This shift is huge. As an alternative of simply producing a picture that just about matches the immediate particulars, Photographs 2.0 can take a a lot vaguer immediate, like “Generate an infographic about actions I ought to do with tomorrow’s climate in San Francisco in thoughts.”

Additionally: How to switch from ChatGPT to Gemini

From this immediate, the AI will collect climate and exercise knowledge about San Francisco, decide actions acceptable to the climate, after which construct a picture or set of photographs that match the outcomes.

In response to OpenAI, “On this mannequin, Photographs 2.0 acts extra like a visible thought companion, serving to carry a challenge from tough idea to completed asset with considerably much less work in your half.”

Precision and design management enhance usability

Many people have lengthy struggled to persuade ChatGPT to generate photographs in a particular desired facet ratio. Usually, the AI stubbornly produces what it needs. However now, with Photographs 2.0, the mannequin has help for “facet ratios as broad as 3:1 and as tall as 1:3.”

The mannequin additionally helps higher-fidelity outputs that (largely) produce correct object placement, detailed textual content rendering, and sophisticated compositions. We’ll see if we are able to take away the phrase “largely” from that sentence after the product is formally launched.

Additionally: I tried Personal Intelligence, and it was accurate (but unsettling)

The AI additionally helps small textual content, UI parts, and stylistic constraints at as much as 2K decision. Cool.

Testing the preview

I used to be given entry to a day-before-release preview, and the mannequin is spectacular, largely. I fed it a screenshot of the ZDNET residence web page and a draft of the Photographs 2.0 press launch.

Then I instructed, “Based mostly on the contents of the press launch, generate a 16:9 infographic concerning the new picture replace and generate it utilizing the ZDNET model model as proven within the ZDNET residence web page doc.”

Additionally: I tried Google Photos’ new AI Enhance tool: How it crops, relights, and fixes your shots – sometimes

The mannequin did an important job on the infographic, however strive as it’d, it couldn’t reproduce the ZDNET emblem. On its first strive, it rendered the Z in ZDNET with a slight droop.

I attempted a wide range of requests on the order of, “Repair the ZDNET Brand. The Z droops in your model however is just not droopy within the precise emblem.” However Photographs 2.0 by no means managed to repair it.

So I began a brand new session. This time, I included the instruction, “Use particular care to breed the ZDNET emblem precisely.”

Additionally: I tested ChatGPT Plus vs. Gemini Pro to see which is better – and if it’s worth switching

This is the place issues acquired very odd. For its first run, the mannequin someway dug up a duplicate of ZDNET’s emblem from earlier than our 2022 redesign. This emblem is nowhere to be discovered on our present residence web page. Weirdly, it rendered that outdated emblem utilizing the present coloration scheme. The mannequin then pushed the emblem and the infographic data off the left fringe of the picture. It additionally selected a light-weight blue for “Photographs 2.0” that is not a ZDNET model coloration.

I attempted mightily to persuade it to make use of the present emblem. I managed to get it to push the picture to the fitting, so nothing was reduce off. However including the immediate, “Use the ZDNET emblem that’s on the offered web page. Don’t seek for an alternate emblem,” did nothing to repair the issue.

I took another shot on the problem earlier than deciding to return to ending up this text. As soon as once more, I began a brand new session so the AI did not have muscle reminiscence from its earlier miscalculations.

Additionally: This powerful Gemini setting made my AI results way more personal and accurate

The mannequin tousled the emblem once more. This time, the AI determined so as to add a rudder form to the stem of the stretched-out capital D.

To be honest, I am utilizing a pre-release model of Photographs 2.0. I will be again with a way more complete take a look at run of the mannequin after the official product launch.

I additionally tried an identical take a look at utilizing a distinct doc with Google’s Nano Banana Professional, however as a result of it did not deal with the synthesis the best way that this new model of OpenAI’s product does, it wasn’t actually in a position to repeat the outcomes I acquired right here. We’ll know extra as we do extra superior assessments

Pricing and availability

The brand new mannequin is offered at present to all ChatGPT and Codex users. Superior outputs and the considering functionality can be found to ChatGPT Plus, Professional, Enterprise, and Enterprise customers. Make sure you choose “Considering” from the ChatGPT dropdown bar on the high of the display.

On the time of writing, earlier than launch, the brand new Photographs 2.0 mannequin is just accessible on the desktop. However OpenAI guarantees that these capabilities can be within the cellular model as effectively, together with the flexibility to finger-select photographs utilizing your cellular touchscreen.

The pictures are additionally accessible by way of API utilizing the gpt-image-2 mannequin. API pricing varies relying on the standard, thinkiness (my phrase), and desired picture decision.

If an AI can deal with structure and content material together, will that change the way you strategy design tasks? Tell us within the feedback under.

You’ll be able to comply with my day-to-day challenge updates on social media. Make sure you subscribe to my weekly update newsletterand comply with me on Twitter/X at @DavidGewirtzon Fb at Facebook.com/DavidGewirtzon Instagram at Instagram.com/DavidGewirtzon Bluesky at @DavidGewirtz.comand on YouTube at YouTube.com/DavidGewirtzTV.

Source link

Login

Register

ZDNET’s key takeaways

Considering capabilities allow complicated workflows

Precision and design management enhance usability

Testing the preview

Pricing and availability

Related posts