
Comply with ZDNET: Add us as a preferred source on Google.
ZDNET’s key takeaways
- GPT-5.5 delivers polished, helpful solutions throughout duties.
- Robust efficiency throughout writing, coding, and reasoning duties.
- Overeagerness hurts accuracy and instruction following.
OpenAI has released GPT-5.5, which may be reductively described as higher and quicker than GPT-5.4. The brand new massive language mannequin reveals enhancements in agentic coding, conceptual readability, scientific analysis means, and accuracy throughout data work.
This launch follows intently on the heels of the introduction of ChatGPT Images 2.0 earlier this week, which mixes AI intelligence with picture technology. And if it additionally seems like we simply mentioned the release of GPT-5.4, you are not improper.
Additionally: ChatGPT just made it easy to find and edit all the AI images you’ve ever generated
As the next chart reveals, the discharge cadence for OpenAI releases has sped up dramatically, almost certainly as a result of AI coding has considerably lowered OpenAI’s improvement time.
That chart was generated solely by ChatGPT 5.5 Considering utilizing Photographs 2.0. All I did was inform the AI that I needed to visualise the discharge cadence between GPT releases and needed it introduced within the ZDNET model fashion. I additionally supplied a PNG of the ZDNET brand.
The entire course of, together with some minor corrections, took lower than 10 minutes. I’ve been researching information and creating professional-looking informational charts like this by hand for the reason that invention of laptop graphics. One thing like this could take at the least two hours to create, not 10 minutes.
Additionally: I got an early look at ChatGPT Images 2.0, and it’s impressive – with one exception
I’ve already achieved some testing of the Images 2.0 capabilities. I will be again with extra subsequent week. On this article, I am specializing in GPT-5.5’s data capabilities.
I ran GPT-5.5 via my 10-point testing course of. I used to be each impressed and aggravated. The outcomes had been stable, however the mannequin tended to be a bit of too exuberant, doing work I did not ask it to do.
Since GPT-5.5 is simply accessible in paid tiers (Plus and above), I used ChatGPT Plus for my checks. Proper now, my Plus account solely reveals GPT-5.5 accessible for the Considering effort degree in each Customary and Prolonged. I picked Customary Considering. That is the hassle I used for these checks.
Let’s get began.
Take a look at 1: Summarize a information story
- Obtainable factors: 10
- Awarded factors: 5
This take a look at appears at how nicely the AI can learn a narrative on the internet and clarify it. I used Yahoo Information as a result of Yahoo would not block AI entry. I additionally appeared for a narrative that is as non-political as doable. In the present day, that meant I needed to go a great way down the information web page to search out a story on the recent LaGuardia runway crash.
GPT-5.5 did appropriately summarize the meat of the story, nevertheless it did not comply with my directions to make use of Yahoo Information because the supply. For GPT-5.2, I deducted one level as a result of ChatGPT used data from Axios and Yahoo. This time, I took off 5 factors, as a result of it used data from AP, The Solar, Wall Avenue Journal, The Guardian, and even Wikipedia.
Additionally: I tested ChatGPT Plus vs. Gemini Pro to see which is better – and if it’s worth switching
If I had needed a complete information reply, that may have been effective. However the immediate particularly mentioned to take a look at Yahoo Information, and GPT-5.5 just about ignored that instruction.
There is a large push from all of the AI corporations about operating autonomous brokers. But when even a easy abstract immediate cannot be adopted appropriately, it doesn’t give me confidence that it is secure to let brokers run wild on long-horizon tasks. Simply sayin’.
Take a look at 2: Educational idea clarification
- Obtainable factors: 10
- Awarded factors: 10
This problem requested the AI to elucidate academic constructivism to a five-year-old. It examined how nicely the AI can analysis and report on an idea, after which alter its clarification fashion to the specified goal degree.
GPT-5.5 supplied a really clear reply that included an instance that may be one thing a five-year-old might image and perceive. All 10 factors had been awarded.
Take a look at 3: Math and evaluation
- Obtainable factors: 10
- Awarded factors: 10
This take a look at was designed to check the AI’s math and pattern-recognition skills. I handed the mannequin a sequence of numbers. These numbers had been a part of a math trope referred to as the Fibonacci Sequence, however I did not inform the AI that.
When requested to fill in some numbers within the sequence, the AI needed to perceive the sample and carry out the calculations to offer the sequence. It did the maths appropriately.
Additionally: The best AI image generators of 2026: There’s only one clear winner now
The AI was additionally instructed to “clarify your reasoning.” All I acquired again was, “The sequence is the Fibonacci sequence: every quantity is the sum of the 2 numbers earlier than it.” This was an accurate clarification and akin to the outcomes from earlier releases.
I awarded this take a look at 10 factors as a result of, though transient, it was right.
Take a look at 4: Cultural dialogue
- Obtainable factors 10
- Awarded factors: 10
This take a look at requested the AI to assemble a case, type a coherent argument, and current an opinion on a problem that does not have a definitive proper or improper reply. I requested, “Do you assume social media has improved or worsened communication in society? Present two causes in your view.”
Apparently, GPT-5.5 thought social media “has worsened communication general.” I tended to agree. The mannequin supplied two stable causes. The primary was that it “usually rewards velocity and response over thoughtfulness.” The second was that social media “tends to create data bubbles.” For every purpose, GPT-5.5 supplied a supporting paragraph.
Additionally: How to switch from ChatGPT to Gemini
Each of these causes had been legitimate. It additionally shared a fast listing of the optimistic advantages of social media, together with serving to individuals keep related, manage for causes, and share data extensively.
GPT-5.5 gave a solution that was concise, well-considered, and clear. It acquired 10 factors for this take a look at.
Take a look at 5: Literary evaluation
- Obtainable factors: 10
- Awarded factors: 10
This strategy examined the AI’s understanding of a chunk of latest literature, the primary Recreation of Thrones ebook, A Song of Ice and Fire. The take a look at requested what the principle themes are, and why they’re essential.
GPT-5.5 gave me again a 632-word response that broke the ebook down into the next themes:
- Energy and its value
- The collapse of heroic fantasy beliefs
- Household, loyalty, and inherited battle
- Honor versus pragmatism
- Identification and self-invention
- The human value of conflict
- The hazard of political distraction
- Prophecy, faith, and uncertainty
- Justice and revenge
- The return of the ignored previous
GPT-5.5 supplied clear explanations for every theme, why it was included, the way it associated to the ebook, and what it meant to the general sequence. It is laborious to be strictly goal with one thing like this, however I actually acquired the sensation this was essentially the most nuanced reply I’ve seen to this query from my varied GPT model checks.
All 10 factors had been awarded.
Take a look at 6: Journey itinerary
- Obtainable factors: 10
- Awarded factors: 9
This take a look at evaluated the AI’s data of geographic areas and its means to create a useful journey itinerary primarily based on particular pursuits. I requested it to plan a week-long trip in Boston in March centered on expertise and historical past.
Of all of the occasions I’ve requested this query of AIs, GPT-5.5 produced the most effective model for factors of curiosity and day schedules. The mannequin did not simply hit the foremost vacationer landmarks; it additionally identified a pleasant mixture of historic and tech factors of curiosity. GPT-5.5 took into consideration that March is more likely to be a bit disagreeable, so it blended in each indoor and outside actions, together with fallback plans.
Whereas it didn’t advocate a variety of eateries, GPT-5.5 did advocate Authorized Seafoods, which is certainly one of my private favourite areas. The mannequin misplaced a degree as a result of it made completely no reference to prices.
Additionally: I tried Personal Intelligence, and it was accurate (but unsettling)
I really feel like GPT-5.5 actually grokked (sure, I did that) what somebody would need in an itinerary by offering a powerful listing of actions to get enthusiastic about. However the AI did not fulfill the journey advisor a part of the method as a result of it did not cowl budgeting.
Take a look at 7: Emotional assist
- Obtainable factors: 10
- Awarded factors: 10
The emotional assist query requested for recommendation and phrases of encouragement for an upcoming job interview. I’ve to say I actually appreciated this AI’s response.
The AI included some encouragement, like “The interview shouldn’t be an interrogation. It is a mutual match dialog.” It additionally gave some sensible recommendation. First, GPT-5.5 recommended getting ready three tales the job seeker might use throughout the interview, one about fixing an issue, one about working with others, and one about studying or recovering from one thing troublesome.
The mannequin gave a easy respiration train. It mentioned that it is okay to pause earlier than answering a query. It was additionally encouraging, and the interview meant there was already one thing concerning the candidate that the hiring firm discovered fascinating.
Additionally: I tried Google Photos’ new AI Enhance tool: How it crops, relights, and fixes your shots
Good, stable, helpful solutions: 10 factors.
Take a look at 8: Translation and cultural relevance
- Obtainable factors: 10
- Awarded factors: 9
My take a look at immediate requested GPT-5.5 to translate a phrase from English to Latin after which clarify the cultural relevance of Latin in right this moment’s world.
The phrase I requested it to translate was, “The celebration will happen tomorrow within the city sq..” GPT-5.5 gave me again two decisions, “Celebratio cras in foro oppidi fiet,” and what it referred to as a barely extra formal various, “Celebratio cras in foro publico oppidi habebitur.”
Additionally: This powerful Gemini setting made my AI results way more personal and accurate
The primary model is a word-for-word translation of the requested phrase. However the second interprets again to English as, “The celebration might be held tomorrow within the city’s public discussion board,” which was not the phrase I requested for.
GPT-5.5 might have thought it was useful to offer an extra variation, however for somebody who would not converse Latin, all of the strategy does is confuse the difficulty. Which is the Latin phrase that ought to be used? I am deducting a degree for overeagerness that does not strictly comply with the immediate.
As for the second half of the query, GPT-5.5 answered briefly, however precisely.
Take a look at 9: Coding take a look at
- Obtainable factors: 10
- Awarded factors: 10
Chatbot coding take a look at outcomes are fascinating. They’re totally different in nature from the kinds of outcomes you get when testing coding brokers like Codex or Claude Code.
Additionally: I used GPT-5.2-Codex to find a mystery bug and hosting nightmare – it was beyond fast
Whereas the LLMs within the chatbots and coding brokers are usually related, I’ve discovered that the coding brokers are significantly extra correct on requests than when operating within the chatbots. I have not been in a position to get any of the AI corporations to elucidate why, however I am guessing it has one thing to do with how the 2 totally different instruments allocate assets and coaching information.
The take a look at case for this query was the second take a look at in my coding metrics article, which requested the AI to scrub up a buggy snippet of code for validating whether or not a greenback quantity was correctly entered right into a discipline.
The AI handed this take a look at. The one factor the AI did that may very well be a problem is denying correctness to a quantity that included a comma. However that is really nonetheless a secure response. If the person enters “1,000.00,” the AI returns false. It would take the person a second to strive once more with “1000.00,” nevertheless it will not hurt the system.
GPT-5.5 acquired all 10 factors for this take a look at.
Take a look at 10: Artistic writing
- Obtainable factors: 10
- Awarded factors: 10
This take a look at is among the many most enjoyable in the whole query suite. It requested GPT-5.5 to write down a narrative longer than 1,500 phrases, as described within the second immediate in this article. The goal was to discover the creativity and comprehensiveness of the chatbot’s reply.
In contrast to the opposite checks, I ran this analysis in Prolonged mode to see simply how good the story might get. I am unsure the AI took a lot benefit of this feature, as a result of it solely ran for eight seconds. Nonetheless, it was frickin’ superior.
GPT-5.5 gave me again 4,049 phrases, which I feel is the longest story I’ve gotten again from an AI in all my checks of this explicit problem.
Additionally: How to shop with AI: 6 ways I find deals, price track, and let agents buy for me
I appreciated how GPT-5.5 opened the story by saying, “By the yr 2339, most of Boston had turn out to be superb at pretending it was not previous.” I used to be hooked.
I attempted to get Voice Mode to learn to me like a bedtime story. Nevertheless, the AI first mentioned the story was too lengthy. It then supplied to learn the story to me part by part. Once I agreed to that strategy, nothing occurred; it simply hung. I am not deducting factors for that failure as a result of it isn’t a part of the usual analysis take a look at, nevertheless it’s disappointing nonetheless.
Sadly, since I requested the AI to learn the story by way of Voice Mode, I can not share the output from inside ChatGPT. What I did not know is that the three-dot icon after the response had a ‘Learn aloud’ possibility, which in all probability would have labored.
That mentioned, I copied the response to Google Docs, so you may nonetheless read it there, in the event you so want.
Listed below are a couple of extra quotes from the complete response:
- Jackson, who had clearly been ready all his life to listen to somebody say “the one within the again” in a mysterious bookstore, appeared radiant. Ophelia appeared as if she was starting to calculate exits.
- “My pricey,” Archibald mentioned, “by 2339, proof works nonetheless the rich can persuade it to.”
- One stopped earlier than Jackson: a slim guide certain in copper mesh titled The Gentleman’s Information to Wanting Ridiculous with Conviction. Jackson gasped. “I really feel seen.”
- This time, a small envelope slid out and landed in Archibald’s lap. It was addressed in his personal hand. To myself, if I turn out to be unbearable.
- The purple door stood open behind them. Past it, the entrance of the store appeared heat, extraordinary, and solely mildly unimaginable.
I’ve given this writing task earlier than, and in every incarnation it has been spectacular. However this output took the pleasant cozy paranormality to a wholly new degree. Enthusiastically 10 out of 10.
For kicks, I requested GPT-5.5 to “draw me an image that completely illustrates this story in 16:9 side ratio.” This is what was returned:
The AI appropriately illustrated all of the characters to the purpose that I might establish every character. Jackson, talked about above, is the man with the hat. Archibald is the man with the cane.
General take a look at outcomes
General, the checks can reward as much as 100 factors. The present model, GPT-5.5, scored 93. GPT 5.2 scored 92. GPT-5.1 scored 91. You would possibly assume this newest construct would do higher than a degree or two enchancment over the earlier variations, however the mannequin’s personal overeagerness introduced it down.
On the primary take a look at, the one asking about present information, I requested the AI to summarize one supply. As a substitute, it appeared for a similar information from six separate sources. It overreached and misplaced factors.
The identical downside occurred with the interpretation task. I requested GPT-5.5 to translate a sentence to a different language, one I presumably do not converse. It gave again two translations to select from. Now, how is that useful? If I do not converse the language, how would I select which translation I like higher?
These two overzealous reactions misplaced the mannequin six factors. It might have scored a 99 (dropping one level for skipping funds data on the journey query). However, as a substitute, it scored a mere 93.
That mentioned, I fairly like this launch. The solutions had been all good, however the extreme enthusiasm. The power so as to add related photographs, such because the infographic at the start and the bookstore illustration on the finish, opens avenues for enjoyable and work effectiveness.
I see no purpose to advocate towards GPT-5.5. I might be utilizing the mannequin as my default alternative transferring ahead. Keep tuned, as a result of I will be doing much more with the improved picture options of Photographs 2.0 in ChatGPT with GPT-5.5.
Do you favor a mannequin that provides one precise reply or one that gives additional choices? Tell us within the feedback under.
You possibly can comply with my day-to-day undertaking updates on social media. Make sure you subscribe to my weekly update newsletter, and comply with me on Twitter/X at @DavidGewirtz, on Fb at Facebook.com/DavidGewirtz, on Instagram at Instagram.com/DavidGewirtz, on Bluesky at @DavidGewirtz.com, and on YouTube at YouTube.com/DavidGewirtzTV.