By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
PulseReporterPulseReporter
  • Home
  • Entertainment
  • Lifestyle
  • Money
  • Tech
  • Travel
  • Investigations
Reading: LLMs vs. Geolocation: GPT-5 performs worse than different AI fashions
Share
Notification Show More
Font ResizerAa
PulseReporterPulseReporter
Font ResizerAa
  • Home
  • Entertainment
  • Lifestyle
  • Money
  • Tech
  • Travel
  • Investigations
Have an existing account? Sign In
Follow US
  • Advertise
© 2022 Foxiz News Network. Ruby Design Company. All Rights Reserved.
PulseReporter > Blog > Investigations > LLMs vs. Geolocation: GPT-5 performs worse than different AI fashions
Investigations

LLMs vs. Geolocation: GPT-5 performs worse than different AI fashions

Pulse Reporter
Last updated: August 14, 2025 12:43 pm
Pulse Reporter 3 hours ago
Share
LLMs vs. Geolocation: GPT-5 performs worse than different AI fashions
SHARE


In June, Bellingcat ran 500 geolocation checks, evaluating LLMs from varied corporations towards one another, in addition to Google Lens – a staple software for locating the situation of photographs.

On the time, ChatGPT o4-mini-high emerged because the clear winner, with Google Lens outperforming most different fashions. Simply two months later, with new variations of those AI instruments obtainable, we re-ran the trial – this time together with Google “AI Mode,” GPT-5, GPT-5 Pondering, and Grok 4 into the combo.

These 5 photographs had been excluded from our most up-to-date trial as they had been revealed in our earlier article.

The unique check used 25 of Bellingcat’s personal vacation photographs. From cities to distant countryside, the pictures included scenes each with and with out recognisable options – corresponding to roads, signage, mountains, or structure. Pictures had been sourced from each continent.

For the up to date trial, 5 check photographs had been excluded, as they’d appeared in a earlier article, thus compromising the integrity of the outcomes.

All 24 fashions’ responses had been ranked on a scale from 0 to 10, with 10 indicating an correct and particular identification (corresponding to a neighbourhood, path, or landmark) and 0 indicating no try to establish the situation in any respect.

Google AI Mode was proven to be probably the most succesful geolocation software general. 

Grok 4 gave each higher and worse solutions in comparison with Grok 3 however, on common, scored marginally larger. Nonetheless, it was nonetheless much less correct than older variations of Gemini and GPT. 

GPT-5, even in ‘Pondering’ and ‘Professional’ modes, was a substantial downgrade compared with the capabilities demonstrated by GPT o4-mini-high. In a single instance, of a metropolis avenue with skyscrapers within the background, o4-mini-high accurately recognized the road, whereas GPT-5 in Pondering mode pointed to the fallacious nation. 

Help Bellingcat

Your donations straight contribute to our capacity to publish groundbreaking investigations and uncover wrongdoing world wide.

Regardless of delivering quicker solutions, GPT-5 appeared to sacrifice accuracy. A stunning variety of errors and a common sense of disappointment within the new mannequin have additionally been reported by different customers.

Bellingcat examined GPT-5 and its ‘Pondering’ mode through the Plus subscription, which prices roughly the identical as entry to 04-mini-high previous to its retirement. 5 of probably the most troublesome check photos had been additionally run by means of GPT-5 Professional. However even Professional, with a premium price ticket of €200 monthly, did not geolocate the photographs any extra precisely than GPT 04-mini-high.

A Seaside, a Lodge and a Ferris Wheel

The disparity between Google and the GPT fashions turned much more obvious in Take a look at 25 – a photograph of a shoreline lodge in Noordwijk, the Netherlands, with a Ferris wheel rising simply past the dunes.

Take a look at 25: A photograph of Noordwijk seashore within the Netherlands. Credit score: Bellingcat.

Within the earlier trial, most older fashions – together with these from GPT, Claude, Gemini and Grok – precisely recognized the nation because the Netherlands however did not find the city. Many latched onto the Ferris wheel however pointed as an alternative to the seaside city of Scheveningen, which additionally has a Ferris wheel, although located on a pier, not among the many sand dunes.

Nonetheless, the latest fashions, GPT-5 Professional and Pondering, had been even much less correct, figuring out a seashore in France – a completely completely different nation. 

Sadly for open supply researchers, following the discharge of GPT-5, OpenAI eliminated the choice to pick older fashions corresponding to o4-mini-high. After a wave of unfavorable suggestions, OpenAI reinstated GPT-4o because the default mannequin for paid subscribers. Nonetheless, probably the most succesful geolocation fashions recognized in Bellingcat’s testing stay inaccessible.

Google AI Mode, however, was the primary, and solely mannequin to date, to accurately establish Noordwijk as the situation in Take a look at 25.  

Although AI Mode is powered by a model of Gemini 2.5, it outperformed Gemini 2.5 Professional Deep Analysis in these checks. Described by Google as its “strongest AI search, with extra superior reasoning and multimodality,” AI Mode geolocated check photos with higher accuracy than any GPT fashions, together with our earlier winner, o4-mini-high.

AI Mode is at present solely obtainable in India, United Kingdom and the USA.

The vast majority of fashions, sooner or later, returned a hallucination. Customers mustn’t rely solely on the solutions supplied by LLMs. Even the very best choices, together with Google AI Mode, nonetheless, at occasions, confidently level to the fallacious location. 

The distinction in fashions’ capabilities in contrast with simply two months in the past exhibits how shortly this discipline is evolving. Nonetheless, OpenAI’s latest adjustments additionally recommend that progress shouldn’t be assured, and that AI’s capacity to geolocate might plateau and even worsen over time. As new fashions emerge, Bellingcat will proceed to check them.

Because of Nathan Patin for contributing to the unique benchmark checks.


Bellingcat is a non-profit and the power to hold out our work relies on the sort help of particular person donors. If you want to help our work, you are able to do so right here. You may as well subscribe to our Patreon channel right here. Subscribe to our E-newsletter and observe us on Bluesky right here and Instagram right here.



You Might Also Like

US dockworker strike: Three charts to grasp the influence on agriculture

Senator calls for Tyson little one labor probe. Trump’s cuts to DOL might make that troublesome.

Uncommon Illinois mud storm exhibits how far local weather shifts are reaching

Kristi Noem mentioned an immigrant dwelling in Milwaukee threatened to kill Trump, however the story rapidly fell aside

Invoice seeks to ban hedge funds from shopping for Wisconsin homes

Share This Article
Facebook Twitter Email Print
Previous Article Finest earbuds deal: Save .41 on Soundcore Area A40 earbuds Finest earbuds deal: Save $35.41 on Soundcore Area A40 earbuds
Next Article "KPop Demon Hunters" Music Showdown "KPop Demon Hunters" Music Showdown
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Weekly Newsletter

Subscribe to our newsletter to get our newest articles instantly!

More News

Christian Militants Are Utilizing Instagram to Recruit—and Turning into Influencers within the Course of
Christian Militants Are Utilizing Instagram to Recruit—and Turning into Influencers within the Course of
22 minutes ago
Chase Sapphire Reserve supply with 100K factors plus credit score is ending quickly
Chase Sapphire Reserve supply with 100K factors plus credit score is ending quickly
29 minutes ago
31-year-old New Yorker who discovered a 2-karat diamond in a volcanic crater goes to put on it for an engagement ring
31-year-old New Yorker who discovered a 2-karat diamond in a volcanic crater goes to put on it for an engagement ring
31 minutes ago
Taylor Swift, The Life Of A Showgirl Tracklist Defined
Taylor Swift, The Life Of A Showgirl Tracklist Defined
51 minutes ago
Wednesday Season 2, Half 2 trailer includes a very Girl Gaga-like voice
Wednesday Season 2, Half 2 trailer includes a very Girl Gaga-like voice
1 hour ago

About Us

about us

PulseReporter connects with and influences 20 million readers globally, establishing us as the leading destination for cutting-edge insights in entertainment, lifestyle, money, tech, travel, and investigative journalism.

Categories

  • Entertainment
  • Investigations
  • Lifestyle
  • Money
  • Tech
  • Travel

Trending

  • Christian Militants Are Utilizing Instagram to Recruit—and Turning into Influencers within the Course of
  • Chase Sapphire Reserve supply with 100K factors plus credit score is ending quickly
  • 31-year-old New Yorker who discovered a 2-karat diamond in a volcanic crater goes to put on it for an engagement ring

Quick Links

  • About Us
  • Contact Us
  • Privacy Policy
  • Terms Of Service
  • Disclaimer
2024 © Pulse Reporter. All Rights Reserved.
Welcome Back!

Sign in to your account