By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
PulseReporterPulseReporter
  • Home
  • Entertainment
  • Lifestyle
  • Money
  • Tech
  • Travel
  • Investigations
Reading: Meet The AI Agent With A number of Personalities
Share
Notification Show More
Font ResizerAa
PulseReporterPulseReporter
Font ResizerAa
  • Home
  • Entertainment
  • Lifestyle
  • Money
  • Tech
  • Travel
  • Investigations
Have an existing account? Sign In
Follow US
  • Advertise
© 2022 Foxiz News Network. Ruby Design Company. All Rights Reserved.
PulseReporter > Blog > Tech > Meet The AI Agent With A number of Personalities
Tech

Meet The AI Agent With A number of Personalities

Pulse Reporter
Last updated: April 16, 2025 4:49 pm
Pulse Reporter 2 months ago
Share
Meet The AI Agent With A number of Personalities
SHARE


Within the coming years, brokers are extensively anticipated to take over increasingly more chores on behalf of people, together with utilizing computer systems and smartphones. For now, although, they’re too error inclined to be a lot use.

A brand new agent referred to as S2, created by the startup Simular AI, combines frontier fashions with fashions specialised for utilizing computer systems. The agent achieves state-of-the-art efficiency on duties like utilizing apps and manipulating information—and means that turning to completely different fashions in numerous conditions could assist brokers advance.

“Pc-using brokers are completely different from giant language fashions and completely different from coding,” says Ang Li, cofounder and CEO of Simular. “It’s a distinct kind of downside.”

In Simular’s strategy, a robust general-purpose AI mannequin, like OpenAI’s GPT-4o or Anthropic’s Claude 3.7, is used to cause about how finest to finish the duty at hand—whereas smaller open supply fashions step in for duties like deciphering net pages.

Li, who was a researcher at Google DeepMind earlier than founding Simular in 2023, explains that giant language fashions excel at planning however aren’t pretty much as good at recognizing the weather of a graphical person interface.

S2 is designed to be taught from expertise with an exterior reminiscence module that information actions and person suggestions and makes use of these recordings to enhance future actions.

On significantly advanced duties, S2 performs higher than another mannequin on OSWorld, a benchmark that measures an agent’s skill to make use of a pc working system.

For instance, S2 can full 34.5 % of duties that contain 50 steps, beating OpenAI’s Operator, which might full 32 %. Equally, S2 scores 50 % on AndroidWorld, a benchmark for smartphone-using brokers, whereas the subsequent finest agent scores 46 %.

Victor Zhong, a pc scientist on the College of Waterloo in Canada and one of many creators of OSWorld, believes that future huge AI fashions could incorporate coaching information that helps them perceive the visible world and make sense of graphical person interfaces.

“This may assist brokers navigate GUIs with a lot increased precision,” Zhong says. “I feel within the meantime, earlier than such elementary breakthroughs, state-of-the-art programs will resemble Simular in that they mix a number of fashions to patch the constraints of single fashions.”

To arrange for this column, I used Simular to guide flights and scour Amazon for offers, and it appeared higher than a number of the open supply brokers I attempted final yr, together with AutoGen and vimGPT.

However even the neatest AI brokers are, it appears, nonetheless troubled by edge instances and infrequently exhibit odd habits. In a single occasion, once I requested S2 to assist discover contact data for the researchers behind OSWorld, the agent bought caught in a loop hopping between the challenge web page and the login for OSWorld’s Discord.

OSWorld’s benchmarks present why brokers stay extra hype than actuality for now. Whereas people can full 72 % of OSWorld duties, brokers are foiled 38 % of the time on advanced duties. That mentioned, when the benchmark was launched in April 2024, the perfect agent may full solely 12 % of the duties.

You Might Also Like

X bought — to Elon Musk’s AI firm

Tate Moderns Electrical Desires celebrates digital artwork earlier than the web

Greatest Amazon offers of the day: Google Pixel Buds A, Fitbit Ace LTE, Amazon Fireplace TV Stick 4K Max, and Soundcore Movement Growth

Wildlife Studios indicators Naomi Osaka and different stars for Tennis Conflict

The Finest Kindle to Purchase in 2024

Share This Article
Facebook Twitter Email Print
Previous Article After A Merciless Remark Beneath Hilary Duff’s Latest Images Led To Hypothesis Round How A lot Botox She’s Had, Her Husband, Matthew Koma, Entered The Dialog After A Merciless Remark Beneath Hilary Duff’s Latest Images Led To Hypothesis Round How A lot Botox She’s Had, Her Husband, Matthew Koma, Entered The Dialog
Next Article We Examined 9 Celebs On How To Pour Bagged Milk We Examined 9 Celebs On How To Pour Bagged Milk
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Weekly Newsletter

Subscribe to our newsletter to get our newest articles instantly!

More News

Folks Are Reminding Others That There Are So Many Methods To Have Kids After Benny Blanco Was Referred to as Out For His Latest Feedback About Wanting Children With Selena Gomez
Folks Are Reminding Others That There Are So Many Methods To Have Kids After Benny Blanco Was Referred to as Out For His Latest Feedback About Wanting Children With Selena Gomez
39 minutes ago
DOGE Is on a Recruiting Spree
DOGE Is on a Recruiting Spree
59 minutes ago
Elon Musk’s feud with Donald Trump is vastly damaging to Tesla however don’t anticipate any motion from the board
Elon Musk’s feud with Donald Trump is vastly damaging to Tesla however don’t anticipate any motion from the board
1 hour ago
Prime 8 Recent Streaming Picks For Your Watchlist
Prime 8 Recent Streaming Picks For Your Watchlist
2 hours ago
Pok Pok lifetime subscription: £36.98
Pok Pok lifetime subscription: £36.98
2 hours ago

About Us

about us

PulseReporter connects with and influences 20 million readers globally, establishing us as the leading destination for cutting-edge insights in entertainment, lifestyle, money, tech, travel, and investigative journalism.

Categories

  • Entertainment
  • Investigations
  • Lifestyle
  • Money
  • Tech
  • Travel

Trending

  • Folks Are Reminding Others That There Are So Many Methods To Have Kids After Benny Blanco Was Referred to as Out For His Latest Feedback About Wanting Children With Selena Gomez
  • DOGE Is on a Recruiting Spree
  • Elon Musk’s feud with Donald Trump is vastly damaging to Tesla however don’t anticipate any motion from the board

Quick Links

  • About Us
  • Contact Us
  • Privacy Policy
  • Terms Of Service
  • Disclaimer
2024 © Pulse Reporter. All Rights Reserved.
Welcome Back!

Sign in to your account