By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
PulseReporterPulseReporter
  • Home
  • Entertainment
  • Lifestyle
  • Money
  • Tech
  • Travel
  • Investigations
Reading: Hidden prices in AI deployment: Why Claude fashions could also be 20-30% costlier than GPT in enterprise settings
Share
Notification Show More
Font ResizerAa
PulseReporterPulseReporter
Font ResizerAa
  • Home
  • Entertainment
  • Lifestyle
  • Money
  • Tech
  • Travel
  • Investigations
Have an existing account? Sign In
Follow US
  • Advertise
© 2022 Foxiz News Network. Ruby Design Company. All Rights Reserved.
PulseReporter > Blog > Tech > Hidden prices in AI deployment: Why Claude fashions could also be 20-30% costlier than GPT in enterprise settings
Tech

Hidden prices in AI deployment: Why Claude fashions could also be 20-30% costlier than GPT in enterprise settings

Pulse Reporter
Last updated: May 1, 2025 10:21 pm
Pulse Reporter 2 months ago
Share
Hidden prices in AI deployment: Why Claude fashions could also be 20-30% costlier than GPT in enterprise settings
SHARE

It’s a well-known indisputable fact that completely different mannequin households can use completely different tokenizers. Nonetheless, there was restricted evaluation on how the method of “tokenization” itself varies throughout these tokenizers. Do all tokenizers end in the identical variety of tokens for a given enter textual content? If not, how completely different are the generated tokens? How vital are the variations?

On this article, we discover these questions and study the sensible implications of tokenization variability. We current a comparative story of two frontier mannequin households: OpenAI’s ChatGPT vs Anthropic’s Claude. Though their marketed “cost-per-token” figures are extremely aggressive, experiments reveal that Anthropic fashions might be 20–30% costlier than GPT fashions.

API Pricing — Claude 3.5 Sonnet vs GPT-4o

As of June 2024, the pricing construction for these two superior frontier fashions is extremely aggressive. Each Anthropic’s Claude 3.5 Sonnet and OpenAI’s GPT-4o have equivalent prices for output tokens, whereas Claude 3.5 Sonnet provides a 40% decrease price for enter tokens.

Supply: Vantage

The hidden “tokenizer inefficiency”

Regardless of decrease enter token charges of the Anthropic mannequin, we noticed that the entire prices of operating experiments (on a given set of fastened prompts) with GPT-4o is less expensive when in comparison with Claude Sonnet-3.5.

Why?

The Anthropic tokenizer tends to interrupt down the identical enter into extra tokens in comparison with OpenAI’s tokenizer. Because of this, for equivalent prompts, Anthropic fashions produce significantly extra tokens than their OpenAI counterparts. Consequently, whereas the per-token price for Claude 3.5 Sonnet’s enter could also be decrease, the elevated tokenization can offset these financial savings, resulting in greater total prices in sensible use instances. 

This hidden price stems from the best way Anthropic’s tokenizer encodes data, typically utilizing extra tokens to symbolize the identical content material. The token rely inflation has a big influence on prices and context window utilization.

Area-dependent tokenization inefficiency

Various kinds of area content material are tokenized in a different way by Anthropic’s tokenizer, resulting in various ranges of elevated token counts in comparison with OpenAI’s fashions. The AI analysis neighborhood has famous comparable tokenization variations right here. We examined our findings on three in style domains, particularly: English articles, code (Python) and math.

AreaMannequin EnterGPT TokensClaude Tokens% Token Overhead
English articles7789~16%
Code (Python)6078~30%
Math114138~21%

% Token Overhead of Claude 3.5 Sonnet Tokenizer (relative to GPT-4o) Supply: Lavanya Gupta

When evaluating Claude 3.5 Sonnet to GPT-4o, the diploma of tokenizer inefficiency varies considerably throughout content material domains. For English articles, Claude’s tokenizer produces roughly 16% extra tokens than GPT-4o for a similar enter textual content. This overhead will increase sharply with extra structured or technical content material: for mathematical equations, the overhead stands at 21%, and for Python code, Claude generates 30% extra tokens.

This variation arises as a result of some content material sorts, reminiscent of technical paperwork and code, typically include patterns and symbols that Anthropic’s tokenizer fragments into smaller items, resulting in the next token rely. In distinction, extra pure language content material tends to exhibit a decrease token overhead.

Different sensible implications of tokenizer inefficiency

Past the direct implication on prices, there may be additionally an oblique influence on the context window utilization.  Whereas Anthropic fashions declare a bigger context window of 200K tokens, versus OpenAI’s 128K tokens, attributable to verbosity, the efficient usable token area could also be smaller for Anthropic fashions. Therefore, there may probably be a small or giant distinction within the “marketed” context window sizes vs the “efficient” context window sizes.

Implementation of tokenizers

GPT fashions use Byte Pair Encoding (BPE), which merges often co-occurring character pairs to type tokens. Particularly, the newest GPT fashions use the open-source o200k_base tokenizer. The precise tokens utilized by GPT-4o (within the tiktoken tokenizer) might be considered right here.

JSON
 
{
    #reasoning
    "o1-xxx": "o200k_base",
    "o3-xxx": "o200k_base",

    # chat
    "chatgpt-4o-": "o200k_base",
    "gpt-4o-xxx": "o200k_base",  # e.g., gpt-4o-2024-05-13
    "gpt-4-xxx": "cl100k_base",  # e.g., gpt-4-0314, and many others., plus gpt-4-32k
    "gpt-3.5-turbo-xxx": "cl100k_base",  # e.g, gpt-3.5-turbo-0301, -0401, and many others.
}

Sadly, not a lot might be stated about Anthropic tokenizers as their tokenizer will not be as instantly and simply out there as GPT. Anthropic launched their Token Counting API in Dec 2024. Nonetheless, it was quickly demised in later 2025 variations.

Latenode studies that “Anthropic makes use of a novel tokenizer with solely 65,000 token variations, in comparison with OpenAI’s 100,261 token variations for GPT-4.” This Colab pocket book accommodates Python code to investigate the tokenization variations between GPT and Claude fashions. One other software that permits interfacing with some widespread, publicly out there tokenizers validates our findings.

The power to proactively estimate token counts (with out invoking the precise mannequin API) and price range prices is essential for AI enterprises. 

Key Takeaways

  • Anthropic’s aggressive pricing comes with hidden prices:
    Whereas Anthropic’s Claude 3.5 Sonnet provides 40% decrease enter token prices in comparison with OpenAI’s GPT-4o, this obvious price benefit might be deceptive attributable to variations in how enter textual content is tokenized.
  • Hidden “tokenizer inefficiency”:
    Anthropic fashions are inherently extra verbose. For companies that course of giant volumes of textual content, understanding this discrepancy is essential when evaluating the true price of deploying fashions.
  • Area-dependent tokenizer inefficiency:
    When selecting between OpenAI and Anthropic fashions, consider the character of your enter textual content. For pure language duties, the fee distinction could also be minimal, however technical or structured domains might result in considerably greater prices with Anthropic fashions.
  • Efficient context window:
    As a result of verbosity of Anthropic’s tokenizer, its bigger marketed 200K context window might provide much less efficient usable area than OpenAI’s 128K, resulting in a potential hole between marketed and precise context window.

Anthropic didn’t reply to VentureBeat’s requests for remark by press time. We’ll replace the story in the event that they reply.

Every day insights on enterprise use instances with VB Every day

If you wish to impress your boss, VB Every day has you lined. We provide the inside scoop on what firms are doing with generative AI, from regulatory shifts to sensible deployments, so you possibly can share insights for max ROI.

Learn our Privateness Coverage

Thanks for subscribing. Take a look at extra VB newsletters right here.

An error occured.


You Might Also Like

The very best smartphone images of the yr

Pivoting in politics, tech, antitrust and financial progress | Gary Shapiro interview

Zencoder buys Machinet to problem GitHub Copilot as AI coding assistant consolidation accelerates

FDA Approves New Covid Vaccines Amid Summer time Surge

Dismantling NOAA Threatens the World’s Skill to Monitor Carbon Dioxide Ranges

Share This Article
Facebook Twitter Email Print
Previous Article Register now: Earn American AAdvantage bonus miles on choose World of Hyatt stays Register now: Earn American AAdvantage bonus miles on choose World of Hyatt stays
Next Article Nicole Scherzinger Talks “Sundown Blvd.” Rule Nicole Scherzinger Talks “Sundown Blvd.” Rule
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Weekly Newsletter

Subscribe to our newsletter to get our newest articles instantly!

More News

Meals Community Stars React To Anne Burrell’s Dying
Meals Community Stars React To Anne Burrell’s Dying
3 minutes ago
Wordle at present: The reply and hints for June 19, 2025
Wordle at present: The reply and hints for June 19, 2025
21 minutes ago
Pessimism is hurting the American workforce—and Gen Z is most susceptible
Pessimism is hurting the American workforce—and Gen Z is most susceptible
28 minutes ago
Aaron Taylor-Johnson Will get Candid About Fatherhood, Worry, And Combating The Contaminated In "28 Years Later"
Aaron Taylor-Johnson Will get Candid About Fatherhood, Worry, And Combating The Contaminated In "28 Years Later"
1 hour ago
OpenAI open sourced a brand new Buyer Service Agent framework — study extra about its rising enterprise technique
OpenAI open sourced a brand new Buyer Service Agent framework — study extra about its rising enterprise technique
1 hour ago

About Us

about us

PulseReporter connects with and influences 20 million readers globally, establishing us as the leading destination for cutting-edge insights in entertainment, lifestyle, money, tech, travel, and investigative journalism.

Categories

  • Entertainment
  • Investigations
  • Lifestyle
  • Money
  • Tech
  • Travel

Trending

  • Meals Community Stars React To Anne Burrell’s Dying
  • Wordle at present: The reply and hints for June 19, 2025
  • Pessimism is hurting the American workforce—and Gen Z is most susceptible

Quick Links

  • About Us
  • Contact Us
  • Privacy Policy
  • Terms Of Service
  • Disclaimer
2024 © Pulse Reporter. All Rights Reserved.
Welcome Back!

Sign in to your account