Monday, 15 Dec 2025
Subscribe
logo
  • Global
  • AI
  • Cloud Computing
  • Edge Computing
  • Security
  • Investment
  • Sustainability
  • More
    • Colocation
    • Quantum Computing
    • Regulation & Policy
    • Infrastructure
    • Power & Cooling
    • Design
    • Innovations
    • Blog
Font ResizerAa
Data Center NewsData Center News
Search
  • Global
  • AI
  • Cloud Computing
  • Edge Computing
  • Security
  • Investment
  • Sustainability
  • More
    • Colocation
    • Quantum Computing
    • Regulation & Policy
    • Infrastructure
    • Power & Cooling
    • Design
    • Innovations
    • Blog
Have an existing account? Sign In
Follow US
© 2022 Foxiz News Network. Ruby Design Company. All Rights Reserved.
Data Center News > Blog > AI > Hidden costs in AI deployment: Why Claude models may be 20-30% more expensive than GPT in enterprise settings
AI

Hidden costs in AI deployment: Why Claude models may be 20-30% more expensive than GPT in enterprise settings

Last updated: May 2, 2025 6:35 am
Published May 2, 2025
Share
Hidden costs in AI deployment: Why Claude models may be 20-30% more expensive than GPT in enterprise settings
SHARE

It’s a well-known indisputable fact that totally different mannequin households can use totally different tokenizers. Nonetheless, there was restricted evaluation on how the method of “tokenization” itself varies throughout these tokenizers. Do all tokenizers end in the identical variety of tokens for a given enter textual content? If not, how totally different are the generated tokens? How vital are the variations?

On this article, we discover these questions and look at the sensible implications of tokenization variability. We current a comparative story of two frontier mannequin households: OpenAI’s ChatGPT vs Anthropic’s Claude. Though their marketed “cost-per-token” figures are extremely aggressive, experiments reveal that Anthropic fashions may be 20–30% costlier than GPT fashions.

API Pricing — Claude 3.5 Sonnet vs GPT-4o

As of June 2024, the pricing construction for these two superior frontier fashions is extremely aggressive. Each Anthropic’s Claude 3.5 Sonnet and OpenAI’s GPT-4o have equivalent prices for output tokens, whereas Claude 3.5 Sonnet affords a 40% decrease value for enter tokens.

Supply: Vantage

The hidden “tokenizer inefficiency”

Regardless of decrease enter token charges of the Anthropic mannequin, we noticed that the entire prices of working experiments (on a given set of mounted prompts) with GPT-4o is less expensive when in comparison with Claude Sonnet-3.5.

Why?

The Anthropic tokenizer tends to interrupt down the identical enter into extra tokens in comparison with OpenAI’s tokenizer. Which means, for equivalent prompts, Anthropic fashions produce significantly extra tokens than their OpenAI counterparts. Consequently, whereas the per-token value for Claude 3.5 Sonnet’s enter could also be decrease, the elevated tokenization can offset these financial savings, resulting in greater total prices in sensible use instances. 

See also  OpenAI launches GPT Store | InfoWorld

This hidden value stems from the way in which Anthropic’s tokenizer encodes data, typically utilizing extra tokens to signify the identical content material. The token depend inflation has a major affect on prices and context window utilization.

Area-dependent tokenization inefficiency

Various kinds of area content material are tokenized in a different way by Anthropic’s tokenizer, resulting in various ranges of elevated token counts in comparison with OpenAI’s fashions. The AI analysis neighborhood has famous related tokenization variations here. We examined our findings on three fashionable domains, particularly: English articles, code (Python) and math.

Area Mannequin Enter GPT Tokens Claude Tokens % Token Overhead
English articles 77 89 ~16%
Code (Python) 60 78 ~30%
Math 114 138 ~21%

% Token Overhead of Claude 3.5 Sonnet Tokenizer (relative to GPT-4o) Supply: Lavanya Gupta

When evaluating Claude 3.5 Sonnet to GPT-4o, the diploma of tokenizer inefficiency varies considerably throughout content material domains. For English articles, Claude’s tokenizer produces roughly 16% extra tokens than GPT-4o for a similar enter textual content. This overhead will increase sharply with extra structured or technical content material: for mathematical equations, the overhead stands at 21%, and for Python code, Claude generates 30% extra tokens.

This variation arises as a result of some content material varieties, reminiscent of technical paperwork and code, typically include patterns and symbols that Anthropic’s tokenizer fragments into smaller items, resulting in a better token depend. In distinction, extra pure language content material tends to exhibit a decrease token overhead.

Different sensible implications of tokenizer inefficiency

Past the direct implication on prices, there’s additionally an oblique affect on the context window utilization.  Whereas Anthropic fashions declare a bigger context window of 200K tokens, versus OpenAI’s 128K tokens, attributable to verbosity, the efficient usable token area could also be smaller for Anthropic fashions. Therefore, there may doubtlessly be a small or massive distinction within the “marketed” context window sizes vs the “efficient” context window sizes.

See also  Iceotope to reduce telco operators’ energy costs; creates AI-driven cooling tech

Implementation of tokenizers

GPT fashions use Byte Pair Encoding (BPE), which merges regularly co-occurring character pairs to kind tokens. Particularly, the newest GPT fashions use the open-source o200k_base tokenizer. The precise tokens utilized by GPT-4o (within the tiktoken tokenizer) may be considered here.

JSON
 
{
    #reasoning
    "o1-xxx": "o200k_base",
    "o3-xxx": "o200k_base",

    # chat
    "chatgpt-4o-": "o200k_base",
    "gpt-4o-xxx": "o200k_base",  # e.g., gpt-4o-2024-05-13
    "gpt-4-xxx": "cl100k_base",  # e.g., gpt-4-0314, and many others., plus gpt-4-32k
    "gpt-3.5-turbo-xxx": "cl100k_base",  # e.g, gpt-3.5-turbo-0301, -0401, and many others.
}

Sadly, not a lot may be mentioned about Anthropic tokenizers as their tokenizer shouldn’t be as straight and simply accessible as GPT. Anthropic released their Token Counting API in Dec 2024. Nonetheless, it was quickly demised in later 2025 variations.

Latenode reviews that “Anthropic makes use of a singular tokenizer with solely 65,000 token variations, in comparison with OpenAI’s 100,261 token variations for GPT-4.” This Colab notebook incorporates Python code to investigate the tokenization variations between GPT and Claude fashions. One other tool that permits interfacing with some frequent, publicly accessible tokenizers validates our findings.

The flexibility to proactively estimate token counts (with out invoking the precise mannequin API) and price range prices is essential for AI enterprises. 

Key Takeaways

  • Anthropic’s aggressive pricing comes with hidden prices:
    Whereas Anthropic’s Claude 3.5 Sonnet affords 40% decrease enter token prices in comparison with OpenAI’s GPT-4o, this obvious value benefit may be deceptive attributable to variations in how enter textual content is tokenized.
  • Hidden “tokenizer inefficiency”:
    Anthropic fashions are inherently extra verbose. For companies that course of massive volumes of textual content, understanding this discrepancy is essential when evaluating the true value of deploying fashions.
  • Area-dependent tokenizer inefficiency:
    When selecting between OpenAI and Anthropic fashions, consider the character of your enter textual content. For pure language duties, the price distinction could also be minimal, however technical or structured domains could result in considerably greater prices with Anthropic fashions.
  • Efficient context window:
    As a result of verbosity of Anthropic’s tokenizer, its bigger marketed 200K context window could provide much less efficient usable area than OpenAI’s 128K, resulting in a potential hole between marketed and precise context window.
See also  Apple makes major AI advance with image generation technology rivaling DALL-E and Midjourney

Anthropic didn’t reply to VentureBeat’s requests for remark by press time. We’ll replace the story in the event that they reply.

Source link

Contents
API Pricing — Claude 3.5 Sonnet vs GPT-4oThe hidden “tokenizer inefficiency”Area-dependent tokenization inefficiencyImplementation of tokenizersKey Takeaways
TAGGED: Claude, Costs, deployment, enterprise, Expensive, GPT, hidden, models, settings
Share This Article
Twitter Email Copy Link Print
Previous Article Gruve Acquires Lumos Cloud Archer Review Acquires PulsedIn Technology
Next Article Acoru Acoru Raises €4M in Seed Funding
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Your Trusted Source for Accurate and Timely Updates!

Our commitment to accuracy, impartiality, and delivering breaking news as it happens has earned us the trust of a vast audience. Stay ahead with real-time updates on the latest events, trends.
FacebookLike
TwitterFollow
InstagramFollow
YoutubeSubscribe
LinkedInFollow
MediumFollow
- Advertisement -
Ad image

Popular Posts

African Data Center Firm Teraco to Build Out Giant Power Plant | DCN

(Bloomberg) -- Teraco Information Environments, Africa’s largest information middle firm, is constructing a utility-scale energy…

February 29, 2024

OpenAI, Oracle Eye Nvidia Chips Worth Billions for Stargate Site

(Bloomberg) -- OpenAI and Oracle Company plan to start filling an enormous new information middle…

March 7, 2025

Schneider Electric introduces gamers’ UPS

Schneider Electrical has launched its APC™ Again-UPS™ Professional Gaming uninterruptible energy provides (UPS) in Europe.Celebrating…

July 3, 2024

G42onAIr Podcast: UAE Leaders on AI in Data Privacy & Cybersecurity

Within the thrilling new season of G42onAIr, H.E. Mohamed Aljneibi, Head of Cyber Safety at…

November 15, 2024

Metro data centers: the AI boom’s urban challenge – and how switching can help

By Daren Watkins, chief income officer at VIRTUS Knowledge Facilities As Synthetic Intelligence (AI) accelerates…

May 14, 2025

You Might Also Like

Build vs buy is dead — AI just killed it
AI

Build vs buy is dead — AI just killed it

By saad
Nous Research just released Nomos 1, an open-source AI that ranks second on the notoriously brutal Putnam math exam
AI

Nous Research just released Nomos 1, an open-source AI that ranks second on the notoriously brutal Putnam math exam

By saad
3D Rendering of digital binary data on microchip with glow circuit board background. Concept of for deep machine learning, crypto currency, hi tech product uses. Big data visualization, cpu processing
Global Market

How can Arm gain enterprise acceptance?

By saad
Enterprise users swap AI pilots for deep integrations
AI

Enterprise users swap AI pilots for deep integrations

By saad
Data Center News
Facebook Twitter Youtube Instagram Linkedin

About US

Data Center News: Stay informed on the pulse of data centers. Latest updates, tech trends, and industry insights—all in one place. Elevate your data infrastructure knowledge.

Top Categories
  • Global Market
  • Infrastructure
  • Innovations
  • Investments
Usefull Links
  • Home
  • Contact
  • Privacy Policy
  • Terms & Conditions

© 2024 – datacenternews.tech – All rights reserved

Welcome Back!

Sign in to your account

Lost your password?
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.
You can revoke your consent any time using the Revoke consent button.