Close Menu
RoboNewsWire – Latest Insights on AI, Robotics, Crypto and Tech Innovations
  • Home
  • AI
  • Crypto
  • Cybersecurity
  • IT
  • Energy
  • Robotics
  • TechCrunch
  • Technology
What's Hot

Investors trust Google more than Meta when comes to spending on AI

April 30, 2026

Paragon is not collaborating with Italian authorities probing spyware attacks, report says

April 28, 2026

Microsoft cuts OpenAI revenue share as their AI alliance loosens

April 28, 2026
Facebook X (Twitter) Instagram
Trending
  • Investors trust Google more than Meta when comes to spending on AI
  • Paragon is not collaborating with Italian authorities probing spyware attacks, report says
  • Microsoft cuts OpenAI revenue share as their AI alliance loosens
  • Robotically assembled building blocks could make construction more efficient and sustainable | MIT News
  • AI showdown: Musk and Altman go to trial in fight over OpenAI’s beginnings
  • U.S., Iran seize ships as war evolves into standoff over Strait of Hormuz
  • Google launches training and inference TPUs in latest shot at Nvidia
  • Zoom teams up with World to verify humans in meetings
  • Home
  • About Us
  • Advertise
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
Facebook X (Twitter) Instagram
RoboNewsWire – Latest Insights on AI, Robotics, Crypto and Tech InnovationsRoboNewsWire – Latest Insights on AI, Robotics, Crypto and Tech Innovations
Thursday, May 7
  • Home
  • AI
  • Crypto
  • Cybersecurity
  • IT
  • Energy
  • Robotics
  • TechCrunch
  • Technology
RoboNewsWire – Latest Insights on AI, Robotics, Crypto and Tech Innovations
Home » New model design could fix high enterprise AI costs

New model design could fix high enterprise AI costs

GTBy GTNovember 6, 2025 AI No Comments4 Mins Read
Share
Facebook Twitter LinkedIn Pinterest Email


Enterprise leaders grappling with the steep costs of deploying AI models could find a reprieve thanks to a new architecture design.

While the capabilities of generative AI are attractive, their immense computational demands for both training and inference result in prohibitive expenses and mounting environmental concerns. At the centre of this inefficiency is the models’ “fundamental bottleneck” of an autoregressive process that generates text sequentially, token-by-token.

For enterprises processing vast data streams, from IoT networks to financial markets, this limitation makes generating long-form analysis both slow and economically challenging. However, a new research paper from Tencent AI and Tsinghua University proposes an alternative.

A new approach to AI efficiency

The research introduces Continuous Autoregressive Language Models (CALM). This method re-engineers the generation process to predict a continuous vector rather than a discrete token.

A high-fidelity autoencoder “compress[es] a chunk of K tokens into a single continuous vector,” which holds a much higher semantic bandwidth.

Instead of processing something like “the”, “cat”, “sat” in three steps, the model compresses them into one. This design directly “reduces the number of generative steps,” attacking the computational load.

The experimental results demonstrate a better performance-compute trade-off. A CALM AI model grouping four tokens delivered performance “comparable to strong discrete baselines, but at a significantly lower computational cost” for an enterprise.

One CALM model, for instance, required 44 percent fewer training FLOPs and 34 percent fewer inference FLOPs than a baseline Transformer of similar capability. This points to a saving on both the initial capital expense of training and the recurring operational expense of inference.

Rebuilding the toolkit for the continuous domain

Moving from a finite, discrete vocabulary to an infinite, continuous vector space breaks the standard LLM toolkit. The researchers had to develop a “comprehensive likelihood-free framework” to make the new model viable.

For training, the model cannot use a standard softmax layer or maximum likelihood estimation. To solve this, the team used a “likelihood-free” objective with an Energy Transformer, which rewards the model for accurate predictions without computing explicit probabilities.

This new training method also required a new evaluation metric. Standard benchmarks like Perplexity are inapplicable as they rely on the same likelihoods the model no longer computes.

The team proposed BrierLM, a novel metric based on the Brier score that can be estimated purely from model samples. Validation confirmed BrierLM as a reliable alternative, showing a “Spearman’s rank correlation of -0.991” with traditional loss metrics.

Finally, the framework restores controlled generation, a key feature for enterprise use. Standard temperature sampling is impossible without a probability distribution. The paper introduces a new “likelihood-free sampling algorithm,” including a practical batch approximation method, to manage the trade-off between output accuracy and diversity.

Reducing enterprise AI costs

This research offers a glimpse into a future where generative AI is not defined purely by ever-larger parameter counts, but by architectural efficiency.

The current path of scaling models is hitting a wall of diminishing returns and escalating costs. The CALM framework establishes a “new design axis for LLM scaling: increasing the semantic bandwidth of each generative step”.

While this is a research framework and not an off-the-shelf product, it points to a powerful and scalable pathway towards ultra-efficient language models. When evaluating vendor roadmaps, tech leaders should look beyond model size and begin asking about architectural efficiency.

The ability to reduce FLOPs per generated token will become a defining competitive advantage, enabling AI to be deployed more economically and sustainably across the enterprise to reduce costs—from the data centre to data-heavy edge applications.

See also: Flawed AI benchmarks put enterprise budgets at risk

Banner for AI & Big Data Expo by TechEx events.

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and is co-located with other leading technology events including the Cyber Security Expo, click here for more information.

AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.



Source link

GT
  • Website

Keep Reading

Enterprise users swap AI pilots for deep integrations

Google, Sony Innovation Fund, and Okta back Resemble AI deepfake detection plan

Platform corrects AI algorithmic bias for eKYC

What ByteDance’s Launch Means for Enterprise

UK and Germany plan to commercialise quantum supercomputing

Frontier AI agents replace chatbots

Add A Comment
Leave A Reply Cancel Reply

Editors Picks

Investors trust Google more than Meta when comes to spending on AI

April 30, 2026

Google launches training and inference TPUs in latest shot at Nvidia

April 27, 2026

Meta tracks employee usage on Google, LinkedIn AI training project

April 25, 2026

Meta will cut 10% of workforce as company pushes deeper into AI

April 24, 2026
Latest Posts

Malicious Chrome Extension Steal ChatGPT and DeepSeek Conversations from 900K Users

April 1, 2026

Top 10 Best Server Monitoring Tools

April 1, 2026

10 Best Cybersecurity Risk Management Tools

March 31, 2026

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Welcome to RoboNewsWire, your trusted source for cutting-edge news and insights in the world of technology. We are dedicated to providing timely and accurate information on the most important trends shaping the future across multiple sectors. Our mission is to keep you informed and ahead of the curve with deep dives, expert analysis, and the latest updates in key industries that are transforming the world.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Facebook X (Twitter) Instagram
  • Home
  • About Us
  • Advertise
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2026 Robonewswire. Designed by robonewswire.

Type above and press Enter to search. Press Esc to cancel.