Saturday, 13 Dec 2025
Subscribe
logo
  • Global
  • AI
  • Cloud Computing
  • Edge Computing
  • Security
  • Investment
  • Sustainability
  • More
    • Colocation
    • Quantum Computing
    • Regulation & Policy
    • Infrastructure
    • Power & Cooling
    • Design
    • Innovations
    • Blog
Font ResizerAa
Data Center NewsData Center News
Search
  • Global
  • AI
  • Cloud Computing
  • Edge Computing
  • Security
  • Investment
  • Sustainability
  • More
    • Colocation
    • Quantum Computing
    • Regulation & Policy
    • Infrastructure
    • Power & Cooling
    • Design
    • Innovations
    • Blog
Have an existing account? Sign In
Follow US
© 2022 Foxiz News Network. Ruby Design Company. All Rights Reserved.
Data Center News > Blog > AI > Diffbot’s AI model doesn’t guess — it knows, thanks to a trillion-fact knowledge graph
AI

Diffbot’s AI model doesn’t guess — it knows, thanks to a trillion-fact knowledge graph

Last updated: January 12, 2025 2:05 pm
Published January 12, 2025
Share
Diffbot’s AI model doesn’t guess — it knows, thanks to a trillion-fact knowledge graph
SHARE

Be a part of our every day and weekly newsletters for the most recent updates and unique content material on industry-leading AI protection. Study Extra


Diffbot, a small Silicon Valley firm finest identified for sustaining one of many world’s largest indexes of web knowledge, introduced as we speak the discharge of a brand new AI mannequin that guarantees to deal with one of many largest challenges within the discipline: factual accuracy.

The new model, a fine-tuned model of Meta’s LLama 3.3, is the primary open-source implementation of a system often called graph retrieval-augmented era, or GraphRAG.

Not like typical AI fashions, which rely solely on huge quantities of preloaded coaching knowledge, Diffbot’s LLM attracts on real-time info from the corporate’s Knowledge Graph, a consistently up to date database containing greater than a trillion interconnected info.

“Now we have a thesis: that finally general-purpose reasoning will get distilled down into about 1 billion parameters,” stated Mike Tung, Diffbot’s founder and CEO, in an interview with VentureBeat. “You don’t really need the data within the mannequin. You need the mannequin to be good at simply utilizing instruments in order that it may question data externally.”

The way it works

Diffbot’s Knowledge Graph is a sprawling, automated database that has been crawling the general public internet since 2016. It categorizes internet pages into entities comparable to individuals, corporations, merchandise and articles, extracting structured info utilizing a mixture of laptop imaginative and prescient and pure language processing.

Each 4 to 5 days, the Information Graph is refreshed with thousands and thousands of latest info, making certain it stays up-to-date. Diffbot’s AI model leverages this useful resource by querying the graph in actual time to retrieve info, moderately than counting on static data encoded in its coaching knowledge.

See also  US slams brakes on AI Diffusion Rule, hardens chip export curbs

For instance, when requested a few latest information occasion, the mannequin can search the net for the most recent updates, extract related info, and cite the unique sources. This course of is designed to make the system extra correct and clear than conventional LLMs.

“Think about asking an AI in regards to the climate,” Tung stated. “As a substitute of producing a solution based mostly on outdated coaching knowledge, our mannequin queries a stay climate service and supplies a response grounded in real-time info.”

How Diffbot’s Information Graph beats conventional AI at discovering info

In benchmark assessments, Diffbot’s strategy seems to be paying off. The corporate experiences its mannequin achieves an 81% accuracy rating on FreshQA, a Google-created benchmark for testing real-time factual data, surpassing each ChatGPT and Gemini. It additionally scored 70.36% on MMLU-Pro, a harder model of a normal take a look at of educational data.

Maybe most importantly, Diffbot is making its mannequin totally open-source, permitting corporations to run it on their very own {hardware} and customise it for his or her wants. This addresses rising considerations about knowledge privateness and vendor lock-in with main AI suppliers.

“You possibly can run it domestically in your machine,” Tung famous. “There’s no method you possibly can run Google Gemini with out sending your knowledge over to Google and delivery it exterior of your premises.”

Open-source AI might remodel how enterprises deal with delicate knowledge

The discharge comes at a pivotal second in AI improvement. Latest months have seen mounting criticism of enormous language fashions’ tendency to “hallucinate” or generate false info, at the same time as corporations proceed to scale up mannequin sizes. Diffbot’s strategy suggests an alternate path ahead, one centered on grounding AI programs in verifiable info moderately than trying to encode all human data in neural networks.

See also  Microsoft unveils serverless fine-tuning for its Phi-3 small language model

“Not everybody’s going after simply greater and greater fashions,” Tung stated. “You possibly can have a mannequin that has extra functionality than a giant mannequin with sort of a non-intuitive strategy like ours.”

Business consultants be aware that Diffbot’s Information Graph-based strategy could possibly be notably priceless for enterprise purposes the place accuracy and auditability are essential. The corporate already supplies knowledge providers to main corporations together with Cisco, DuckDuckGo and Snapchat.

The mannequin is on the market instantly via an open-source launch on GitHub and will be examined via a public demo at diffy.chat. For organizations desirous to deploy it internally, Diffbot says the smaller 8-billion-parameter model can run on a single Nvidia A100 GPU, whereas the complete 70-billion-parameter model requires two H100 GPUs.

Wanting forward, Tung believes the way forward for AI lies not in ever-larger fashions, however in higher methods of organizing and accessing human data: “Details get stale. A variety of these info shall be moved out into specific locations the place you possibly can really modify the data and the place you possibly can have knowledge provenance.”

Because the AI {industry} grapples with challenges round factual accuracy and transparency, Diffbot’s launch provides a compelling different to the dominant bigger-is-better paradigm. Whether or not it succeeds in shifting the sector’s route stays to be seen, but it surely has definitely demonstrated that with regards to AI, dimension isn’t every part.


Source link
TAGGED: Diffbots, doesnt, Graph, guess, Knowledge, Model, trillionfact
Share This Article
Twitter Email Copy Link Print
Previous Article Abstract Abstract Raises $4.8M in Seed Funding
Next Article Achieving Automation Maturity in FinOps – Webinar by CloudBolt Achieving Automation Maturity in FinOps – Webinar by CloudBolt
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Your Trusted Source for Accurate and Timely Updates!

Our commitment to accuracy, impartiality, and delivering breaking news as it happens has earned us the trust of a vast audience. Stay ahead with real-time updates on the latest events, trends.
FacebookLike
TwitterFollow
InstagramFollow
YoutubeSubscribe
LinkedInFollow
MediumFollow
- Advertisement -
Ad image

Popular Posts

US power, tech companies lament snags in meeting AI energy needs | The Mighty 790 KFGO

By Laila KearneyNEW YORK (Reuters) – U.S. electrical methods aren't increasing quick sufficient to satisfy…

April 19, 2024

Signal AI Acquires Social 360

Signal AI, a London, UK-based firm which focuses on popularity and threat intelligence, acquired Social…

February 18, 2024

Comvest Partners Closes Sixth Flagship Private Equity Fund, at $881M

Comvest Partners, a West Palm Seaside, FL-based middle-market personal fairness and credit score funding agency,…

March 20, 2024

Google to Spend $3B for Hydropower from Brookfield

(Bloomberg) -- Google agreed to spend greater than $3 billion to purchase energy for its…

July 15, 2025

Cybersecurity trends and how to navigate them

As organisations worldwide proceed to grapple with an ever-expanding menace panorama, understanding the present cybersecurity…

July 23, 2025

You Might Also Like

BBVA embeds AI into banking workflows using ChatGPT Enterprise
AI

BBVA embeds AI into banking workflows using ChatGPT Enterprise

By saad
Ai2's new Olmo 3.1 extends reinforcement learning training for stronger reasoning benchmarks
AI

Ai2's new Olmo 3.1 extends reinforcement learning training for stronger reasoning benchmarks

By saad
Experimental AI concludes as autonomous systems rise
AI

Experimental AI concludes as autonomous systems rise

By saad
OpenAI's GPT-5.2 is here: what enterprises need to know
AI

OpenAI's GPT-5.2 is here: what enterprises need to know

By saad
Data Center News
Facebook Twitter Youtube Instagram Linkedin

About US

Data Center News: Stay informed on the pulse of data centers. Latest updates, tech trends, and industry insights—all in one place. Elevate your data infrastructure knowledge.

Top Categories
  • Global Market
  • Infrastructure
  • Innovations
  • Investments
Usefull Links
  • Home
  • Contact
  • Privacy Policy
  • Terms & Conditions

© 2024 – datacenternews.tech – All rights reserved

Welcome Back!

Sign in to your account

Lost your password?
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.
You can revoke your consent any time using the Revoke consent button.