Over half of us now use AI to go looking the net, but the stubbornly low knowledge accuracy of widespread instruments creates new enterprise dangers.
Whereas generative AI (GenAI) affords plain effectivity beneficial properties, a brand new investigation highlights a disparity between consumer belief and technical accuracy that poses particular dangers to company compliance, authorized standing, and monetary planning.
For the C-suite, the adoption of those instruments represents a traditional ‘shadow IT’ problem. In response to a survey of 4,189 UK adults performed in September 2025, round a 3rd of customers consider AI is already extra vital to them than normal net looking out. If workers belief these instruments for private queries, they’re nearly actually using them for enterprise analysis.
The investigation, performed by Which?, means that unverified reliance on these platforms might be expensive. Round half of AI customers report trusting the data they obtain to a ‘affordable’ or ‘nice’ extent. But, wanting on the granularity of the responses supplied by AI fashions, that belief is commonly misplaced.
The accuracy hole when utilizing AI to go looking the net
The research examined six main instruments – ChatGPT, Google Gemini (each normal and ‘AI Overviews’), Microsoft Copilot, Meta AI, and Perplexity – throughout 40 widespread questions spanning finance, regulation, and client rights.
Perplexity achieved the very best whole rating at 71 %, carefully adopted by Google Gemini AI Overviews at 70 %. In distinction, Meta scored the bottom at 55 %. ChatGPT, regardless of its widespread adoption, obtained a complete rating of 64 %, making it the second-lowest performer among the many instruments examined. This disconnect between market dominance and dependable output underlines the hazard of assuming reputation equals efficiency within the GenAI area.
Nonetheless, the investigation revealed that every one of those AI instruments ceaselessly misinterpret data or supplied incomplete recommendation that might pose severe enterprise dangers. For monetary officers and authorized departments, the character of those errors is especially regarding.
When requested easy methods to make investments a £25,000 annual ISA allowance, each ChatGPT and Copilot didn’t establish a deliberate error within the immediate concerning the statutory restrict. As a substitute of correcting the determine, they supplied recommendation that doubtlessly risked breaching HMRC guidelines.
Whereas Gemini, Meta, and Perplexity efficiently recognized the error, the inconsistency throughout platforms necessitates a rigorous “human-in-the-loop” protocol for any enterprise course of involving AI to make sure accuracy.
For authorized groups, the tendency of AI to generalise regional laws when utilizing it for net search presents a definite enterprise danger. The testing discovered it widespread for instruments to misconceive that authorized statutes typically differ between UK areas, equivalent to Scotland versus England and Wales.
Moreover, the investigation highlighted an moral hole in how these fashions deal with high-stakes queries. On authorized and monetary issues, the instruments sometimes suggested customers to seek the advice of a registered skilled. For instance, when queried a few dispute with a builder, Gemini suggested withholding cost; a tactic that consultants famous may place a consumer in breach of contract and weaken their authorized place.
This “overconfident recommendation” creates operational hazards. If an worker depends on an AI for preliminary compliance checks or contract assessment with out verifying the jurisdiction or authorized nuance, the organisation may face regulatory publicity.
Supply transparency points
A main concern for enterprise knowledge governance is the lineage of data. The investigation discovered that AI search instruments typically bear a excessive duty to be clear, but ceaselessly cited sources that have been imprecise, non-existent, or have doubtful accuracy, equivalent to previous discussion board threads. This opacity can result in monetary inefficiency.
In a single take a look at concerning tax codes, ChatGPT and Perplexity offered hyperlinks to premium tax-refund firms fairly than directing the consumer to the free official HMRC software. These third-party companies are sometimes characterised by excessive charges.
In a enterprise procurement context, such algorithmic bias from AI instruments when utilizing them for net search may result in pointless vendor spend or engagement with service suppliers that pose a excessive danger on account of not assembly company due diligence requirements.
The key know-how suppliers acknowledge these limitations, putting the burden of verification firmly on the consumer—and, by extension, the enterprise.
A Microsoft spokesperson emphasised that their software acts as a synthesiser fairly than an authoritative supply. “Copilot solutions questions by distilling data from a number of net sources right into a single response,” the corporate famous, including that they “encourage folks to confirm the accuracy of content material.”
OpenAI, responding to the findings, stated: “Enhancing accuracy is one thing the entire business’s engaged on. We’re making good progress and our newest default mannequin, GPT-5, is the neatest and most correct we’ve constructed.”
Mitigating AI enterprise danger by coverage and workflow
For enterprise leaders, the trail ahead is to not ban AI instruments – which regularly will increase by driving utilization additional into the shadows – however to implement sturdy governance frameworks to make sure the accuracy of their output when convey used for net search:
- Implement specificity in prompts: The investigation notes that AI continues to be studying to interpret prompts. Company coaching ought to emphasise that imprecise queries yield dangerous knowledge. If an worker is researching laws, they have to specify the jurisdiction (e.g., “authorized guidelines for England and Wales”) fairly than assuming the software will infer the context.
- Mandate supply verification: Trusting a single output is operationally unsound. Workers should demand to see sources and verify them manually. The research means that for high-risk subjects, customers ought to confirm findings throughout a number of AI instruments or “double supply” the data. Instruments like Google’s Gemini AI Overviews, which permit customers to assessment offered net hyperlinks instantly, carried out barely higher in scoring as a result of they facilitated this verification course of.
- Operationalise the “second opinion”: At this stage of technical maturity, GenAI outputs must be seen as only one opinion amongst many. For advanced points involving finance, regulation, or medical knowledge, AI lacks the power to completely comprehend nuance. Enterprise coverage should dictate that skilled human recommendation stays the ultimate arbiter for selections with real-world penalties.
The AI instruments are evolving and their net search accuracy is progressively bettering, however because the investigation concludes, counting on them an excessive amount of proper now may show expensive. For the enterprise, the distinction between a enterprise effectivity achieve from AI and a compliance failure danger lies within the verification course of.
See additionally: How Levi Strauss is utilizing AI for its DTC-first enterprise mannequin

Wish to study extra about AI and large knowledge from business leaders? Take a look at AI & Big Data Expo happening in Amsterdam, California, and London. The excellent occasion is a part of TechEx and is co-located with different main know-how occasions together with the Cyber Security Expo. Click on here for extra data.
AI Information is powered by TechForge Media. Discover different upcoming enterprise know-how occasions and webinars here.
