Amazon has launched Nova Act, a complicated AI mannequin engineered for smarter brokers that may execute duties inside net browsers.
Whereas giant language fashions popularised the idea of “brokers” as instruments that reply queries or retrieve data through strategies resembling Retrieval-Augmented Technology (RAG), Amazon envisions one thing extra sturdy. The corporate defines brokers not simply as responders however as entities able to performing tangible, multi-step duties in various digital and bodily environments.
“Our dream is for brokers to carry out wide-ranging, advanced, multi-step duties like organising a marriage or dealing with advanced IT duties to extend enterprise productiveness,” stated Amazon.
Present market choices typically fall quick, with many brokers requiring steady human supervision and their performance depending on complete API integration—one thing not possible for all duties. Nova Act is Amazon’s reply to those limitations.
Alongside the mannequin, Amazon is releasing a analysis preview of the Amazon Nova Act SDK. Utilizing the SDK, builders can create brokers able to automating net duties like submitting out-of-office notifications, scheduling calendar holds, or enabling computerized electronic mail replies.
The SDK goals to interrupt down advanced workflows into reliable “atomic instructions” resembling looking out, trying out, or interacting with particular interface components like dropdowns or popups. Detailed directions might be added to refine these instructions, permitting builders to, as an illustration, instruct an agent to bypass an insurance coverage upsell throughout checkout.
To additional improve accuracy, the SDK helps browser manipulation through Playwright, API calls, Python integrations, and parallel threading to beat net web page load delays.
Nova Act: Distinctive efficiency on benchmarks
In contrast to different generative fashions that showcase middling accuracy on advanced duties, Nova Act prioritises reliability. Amazon highlights its mannequin’s spectacular scores of over 90% on inner evaluations for particular capabilities that usually problem opponents.
Nova Act achieved a near-perfect 0.939 on the ScreenSpot Internet Textual content benchmark, which measures pure language directions for text-based interactions, resembling adjusting font sizes. Competing fashions resembling Claude 3.7 Sonnet (0.900) and OpenAI’s CUA (0.883) path behind by vital margins.
Equally, Nova Act scored 0.879 within the ScreenSpot Internet Icon benchmark, which checks interactions with visible components like ranking stars or icons. Whereas the GroundUI Internet take a look at, designed to evaluate an AI’s proficiency in navigating numerous consumer interface components, confirmed Nova Act barely trailing opponents, Amazon sees this as an space ripe for enchancment because the mannequin evolves.
Amazon stresses its deal with delivering sensible reliability. As soon as an agent constructed utilizing Nova Act features as anticipated, builders can deploy it headlessly, combine it as an API, and even schedule it to run duties asynchronously. In a single demonstrated use case, an agent robotically orders a salad for supply each Tuesday night with out requiring ongoing consumer intervention.
Amazon units out its imaginative and prescient for scalable and good AI brokers
One in every of Nova Act’s standout options is its capacity to switch its consumer interface understanding to new environments with minimal further coaching. Amazon shared an occasion the place Nova Act carried out admirably in browser-based video games, regardless that its coaching had not included online game experiences. This adaptability positions Nova Act as a flexible agent for various functions.
This functionality is already being leveraged in Amazon’s personal ecosystem. Inside Alexa+, Nova Act permits self-directed net navigation to finish duties for customers, even when API entry isn’t complete sufficient. This represents a step in the direction of smarter AI assistants that may perform independently, harnessing their expertise in additional dynamic methods.
Amazon is obvious that Nova Act represents the primary stage in a broader mission to craft clever, dependable AI brokers able to dealing with more and more advanced, multi-step duties.
Increasing past easy directions, Amazon’s focus is on coaching brokers by means of reinforcement studying throughout diverse, real-world situations fairly than overly simplistic demonstrations. This foundational mannequin serves as a checkpoint in a long-term coaching curriculum for Nova fashions, indicating the corporate’s ambition to reshape the AI agent panorama.
“Probably the most priceless use circumstances for brokers have but to be constructed,” Amazon famous. “One of the best builders and designers will uncover them. This analysis preview of our Nova Act SDK permits us to iterate alongside these builders by means of fast prototyping and iterative suggestions.”
Nova Act is a step in the direction of making AI brokers really helpful for advanced, digital duties. From rethinking benchmarks to emphasising reliability, its design philosophy is centred round empowering builders to maneuver past what’s doable with current-generation instruments.
See additionally: Anthropic offers insights into the ‘AI biology’ of Claude

Wish to be taught extra about AI and massive information from trade leaders? Try AI & Big Data Expo going down in Amsterdam, California, and London. The great occasion is co-located with different main occasions together with Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.
Discover different upcoming enterprise know-how occasions and webinars powered by TechForge here.