Say hello to Action Agent

Enterprise-ready autonomous AI that works on your behalf

Waseem AlShikh | July 29, 2025

Over the last decade, we’ve built AI systems that excel at specialized tasks. Models can write poetry, diagnose diseases, generate photorealistic images, and generate working code, all from natural language descriptions. But for all this progress, the primary mode of interaction with AI has remained conversational. We tell the AI what we want, and it tells us how to do it, but humans still bear the burden of executing the work.

Today, that changes. The era of asking AI what to do is over, and the era of AI actually doing it has begun.

I’m thrilled to announce the release of Action Agent, available today in open beta for all WRITER customers. Action Agent is a general-purpose autonomous agent that represents a fundamental leap in how we interact with technology. It doesn’t just provide instructions; it executes them. It can understand complex, multi-step requests, create a plan, and then autonomously use the same tools we do – browsers, terminals, file systems, code interpreters – to get the job done.

This isn’t just another chatbot with a few new tricks. This is the first truly autonomous agent purpose-built for the enterprise, balancing power and agency with the security, governance, and control that enterprises demand. Launching in open beta lets us deliver value to customers faster while working closely with them to shape our roadmap.

How it works

To build an agent capable of true autonomy, we had to rethink the entire operational paradigm. The result is an architecture built on three core principles: a secure, sandboxed environment; a dynamic, self-correcting planning and execution framework; and a robust set of native tools and capabilities.

The execution environment: A secure sandbox for every session

Every time a user initiates a session with Action Agent, we spin up a dedicated, containerized Linux environment. This is not a simulation; it’s a real, fully-functional operating system with its own file system, terminal, and sandboxed internet access. This approach provides three key advantages:

Security by isolation: All operations are confined to the temporary environment, which is completely separate from your local machine or network to ensure your data and systems are protected.
Handles complex, multi-step tasks: The environment remembers the agent’s progress, allowing it to manage multiple tasks concurrently. Sessions continue running when a user switches between them or closes the app.
Real tools for real work: By working in a real computing environment, Action Agent has access to a vast library of tools, as well as the ability to install new software and run code in any language.

The planning framework and execution loop

At the heart of Action Agent is Palmyra X5, WRITER’s latest adaptive reasoning LLM, with a new ‘deep thinking’ mode enabled. This leap in reasoning capability powers a rigorous planning framework and execution loop that enables dynamic problem-solving and the delivery of complete artifacts.

Action Agent first breaks down a user’s request into a series of concrete, actionable steps. Stored as a todo.md file, this is a simple, human-readable markdown file that serves as a roadmap and progress tracker.

*Action Agent project plan stored as a todo.md file*

Then, Action Agent begins methodically working through the steps of its todo.md while communicating its actions transparently, applying a rigorous execution loop:

Acts: Starts each task by writing its own scripts and tool calls, then executes them
Observes: Uses reflection mechanism to evaluate whether it did the task right
Refines: Discovers workarounds in the event of failure or incompletion

*Action Agent executing commands and communicating its actions*

The real power of this system lies in its ability to self-correct. If a command fails, an API returns an error, or a web page doesn’t load correctly, Action Agent doesn’t just give up. It analyzes the error, updates the todo.md file with a new approach, and tries again. This iterative, self-correcting loop allows it to navigate the complexities and unpredictabilities of the real world.

Once all tasks have been executed, Action Agent delivers final artifacts, including dashboards, files, images, websites, and more. Users can provide feedback, ask for modifications, or build on previous work, offering a collaborative and iterative experience.

Tools and native capabilities

Today, Action Agent has access to dozens of preconfigured tools for getting work done across multiple domains, including:

Full-spectrum web interaction: It can browse websites, fill out forms, click buttons, and extract information from complex, dynamic web pages.
Data analysis and visualization: It can process structured and unstructured data, perform complex calculations, and generate charts and graphs.
File and system operations: It can create, read, write, and delete a range of file types, as well as perform a wide range of system-level operations.
Code execution and software development: It can write, test, and debug code in multiple languages, and even deploy applications.

We will also be adding an additional 600+ tools across 80+ platforms, available soon via our implementation of Model Context Protocol (MCP).

The proof is in the benchmarks

Putting it all together – the sandboxed operating environment, the rigorous planning and execution loop, and the enterprise-grade security, control and supervision – Action Agent outperforms other agents on industry-standard benchmarks. It scores a 61% on the most difficult level of the General AI Assistants (GAIA) benchmark.

Action Agent also demonstrates robust performance across domains, with the highest overall score on the Computer Use Benchmark (CUB) leaderboard, which evaluates performance across six distinct industry verticals.

Enterprise-grade security, control, and supervision

As AI autonomy increases, so does operational risk. And in a world where synthetic labor is abundant, the core value shifts to judgment. We built Action Agent for the realities of enterprise scale, where small errors compound fast. Security, control, and supervision aren’t overhead — they’re what make scaling AI safe and reliable.

Enterprise agents demand multi-layered security, comprehensive data governance, and brand protection mechanisms that enforce business-specific rules beyond basic safety models. They require precision controls that generate verifiable results – not just plausible content – and administrative oversight that enables both real-time monitoring and retroactive auditability. Additionally, to meet enterprise standards, AI systems must be built on governance frameworks with role-based permissions and systematic risk management from the ground up, not consumer tools adapted for business use.

The road ahead

With Action Agent, we’re setting the pace toward artificial general intelligence for the enterprise. Today is just the beginning, and there’s no slowing down. We’re launching in open beta because we want to build the future with our customers, and over the next few months, we’ll remain relentlessly focused on expanding Action Agent’s capabilities, defining new user patterns, and deepening integrations with third-party tools and systems. Even as roadmaps shift and technology leaps forward, our vision remains the same – to be the platform that automates complex work by connecting people with the data, models, tools, and systems they need to get work done.

We can’t wait to see what you’ll achieve with Action Agent – try it out today and let us know what you think.