Inside Writer

– 7 min read

Introducing intelligent actions with Palmyra X 004

Powerful new model sets the bar with action capabilities

Writer Team

Writer Team   |  October 9, 2024

Palmyra x004

AI is already starting to reshape enterprise workflows. In 2024, 96% of companies say they view generative AI as a key enabler, with 82% anticipating rapid growth in its adoption across various departments.

The transformative potential of this emerging technology lies in making strategic decisions on what work and how much work AI can reliably take on. And when you give LLMs the ability to do even more with the tools you already use, your company stands to benefit from even more efficiency, scalability, and revenue growth.

With the release of our new top-ranking LLM, Palmyra X 004, we’re taking another step toward more impactful AI. Palmyra X 004 boasts state-of-the-art reasoning through novel training techniques and a suite of powerful new features and capabilities, including:

  • The ability for the LLM to take action in external systems, like software, databases, and other Writer apps via tool calling
  • Automatic data integration with built-in retrieval augmented generation (RAG), including chain-of-thought and source transparency
  • Code generation and deployment
  • An expanded 128k context window
  • Structured output generation for simpler system integration (coming in a few weeks)

Palmyra X 004 joins Writer’s widely popular Palmyra family of enterprise-grade LLMs, which support multilingual capabilities in 30+ languages and multi-modal inputs across images, audio, and video. This new model has been benchmarked against models on Berkeley’s Tool Calling Leaderboard and early results show it leads by a significant margin, outperforming model providers including OpenAI, Anthropic, Meta, and Google (official listing coming soon). It’s also top-ranked on Stanford HELM. Not only that, Palmyra X 004 is also the first model of its size to be trained on synthetic data at a fraction of the cost reported by major AI labs, showcasing our commitment to innovative and cost-effective ways to scale AI.

Palmyra X 004 is available today in Ask Writer, our prebuilt chat app, and AI Studio, our suite of development tools.

Mobilizing AI apps that can take action in your enterprise tools

Imagine AI collapsing all the manual steps involved in the product development lifecycle — pulling feedback, analyzing it, prioritizing features, and creating tickets — without human intervention. Today, this process involves time-consuming, manual extraction of feedback, bug reports, and cross-functional decision-making, slowing down the development lifecycle.

With Palmyra X 004’s action capabilities, custom AI apps built in AI Studio are enabled by a mechanism called tool calling, allowing it to interact with tools and services made available to it by an AI engineer. Action capabilities through tool calling unlocks more sophisticated workflows by arming an LLM with tools beyond its built-in knowledge. You can read more about how tool calling works in our engineering blog.

Palmyra X 004 combined with the Writer full-stack platform approach enables developers to more easily build complex AI apps integrated with their unique business systems.

Take action with Palmyra X 004
Action capabilities allow Palmyra to perform real work‌ — ‌updating systems, performing transactions, sending emails, triggering workflows‌ — ‌after processing an input. Actions are a stepping stone toward agentic AI.

From the product roadmapping example above, let’s say a product manager is prioritizing new feature requests in Zendesk based on customer feedback. Palmyra can now automate this process by interpreting the request and deciding which tools it needs to use to respond to the user. Once it identifies the right tools, it can interpret the query and sequence the actions it needs to take. Then it gathers feedback from Zendesk, running it through a prioritization tool, and creates a Jira ticket — all automatically.

Beyond third-party tools, Writer also offers unique enterprise-specific tools that extend Palmyra X 004’s ability to take action. These include:

  • A built-in graph-based RAG tool that automatically brings in company data from your Knowledge Graph into custom chat apps in Writer. For example, a user might ask an internal enablement Writer app, “Compare our Q3 product performance metrics to last year,” and Palmyra X 004 will automatically infer that the built-in RAG tool needs to be called to fetch performance data before analyzing it. 
  • An application endpoint using the Applications API that lets you build multi-step workflows with your no-code Writer apps. This would allow a developer to build a custom app that combines the power of other Writer no-code apps, like a Writer email text-gen app and a Writer chat app powered by a domain-specific model, like Palmyra-Med.

Action capabilities help enterprises integrate tools with AI, simplify repetitive workflows and context switching for end users, and empower non-technical teams. Palmyra also now takes on some of the burden from developers by dynamically managing workflows and making decisions on tool use without constant developer input.

Pull data from a CRM, Update financial records, Send an email, Forecast revenue performance, Retrieve support tickets, Write and deploy code, Generate an XML report, Adjust ad spend, Assign project tasks, Launch a campaign, Pay invoices, Analyze infrastructure logs

By simplifying the coding and automating decision-making, tool calling makes it easier to activate a wide range of powerful, multi-action workflows, all in response to natural language:

  • A financial institution can automatically pull data from external financial databases via API, perform complex analyses defined by functions, and update dashboards.
  • A non-technical team member in the manufacturing industry can retrieve product and supply chain data and run SQL queries without writing any code.
  • A healthcare payor can automate claim processing, pulling relevant patient data automatically from a graph of patient health records, with compliance and privacy.
  • A frontend web developer can debug, write, and deploy new code for a webpage, and publish it directly in a CMS.

Building the market-leading LLM for action capabilities

Writer is committed to building scalable AI solutions that meet the stringent accuracy and reliability needs of the enterprise. With Palmyra X 004, we took a unique approach of training with synthetic data, helping us produce the top-ranking model at a significantly lower cost than other frontier models, and setting a new standard for cost-efficiency in model development. Our four-year track record of innovation in LLM development — across open-source, closed environment, vision, and domain-specific models for industry verticals — is recognized by leading researchers.

Tool calling accuracy

Early results show Palmyra X 004 is the leading model on Berkeley’s Tool Calling Leaderboard benchmarks by a large margin (listing coming soon). It’s ranked as the most accurate model for tool calling and API selection over all GPT, LLama, Claude, and Gemini models.

The benchmarks put LLMs through real-world scenarios to evaluate their ability to select the correct tools, determine which API to call, and successfully execute a function. The result is an accurate, fast, reliable model that can execute multiple tool calls in sequence or in parallel in one interaction with the user.

Average performance across tool calling benchmarks
Benchmarks include overall accuracy, and the ability to plan and structure (AST) and execute (Exec) one, or multiple sequential or parallel tool calls in one step (single-turn).
  • Top accuracy (acc): Palmyra X 004 achieves 78.76% in overall accuracy in identifying and executing the correct tool call, leading the industry by a nearly 20% margin.
  • Leading ability to structure a call (AST): Palmyra X 004 achieves the highest average performance of 87.93% on correctly planning and organizing tool call(s) before execution, demonstrating its ability to accurately interpret a user input, generate the correct parameters, and sequence the steps for a tool call.
  • Leading ability to execute a call (Exec): Palmyra X 004 achieves 88.27% performance on executing tool call(s), ranking highest against all models to efficiently carry out actions across enterprise systems.

Palmyra X 004 also debuted as one of the world’s top 10 models on both HELM Lite, a holistic framework for evaluating foundation models, and HELM MMLU, which tests understanding across 57 subjects, scoring 86.1% and 81.3% respectively.

Getting started with Palmyra X 004

Palmyra X 004 is a foundational step in making Writer the central nervous system of an enterprise, seamlessly connecting data, tools, and departments with minimal human intervention or custom development. Combined with Writer’s full-stack platform, these new capabilities open up a new frontier of innovative AI use cases.

Palmyra X 004 is available now for use on Ask Writer and in AI Studio. Talk to your developers on getting started with actions through Palmyra X 004 today.