Inside Writer

– 7 min read

Introducing intelligent actions with Palmyra X4

Powerful new model sets the bar with action capabilities

Writer Team | October 9, 2024

AI is already starting to reshape enterprise workflows. In 2024, 96% of companies say they view generative AI as a key enabler, with 82% anticipating rapid growth in its adoption across various departments.

The transformative potential of this emerging technology lies in making strategic decisions on what work and how much work AI can reliably take on. And when you give LLMs the ability to do even more with the tools you already use, your company stands to benefit from even more efficiency, scalability, and revenue growth.

With the release of our new top-ranking LLM, Palmyra X4, we’re taking another step toward more impactful AI. Palmyra X4 boasts state-of-the-art reasoning through novel training techniques and a suite of powerful new features and capabilities, including:

The ability for the LLM to take action in external systems, like software, databases, and other WRITER apps via tool calling
Automatic data integration with built-in retrieval augmented generation (RAG), including chain-of-thought and source transparency
Code generation and deployment
An expanded 128k context window
Structured output generation for simpler system integration (coming in a few weeks)

Palmyra X4 joins WRITER’s widely popular Palmyra family of enterprise-grade LLMs, which support multilingual capabilities in 30+ languages and multi-modal inputs across images, audio, and video. This new model has been benchmarked against models on Berkeley’s Tool Calling Leaderboard and early results show it leads by a significant margin, outperforming model providers including OpenAI, Anthropic, Meta, and Google (official listing coming soon). It’s also top-ranked on Stanford HELM. Not only that, Palmyra X4 is also the first model of its size to be trained on synthetic data at a fraction of the cost reported by major AI labs, showcasing our commitment to innovative and cost-effective ways to scale AI.

Palmyra X4 is available today in Ask WRITER, our prebuilt chat app, and AI Studio, our suite of development tools.

Mobilizing AI apps that can take action in your enterprise tools

Imagine AI collapsing all the manual steps involved in the product development lifecycle — pulling feedback, analyzing it, prioritizing features, and creating tickets — without human intervention. Today, this process involves time-consuming, manual extraction of feedback, bug reports, and cross-functional decision-making, slowing down the development lifecycle.

With Palmyra X4’s action capabilities, custom AI apps built in AI Studio are enabled by a mechanism called tool calling, allowing it to interact with tools and services made available to it by an AI engineer. Action capabilities through tool calling unlocks more sophisticated workflows by arming an LLM with tools beyond its built-in knowledge. You can read more about how tool calling works in our engineering blog.

Palmyra X4 combined with the WRITER full-stack platform approach enables developers to more easily build complex AI apps integrated with their unique business systems.

Take action with Palmyra X 004 — Action capabilities allow Palmyra to perform real work‌ — ‌updating systems, performing transactions, sending emails, triggering workflows‌ — ‌after processing an input. Actions are a stepping stone toward agentic AI.

From the product roadmapping example above, let’s say a product manager is prioritizing new feature requests in Zendesk based on customer feedback. Palmyra can now automate this process by interpreting the request and deciding which tools it needs to use to respond to the user. Once it identifies the right tools, it can interpret the query and sequence the actions it needs to take. Then it gathers feedback from Zendesk, running it through a prioritization tool, and creates a Jira ticket — all automatically.

Beyond third-party tools, WRITER also offers unique enterprise-specific tools that extend Palmyra X4’s ability to take action. These include:

A built-in graph-based RAG tool that automatically brings in company data from your Knowledge Graph into custom chat apps in WRITER. For example, a user might ask an internal enablement WRITER app, “Compare our Q3 product performance metrics to last year,” and Palmyra X4 will automatically infer that the built-in RAG tool needs to be called to fetch performance data before analyzing it.
An application endpoint using the Applications API that lets you build multi-step workflows with your no-code WRITER apps. This would allow a developer to build a custom app that combines the power of other WRITER no-code apps, like a WRITER email text-gen app and a WRITER chat app powered by a domain-specific model, like Palmyra Med.

Action capabilities help enterprises integrate tools with AI, simplify repetitive workflows and context switching for end users, and empower non-technical teams. Palmyra also now takes on some of the burden from developers by dynamically managing workflows and making decisions on tool use without constant developer input.

Pull data from a CRM, Update financial records, Send an email, Forecast revenue performance, Retrieve support tickets, Write and deploy code, Generate an XML report, Adjust ad spend, Assign project tasks, Launch a campaign, Pay invoices, Analyze infrastructure logs

By simplifying the coding and automating decision-making, tool calling makes it easier to activate a wide range of powerful, multi-action workflows, all in response to natural language:

A financial institution can automatically pull data from external financial databases via API, perform complex analyses defined by functions, and update dashboards.
A non-technical team member in the manufacturing industry can retrieve product and supply chain data and run SQL queries without writing any code.
A healthcare payor can automate claim processing, pulling relevant patient data automatically from a graph of patient health records, with compliance and privacy.
A frontend web developer can debug, write, and deploy new code for a webpage, and publish it directly in a CMS.

Building the market-leading LLM for action capabilities

WRITER is committed to building scalable AI solutions that meet the stringent accuracy and reliability needs of the enterprise. With Palmyra X4, we took a unique approach of training with synthetic data, helping us produce the top-ranking model at a significantly lower cost than other frontier models, and setting a new standard for cost-efficiency in model development. Our four-year track record of innovation in LLM development — across open-source, closed environment, vision, and domain-specific models for industry verticals — is recognized by leading researchers.

Early results show Palmyra X4 is the leading model on Berkeley’s Tool Calling Leaderboard benchmarks by a large margin (listing coming soon). It’s ranked as the most accurate model for tool calling and API selection over all GPT, LLama, Claude, and Gemini models.

The benchmarks put LLMs through real-world scenarios to evaluate their ability to select the correct tools, determine which API to call, and successfully execute a function. The result is an accurate, fast, reliable model that can execute multiple tool calls in sequence or in parallel in one interaction with the user.

Average performance across tool calling benchmarks — Benchmarks include overall accuracy, and the ability to plan and structure (AST) and execute (Exec) one, or multiple sequential or parallel tool calls in one step (single-turn).

Top accuracy (acc): Palmyra X4 achieves 78.76% in overall accuracy in identifying and executing the correct tool call, leading the industry by a nearly 20% margin.
Leading ability to structure a call (AST): Palmyra X4 achieves the highest average performance of 87.93% on correctly planning and organizing tool call(s) before execution, demonstrating its ability to accurately interpret a user input, generate the correct parameters, and sequence the steps for a tool call.
Leading ability to execute a call (Exec): Palmyra X4 achieves 88.27% performance on executing tool call(s), ranking highest against all models to efficiently carry out actions across enterprise systems.

Palmyra X4 also debuted as one of the world’s top 10 models on both HELM Lite, a holistic framework for evaluating foundation models, and HELM MMLU, which tests understanding across 57 subjects, scoring 86.1% and 81.3% respectively.

Getting started with Palmyra X4

Palmyra X4 is a foundational step in making WRITER the central nervous system of an enterprise, seamlessly connecting data, tools, and departments with minimal human intervention or custom development. Combined with WRITER’s full-stack platform, these new capabilities open up a new frontier of innovative AI use cases.

Palmyra X4 is available now for use on Ask WRITER and in AI Studio. Talk to your developers on getting started with actions through Palmyra X4 today.

Inside Writer

Introducing intelligent actions with Palmyra X4

Mobilizing AI apps that can take action in your enterprise tools

Building the market-leading LLM for action capabilities

Getting started with Palmyra X4

More resources

Inside Writer

Introducing Palmyra Med and Palmyra Fin

Inside Writer

Meet Palmyra Vision, our multimodal LLM with vision capabilities

Inside Writer

HELM benchmark findings showcase Palmyra LLMs as leader in production-ready generative AI