AI in action

– 9 min read

Generating alpha in the age of AI: Weighing rewards against risk

Avatar photo

Writer Team   |  May 9, 2025

Generating alpha in the age of AI: Weighing rewards against risk

AI is a seismic shift that’s already redefining how we approach investing. Its ability to analyze large amounts of structured and unstructured data at unprecedented speeds enables us to uncover patterns and investment insights that would take weeks, if not months, to develop. But off-the-shelf LLMs hallucinate, and that’s too much risk for enterprises. Firms need a purpose-built solution that consistently delivers accurate outputs.

Writer is an end-to-end platform built by leading AI researchers and seasoned financial professionals who have the domain expertise needed to deliver on these complex use cases. Writer customers are already transforming the way they’re conducting investment research, building portfolios, and delivering alpha to their clients.

Head of financial services at Writer Dilshoda Yergasheva and Director of Developer Relations Sam Julien recently led a financial services AMA. They cover our new industry benchmark — FailSafeQA — as well as alpha-generating AI use cases and the pitfalls of off-the-shelf LLMs.

Summarized by Writer

  • AI’s capability to analyze large amounts of data quickly is revolutionizing investment research and portfolio construction, enabling the discovery of deeper insights and patterns.
  • Off-the-shelf AI models can be unreliable and prone to errors, such as making up information, which is unacceptable for the highly regulated financial services industry.
  • Writer’s FailSafeQA benchmark tests AI models for robustness and context grounding, revealing that domain-specific models like Palmyra Fin outperform general models in handling real-world issues and avoiding inaccurate answers.
  • Writer’s specialized AI agents provide tailored financial insights, step-by-step guidance, and real-time data without compromising accuracy or compliance.
  • Financial firms can use Writer to enhance research efficiency, optimize portfolios, and maintain a competitive edge by staying informed on market movements and identifying biases in investment patterns.

The role of AI in financial services

Financial services relies heavily on vast amounts of data paired with the knowledge and context to properly analyze and advise. And AI presents a massive opportunity — Goldman Sachs’ CEO says 95% of an S–1 filing can be written by AI. That’s just the tip of the iceberg: AI can turbocharge bespoke research and complex, alpha-generating tasks.

Access to AI platforms like Writer democratizes investment knowledge, enabling investors to:

  • Enhance research efficiency: AI can analyze large datasets — including non-traditional and hard-to-access data sources — providing deeper insights into market trends and company performance.
  • Optimize portfolios: By using AI-driven insights, investors can build more balanced and risk-aware portfolios, adapting to market changes with greater agility.
  • Stay informed: AI tools can monitor real-time portfolio data, ensuring you remain updated on market movements and news, and maintain a competitive edge.
  • Work with a real-time coach: AI can serve as an investment coach based on your historical investment patterns. It can help pinpoint biases — like a habit of holding on to positions for too long.
Big book of financial services generative AI use cases - wide image

Generative AI use cases for financial services

Read the guide

The risks of AI in financial services

While AI can give firms a competitive edge in separating signal from noise, it’s important to highlight the risks. When assessing AI vendors, there are a few considerations you need to keep in mind to make sure your AI partner is well-suited for financial services.

“We know that off-the-shelf models built for general public — not the enterprise — tend to hallucinate,” Yergasheva says. “So when you choose to power investment workflows with AI, it is paramount that you are working with a reliable and accurate model.”

To power mission-critical use cases — like investment research and portfolio construction — you need to be able to trust that the information from your chosen model is consistently accurate. This is where off-the-shelf LLMs fall short. Trading on the wrong information will cost you and your clients.

Financial services is a highly regulated industry that requires trustworthy information, which is what domain-specific models deliver. To better assess the accuracy and reliability of domain-specific LLMs used for financial use cases, our research team developed the FailSafeQA benchmark.

Why domain-specific models matter

Most benchmarks test AI in ideal scenarios with perfect datasets. Unfortunately, the real world isn’t so tidy, especially in finance where compliance is non-negotiable.

One of the lead researchers and authors of FailSafeQA built a grammatical error correction system that aced all industry tests. But when our CTO and Co-Founder Waseem AlShikh gave it a simple, real-world message like “Hey, man. How are you?” the system crashed and burned.

“This experience illustrates something fundamental,” says Julien. “When we talk about enterprise-grade safety, we need to consider many more scenarios than just accuracy on perfect inputs.”

FailSafeQA

Human error consideration

We tested 24 different AI models — including seven “thinking models” like OpenAI’s o1 and o3-mini and DeepSeek R1 — to see how they handle two major types of problems. What happens when the question is wrong, and what happens when the context is missing?

Testing these models from this perspective gives better insight into how models perform in real-world scenarios. We looked at common user mistakes like spelling errors, incomplete queries, and using everyday language instead of industry jargon. We also checked what happens when the document is missing, hard to read, or irrelevant.

Then, we focused on two key metrics: robustness and context grounding.

“Robustness means the AI works well even with these issues, while context grounding means it doesn’t make up answers when it shouldn’t. For example, you’re asking about mutual fund information,” Julien says. “You do want the model to respond if you just made a little spelling error in your question or the prospectus had a couple of OCR problems, but you definitely don’t want the model to make stuff up if you upload the wrong prospectus.”

Key findings from our FailSafeQA benchmark

Thinking model vs. chat model performance on finance-specific tasks. Writer at the top of FailSafeQA Compliance Score

Our FailSafe QA benchmark revealed some surprising insights. Reasoning models, while great at answering questions, also made up answers up to 41% of the time. Turns out, they’re not “thinking” as much as we’d like to believe.

Then you have models like Palmyra Fin, Gemini Flash, and the Qwen series that showed much stronger context grounding. They’re more likely to politely decline to answer when the information is missing — a necessary feature in finance.

“This shows the need for domain specific models and that reasoning models, while useful for some tasks, are not magic bullets for all scenarios,” Julien explains. “When we talk about the deep understanding of financial analysis, Palmyra Fin outshines the rest of the models.”

How Writer agents ensure compliance

Writer’s top-ranked specialized LLM, Palmyra Fin, is purpose-built for financial use cases. It understands and navigates the complexities of financial data to deliver accurate and insightful analysis. To see how it excels in real-world scenarios, let’s dive into two sample agents.

Palmyra Fin assistant

You can chat with Palmyra Fin assistant in general mode about various topics. Yergasheva demonstrates what happens when you discuss statistical techniques for portfolio optimization with Palmyra Fin assistant.

“Before, in order to be able to access this knowledge, you would need to either look through a number of finance or statistical books, or you would have to work with other folks,” Yergasheva says. “A lot of firms have an apprentice model where seasoned portfolio managers kind of teach you how to do these techniques. With Writer, you can have that type of a coach right next to you, telling you what are the different ways you can optimize some of your portfolios.”

Palmyra Fin assistant offers step-by-step guides on setting up these optimization techniques, including the data needed, calculations required, and even example Python code. Whether you’re a finance newbie or a seasoned pro, you can access advanced strategies with ease.

Another powerful feature of Palmyra Fin assistant is its ability to handle open-ended questions without making things up. For example, if you ask about the top three growth equity mutual funds, the model won’t just spit out a fabricated list.

Palmyra Fin Assistant example output

“What Writer is doing here is actually telling me that in order to provide a detailed analysis of these top three growth mutual funds, it would actually need to get access to the most recent data,” Yergasheva explains. “So, instead of hallucinating an answer for you, it actually is clearly telling you that, ‘Hey, I can tell you what data I’ve been trained on, and I can tell you it looks like the most recent information was as of 2023.’ It knows what top funds are, but it also is sharing the fact that it needs some information to give you the most recent data.”

SEC analysis assistant

SEC analysis assistant is another example of an agent built in AI Studio. Because it’s connected directly to SEC filings, it’s like having a personal analyst at your disposal.

SEC analysis assistant

“You can tailor it — if you’re a growth portfolio manager and when you read a 10K, all you care about is forward projections, that’s what you do,” Yergasheva says. “If you’re a value portfolio manager, you can prompt it accordingly and make that summary your own.”

Users can dive deeper with an insights chat, getting even more detailed information. And for added transparency, the agent always shows the source documents, so you can verify the data and ensure everything is correct. This level of customization and transparency means you get the insights you need, exactly when you need them, without any fluff or guesswork.

Writer empowers financial decisions with precision

Writer allows financial firms and companies to recreate core options to be AI-first. Our AI Agent Library has solutions tailored to different financial service use cases. Whether you’re a portfolio manager, a risk analyst, or a compliance officer, we’ve got you covered with solutions that are not only smart but also transparent and trustworthy.

As highlighted in our FailSafeQA Benchmark, Writer provides unmatched, AI-driven insights that are transparent and trustworthy. Watch the full webinar to see how Writer can transform your investment strategy and help you generate alpha in a world where every second counts.