- Key insight: Goldman pairs Claude agents with rules systems and human oversight to resolve exception-heavy workflows.
- What’s at stake: Potential risks include vendor concentration, regulatory accountability, and operational outages.
- Forward look: Expect AI agents to scale back-office throughput, reshape onboarding and sales enablement.
Source: Bullets generated by AI with editorial review
Processing Content
Many banks use generative AI models like Claude to help them with their work.
Having generative AI handle operations tasks is a newer frontier. Trade accounting and know-your-customer compliance are rules-based processes – collect this data, validate it against this other database, gather these documents and so on. That makes them a straightforward task for any computer program. Why use a high-powered generative AI program such as Claude?
“If that was the case, why would we have thousands of people doing it?” Marco Argenti, Chief Information Officer at
The best rules-based systems can automate a high percentage of this work, he said. But if a bank is processing hundreds of millions of transactions per day, and even 1% doesn’t fit neatly within the rules, that creates thousands of exceptions, he said.
For instance, one step in know-your-customer compliance is checking an applicant’s government-issued ID such as a driver’s license. But it might be close to expiring, or there could be some other discrepancy — there are myriad things that can go wrong, he said.
“The beauty of neural networks is the fact that they can reason like a human in those micro use cases, and therefore they complement a rules-based system into something that actually can get you really close to 100%,” Argenti said.
How Goldman has been working with Anthropic
Two years ago,
“That gave us the confidence that the reasoning capabilities of Claude were extremely effective in not only suggesting how to code to our developers, but also in executing autonomous tasks,” Argenti said.
A human developer sets the specs, the regulatory parameters and the intended outcome, he said. A coding agent executes on those instructions, and the code is reviewed by humans. The AI agent tests and validates the code.
“It’s a completely different way of working, where you have the concept of agency — the agent understands instructions and can go on its own to create an artifact,” Argenti said.
To create the new agents for trade accounting and client onboarding, engineers at
They observed the way people use their computers, how the processes work in practice, what the bottlenecks are, then started to see areas where the agent could improve the work.
“We ended up with these agents that are reviewing documents, extracting entities, determining, for example, whether you need to ask for another document or determining if you have an ownership on a certain company and your spouse also has an ownership, then you need to do a separate KYC for that, because they’re co-owners,” Argenti said. “There are a lot of micro decisions within boundaries, and those micro decisions are not rules, but based on reasoning, on a chain of thought.”
Why Claude
Trade accounting and client onboarding are data-intensive reasoning problems, said Jonathan Pelosi, head of financial services at Anthropic. High volumes of structured and unstructured inputs need to be parsed, layered rules need to be applied and judgement needs to be exercised where the rules run out.
“That’s the kind of work Claude is designed for,” he said. Anthropic has built integrations into some commonly used tools such as Excel and PowerPoint (the PowerPoint integration is in research preview), Pelosi said.
Indranil Bandyopadhyay, principal analyst at Forrester, also said Claude is good for work like trade accounting and client onboarding, with caveats.
“In trade accounting, much of the operational burden sits in reconciliation: comparing fragmented data across internal ledgers, counterparty confirmations and external bank statements,” Bandyopadhyay told American Banker. “This is extraction and matching work across large volumes of documents, and Claude’s large context window and strong instruction-following capabilities make it effective at processing these document-heavy workflows with precision.”
Claude is also good at client onboarding work, he said, because know-your-customer document parsing and anti-money-laundering screening are labor-intensive, requiring analysts to review passports, corporate registrations and beneficial-ownership documents and cross-reference them against regulatory requirements, Bandyopadhyay said. Claude can accelerate this by extracting structured data from unstructured documents and flagging inconsistencies, thereby reducing cycle times, he said.
“But the critical point is this: Claude is not acting as the system of record,” he said. “It sits in the workflow layer, automating and augmenting the manual, judgment-intensive steps that currently slow these processes down. The accounting system, the compliance platform, those remain the authoritative systems. Claude augments human analysts by handling the repetitive extraction and comparison work, so they can focus on exceptions and final decisions. That’s where the real operational value lies in a regulated environment like financial services.”
The risk of hallucinations and errors
One ever-present concern of using large language models such as Claude is that they can hallucinate and make mistakes.
“No AI system can guarantee a zero-error rate, but we’ve engineered Claude to reduce that risk and make it manageable in practice,” Pelosi said. “Claude is trained to surface uncertainty rather than confabulate — when the underlying data is ambiguous or incomplete, it’s built to say so rather than produce a confident-sounding but unsupported answer.”
Every Claude output carries full source attribution, creating an audit trail that ties conclusions back to the data, he said.
“That’s not just useful for review — in regulated environments, it’s often a prerequisite for deploying AI.”
Bandyopadhyay warned that no large language model is infallible.
“Banks should not deploy Claude in these workflows without multiple layers of safeguards,” he said. They can have a human in the loop, as
They can create programmatic validation layers, Bandyopadhyay said.
“In trade accounting specifically, many outputs are deterministically verifiable,” he said. “If Claude extracts a settlement amount, you can validate it against the ledger entry programmatically. That’s a significant advantage over use cases where correctness is subjective or harder to verify.”
Beyond that, grounding techniques like retrieval augmented generation ensure the model is reasoning using a bank’s actual data rather than relying on general training knowledge. Chain-of-verification prompting adds another layer by having the model cross-check its own outputs before presenting them.
“So the honest answer is: you don’t trust the model blindly,” Bandyopadhyay said. “You design the system around it so that errors are caught before they matter.”
“The use cases vary, but the pattern is consistent: document-heavy, judgment-intensive workflows where AI can accelerate human review,” Bandyopadhyay said.
Data governance is always important when using generative AI, especially when it’s helping with operations, Bandyopadhyay said.
“Back-office workflows involve highly sensitive information including trade details, client PI, and counterparty data, so banks need to be extremely deliberate about how that data flows when interacting with an LLM, validating whether the model is deployed on-premises or through an API and what data-retention policies the vendor maintains,” he said.
Equally important is regulatory accountability. “Regulators are increasingly clear that adopting AI does not transfer accountability, meaning institutions need robust model risk management frameworks that treat LLMs with the same rigor applied to any quantitative model in production,” Bandyopadhyay said.
Vendor concentration is another risk. “If critical workflows are built around a single model provider, an outage, pricing change, or model behavior update could have significant operational impact, so banks should be designing for model portability with abstraction layers that allow them to swap underlying models without re-engineering workflows,” he said.
Fraud, security concerns
Another concern about letting AI agents handle client onboarding is the idea that computers can be easier to fool than people because they lack a gut sense that something just feels off. But Argenti strongly countered this argument.
“If you look at the history of fraud and security in general, you know that the most effective way to trick a system or a company is actually to trick people,” Argenti said. “People are the most vulnerable link in phishing and social engineering. Humans are very easy to fool. And in fact, it’s very unfair to have the bad guys using AI to generate stuff, and then having humans on the other side. Maybe there are a lot of people with great instincts, but there’s also a lot of people without great instincts. And that instinct can be higher when you’re fully awake in the morning, but towards the end of your shift or after lunch, you might have a moment of weakness.”
Fraud and security work will always call for a combination of software and people, Argenti said.
“Humans are going to be absolutely fundamental,” he said. “But then the complementary AI is going to tell you, hey, I’m really paying attention to this, you might have overlooked that, and I think the best defense is the combination of powerful humans and powerful AIs – human judgment and AI obsessiveness. AIs are really good at spotting very, very subtle things.”
With AI agents and humans working together, “now you can do it in a way that scales so the same group of people can handle five to ten times the cases, or in less time, because a lot of this machinery of work is actually automated,” Argenti said.
Will there be layoffs?
An existential worry about AI is that if software is doing the work of humans, this will lead to layoffs.
“There are going to be job shifts, the same way we don’t see too many paper accountants around anymore,” Argenti said. “We have a lot of people that use Excel. The real power here is that it gives you the optionality: You can do more work. You can accelerate your timelines. Or you can decide, for example, to take those efficiencies. It really depends on the company, it depends on the job.”
People ask him all the time if, since his developers are now more efficient, he’s going to reduce the number of developers.
The answer is generally no, Argenti said, because there are always new projects he’d like to be able to do.
“As a competitive company that’s in a competitive environment and is focusing on growth, that extra capacity is what you want,” Argenti said. “Most of our focus is on trying to be faster, to be leaner, and to service our clients better and and I think AI really injects this accelerant throughout the entire organization that supercharges the organization to go even faster, to go to be even leaner and to be even more efficient in serving our clients and growing our business.”
People who work in trade accounting and client onboarding are being trained and reskilled, he said. For instance, prompt engineering is an important skill for bankers to learn, he said, because AI models are very sensitive to how they are prompted.
Argenti plans to continue to unleash AI agents throughout the back office, including in areas like vendor management, procurement and lending, where there are a lot of manual processes, like analyzing complex agreements that are sometimes hundreds of pages long and then turning them into formats that can go to an investment committee or a credit committee.
“But the biggest unlock is going to be on sales enablement,” he said. “How do we enable our client relationship people to have the answers quicker? Maybe instead of answering you in two hours, I’m anticipating your question an hour before you ask, because maybe there is a world event that could trigger certain consequences for certain companies that you’re investing in.” AI agents could scout the market all day and all night, giving salespeople relevant information.
Many proofs of concept are in the works throughout
“We got the developers in a good place to be more productive,” Argenti said. “And we continue to expand it to the operational types of jobs. And then eventually, we will get into enhancing the client experience, empowering people that are client facing, so the story really becomes both about efficiency and growth at the same time.”