Article

Why CSAT Fails in Evaluating AI-Powered Customer Support Performance

Guillaume Luccisano

Friday, May 23, 2025

Friday, May 30, 2025

•

min read

CSAT can be misleading when used to evaluate AI in customer support because it often leaves human agents with the tougher cases, skewing the scores. A new scoring system tailored for AI might offer a fairer assessment of both AI and human contributions.

Yes, CSAT is not fit to evaluate your AI performance. It was potentially okay without AI, but with AI in the mix, it's now an outdated tool. Let me explain why below.

Some background on CSAT first

Despite its limitations, CSAT is widely used in the customer service industry as a key metric to gauge the quality of the support department. Despite its imperfections, CSAT's simplicity and widespread adoption have made it a universally recognized standard.

CSAT is a score provided by a customer to rate the quality of support received, usually measured on a 1-5 scale in the e-commerce industry.

However, CSAT does have known biases, primarily the Response Bias and the Temporal Bias. Often, only dissatisfied customers may take the time to grade your service. Additionally, the score is typically taken shortly after an interaction and may not reflect the customer's entire journey. Adding to this the fact that CSAT is highly dependent on the business model and the products being sold. CSAT can vary widely between merchants, not necessarily always due to the quality of their support team.

With that in mind and despite those flaws, CSAT can still be considered a good general indicator to monitor the health of your support organization.

AI to the rescue

The introduction of AI is a significant shift, offering numerous benefits to your customers.

To state the obvious, AI ensures 24/7 support, faster response times, higher overall ticket quality thanks to a shared knowledge base, and streamlined centralized procedures.

While AI is a major boost to your support organization, it's crucial not to rely solely on CSAT to gauge its efficiency. Even though some AI tools out there are touting their AI CSAT, here's why you shouldn’t take those scores at face value.

AI vs Humans

By default, your AI will begin by handling the simplest cases and answering those quickly, often resulting in higher CSAT scores. This is because it's relatively easy for any good AI to achieve good CSAT, especially as they tend to handle cases that align with customer expectations, such as avoiding denying refunds. Your AI having a good CSAT is basically a requirement and it’s easy to achieve.

However, what happens next? Your human team is left with the more complex cases, the tickets that might go against the customer's wishes, or issues dealing with real problems, such as a lost package, which are more likely to result in lower CSAT scores.

Consequently, when you split your CSAT scores into human vs AI, you're comparing two very different datasets. This means your comparison is completely biased. Finally, as your AI scales, its CSAT score remains high and steady, while your human CSAT score may continue to decrease as they are left with the more arduous cases.

This is unfair to your human team and might give you a misleadingly positive impression of your AI. The AI is simply handling the easier tasks and might not actually be doing the hard work.

If you still want to use CSAT, at least try comparing tickets with similar intents. This should provide a more accurate picture of how your AI is truly performing (filter by tag or ticket fields for example). Also, it goes without saying, but pick an AI tool that can truly automate your support. You want autonomous AI agents that can fetch information from external services as well as take actions in those, ie: truly automating, not just answering simple Q&A about your business. This means dealing with L2 and L3 tickets, not just L1.

A New Industry-Wide Scoring with AI in Mind?

Clearly, as every merchant is adopting AI to improve the quality and efficiency of their support, we need to rethink our approach to tracking the quality of each interaction. This likely involves creating a new score ready for the AI age, one that isn't biased by policy enforcement, speed, or mistakes beyond the control of the support agents.

At Yuma, we're developing an alternative scoring system that we plan to release this coming June. Our goal is to create a system that's fair to humans, and that can assess both the quality of overall interactions and adherence to policies. If you have any ideas for what we should include in this new scoring system, please share your insights! What would be the perfect scoring mechanism for you? Can a single score actually be perfect?

To conclude, while CSAT is still a reasonably good proxy overall, please avoid using it to distinguish between Humans and AI. Or if you still do, do it while being fully aware of all the biases in that split :)

‍

#ai

#automation

#customersupport

#e-commerce

Share this post

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Yuma.ai Integrates with Re:amaze to Empower Merchants with Enhanced AI-Driven Customer Support

Yuma, a leader in AI-powered customer support automation for e-commerce, today announced its newest integration with Re:amaze, a robust helpdesk platform trusted by merchants worldwide. This new integration brings Yuma’s AI-driven automation to Re:amaze’s unified inbox—covering email, chat, social, and SMS—so brands can enhance efficiency, reduce response times, and deliver consistently outstanding customer experiences.

Why CSAT Fails in Evaluating AI-Powered Customer Support Performance

CSAT can be misleading when used to evaluate AI in customer support because it often leaves human agents with the tougher cases, skewing the scores. A new scoring system tailored for AI might offer a fairer assessment of both AI and human contributions.

AI for customer care support: What do successful companies have in common?

Read real stories of AI transforming e-commerce customer support.

#ai

#automation

#customersupport

#e-commerce

Share this post

How FINN, Europe’s leading car subscription platform, automated 45% of B2B CX tickets and cut resolution times by 90% with Yuma AI

FINN used Yuma AI to automate 45% of support tickets and cut average resolution times by 90%, freeing agents for high-value sales work and reinforcing the brand’s tech-first, revenue-driven culture.

read

The Koin Club Transforms Customer Support with Yuma AI: 40% Automation & SLA Boost to 57%

"The Koin Club Transforms Customer Support with Yuma AI: 40% Automation & SLA Boost by 50% all while reducing 83% of human effort for their customer support"

read

Omnie and Yuma AI: Reshaping E-Commerce Customer Service together

Omnie has automated 50% of customer support for several of its e-commerce clients after partnering with Yuma AI. They also succeeded in reducing the average FRT from 7 hours to 1 hour. Omnie is currently serving 12 clients with Yuma's AI (and growing).

read

MyVariations Slashes Response Time by 70% and Automates 62% of Customer Support Tickets with Yuma AI

"Thanks to the Yuma team, we have automated more than 62% of our total tickets in just a few months while keeping our Trust Pilot score of 4.8/5"

read

How MFI Medical Cut First Response Time by 87% and Automated 64% of Customer Inquiries with Yuma AI

Yuma has enabled MFI Medical to save $30,000 annually, drastically reduce response times (FRT), and boost their Google rating from 3.5 to 4.4, reflecting improved customer satisfaction and operational efficiency.

read

How Petlibro Achieved 79% Automation and Saves 20% Annually with Yuma AI

Petlibro leverages Yuma AI for 79% automation, reducing costs by 20% and speeding up resolutions by 30%. Enhanced support includes 24/7 coverage and seamless integration, empowering growth.

read

How CABAIA Achieved 74% Cost Reduction with Yuma AI

CABAIA enhances customer experience with Yuma AI, achieving significant cost reductions and boosting response efficiency. This strategic implementation allows 24/7 customer interaction with seamless integration, empowering their business growth.

read

A Glossier Touch: Elevating Customer Experience with Yuma AI

Learn how a massive global brand like Glossier slashed 87% on overall response time and saved 16 hours in per ticket resolution, all with high accuracy across the board.

read

How Clove Achieved 3x ROI, 70% AI Automation, and 25% Cost Savings in Just 3 Months with Yuma AI

Explore how Clove revolutionized customer support with AI, achieving 70% automation, slashing response times to 3 minutes, and realizing a 3x ROI, through their successful partnership with Yuma AI.

read

How EvryJewels Achieved 89% Automation, slashed cost by 63% and process over 150k tickets with Yuma

Learn how EvryJewels scaled customer support with Yuma AI, slashing costs, automating 89% of tickets, and cutting response times by 87.5%.

read

view all

Glossier Cuts Response Time 87%

Clove Achieves 70% CX Automation

EvryJewels Hits 89% Automation

Petlibro Powers 79% CX with Yuma AI

The Ultimate Guide to Delivering E-Commerce Customer Service with Generative AI

Generative AI vs Chatbots: Understanding the difference

E-commerce CX: Why you need a specialized AI tool, not just a help desk

AI and Customer Service: 7 reasons why generative AI outperforms traditional Customer Support

July 14, 2025 - Gladly + Yuma: Elevating Customer Service with AI-Powered Automation

Feb 10 2025 - Yuma Integrates with Re:amaze to Empower Merchants with Enhanced AI-Driven Customer Support

Dec 23 2024 - Yuma.ai Launches New Metrics Features to Supercharge Automation Insights for Merchants

Dec 13 2024 - Yuma AI Expands E-Commerce Capabilities Through New Integration with BigCommerce

Shopify

Gorgias

Zendesk

Kustomer

Klaviyo

Why CSAT Fails in Evaluating AI-Powered Customer Support Performance

Some background on CSAT first

AI to the rescue

AI vs Humans

A New Industry-Wide Scoring with AI in Mind?

Stay Ahead with the Latest AI & E-Commerce Insights

Yuma.ai Integrates with Re:amaze to Empower Merchants with Enhanced AI-Driven Customer Support

Why CSAT Fails in Evaluating AI-Powered Customer Support Performance

AI for customer care support: What do successful companies have in common?

Explore More Insights on AI & E-Commerce

How FINN, Europe’s leading car subscription platform, automated 45% of B2B CX tickets and cut resolution times by 90% with Yuma AI

The Koin Club Transforms Customer Support with Yuma AI: 40% Automation & SLA Boost to 57%

Omnie and Yuma AI: Reshaping E-Commerce Customer Service together

MyVariations Slashes Response Time by 70% and Automates 62% of Customer Support Tickets with Yuma AI

How MFI Medical Cut First Response Time by 87% and Automated 64% of Customer Inquiries with Yuma AI

How Petlibro Achieved 79% Automation and Saves 20% Annually with Yuma AI

How CABAIA Achieved 74% Cost Reduction with Yuma AI

A Glossier Touch: Elevating Customer Experience with Yuma AI

How Clove Achieved 3x ROI, 70% AI Automation, and 25% Cost Savings in Just 3 Months with Yuma AI

How EvryJewels Achieved 89% Automation, slashed cost by 63% and process over 150k tickets with Yuma

GET YUMA AI

Support AI - use cases

Sales AI - use cases

Social AI - use cases

Chat AI - use cases

PRODUCTS

Support AI - FEATURES

Sales AI - FEATURES

Social AI - FEATURES

Chat AI - FEATURES

RESOURCES

CASE STUDIES

USERS & VERTICALS

INTEGRATIONS

COMPARE

BOSTON

BARCELONA