BACKGROUND: Airline customer service is hard. Agents need to understand thousands of complicated policies across hundreds of locations. Basic training takes nine months or more and frustrated travelers endure long wait times. Air France/KLM identified this as a prime use case for generative AI, but only if accuracy could achieve human levels or better.
Their goal was an AI chatbot, trained on thousands of policy documents, that could help customer service agents answer traveler questions in seconds.
THE CHALLENGE: The airline’s knowledge base is filled with documents that create problems for AI. The docs are visually complex, containing tables, diagrams, decision trees and software screenshots. The information is also complicated. A question as seemingly simple as how much will I pay for luggage involves nearly a dozen rules including class, fare, location and bag weight. Both of these problems create hallucinations in gen AI systems, so the airline targeted a modest 60% accuracy for their first POC.
METHODOLOGY: EyeLevel.ai conducted a three-month test using their ChatGPT-like virtual assistant. PDF documents about KLM and Air France services were ingested into the proprietary retrieval augmented generation (RAG) platform to create the assistant's knowledge base.
EVALUATION: Air France/KLM experts evaluated EyeLevel.ai's performance. They created questions, reviewed responses, and scored accuracy based on how closely the bot's answers matched the source documents.
OPTIMIZATION PROCESS: Guided by feedback and recommendations from Air France/KLM test participants, the responses from the bot were customized to fit the style, formatting, and tone of the airline. This optimization process elevated the scores of the responses further.
CONCLUSION: EyeLevel.ai hit an astonishing 96.2% accuracy rate, far exceeding their 60% target. The customer service team concluded that EyeLevel.ai dramatically outperformed their expectations and the other solutions they evaluated.