GPT-2 and Customer Service - Promising Results
GPT, the AI text generator, has captured the imagination of many, supposedly even being able to write novels. Cool as that sounds, it is designed for and much better with smaller text pieces. So, could we use GPT for customer service? The first results are promising.
At Deepdesk, we use AI to help agents in content centers. To put it simply: we make them type less. We offer real-time response recommendations and automation of repetitive conversations. We do this by analyzing millions of real dialogs and training neural networks to come up with the best possible answer. We have reduced typed text up to 20% on average and up to 50% for top agents.
However, we believe we can still improve. Maybe GPT could boost our performance? GPT is a language model trained with a simple objective: predict the next word, given all of the previous words within some text. Previous experiments with GPT by others have shown great results.
Big, slow, and expensive
But hold your horses (or unicorns). GPT is big, resource-intensive, and slow. For our purpose, recommendations have to show up instantly for hundreds of agents simultaneously. And yes, we know, GPT-3 is already out, and the New Hot Thing✨, but it seemed fit to train one of the smaller GPT-2 models (a 'model,' by the way, is an AI term for a language representation machines can understand).
Training a Dutch model
Our customers use mainly Dutch for their communication. We initially used a pre-trained English base model and customized it. Not fully happy with the results, we decided to go off the beaten track: training a Dutch model from scratch had not been done yet, but should deliver better results. We trained the base model on generic Dutch text and fine-tuned it on one of our customers’ real chat conversations.
We put the model into production last December. To save time and money, we only used GPT-2 when our own recommendation engine no longer has suggestions. The chart below shows the use of GPT-2 after its launch in July (English model). The Dutch base model gave much better results.
The curse of GPT-2
There is a big caveat with GPT-2 for customer service: the suggestions GPT-2 gives are hard to control. GPT-2 may suggest foul language to the agent. GPT-2 did precisely that one time: cursing in perfect Dutch. Even though you could argue that it was semantically correct and even fitting given the conversation up to this point, it was still an unwanted response. ;-) We are happy to work with human agents. You can only imagine what would have happened with an unmoderated chatbot… We are looking into blocklisting certain words to prevent this from happening in the future.
Generic model per industry
We are impressed by what GPT-2 can do. We think it should be possible to develop a generically trained model per industry, which means that customer data training is potentially unnecessary. Let’s see.
Many thanks to our CTO Lukas Batteau for his Serving GPT-2 in production on Google Cloud Platform and our Chief Data Science Geert Jonker for Training a Dutch GPT-2 base model.