Big Data and Analytics

Chatbot Testing: Getting It Right The First Time – Part 1

Chatbot Testing

Chatbots are the new conversational agents that are made available through a website or any other interface to assist customers. As chatbots use the natural method of human interaction – a dialogue, it is true that chatbots are supplanting the customer interaction space traditionally served by customer service representatives. With the adoption of Machine Learning and Deep Learning, Chatbots now come integrated with a wide range of features such as text to speech conversion, speech to text conversion and Natural Language Processing ability too.

Chatbots are not recent developments but have known to be in existence for quite a while now. However, with the advent of the internet and also rapid advancements in computer processing and programming abilities, versatility and convenience of implementation, chatbots are increasingly being embraced by businesses. It has also become achievable and beneficial to integrate Chatbots with various applications. However, a chatbot integration comes with a different set of challenges.

Chatbot testing is not just limited to delivering correct responses, but it also means providing intelligent responses from the Chatbot. Here is a comprehensive guide on Chatbot testing and the key factors that should be considered during the testing process.

Chatbot Testing

What is a Chatbot?

A Chatbot, also commonly known as an agent, could also be a web-based application or accessible over other interfaces, whose primary intention is to answer a user’s query regarding a certain domain. Before the inception of Chatbots, a prospective buyer or customer would have to visit the physical store personally or write an email to the company in order for his queries regarding the product or service to be answered. So that would need him to spend time in travelling to the store or drafting an email and then waiting for a response. But with the advent of Chatbots, the user would simply fire the query through the Chatbot and the response he receives would be almost instantaneous.

This would also reduce the company’s time to get to the market and identify potential buyers or users for their services.

It does not mean that Chatbots have or can completely replace human beings because, in some cases, after the query is fired by the user, some back end customer service representatives would still be required to act upon and process those queries. Therefore in light of their efficiency and effectiveness in answering queries, integrating Chatbots seems to be a competent, powerful and productive alternative.

Advantages and disadvantages of Chatbot

Chatbots are not humans and that is precisely, the advantage and disadvantage of them.

Since most of the commonly known Chatbots are not programmed to respond like humans, they don’t have feelings or emotions. As a result, their responses are straightforward without being judgmental. However, there are Chatbots known to be in existence, which can get a sense of and understand the user’s feelings or emotions and respond accordingly. One such Chatbot is that which can efficiently answer the queries of psychiatric patients by knowing and assessing their emotional feelings and help them cope with the disease.

Human agents may let emotionality or a bad day affect their ability to work and that may get in the way for what they can do for a customer. However, Chatbots on the other hand do not have that problem.

Most of the existing Chatbots treat every user in the same way. Their responses also do not vary as per the situation. This may pose as a disadvantage because Chatbots cannot become more friendly or courteous to the repeat customers or to users who interact with them more often or alter their responses based on the situation. But this statement may not always hold true because a Chatbot can be programmed to take the users context into perspective and therefore they could be made intelligent enough to be more friendly or courteous if the situation so demands.

One of the reasons for the widespread use of Chatbots is that they have proven themselves to be money savers. Chatbots which are currently in existence account for business cost savings of $20 million globally and research shows that Chatbots are expected to trim business costs by more than $8 billion per year by 2022. By implementing Chatbots to answer customers’ frequently asked queries and leaving the more complex queries to be handled by actual humans, Chatbots would be more efficient and effective when augmented with human agents rather than replacing them.

Generally, the costs of hiring, onboarding, training, hourly pay, and benefits per agent are the expenditures that a company would have to bear in order to hire an employee. However, even after considering the initial installation and programming costs, a good quality Chatbot often costs less than the annual salary of a single employee.

Moreover, Chatbots are available 24/7, 365 days a year. By using Chatbots to handle after hour queries, companies can provide customers with the support that they need – when they need it – without having to hire additional agents to cover these odd working hours. Also, unlike human agents, Chatbots never take sick leave or paid holidays and are available even during times of natural disasters or calamities.

Chatbot Testing

So, how would one go about Chatbot testing? What are the key areas and functionalities one should address while Chatbot testing?

Intent Testing:

Chatbots are based on intents. Intent means understanding a user’s query and responding accordingly.

For example: If the user asks for a company’s address, the Chatbot must ask the user the location for which the address is being asked. A developer has to program the Chatbot to map this query to the intent of asking the location of the address. If the intent isn’t implemented, the Chatbot would provide an incorrect answer with the address of only one location (say Pune, India) that is found in its database or it may answer with a pre-programmed answer or may not even know what to answer.

Chatbot Testing

Hence, intent testing is one of the important factors in Chatbot testing. It is advisable to have a comprehensive list of pre-constructed queries along with their associated expected responses. This should be run as regression tests on each new build to ensure that intents in the response do not change.

Other than the above, following are the key areas one should consider in Chatbot testing:

The following are the primary features that should be tested on a Chatbot.

  1. Chatbot’s way of introduction: Chatbot testing should start with Bot’s introduction to the users. The Bot should introduce itself in a user-friendly manner. The user should feel as if he is talking to a friend who has the genuine intentions of addressing his queries. It is necessary to engage a better user-experience with the Bot. The Chatbot introduction should be according to the geographical region, demographics and age group of the intended user. The Bot needs to let the user know how it can help the user.
    Chatbot Introduction
  2. Chatbot’s way of recognizing and responding to salutations: The Chatbot testing must include interpretation of salutations. The Bot should be programmed to recognize the salutations like “Hi”, “Hello”, “Hey There”, “Thank You”, “Good Morning”, “Good Bye” etc. Accordingly, for different geographical regions, the Bot should also understand salutations in the respective languages.BOT should be able to understand and respond to slang language like “Hiya there”, “What’s up?”, “How’s it going?” etc.Chatbot Recognition
  3. Conversational flow: This is a very important factor in Chatbot testing.
    For example: A user asks for the phone number of the Company’s Admin department to the Chatbot. In this situation, the Bot should be intelligent enough to ask the user as to which office location’s Admin contact the user desires to know and then provide the relevant information.Thus, the tester should think and determine the conversational flow in situations where a question can have many options and which in turn can have sub-options. Accordingly, the tester has to inform the developer and make appropriate suggestions.Chatbot Conversational FlowGoogle Dialogflow Chatbot Testing: It is often found that the Chatbots fail to understand and follow the sequence in the conversational flow.
    For example: First, the user has to enter a zip code followed by an amount in the conversational flow. Now, if the zip code entered did not comply with the format, the zip code value would be interpreted by the Bot to be the amount value and thereby the amount field would be given the zip code value. Consequently, the Chatbot would not ask the user for the amount value because it thinks that it already has it. This shouldn’t be mistaken as a fault of the tool used and these kinds of issues are known to occur while programming Chatbots because of inadequate training or intents.But these issues do have suitable fixes which can be implemented by the developers and it is the tester’s duty to test these scenarios adequately and report these issues to the developers and also convey to them that they are indeed fixable.A Chatbot designed using Google Dialogflow:Chatbot Design
    In addition to this, the Chatbots often fail to understand the intent behind a certain query, which is known to happen due to bad programming or incorrect handling. These bugs can be fixed and if not, the tester has to thoroughly test these scenarios and report such defects with the pertinent message to the developer that there does exist a fix for them.So, ideally, in such situations, the Chatbot should simply reply by saying something like: “I am sorry; I could not understand your question. Please try again with a valid input.”
    Chatbot DesignSo, it is the onus of the tester to ensure that the BOT strictly understands the conversational flow and follows the sequence faithfully and not misinterpret the entered information.
  4. Adding Human touch in the response (user-friendliness): When doing the Chatbot testing, a tester should ensure that the Bot’s responses have an essence of friendliness and warmth in the conversation. Each response should be so constructed such that the user feels at home and experiences affection while interacting with the Bot.
    For example: If the user wants to know the address of company’s Pune location, the Chatbot may respond like: “Sure, let me help you, here is the address of company’s Pune location”.The words like “Sure, let me help you” in a response add a feeling of friendliness to the conversation.
  5. Ability to understand Natural Language: The Chatbot should not be confined to only understand queries in a certain grammatical format. Since each user may pose the same query in a different grammatical construct while using variations of English words, therefore the BOT must be trained to understand the intent in every grammatical variation of the same query.
    For example: The Chatbot should be able to understand that the queries – “What is Company’s Pune address?” or “May I know the Pune address of the company?” or “Where is the company located at Pune?” all mean the same intent and each of these queries should be answered with the same response.NLP FriendlynessNLP FriendlynessNatural language understanding ability or friendliness is an important factor because this gives the Chatbot flexibility to understand different variations of the same query in terms of the words used and also the grammatical construct.
  6. Accuracy of Chatbot Response: The accuracy of Chatbot responses is the most important factor in Chatbot testing. Any wrong response will make the Chatbot’s purpose useless.
    Chatbot Testing AccuracyFor example: If the user wants to know the maximum expense limit per individual during a team outing, the Chatbot may respond by giving a hyperlink to a PDF document that contains the policy for team outing. However, an accurate response to the user’s query will be if the Chatbot calculates the amount in the response. Thus, it will save the efforts and time of the user from having to access the document and find the required information. Testers must ensure that the Chatbot doesn’t digress in its response.
  7. Minimum Latency in response retrieval: Latency is the time taken for the Chatbot to retrieve a response after the user has fired a query. It is widely observed and known that when a user interacts with a website, even a delay of 4 to 5 seconds experienced while opening the site or accessing a link causes a great deal of frustration to him and almost in all cases he would abandon his action and seek an alternative or a competitor’s site. Going by this, similarly in the case of Chatbots, depending on the response’s content, a maximum time delay (latency) of a few seconds (up to 2) is acceptable, although a delay of about 1 second is better and recommended. The tester should pose different queries during Chatbot testing, which cover large as well as small responses and ensure that there is minimal latency in the response.
  8. Whether hyperlinks in response direct to the correct site? The tester has to ensure that all hyperlinks or suggestion chips open up the relevant and intended web page. It is a common scenario when certain online PDF policy document links were updated, and the old link was being shown to the user in the Bot’s response. So, the tester must ensure that all the links to a specific query are up to date and are clickable leading the user to the intended web page.
    Chatbot Testing Hyperlinks
  9. Avoid cyclic loop: Because of poor implementation, the Chatbot may go in a cyclic loop; i.e. if it didn’t get the appropriate answer to a certain query, it would keep on repeating the same question to the user even when the conversational flow has been purposefully broken.
    For example: If the Bot has to accept an amount value after a valid zip code value is entered, it should not again erroneously ask for the zip code value. If the user was expected to enter a US zip code and the value entered by him was a valid five digit US zip code, then the BOT should proceed to the next step of retrieving the input in the conversational flow and not repeat the same question again.However, if the value entered is invalid, then the BOT can try to ask the same question again in a different manner. Finally, if the user did enter an invalid zip code a couple of times in succession then depending on an acceptable number of maximum failure attempts, the user should be then made to abort and exit the application gracefully with a courteous message.Chatbot Testing LoopingThese are the common scenarios in Chatbot testing that a tester must consider. Read upcoming part of the Chatbot Testing: Getting It Right The First Time – Part 2 blog for further test scenarios.