POSTED : September 30, 2019
BY : Ram Sathia

With 9 out of 10 companies seeking to implement Robotic Process Automation (RPA) by 2020, automation has become table stakes in nearly every industry. Some companies see it as a direct path to reducing operating costs, while others view it as a way to enable consistent quality and a better customer experience. But implementing an RPA program isn’t as simple as throwing a few bots at a manual task. 

RPA is increasingly seen as a silver bullet for many of the complexities businesses face, such as the need for improved efficiency due to mergers and acquisitions, global trade pressures, gaps in the workforce, and the desire to scale effectively. Bots themselves must be operationalized in order to be effective – a step that many companies aren’t yet taking. When done correctly, operationalization, which is the deployment and monitoring of your bots, can help businesses reduce time to market, increase product quality and better meet evolving customer needs.  

Many RPA vendors have tools integrated into their product lifecycle that allow for basic bot monitoring. Having visibility into key indicators, like bot process execution results and memory and disk level utilization metrics, are critical to ensuring day-to-day operations. As bots and RPA technologies continue to multiply, a basic bot operationalization routine has significant blind spots. Think about the unintended situations that could occur when using bots within bots (I.e., microbots) as microservices and when bots are interacting with other bots. You need to lock down automation and prevent complete bot chaos by developing the tools for event logging, visualization, cross-bot telemetry and data ingestion.  

To help clients effectively do so and achieve high performance in a wide range of automation use cases, our team has introduced a next generation RPA monitoring methodology called BotRE™ (Bot Reliability Engineering). Since businesses rely on bots to guarantee both a high-quality customer and employee experience, bots must flawlessly execute thousands of processes. Those automated actions could include previously manual activities such as the closure of a customer’s account or the transfer of an employee’s information to another department. When there’s a bot failure, the ability to track and manage the fix can be the difference between an upset customer or employee and smooth operations. 

Event logging: The logging of events, particularly failures, is critical to bot monitoring because without that baseline of information, you’ll be operating in the dark. Understanding historical events as it relates to your bots is vital to monitoring, tracing, and alerting of bot failures, as well as anomaly and failure detection. Knowing where a bot is failing, whether in initiation, execution, end of execution, allows for surgical corrections. Not only can you intervene when bots fail, with enough historical data you can set up proactive alerts that let you know when a bot failure is imminent. To achieve this, you’ll need data ingestion, a logging strategy and structure, as well as user interface to visualize the logging, tracing and reporting of failures 

At PK we applied deep craftsmanship to create a simplistic view of what is a complex and multi-dimensional visualization of bot monitoring. As a result, we developed a Unified Operations Management Dashboard (UOMDfor easy operationalization of bots.      

Visualization: Visualization of bot operations and the flow of transactions in an easy to interpret dashboard helps executives in business, technology, and operations gain more visibility, thereby improving decision-making. Bot observability empowers executives with context and an aerial view of their operations. For bots to perform at their best, it requires human input to optimize their workflows. Having a dashboard allows for keener insights into their digital workforce. 

With UOMD we provide multiple views of the visualization of overall bot performance including business view and department specific view, infrastructure view, bot design view, with a visually appealing color-coded legend to quickly appraise problems and prescribe solutions. Without UOMD, tracing a bot failure is both time consuming and costly because it interrupts business operations and can degrade the customer experience 

Cross-bot Telemetry: As bot interactions have grown in complexity, with bots now involved with their own microbots and connecting with other bots, the opportunities for failure points have also proliferated. We use cross-bot telemetry correlation, which is just a fancy term for measuring how bots are interacting, to provide detailed views into where bots are failing. We then can distill that information for bot operators and moderators who are in charge of getting those bots to function smoothly. 

Our BotRE methodology helps in managing the cross-bot dependencies and controlling the cascading nature of failures through the correlation of relevant bot telemetry based on time and rule based events as well as errors. 

Data Ingestion: But to really understand why your bots are failing, you need greater insight into exactly when they’re failing. Identifying the pattern and the specific steps leading up to the failures as well as other temporal correlations can help in troubleshooting. BotRE helps ingest data from the bot infrastructure, RPA tool infrastructure, application logs, and infrastructure logs. With this data, you can intelligently detect bot anomaliesthen alert and notify your teams before the failures cascade. 

Bot failure is inevitable. But damage to customer and employee relationships from bot failure doesn’t have to be. Being able to track and monitor bot performance through event logging and a dashboard visualizing bot performance enterprise-wide is only the first step to ensuring you’re able to get in front of a bot issue before it cascades into a real-world problem. To reduce risk associated with the rise of inter and intra-bot relationships where there’s no human stop gap, you’ll need to take your interventions even further and be able to predict bot breakdowns before they happen. Having a unified system like UOMD and methodology like BotRE that can ingest the data, help make the right inferences, and recommend the necessary interventions, will elevate automation to its intended purpose and make your life a little easier.     

Learn more about how PK helps companies overcome bot failures.