Performance Debugging with Choreo's AI-Assisted Capabilities

  • By Nadheesh Jihan
  • 22 Nov, 2022

Why is Performance Debugging Important?

The number of public APIs which expose organizational capabilities number in the tens of thousands and growing daily. Most current applications combine these APIs to create new user experiences. Integration-based development is the process of combining multiple services and APIs to create new applications that provide unique user experiences. 

Understanding API performance is key to ensuring service level agreements (SLAs) to satisfy those consuming those services. Therefore, analyzing the performance of these APIs is crucial to provide valuable insight on their scalability and to optimize performance. 

Integration-based development increases the risk of performance mistakes (e.g., a service call within a loop having a variable length) in the code compared to traditional programs that do not depend on external services. Since developers combine multiple services or APIs with unknown performance characteristics, these mistakes are usually missed out during development. Therefore, performance testing is a crucial aspect of integration-based development.

System performance can significantly impact its cost and feasibility. Consider a scenario where an organization must support a load of 100 transactions per second (TPS). They have two options. 

  1. An implementation with best performance practices that supports 100 TPS with a single server.
  2. A scaled implementation with poor performance that handles 20 TPS per node, meaning they require at least five servers to achieve the 100 TPS load.

The following table explains the differences between the two implementations:

Characteristics

100 TPS single node

20 TPS x 5 nodes

Complexity

Simple

Requires a single server application

Highly complex

Need to handle scalability and data consistency

Cost

Cheap

No need to worry about additional complexities and needs fewer computational resources

Expensive

Need a special mechanism and additional resources to handle the complexities and overheads

Errors

Fewer errors

Only application-level errors

More errors

Errors due to data inconsistencies, concurrency handling, and inter-node communications

Feasibility

Highly feasible

Scaling of certain services may not be feasible (e.g., scaling may not be possible with third-party services)

While scaling may be necessary at some point, scaling an application with poor performance practices can unnecessarily increase the complexity, cost, and errors while reducing the feasibility of achieving performance goals. The additional cost of scaling can be as high as 100x compared to the version of that application that follows best practices.

The costs will be incurred for additional resources and advanced technologies, more developing hours, hiring experts, and additional testing. Moreover, sometimes it can be less feasible to reach performance goals by just scaling the API if the bottleneck is an external service that is beyond the developer's control.

Why is Traditional Performance Debugging Expensive?

Since integration platforms connect multiple web services while creating new applications, it’s difficult to anticipate the performance of such applications during the development stage until the APIs are tested and deployed. Moreover, it is challenging to understand the performance characteristics of APIs when changing the code or external service calls without rerunning performance experiments after each change. These experiments require additional time and resources and can cost more and extend deadlines. 

Even after running experiments, finding the root cause is mostly done via educated guesswork; therefore, several other experiments will be required to verify the suspected root cause. An alternative is to avoid performance mistakes by hiring developers with experience in writing performant code. But developers who do so are difficult to find and highly expensive. 

Modern architectures are forcing developers to play dual roles as programmers and system engineers. Developers can’t just write code, they must be responsible for the design, integration, and management of complex systems. New technologies such as graphical code development have reduced the coding effort while expanding the developer base and lowering costs.

Enter Choreo

Choreo is a full lifecycle cloud native developer platform that allows developers to rapidly develop and deploy cloud native enterprise applications. The unique AI-assisted graphical code development platform is beneficial to expedite development workflows while bringing in best practices for deployment and operations. It also includes cloud integration tools to successfully integrate your application with other apps- internal or external, as well as proven API integration tools and techniques to define and govern the APIs of your application.

Helping Everyone to be a 10X Effective Engineer

With the help of artificial intelligence (AI), integrated development environments (IDE) can take up the burden of helping engineers to write performant code. Choreo does this at two levels for services being developed using the Choreo graphical code editor.

Figure 1: Performance feedback in Choreo's graphical editor

As shown in Figure 1, Choreo provides feedback as-it-occurs on the status of the “performance-critical path” in the service being developed. A request can flow through multiple execution paths. The “performance critical path” is the one estimated to incur in the highest response time for a single request. This does not require deploying the service, as Choreo utilizes historical metrics collected from deployed applications to determine the new service’s performance.

The estimated performance metrics will include how latency and throughput can vary when the number of users accessing the service is increased from a single user to a certain limit. This limit is determined based on the confidence of the performance estimations. For example, if the estimations are assigned low confidence due to the lack of historical data, the limit will automatically drop to a lower user count. The ranges of user count, latency, and throughput for the performance-critical path are displayed in the top banner as shown in Figure 1. With the help of estimated performance metrics shown by the top banner, Choreo developers can understand the performance (throughput, average latency, etc.) of the application after each change made to the performance critical path. 

Apart from the overall estimated performance, the estimated latency ranges for the individual API invocations appear next to those calls in the graphical editor as given by Figure 1. With the latency estimations for individual API invocations, Choreo developers can identify the bottleneck API calls that govern the processing rate of the whole service. Whichever API invocation has the highest latency will become the bottleneck for a particular execution path. Moreover, this information is combined with overall performance estimations to help developers identify any performance anti-patterns in the performance critical path. This allows developers to efficiently mitigate performance bottlenecks and anti-patterns since they receive constant feedback with each change they make to the source code of the service.

Diving Into Performance Feedback

Although performance feedback is given for the performance critical path, that path is not highlighted in the graphical code editor by default. A service can have multiple execution paths, which account for the performance of the service. As shown in Figure 2, clicking the link on the top banner highlights the performance-critical path in the graphical code editor and opens a panel to the right side of the editor with controllers to select other execution paths.

Figure 2: Exploring the performance of other execution paths via the performance analyzer panel

With the new panel, developers can investigate the performance of each execution path that has at least a single API invocation. Also, the latency and throughput plots that appear below the path selector help us understand how the selected path behaves when changing the number of users accessing the system. This information is useful to conduct capacity planning by understanding the volume of users that the service can support while conducting external API calls. Finally, through performance feedback, Choreo trains developers in a programming style consisting of well-established Enterprise Integration Patterns (EIPs), that naturally yield good performance, known as Artificial intelligence-assisted (AI) development.

˘These types of analysis with the help of AI-powered performance feedback in Choreo can help both inexperienced and expert-level (e.g., enterprise integration architects) developers to identify and resolve performance mistakes efficiently during development. It can also reduce the cost of performance testing while enabling them to develop high-quality APIs.

How Does it Work?

AI is already used in many domains to model already-established behaviors and forecast new behaviors based on previously observed patterns. AI and machine learning (ML) have shown significant success in the domain of performance forecasting of applications.

Choreo has taken this knowledge into account and combined theoretical performance models to build an algorithm to estimate the performance of service-oriented architectures. 

The idea is simple: use historical data from different web service calls to build performance characteristic models for services and APIs that are already in use. Using these characteristic models, Choreo tries to satisfy a theoretical model that represents each execution path with service invocations from the considered application. 

The performance values that satisfy the theoretical model for the application are the possible performance numbers for a particular execution path. 

Therefore, with the above approach, Choreo only needs to build the theoretical model for each change developers make to the application, and within milliseconds, Choreo can produce an accurate performance forecast of that application without running any new performance experiments.

Finally, What a Relief!

Writing performant code has become a challenge. Developers need to run multiple performance tests to tune the performance of an application. When it comes to integration-based applications, performance anti-patterns can reduce the performance of the application drastically. In such cases, scaling the application can be complex, expensive, and less feasible. Scaling an application with performance mistakes can increase the cost by at least 100x. While testing applications for performance mistakes is expensive and time-consuming. 

With Choreo, developers can use feedback from the performance forecast tools to develop performant code efficiently while identifying performance anti-patterns. This saves both money and time while ensuring the quality of the code. Also, inexperienced developers can create better applications using AI-based performance feedback provided by the Choreo editor. This opens the door for inexperienced developers to use graphical code editors with confidence while minimizing the likelihood of mistakes.

Choreo offers a graphical code editor specially designed for cloud native development processes. So, any developer who wants to use this platform can get the benefits of the performance debugging tools. 

Sign up to Choreo for free and try out this feature today!

Table of Contents