Queuing Theory for evaluating system performance in Event Driven Architecture – Part 1

Event Driven Architecture (EDA) is gaining a lot of popularity due to the benefits it offers to an enterprise to easily connect multiple disparate systems. Instead of creating a direct or point to point communication links between the systems that need to talk to each other, the communication process can be decoupled by having the systems interact by passing the events in a format that is programming language neutral (XML, JSON or  delimited text).

Service Oriented Architecture (SOA) has been widely adopted by the enterprises having loosely coupled services that interact with each other over an agreed protocol (SOAP, REST) to offer different business capabilities. EDA is not a replacement for SOA, rather it complements the existing SOA infrastructure by providing a means by which a system can interact asynchronously without the knowledge of other systems in an enterprise.

Building Blocks of Event Driven Architecture

Events

An event can be defined as a user or a system driven action that is captured by the system to signify an important interaction with customer, an opportunity, error conditions or a deviation in the service or threshold. The event may result in the invocation of other services or business processes based on the event outcome to offer value added services to the customer, enhance customer experience, cross sell  or avoid bad customer experience due to system errors. It is a good practice to standardize the event payload structure by having an event specification followed throughout the organization. All the events generated should have an event header and event body. Event header provides a context for the event i.e. the type of event, event generation time, source of event and message properties to enable the filtering of the events. Event body contains the event data that should conform to the schema definition.

Event Processors and Publishers

Generally the events generated by the system are in a Raw format which might not be of much use to the other systems in an enterprise. The Raw events might need to be enriched with additional customer or product information before they can be consumed by the other services or systems. Also, the raw event needs to be converted to an enterprise standard event definition for consistency and easy consumption by multiple subscribers.

Simple Event Processors can be developed in house by the application teams supporting the system that generates the events, however there are scenarios where Complex Event Processors (CEP) might be needed. CEPs are specialized event processing product offered by various vendors (TIBCO, Oracle, SAP) that helps in aggregating and correlating multiple events from different systems to derive new events which otherwise could be challenging with the custom built simple event processors. As an example, customer might prematurely close a CD account  held with a bank by withdrawing the money by official check. Since this customer action is an early withdrawal from a CD account, bank might charge an early withdrawal fees to the customer. Here a single action of closing the CD account by the customer generates a raw event for account closure which then could be processed by the event processors to generate multiple events (derived events) for account closure, official check issued, and/or early withdrawal fees.

Core Messaging Infrastructure

Once the raw event has been processed, enriched and transformed to an enterprise standard event, the event needs to be published to a Topic provided by the messaging infrastructure. There are many options available for building the enterprise wide messaging infrastructure (a.k.a. Message Bus or Event Bus) by  using WebsphereMQ, ActiveMQ, RabbitMQ or Kafka. Kafka is a highly scalable and distributed messaging system gaining lot of popularity among enterprises for Message Oriented Middleware (MOM) infrastructure.

Event Subscribers

Once the event is published to a specific Topic on the  Message/Event Bus, all the subscribers that have subscribed to the Topic will get the events and then take further action – call other services or generate additional events after applying some business rules.

System Performance

It is important to decouple the event generation and event processing from the customer facing business processes to avoid any delay in the response time due to the additional processing overhead of event handling. One of the ways to avoid this delay in the response time is to asynchronously store the raw events in the event/audit tables and then send them to the message queue or write the raw events directly to a message queue for processing. By putting the raw events in a message queue before the events are processed and published helps in decoupling the business process generating the events and the actual processing of the events, however this adds the complexity to the overall architecture. There could be a delay in the event processing and publishing if the rate at which the raw events are generated is higher than the rate at which the events are processed and published by event processors. There could be a business requirement to process the events within X seconds and any delay could result in sub-optimal customer experience or even an undesired consequences. As an example, sending the debit card transaction alert to the customer after 5 mins of swiping the card might not be a good customer experience and a delay of 15 mins in alerting the customer can result in a potential fraud costing a significant expense to the bank.

In order to avoid any delays in event processing, it is important to run performance tests to evaluate the system performance under peak load. We can also evaluate the system performance before actually running the performance tests by applying the Queuing theory. This helps in setting up a theoretical benchmark against which we can run the performance tests to evaluate the system performance involving messaging infrastructure.

In my next post on Queuing Theory, we’ll take a look at the queuing model and discuss how it can be applied to the raw event processors for evaluating the system performance before running the load tests on the system.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s