ServiceStarter Tutorial: Controlling Services

ServiceStarter is a utility that manages the lifecycle of independent services, executing start and stop tasks deterministically and at the correct time.

Greg Higgins

Jan. 12, 22 · Tutorial

Like (3)

Save

4.1K Views

This article is a tutorial for a new utility ServiceStarter. The purpose of ServiceStarter is to control the lifecycle of independent services, ensuring that services start and stop in the correct order and predictably degrade if a required service fails at runtime. This tutorial is based upon a notional order processing system connected to an exchange.

As background, this is my Christmas project. Due to a COVID-cancelled skiing holiday, I was left with time on my hands. A friend, who is building an event-driven HFT trading system, asked me to help solve the problem of starting and stopping trading services in a predictable order. The solution morphed into the ServiceStarter project.

The Problem ServiceStarter Solves

In many event-driven systems, services execute independently but globally need to coordinate their lifecycle with each other. A service may require all downstream services to be started before starting and becoming available to accept upstream requests.

Similarly, if a downstream service becomes unavailable all upstream services will need to be notified and take appropriate actions, gracefully shutting down services starting with external-facing services and working back towards the failed service.

As systems grow, a complex graph of interdependent services quickly arises. The difficulty in correctly managing lifecycle overwhelms a handwritten manual solution and can result in brittle non-deterministic behavior that services may rely upon, or introduce transient failures.

ServiceStarter is a utility that manages the lifecycle of independent services, executing start and stop tasks associated with a particular service deterministically and at the correct time.

ServiceStarter Order Processing Example

Find the example here on GitHub.

A simulated order processing system forms the requirements for this example. An order gateway connects to an exchange and processes orders from market participants. Orders must make it to a running instance of the risk manager, and be recorded for audit.

If the services downstream of the gateway are not running, the gateway must be prevented from accepting new orders.

The goal of this example is to control the lifecycle of the independent services:

Start all services in the correct order. The order gateway must be the last component started.
Stop all services in the correct order. The order gateway must be the first component stopped.
Stop/start service interactively ensure dependencies are stopped/started in the correct order.
React to a random start/stop of a service and start/stop required dependencies.
Optionally start/stop a service without causing other services to start or stop.

An internal order source submits orders that are not subject to the same PnL checks as external orders. The internal order source is controlled independently to the order gateway, although they share some common downstream components.

Running the Example

Run the executable jar in the dist directory start by cloning the project.

    Java
   
   java -jar .\dist\servicestarter-orderprocessing-example.jar

Use the GUI to trigger start/stop of services. The triggering checkboxes control the automated response of ServiceManager to task completion and service status updates.

Build with maven and execute Main from your ide to run the example GUI.

Services Description

Service name	Description	Requires started services
orderGateway	Connects to exchange and receive orders	pnlCheck
limitReader	Publishes limits for valid maximum/minimum order sizes
marketDataGateway	Publishes current market price for assets
pnlCheck	Validates an order is within limit size Off market rate check	limitReader marketDataGateway orderProcessor
orderProcessor	Validates order details	orderAudit riskManager
internalOrderSource	Order from internal customers, no pnl check required	orderProcessor
orderAudit	Records all valid orders for audit
riskManager	Manages risk

Programming the Example

A single file Main defines the dependency relationship between services in accordance with the table above and builds a ServiceManager that will control the services. The GUI is a sample built for this tutorial and is not general purpose. The GUI registers as a status update listener and invokes methods on the ServiceManager.

Defining a Service

Service definitions map an external service, start task, stop task, and dependencies into the ServiceManager. (See here for documentation describing a service.) A builder pattern is used to construct service definitions.

An extract from Main demonstrates building some of the service definitions.

    Java
   
 

   Service orderGateway = Service.builder(ORDER_GATEWAY)
        .startTask(Main::emptyTask)
        .stopTask(Main::emptyTask)
        .build();
//push order limits to pnlCheck
Service limitReader = Service.builder(LIMIT_READER)
        .startTask(Main::emptyTask)
        .stopTask(Main::emptyTask)
        .build();
//pushes market data to pnlCheck
Service marketDataGateway = Service.builder(MARKET_DATA_GATEWAY)
        .startTask(Main::emptyTask)
        .stopTask(Main::emptyTask)
        .build();
//carries out size and off market check on incoming orders 
//has a complex dependency relationship
Service pnlCheck = Service.builder(PNL_CHECK)
        .requiredServices(limitReader, marketDataGateway)
        .servicesThatRequireMe(orderGateway)
        .stopTask(Main::emptyTask)
        .startTask(Main::emptyTask)
        .build();
//processes orders check validity
Service orderProcessor = Service.builder(ORDER_PROCESSOR)
        .servicesThatRequireMe(pnlCheck)
        .stopTask(Main::emptyTask)
        .startTask(Main::emptyTask)
        .build();
  

The code above creates service definitions:

Each service is provided with a unique name
pnlCheck service defines services it requires: limitReader, marketDataGateway
pnlCheck defines services that require it: orderGateway.
orderProcessor service defines services that require it: pnlCheck
The start and stop tasks are set to Main::emptyTask for all services

Main::emptyTask always succeed after a timed delay of 1_500 milliseconds.

Building the ServiceManager

Once all the services are defined the ServiceManager can be built with the list of services. A ServiceManager is the controller for managing services and publishing the set of tasks to execute in response to a service event. See here for information about building a ServiceManager and here for how services are controlled.

    Java
   
 

   ServiceManager svcManager = ServiceManager.build(
	orderGateway,
	limitReader,
	marketDataGateway,
	pnlCheck,
	orderProcessor,
	internalOrderSource,
	orderAudit,
	riskManager
);
  

Threading Model

All the requests take place on the GUI thread, and default behavior is to execute the task on the calling thread. The example task sleeps for 1_500 milliseconds, which would lock the GUI. For this example, the ServiceManager is configured to execute tasks with an AsynchronousTaskExecutor. Tasks run on worker threads and ensure the GUI is not locked during task execution.

The tasks published by the ServiceManager are independent and can safely be executed in parallel by worker threads created by the AsynchronousTaskExecutor. The default SynchronousTaskExecutor does not parallelize the execution of tasks they run serially.

    Java
   
   svcManager.registerTaskExecutor(new AsynchronousTaskExecutor());

Notifications from the tasks to the ServiceManager are on a worker thread. ServiceManager is thread-safe, preventing race conditions if the GUI thread tries to update the ServiceManager at the same time.

Automatic Triggering on Successful Task Execution

When a service is started or stopped tasks are executed as specified in the service definition, if a task completes without exception, then the status for the service is updated to either STOPPED or STARTED. Change of service state triggers the ServiceManager to generate the next set of tasks to execute.

    Java
   
   svcManager.triggerNotificationOnSuccessfulTaskExecution(true);

It is possible to disable this behavior by changing the triggerNotificationOnSuccessfulTaskExecution flag to false. That way, after an execution, there is no automatic status update and no triggering of sub-tasks. This allows the developer to debug the events passing into the ServiceManager, using the notification buttons to progress the execution.

The checkbox on the GUI is connected to the relevant flag in ServiceManager.

Automatic Task Triggering on State Changes

Sometimes services unexpectedly start or stop, and dependent services should be controlled to maintain the integrity of the system. If a service monitoring solution is in place notifications can be sent to the ServiceManager via API calls. When the ServiceManager receives a notification of a service status change then tasks are executed as required to control the starting and stopping of dependent services.

It is possible to disable this behavior by changing the triggerDependentsOnNotification flag to false so that a status update will not trigger the execution of sub-tasks.

    Java
   
   svcManager.triggerDependentsOnNotification(true);

The checkbox on the GUI is connected to the relevant flag in ServiceManager.

Registering the GUI as a Status Listener

When a service state changes the ServiceManager publishes a list of all service ServiceStatusRecords to a registered listener. The GUI registers for status updates with:

    Java
   
   svcManager.registerStatusListener(new ServiceManagaerFrame(svcManager)::logStatus);

When the GUI is notified with a status update, the screen is repainted reflecting the new state of the services according to the ServiceManager.

Service States

Each service is effectively a state machine transitioning to a new state in response to events posted by the ServiceManager. The state definitions for a service are described here.

Experiment with stopping the riskManager

Ensure all services are running, un-tick the "trigger on task complete" checkbox then stop the risk manager. Un-ticking "trigger on task complete" allows the developer to step through the states by manually using notify started buttons.

Push Stop Service for the riskManager

Service name	Status	Stop task executing
orderGateway	STOPPING	yes
limitReader	STARTED
marketDataGateway	STARTED
pnlCheck	WAITING_FOR_PARENTS_TO_STOP
orderProcessor	WAITING_FOR_PARENTS_TO_STOP
internalOrderSource	STOPPING	yes
orderAudit	STARTED
riskManager	WAITING_FOR_PARENTS_TO_STOP

Push Notify Stopped For orderGateway

Service name	Status	Stop task executing	Stop task completed
orderGateway	STOPPED		yes
limitReader	STARTED
marketDataGateway	STARTED
pnlCheck	STOPPING	yes
orderProcessor	WAITING_FOR_PARENTS_TO_STOP
internalOrderSource	STOPPING	yes
orderAudit	STARTED
riskManager	WAITING_FOR_PARENTS_TO_STOP

Push Notify Stopped for internalOrderSource and pnlCheck

Service name	Status	Stop task executing	Stop task completed
orderGateway	STOPPED		yes
limitReader	STARTED
marketDataGateway	STARTED
pnlCheck	STOPPED		yes
orderProcessor	STOPPING	yes
internalOrderSource	STOPPED		yes
orderAudit	STARTED
riskManager	WAITING_FOR_PARENTS_TO_STOP

Push Notify Stopped for orderProcessor and riskManager

Service name	Status	Stop task completed
orderGateway	STOPPED	yes
limitReader	STARTED
marketDataGateway	STARTED
pnlCheck	STOPPED	yes
orderProcessor	STOPPED	yes
internalOrderSource	STOPPED	yes
orderAudit	STARTED
riskManager	STOPPED	yes

Conclusion

I hope you enjoyed reading the article, I intend to write a follow-up piece to cover:

Monitoring and audit logs
Compiling a ServiceManager ahead of time
Fundamentals of ServiceStarter

My goal is to have developers actively use the ServiceStarter, letting the feedback drive the development direction and iron out any bugs along the way.

So please try it out, comment, and ask questions.

Task (computing)

Opinions expressed by DZone contributors are their own.

Related

Trending