ServiceStarter Tutorial: Controlling Services
ServiceStarter is a utility that manages the lifecycle of independent services, executing start and stop tasks deterministically and at the correct time.
Join the DZone community and get the full member experience.
Join For FreeThis article is a tutorial for a new utility ServiceStarter. The purpose of ServiceStarter is to control the lifecycle of independent services, ensuring that services start and stop in the correct order and predictably degrade if a required service fails at runtime. This tutorial is based upon a notional order processing system connected to an exchange.
As background, this is my Christmas project. Due to a COVID-cancelled skiing holiday, I was left with time on my hands. A friend, who is building an event-driven HFT trading system, asked me to help solve the problem of starting and stopping trading services in a predictable order. The solution morphed into the ServiceStarter project.
The Problem ServiceStarter Solves
In many event-driven systems, services execute independently but globally need to coordinate their lifecycle with each other. A service may require all downstream services to be started before starting and becoming available to accept upstream requests.
Similarly, if a downstream service becomes unavailable all upstream services will need to be notified and take appropriate actions, gracefully shutting down services starting with external-facing services and working back towards the failed service.
As systems grow, a complex graph of interdependent services quickly arises. The difficulty in correctly managing lifecycle overwhelms a handwritten manual solution and can result in brittle non-deterministic behavior that services may rely upon, or introduce transient failures.
ServiceStarter is a utility that manages the lifecycle of independent services, executing start and stop tasks associated with a particular service deterministically and at the correct time.
ServiceStarter Order Processing Example
Find the example here on GitHub.
A simulated order processing system forms the requirements for this example. An order gateway connects to an exchange and processes orders from market participants. Orders must make it to a running instance of the risk manager, and be recorded for audit.
If the services downstream of the gateway are not running, the gateway must be prevented from accepting new orders.
The goal of this example is to control the lifecycle of the independent services:
- Start all services in the correct order. The order gateway must be the last component started.
- Stop all services in the correct order. The order gateway must be the first component stopped.
- Stop/start service interactively ensure dependencies are stopped/started in the correct order.
- React to a random start/stop of a service and start/stop required dependencies.
- Optionally start/stop a service without causing other services to start or stop.
An internal order source submits orders that are not subject to the same PnL checks as external orders. The internal order source is controlled independently to the order gateway, although they share some common downstream components.
Running the Example
Run the executable jar in the dist directory start by cloning the project.
java -jar .\dist\servicestarter-orderprocessing-example.jar
Use the GUI to trigger start/stop of services. The triggering checkboxes control the automated response of ServiceManager to task completion and service status updates.
Service name | Description | Requires started services |
---|---|---|
orderGateway | Connects to exchange and receive orders | pnlCheck |
limitReader | Publishes limits for valid maximum/minimum order sizes | |
marketDataGateway | Publishes current market price for assets | |
pnlCheck | Validates an order is within limit size Off market rate check |
limitReader marketDataGateway orderProcessor |
orderProcessor | Validates order details | orderAudit riskManager |
internalOrderSource | Order from internal customers, no pnl check required | orderProcessor |
orderAudit | Records all valid orders for audit | |
riskManager | Manages risk |
Programming the Example
A single file Main defines the dependency relationship between services in accordance with the table above and builds a ServiceManager that will control the services. The GUI is a sample built for this tutorial and is not general purpose. The GUI registers as a status update listener and invokes methods on the ServiceManager.
Defining a Service
Service definitions map an external service, start task, stop task, and dependencies into the ServiceManager. (See here for documentation describing a service.) A builder pattern is used to construct service definitions.
An extract from Main demonstrates building some of the service definitions.
Service orderGateway = Service.builder(ORDER_GATEWAY)
.startTask(Main::emptyTask)
.stopTask(Main::emptyTask)
.build();
//push order limits to pnlCheck
Service limitReader = Service.builder(LIMIT_READER)
.startTask(Main::emptyTask)
.stopTask(Main::emptyTask)
.build();
//pushes market data to pnlCheck
Service marketDataGateway = Service.builder(MARKET_DATA_GATEWAY)
.startTask(Main::emptyTask)
.stopTask(Main::emptyTask)
.build();
//carries out size and off market check on incoming orders
//has a complex dependency relationship
Service pnlCheck = Service.builder(PNL_CHECK)
.requiredServices(limitReader, marketDataGateway)
.servicesThatRequireMe(orderGateway)
.stopTask(Main::emptyTask)
.startTask(Main::emptyTask)
.build();
//processes orders check validity
Service orderProcessor = Service.builder(ORDER_PROCESSOR)
.servicesThatRequireMe(pnlCheck)
.stopTask(Main::emptyTask)
.startTask(Main::emptyTask)
.build();
The code above creates service definitions:
- Each service is provided with a unique name
pnlCheck
service defines services it requires:limitReader
,marketDataGateway
pnlCheck
defines services that require it:orderGateway
.orderProcessor
service defines services that require it:pnlCheck
- The start and stop tasks are set to
Main::emptyTask
for all services
Main::emptyTask
always succeed after a timed delay of 1_500 milliseconds.
Building the ServiceManager
Once all the services are defined the ServiceManager can be built with the list of services. A ServiceManager is the controller for managing services and publishing the set of tasks to execute in response to a service event. See here for information about building a ServiceManager and here for how services are controlled.
ServiceManager svcManager = ServiceManager.build(
orderGateway,
limitReader,
marketDataGateway,
pnlCheck,
orderProcessor,
internalOrderSource,
orderAudit,
riskManager
);
Threading Model
All the requests take place on the GUI thread, and default behavior is to execute the task on the calling thread. The example task sleeps for 1_500 milliseconds, which would lock the GUI. For this example, the ServiceManager is configured to execute tasks with an AsynchronousTaskExecutor. Tasks run on worker threads and ensure the GUI is not locked during task execution.
The tasks published by the ServiceManager are independent and can safely be executed in parallel by worker threads created by the AsynchronousTaskExecutor. The default SynchronousTaskExecutor does not parallelize the execution of tasks they run serially.
svcManager.registerTaskExecutor(new AsynchronousTaskExecutor());
Notifications from the tasks to the ServiceManager are on a worker thread. ServiceManager is thread-safe, preventing race conditions if the GUI thread tries to update the ServiceManager at the same time.
Automatic Triggering on Successful Task Execution
When a service is started or stopped tasks are executed as specified in the service definition, if a task completes without exception, then the status for the service is updated to either STOPPED or STARTED. Change of service state triggers the ServiceManager to generate the next set of tasks to execute.
svcManager.triggerNotificationOnSuccessfulTaskExecution(true);
It is possible to disable this behavior by changing the triggerNotificationOnSuccessfulTaskExecution
flag to false. That way, after an execution, there is no automatic status update and no triggering of sub-tasks. This allows the developer to debug the events passing into the ServiceManager, using the notification buttons to progress the execution.
The checkbox on the GUI is connected to the relevant flag in ServiceManager.
Automatic Task Triggering on State Changes
Sometimes services unexpectedly start or stop, and dependent services should be controlled to maintain the integrity of the system. If a service monitoring solution is in place notifications can be sent to the ServiceManager via API calls. When the ServiceManager receives a notification of a service status change then tasks are executed as required to control the starting and stopping of dependent services.
It is possible to disable this behavior by changing the triggerDependentsOnNotification
flag to false so that a status update will not trigger the execution of sub-tasks.
svcManager.triggerDependentsOnNotification(true);
The checkbox on the GUI is connected to the relevant flag in ServiceManager.
Registering the GUI as a Status Listener
When a service state changes the ServiceManager publishes a list of all service ServiceStatusRecords to a registered listener. The GUI registers for status updates with:
svcManager.registerStatusListener(new ServiceManagaerFrame(svcManager)::logStatus);
When the GUI is notified with a status update, the screen is repainted reflecting the new state of the services according to the ServiceManager.
Service States
Each service is effectively a state machine transitioning to a new state in response to events posted by the ServiceManager. The state definitions for a service are described here.
Experiment with stopping the riskManager
Ensure all services are running, un-tick the "trigger on task complete" checkbox then stop the risk manager. Un-ticking "trigger on task complete" allows the developer to step through the states by manually using notify started buttons.
Push Stop Service for the riskManager
Service name | Status | Stop task executing | Stop task completed |
---|---|---|---|
orderGateway | STOPPING | yes | |
limitReader | STARTED | ||
marketDataGateway | STARTED | ||
pnlCheck | WAITING_FOR_PARENTS_TO_STOP | ||
orderProcessor | WAITING_FOR_PARENTS_TO_STOP | ||
internalOrderSource | STOPPING | yes | |
orderAudit | STARTED | ||
riskManager | WAITING_FOR_PARENTS_TO_STOP |
Push Notify Stopped For orderGateway
Service name | Status | Stop task executing | Stop task completed |
---|---|---|---|
orderGateway | STOPPED | yes | |
limitReader | STARTED | ||
marketDataGateway | STARTED | ||
pnlCheck | STOPPING | yes | |
orderProcessor | WAITING_FOR_PARENTS_TO_STOP | ||
internalOrderSource | STOPPING | yes | |
orderAudit | STARTED | ||
riskManager | WAITING_FOR_PARENTS_TO_STOP |
Push Notify Stopped for internalOrderSource and pnlCheck
Service name | Status | Stop task executing | Stop task completed |
---|---|---|---|
orderGateway | STOPPED | yes | |
limitReader | STARTED | ||
marketDataGateway | STARTED | ||
pnlCheck | STOPPED | yes | |
orderProcessor | STOPPING | yes | |
internalOrderSource | STOPPED | yes | |
orderAudit | STARTED | ||
riskManager | WAITING_FOR_PARENTS_TO_STOP |
Push Notify Stopped for orderProcessor and riskManager
Service name | Status | Stop task executing | Stop task completed |
---|---|---|---|
orderGateway | STOPPED | yes | |
limitReader | STARTED | ||
marketDataGateway | STARTED | ||
pnlCheck | STOPPED | yes | |
orderProcessor | STOPPED | yes | |
internalOrderSource | STOPPED | yes | |
orderAudit | STARTED | ||
riskManager | STOPPED | yes |
Conclusion
I hope you enjoyed reading the article, I intend to write a follow-up piece to cover:
- Monitoring and audit logs
- Compiling a ServiceManager ahead of time
- Fundamentals of ServiceStarter
My goal is to have developers actively use the ServiceStarter, letting the feedback drive the development direction and iron out any bugs along the way.
So please try it out, comment, and ask questions.
Opinions expressed by DZone contributors are their own.
Comments