Configuration
For detailed security recommendations (HTTPS, etc.), see the Deployment page.
All configuration for the Load Balancer is managed through the .env
file.
Environment Variables
Below are the main environment variables used by the Load Balancer. All variables are set in your .env
file.
Variable | Description | Default |
---|---|---|
PORT | HTTP port for the Load Balancer API. | 4000 |
REDIS_HOST | Host for the Redis instance used by BullMQ. | localhost |
REDIS_PORT | Port for the Redis instance used by BullMQ. | 6379 |
LB_AUTH_KEY | Authentication key for the Load Balancer and Worker communication or to use the endpoints. | default-lb-auth-key |
WORKER_DISCOVERY_INTERVAL | Interval (ms) for polling Workers for status. | 3000 |
WORKER_DISCOVERY_TIMEOUT | Timeout (ms) for Worker discovery requests. | 3000 |
WORKER_DISCOVERY_RETRIES | Number of retries for Worker discovery. | 10 |
WS_HEARTBEAT_INTERVAL | Interval (ms) for WebSocket heartbeat pings. | 3000 |
WS_HEARTBEAT_TIMEOUT | Timeout (ms) for WebSocket heartbeat. | 3000 |
DASHBOARD_ENABLED | Enable or disable the dashboard interface on /dashboard . | true |
BULL_BOARD_ENABLED | Enable or disable the BullMQ dashboard interface on /bull . | true |
LOG_LEVEL | Log verbosity (info , warn , error , etc.). | info |
MAX_LOGS | Number of log lines to keep in memory (for /logs ). | 1000 |
TASK_TIMEOUT | Timeout (ms) for task execution. | 300000 |
TASK_ATTEMPTS | Number of attempts for a task before failure. | 3 |
Example .env
PORT=4000
REDIS_HOST=localhost
REDIS_PORT=6379
LB_AUTH_KEY=default-lb-auth-key
WORKER_DISCOVERY_INTERVAL=3000
WORKER_DISCOVERY_TIMEOUT=3000
WORKER_DISCOVERY_RETRIES=10
WS_HEARTBEAT_INTERVAL=3000
WS_HEARTBEAT_TIMEOUT=3000
DASHBOARD_ENABLED=true
BULL_BOARD_ENABLED=true
LOG_LEVEL=info
MAX_LOGS=1000
TASK_TIMEOUT=300000
TASK_ATTEMPTS=3
Worker List
General
- The list of available Workers is managed in the file
workers.json
(at the project root). - Each entry is a URL (e.g.,
http://localhost:3000
) pointing to a Worker instance. - This file is read at startup and can be hot-reloaded for dynamic cluster management.
[
"http://localhost:3000"
]
It is strictly recommended to use HTTPS for your Workers if they are exposed on the Internet (this is not necessary when using the Load Balancer, which can be the only one exposed). When using HTTP, the Worker connects to the Load Balancer via ws://
. When using HTTPS, it connects via wss://
.
Docker network
When using Docker on your Mac/Windows machine, your Load Balancer and Workers can communicate using the host.docker.internal
domain. Example of a workers.json
file when you're working on your machine:
[
"http://host.docker.internal:3000"
]
This works out of the box, as host.docker.internal is resolved by Docker.
That said, it won't work on a Linux-type production machine, which is why the Load Balancer and Andera Workers use an andera-net
network. In this configuration, this syntax must be used for production:
[
"http://service-name:3000"
]
Where service-name
is the Docker Compose service name.
Worker Management
The Load Balancer supports dynamic, hot-reloadable management of Workers:
- Hot Reload: The Load Balancer periodically reloads the
workers.json
file (everyWORKER_DISCOVERY_INTERVAL
ms) and re-discovers all Workers. You can add or remove Workers at runtime without restarting the Load Balancer. - Adding a Worker: Add its URL to
workers.json
. The Load Balancer will discover and include it in routing if it is healthy and contract-valid. - Removing a Worker: Remove its URL from
workers.json
. The Load Balancer will stop routing new tasks to it as soon as it is no longer in the list. - Best Practice: Before removing a Worker, set it to maintenance mode by calling its
/off
endpoint. This ensures the Worker finishes any tasks in progress and does not accept new ones. Once all tasks are complete, you can safely remove it fromworkers.json
. - Status and Health: The Load Balancer continuously monitors each Worker's status via
/health
and WebSocket. Workers in maintenance or unreachable are excluded from routing until they are back online or set to/on
.
Groups and Contracts
Workers behind the Load Balancer are organized by group and contract version:
- Group (
GROUP
): Each Worker declares a group (e.g.,default
,screenshot
,translate
). A group represents a logical set of Workers able to process the same type of tasks. - Contract (
CONTRACT
): Each Worker declares a contract version (integer). A contract defines the interface (available functions and their parameters) for a group.
Consistency Constraints
- All Workers in the same group and contract version must expose exactly the same interface (functions and parameters).
- The Load Balancer fetches the interface of each Worker (via
/health
) and validates that all Workers for a given group/contract are consistent. - If a Worker in a group/contract exposes a different interface, it is marked as invalid and excluded from routing.
- When submitting a task, the Load Balancer only routes to valid Workers matching the requested group and contract.
- If no valid Worker is available for a group/contract, the task is rejected with an error.
What happens if contracts mismatch?
- If Workers in the same group declare the same contract version but expose different functions or parameters, they are excluded from routing.
- This ensures that all tasks for a group/contract are processed consistently and safely.
- Best practice: When changing the interface of a group, increment the contract version (
CONTRACT=2
, etc.) on all relevant Workers.
Example
- Three Workers in group
screenshot
, contract1
, all expose the same functionscreenshot
. - One Worker is updated to add a parameter but keeps
CONTRACT=1
. - The Load Balancer marks this Worker as invalid and does not route tasks to it until its interface matches the others or the contract version is incremented.