Skip to main content

Configuration

info

For detailed security recommendations (HTTPS, etc.), see the Deployment page.

All configuration for the Load Balancer is managed through the .env file.

Environment Variables

Below are the main environment variables used by the Load Balancer. All variables are set in your .env file.

VariableDescriptionDefault
PORTHTTP port for the Load Balancer API.4000
REDIS_HOSTHost for the Redis instance used by BullMQ.localhost
REDIS_PORTPort for the Redis instance used by BullMQ.6379
LB_AUTH_KEYAuthentication key for the Load Balancer and Worker communication or to use the endpoints.default-lb-auth-key
WORKER_DISCOVERY_INTERVALInterval (ms) for polling Workers for status.3000
WORKER_DISCOVERY_TIMEOUTTimeout (ms) for Worker discovery requests.3000
WORKER_DISCOVERY_RETRIESNumber of retries for Worker discovery.10
WS_HEARTBEAT_INTERVALInterval (ms) for WebSocket heartbeat pings.3000
WS_HEARTBEAT_TIMEOUTTimeout (ms) for WebSocket heartbeat.3000
DASHBOARD_ENABLEDEnable or disable the dashboard interface on /dashboard.true
BULL_BOARD_ENABLEDEnable or disable the BullMQ dashboard interface on /bull.true
LOG_LEVELLog verbosity (info, warn, error, etc.).info
MAX_LOGSNumber of log lines to keep in memory (for /logs).1000
TASK_TIMEOUTTimeout (ms) for task execution.300000
TASK_ATTEMPTSNumber of attempts for a task before failure.3

Example .env

PORT=4000
REDIS_HOST=localhost
REDIS_PORT=6379
LB_AUTH_KEY=default-lb-auth-key
WORKER_DISCOVERY_INTERVAL=3000
WORKER_DISCOVERY_TIMEOUT=3000
WORKER_DISCOVERY_RETRIES=10
WS_HEARTBEAT_INTERVAL=3000
WS_HEARTBEAT_TIMEOUT=3000
DASHBOARD_ENABLED=true
BULL_BOARD_ENABLED=true
LOG_LEVEL=info
MAX_LOGS=1000
TASK_TIMEOUT=300000
TASK_ATTEMPTS=3

Worker List

General

  • The list of available Workers is managed in the file workers.json (at the project root).
  • Each entry is a URL (e.g., http://localhost:3000) pointing to a Worker instance.
  • This file is read at startup and can be hot-reloaded for dynamic cluster management.
[
"http://localhost:3000"
]
info

It is strictly recommended to use HTTPS for your Workers if they are exposed on the Internet (this is not necessary when using the Load Balancer, which can be the only one exposed). When using HTTP, the Worker connects to the Load Balancer via ws://. When using HTTPS, it connects via wss://.

Docker network

When using Docker on your Mac/Windows machine, your Load Balancer and Workers can communicate using the host.docker.internal domain. Example of a workers.json file when you're working on your machine:

[
"http://host.docker.internal:3000"
]

This works out of the box, as host.docker.internal is resolved by Docker.

That said, it won't work on a Linux-type production machine, which is why the Load Balancer and Andera Workers use an andera-net network. In this configuration, this syntax must be used for production:

[
"http://service-name:3000"
]

Where service-name is the Docker Compose service name.

Worker Management

The Load Balancer supports dynamic, hot-reloadable management of Workers:

  • Hot Reload: The Load Balancer periodically reloads the workers.json file (every WORKER_DISCOVERY_INTERVAL ms) and re-discovers all Workers. You can add or remove Workers at runtime without restarting the Load Balancer.
  • Adding a Worker: Add its URL to workers.json. The Load Balancer will discover and include it in routing if it is healthy and contract-valid.
  • Removing a Worker: Remove its URL from workers.json. The Load Balancer will stop routing new tasks to it as soon as it is no longer in the list.
  • Best Practice: Before removing a Worker, set it to maintenance mode by calling its /off endpoint. This ensures the Worker finishes any tasks in progress and does not accept new ones. Once all tasks are complete, you can safely remove it from workers.json.
  • Status and Health: The Load Balancer continuously monitors each Worker's status via /health and WebSocket. Workers in maintenance or unreachable are excluded from routing until they are back online or set to /on.

Groups and Contracts

Workers behind the Load Balancer are organized by group and contract version:

  • Group (GROUP): Each Worker declares a group (e.g., default, screenshot, translate). A group represents a logical set of Workers able to process the same type of tasks.
  • Contract (CONTRACT): Each Worker declares a contract version (integer). A contract defines the interface (available functions and their parameters) for a group.

Consistency Constraints

  • All Workers in the same group and contract version must expose exactly the same interface (functions and parameters).
  • The Load Balancer fetches the interface of each Worker (via /health) and validates that all Workers for a given group/contract are consistent.
  • If a Worker in a group/contract exposes a different interface, it is marked as invalid and excluded from routing.
  • When submitting a task, the Load Balancer only routes to valid Workers matching the requested group and contract.
  • If no valid Worker is available for a group/contract, the task is rejected with an error.

What happens if contracts mismatch?

  • If Workers in the same group declare the same contract version but expose different functions or parameters, they are excluded from routing.
  • This ensures that all tasks for a group/contract are processed consistently and safely.
  • Best practice: When changing the interface of a group, increment the contract version (CONTRACT=2, etc.) on all relevant Workers.

Example

  1. Three Workers in group screenshot, contract 1, all expose the same function screenshot.
  2. One Worker is updated to add a parameter but keeps CONTRACT=1.
  3. The Load Balancer marks this Worker as invalid and does not route tasks to it until its interface matches the others or the contract version is incremented.