Enable Shared Elasticsearch¶
Grove supports deploying an ElasticSearch cluster that is shared between all instances.
The Open edX platform uses ElasticSearch (ES) in multiple ways, when doing a search for courses, when searching for forum posts, for course discovery, etc.
For each deployment, tutor
adds an ES pod which uses about 1.3GB of RAM. For operators that are running multiple instances,
this isn't ideal as you'll be dedicating a large chunk of memory to idle processes.
How it works¶
ElasticSearch indices are unique and the edx-platform
assumes that indices are dedicate the currently running instance.
This means that in order to run multiple instances, we need to make sure there aren't collisions with the index names.
Grove does this by modifying the underlying platforms to consider prefixed indices. For example an instance will assume
it's writing to courses
index but the underlying code will rout the request instead to {instance-name}-courses
instead.
Turning it on¶
To enable the feature, add to your cluster.yml
.
TF_VAR_enable_shared_elasticsearch: "true"
And apply the changes with terraform
.
cd control
./tf plan
./tf apply
For each instance you would like to use the cluster, add the below block the config.yml
. The ELASTIC_*
variables will be overridden by Grove:
RUN_ELASTICSEARCH: false
GROVE_ENABLE_SHARED_ELASTICSEARCH: true
ELASTICSEARCH_HOST: set-via-environment
ELASTICSEARCH_HTTP_AUTH: set-via-environment
ELASTICSEARCH_INDEX_PREFIX: set-via-environment
ELASTICSEARCH_CA_CERT_PEM: set-via-environment
Configuring¶
Grove exposes a shared_elasticsearch
configuration hash which has the following keys:
heap_size
: Defaulting to2g
, this is the minimum and maximum heap size that will be used for each node in the cluster.replicas
: Defaults to2
. The number of replicas to create for the cluster. Do not set this number higher than your number of nodes as each pod is meant to run on a dedicated node.search_queue_size
: Defaults to5000
. This number determines the throughput (and resource usage) of your cluster. It should be increased only if there's more CPU available.cpu_limit
: Defaults to2000m
.memory_limit
: Defaults to4Gi
.
Example configuration in cluster.yml
:
TF_VAR_elasticsearch_config: |
heap_size: "5g"
memory_limit: "10Gi"