Elasticsearch

Index

An index is composed of one or more shards.
Index can be compared to a database. It will contains a schema (called mapping) and documents.

To optimize an index there is two parameters to play with :

Mapping in order to optimize index size
Shards can improve disk space and access time.

N.B.: To delay shards reallocation (maintenance) :

PUT _all/_settings
{
  "settings": {
    "index.unassigned.node_left.delayed_timeout": "5m"
  }
}

Shards

Shards are split in two tags :

Primary
Replica

By default an index is composed of 5 primary and 1 replica.
This means that our index will be split in 5 accross cluster nodes.

One replica means that each primary shard will be replicate so you will have 5 more shards.
So your index will be 10 shards total.

Default value are not always the best you need to think depending your needs.

Mapping

Mapping is a description of the index content (schema).
By default elasticsearch create field with type text.
This will not be optimized because search will be a lot longuer (because it’s indexing all content).

Best practice

It is adviced to have primary shards of 10Go maximum
Maximum shards per node should be between 600 and 750 (for a heap size of 30Go)
Heap size should not be over 32Go

Maintainance

Disable reallocation :

curl -XPUT 'http://node:9200/_cluster/settings' -d '
{
  "transient": {
    "cluster.routing.allocation.enable": "none"
  }
}'

Then enable it back :

curl -XPUT 'http://node:9200/_cluster/settings'  -d '
{
  "transient": {
    "cluster.routing.allocation.enable": "all"
  }
}'

Timelion

Example query : .es(q='hostname.keyword:* AND metric_type.keyword: KPI_CPU', offset=-1h,index=logstash-*, split=instance.keyword:105, metric='avg:max_tot_cpu_usage_ps').fit(average).label("$1", "^.* > instance.keyword:(\S+) > .*").title('oracle - metrics').legend(se)

Snapshot & restore

Get repository :

GET _snapshot/

Create a new one :

PUT _snapshot/kpi_snapshot
{
    "type" : "fs",
    "settings" : {
      "location" : "/opt/elk/backups"
    }
}

Create a snapshot of a specific index :

 PUT /_snapshot/kpi_snapshot/snapshot_1?wait_for_completion=true
 {
   "indices": "index1,index2",
   "ignore_unavailable": true,
   "include_global_state": false
 }

If you want to restore it elsewhere copy the kpi folder onto another server.
Create the same backup repository and restore it :

POST /_snapshot/kpi_snapshot/snapshot_1/_restore