2 minutes
Elasticsearch
Index
An index is composed of one or more shards.
Index can be compared to a database. It will contains a schema (called mapping) and documents.
To optimize an index there is two parameters to play with :
- Mapping in order to optimize index size
- Shards can improve disk space and access time.
N.B.: To delay shards reallocation (maintenance) :
PUT _all/_settings
{
"settings": {
"index.unassigned.node_left.delayed_timeout": "5m"
}
}
Shards
Shards are split in two tags :
- Primary
- Replica
By default an index is composed of 5 primary and 1 replica.
This means that our index will be split in 5 accross cluster nodes.
One replica means that each primary shard will be replicate so you will have 5 more shards.
So your index will be 10 shards total.
Default value are not always the best you need to think depending your needs.
Mapping
Mapping is a description of the index content (schema).
By default elasticsearch create field with type text.
This will not be optimized because search will be a lot longuer (because it’s indexing all content).
Best practice
- It is adviced to have primary shards of 10Go maximum
- Maximum shards per node should be between 600 and 750 (for a heap size of 30Go)
- Heap size should not be over 32Go
Maintainance
Disable reallocation :
curl -XPUT 'http://node:9200/_cluster/settings' -d '
{
"transient": {
"cluster.routing.allocation.enable": "none"
}
}'
Then enable it back :
curl -XPUT 'http://node:9200/_cluster/settings' -d '
{
"transient": {
"cluster.routing.allocation.enable": "all"
}
}'
Timelion
Example query : .es(q='hostname.keyword:* AND metric_type.keyword: KPI_CPU', offset=-1h,index=logstash-*, split=instance.keyword:105, metric='avg:max_tot_cpu_usage_ps').fit(average).label("$1", "^.* > instance.keyword:(\S+) > .*").title('oracle - metrics').legend(se)
Snapshot & restore
Get repository :
GET _snapshot/
Create a new one :
PUT _snapshot/kpi_snapshot
{
"type" : "fs",
"settings" : {
"location" : "/opt/elk/backups"
}
}
Create a snapshot of a specific index :
PUT /_snapshot/kpi_snapshot/snapshot_1?wait_for_completion=true
{
"indices": "index1,index2",
"ignore_unavailable": true,
"include_global_state": false
}
If you want to restore it elsewhere copy the kpi folder onto another server.
Create the same backup repository and restore it :
POST /_snapshot/kpi_snapshot/snapshot_1/_restore