Elasticsearch Notes

Elasticsearch is 2 components.

You need to understand how both work;

Lucene: #

Index Merges #

This video displays how index merges occur:
Indexing Mediawiki

Basically when you have enough segments that can be grouped they will be vacuumed and merged.

Source

Memory Pressure/Heap: #

If you monitor the total memory used on the JVM you will typically see a sawtooth pattern where the memory usage steadily increases and then drops suddenly.

Sawtooth

Sawtooth
The reason for this sawtooth pattern is that the JVM continously needs to allocate memory on the heap as new objects are created as a part of the normal program execution. Most of these objects are however short lived and quickly become available for collection by the garbage collector. When the garbage collector finishes you’ll see a drop on the memory usage graph. This constant state of flux makes the current total memory usage a poor indicator of memory pressure.

The JVM garbage collector is designed such that it draws on the fact that most objects are short lived. There are separate pools in the heap for new objects and old objects and these pools are garbage collected separately. After the collection of the new objects pool, surviving objects are moved to the old objects pool, which is garbage collected less frequently. This is due to the fact that it’s less likely to be any substantial amount of garbage there. If you monitor each of these pools separately, you will see the same sawtooth pattern, but the old pool is fairly steady while the new pool frequently moves between full and empty. This is why we we have based our memory pressure indicator on the fill rate of the old pool.

This is bad, heap is too large:
Notice the gaps between teeth
If the heap is too large, the application will be prone to infrequent long latency spikes from full-heap garbage collections. Infrequent long pauses impact end-user experience as these pauses increase the tail of the latency distribution; user requests will sometimes see unacceptably-long response times. Long pauses are especially detrimental to a distributed system like Elasticsearch because a long pause is indistinguishable from a node that is unreachable because it is hung, or otherwise isolated from the cluster. During a stop-the-world pause, no Elasticsearch server code is executing: it doesn’t call, it doesn’t write, and it doesn’t send flowers. In the case of an elected master, a long garbage collection pause can cause other nodes to stop following the master and elect a new one. In the case of a data node, a long garbage collection pause can lead to the master removing the node from the cluster and reallocating the paused node’s assigned shards. This increases network traffic and disk I/O across the cluster, which hampers normal load. Long garbage collection pauses are a top issue for cluster instability.

This is bad, heap is too small:
Baaaaad
If the heap is too small, applications will be prone to the danger of out of memory errors. While that is the most serious risk from an undersized heap, there are additional problems that can arise from a heap that is too small. A heap that is too small relative to the application’s allocation rate leads to frequent small latency spikes and reduced throughput from constant garbage collection pauses. Frequent short pauses impact end-user experience as these pauses effectively shift the latency distribution and reduce the number of operations the application can handle. For Elasticsearch, constant short pauses reduce the number of indexing operations and queries per second that can be handled. A small heap also reduces the memory available for indexing buffers, caches, and memory-hungry features like aggregations and suggesters.

Memory Pressure
Understanding HEAP
Sizing ES

Elasticsearch #

Node Types: #

Client Node:
node.data: off and node.master: off

Master Node:
node.data: off and node.master: on

Data Node:
node.data: on and node.master: off

Sizing of ES #

Reindexing: #

Performance: #

Monitoring: #

curl -XGET localhost:9200/_cat

Explore!, you can get a subset of data by listing the headings

curl -XGET 'localhost:9200/_cat/nodeattrs?h=host,ip,attr'

And list the headings

curl -XGET 'localhost:9200/_cat/nodeattrs?v

of course there is ?help too.

Recommended software:
Monitoring:

IDE:

Best Practices: #

 
8
Kudos
 
8
Kudos

Now read this

RacAdmin Quick and dirty cheatsheet

iDRAC racadm quick and dirty cheatsheet. racadm command can be issues via iDRAC/CMC/OS if svradmin-racadm is installed. Also you can specify -h option to access remote servers RAC as long as you have network access. Also if you are... Continue →