Wednesday, 15 January 2014

alias - Faking Index per User: is having many aliases bad? -


On an elastic / elastic search website, many (thousands) customers are recommended to use and use an index LIS filters to separate your data invisibly.

I have heard that someone says that this is not a good practice because the nickname is part of the cluster state.

Why is this a matter? This is the first time I've heard it.

There is nothing wrong with the aliases per copy . The nickname is very light: when you make a surname, it looks at the index and puts an "alias tag" on that index.

When you perform a search against a nickname, if there is no matching index, it will examine the tagged aliases and use the underlying index. The whole process is very light, so there is really no problem, from a search perspective, to get many nicknames.

Note about cluster state, however, is valid (sorta). Millions of nicknames (or millions of fields etc.) will blot the cluster state. Whenever there is a change, this cluster state is published on all the nodes, which guarantees flexible detection that all nodes can respond to all questions.

The problem is that if your cluster position becomes overwhelming (hundreds of megabytes, etc.) the physical work of publishing it for the cluster becomes negligible, adding 800 nodes or 800 nodes to each 800 Imagine publishing 100 nodes. The Master also has a fixed CPU cost which becomes a problem.

In practice, there are lots of tricks to keep the difference between this correcration, cluster states, batching etc. But basically the cluster state represents an obstacle which can become a problem if you become very big in the state.

In the real world, some groups have never been able to access this problem, because for this a large number of fields / surnames / indexes / analysts actually have such a large size of a cluster state Do bloat.

If you are concerned about this, you can keep an eye on this. Pending tasks will show all cluster-level functions that are queued for processing on master node. It should be almost always empty, because the guru is sometimes interrupted in a cluster, but if you see this line rising (and high load on your master), then you may have a cluster state problem.


No comments:

Post a Comment