๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ

Elasticsearch

Elasticsearch๊ฐ€ Apache Lucene ์„ ์–ด๋–ป๊ฒŒ ์‚ฌ์šฉํ•˜๋Š”์ง€ refresh/flush API๋ฅผ ํ†ตํ•ด ์•Œ์•„๋ณด๊ธฐ

๋ชฉํ‘œ: ์—˜๋ผ์Šคํ‹ฑ์„œ์น˜๊ฐ€ ์•„ํŒŒ์น˜ ๋ฃจ์”ฌ์„ ์–ด๋–ป๊ฒŒ ์‚ฌ์šฉํ•˜๋Š”์ง€ ์•Œ์•„๋ณด๊ธฐ

 

Elasticsearch์™€ Apache Lucene์˜ ๊ด€๊ณ„

Apache Lucene์€ ๋ฌธ์„œ๋ฅผ ์ƒ‰์ธํ•˜๊ณ  ๊ฒ€์ƒ‰ํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•ด์ฃผ๋Š” ์˜คํ”ˆ์†Œ์Šค Java ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋กœ, ๊ฐ•๋ ฅํ•œ ํ…์ŠคํŠธ ๊ฒ€์ƒ‰ ์—”์ง„์ž…๋‹ˆ๋‹ค.(https://lucene.apache.org/)

๋ฐ˜๋ฉด, Elasticsearch๋Š” Lucene์„ ๊ธฐ๋ฐ˜์œผ๋กœ ๋งŒ๋“ค์–ด์ง„ ๋ถ„์‚ฐํ˜• ๊ฒ€์ƒ‰ ์—”์ง„์œผ๋กœ, Lucene์„ ๋ณด๋‹ค ์‰ฝ๊ฒŒ ์‚ฌ์šฉ์ž๋“ค์ด ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•ด JSON ๊ธฐ๋ฐ˜์˜ RESTful HTTP API๋ฅผ ์ œ๊ณตํ•˜์—ฌ ๋ณด๋‹ค ์‰ฝ๊ฒŒ ๋ฌธ์„œ๋ฅผ ์ƒ‰์ธํ•˜๊ณ  ๊ฒ€์ƒ‰ํ•  ์ˆ˜ ์žˆ๋„๋ก ๋„์™€์ค๋‹ˆ๋‹ค.

 

์˜ˆ๋ฅผ ๋“ค์–ด, GET์œผ๋กœ ๊ฒ€์ƒ‰ํ•˜๊ณ , POST/PUT์œผ๋กœ ๋ฌธ์„œ๋ฅผ ์ƒ‰์ธํ•˜๋ฉฐ, DELETE๋กœ ๋ฌธ์„œ๋ฅผ ์‚ญ์ œํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

 

์ด ์™ธ์—๋„ Elasticsearch๋Š” ๊ณ ๊ฐ€์šฉ์„ฑ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. Elasticsearch๋Š” ์ตœ์†Œ 3๊ฐœ ์ด์ƒ์˜ ๋…ธ๋“œ๋กœ ํด๋Ÿฌ์Šคํ„ฐ ๊ตฌ์„ฑํ•  ๊ฒƒ์„ ๊ถŒ์žฅํ•ฉ๋‹ˆ๋‹ค. ์ด๋Š” ์ผ๋ถ€ ๋…ธ๋“œ์— ์žฅ์• ๊ฐ€ ๋ฐœ์ƒํ•ด๋„, ๋‹ค๋ฅธ ๋…ธ๋“œ์— ์ €์žฅ๋œ ๋ณต์ œ๋ณธ(replica)์„ ํ™œ์šฉํ•ด ์„œ๋น„์Šค ์ค‘๋‹จ ์—†์ด ์šด์˜ํ•  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค

 

์ˆ˜ํ‰์  ํ™•์žฅ์ด ์‰ฝ์Šต๋‹ˆ๋‹ค. ์ƒ‰์ธํ•  ๋ฌธ์„œ๊ฐ€ ๋งŽ์•„์ง€๊ฑฐ๋‚˜ ๊ฒ€์ƒ‰ ์š”์ฒญ์ด ๋งŽ์•„์งˆ ๊ฒฝ์šฐ, ํด๋Ÿฌ์Šคํ„ฐ์— ๋…ธ๋“œ๋ฅผ ์ถ”๊ฐ€ํ•จ์œผ๋กœ์จ ์‰ฝ๊ฒŒ ํ™•์žฅ์ด ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค. ์ƒˆ๋กœ์šด ๋…ธ๋“œ๊ฐ€ ํด๋Ÿฌ์Šคํ„ฐ์— ํ•ฉ๋ฅ˜ํ•˜๋ฉด ๊ธฐ์กด ๋…ธ๋“œ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ๋ณต์ œํ•˜๊ฑฐ๋‚˜ ์˜ฎ๊ธฐ๋Š” ์ž‘์—…๋„ Elasticsearch๊ฐ€ ์ž๋™์œผ๋กœ ์ฒ˜๋ฆฌํ•ฉ๋‹ˆ๋‹ค.

 

Elasticsearch์˜ ๋‹ค์–‘ํ•œ ์žฅ์ ์€ ๊ตฌ๊ธ€๋งํ•˜๋ฉด ๋งŽ์€ ๊ธ€์—์„œ ์†Œ๊ฐœํ•˜์—ฌ, ์ด๋ฒˆ ๊ธ€์—์„œ๋Š” ์ด์ฏค์—์„œ ๊ฐ„๋‹จํžˆ ์ •๋ฆฌํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

Elasticsearch๋Š” ํด๋Ÿฌ์Šคํ„ฐ ๋ฐ ๋…ธ๋“œ ๊ด€๋ฆฌ๋ฅผ ์ž๋™ํ™”ํ•ด ํด๋ผ์ด์–ธํŠธ์—๊ฒŒ ๊ฐ„ํŽธํ•œ ์šด์˜ ํ™˜๊ฒฝ์„ ์ œ๊ณตํ•˜๊ณ , Lucene์€ ๊ฒ€์ƒ‰ ์—”์ง„์˜ ํ•ต์‹ฌ ๊ธฐ๋Šฅ์„ ๋‹ด๋‹นํ•˜๋Š” ๊ตฌ์กฐ์ž…๋‹ˆ๋‹ค.

 

๊ทธ๋ ‡๊ธฐ์— Elasicsearch๋ฅผ ์ดํ•ดํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” Lucene์„ ์•Œ ํ•„์š”๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

 

Lucene์˜ flush

  1. ๋ฌธ์„œ ์ƒ‰์ธ ์š”์ฒญ์ด ๋“ค์–ด์˜ค๋ฉด, ๋ฌธ์„œ๋ฅผ ๋ถ„์„ํ•ด ์—ญ์ƒ‰์ธ(inverted index)์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
  2. ์ƒ์„ฑ๋œ ์—ญ์ƒ‰์ธ์„ ๋ฉ”๋ชจ๋ฆฌ ๋ฒ„ํผ์— ์ž„์‹œ๋กœ ์ €์žฅํ•˜๋ฉฐ, ์ด๋“ค์„ ๋””์Šคํฌ์— ์ €์žฅํ•˜๋ฉด ์„ธ๊ทธ๋จผํŠธ(segment)๋ผ๋Š” Lucene ์ธ๋ฑ์Šค ํŒŒ์ผ์ด ์ƒ์„ฑ๋˜๊ณ , ์ด ์„ธ๊ทธ๋จผํŠธ๋“ค์ด ๋ฃจ์”ฌ์ด ๊ฒ€์ƒ‰ํ•  ๋Œ€์ƒ์ž…๋‹ˆ๋‹ค.
  3. ๊ทธ๋ฆฌ๊ณ , ๋ฃจ์”ฌ์€ ์ด ์„ธ๊ทธ๋จผํŠธ๋“ค์„ ๋””์Šคํฌ์— write() ํ•ฉ๋‹ˆ๋‹ค.(์ด ๋•Œ, OS๊ฐ€ ์„ธ๊ทธ๋จผํŠธ๋“ค์„ ๋””์Šคํฌ๊ฐ€ ์•„๋‹Œ ์ž์‹ ์ด ๊ด€๋ฆฌํ•˜๋Š” ๋ฉ”๋ชจ๋ฆฌ์˜ ์ผ๋ถ€์ธ ํŽ˜์ด์ง€ ์บ์‹œ(filesystem cache)์— ์˜ฌ๋ฆฝ๋‹ˆ๋‹ค.)

์ด ๊ณผ์ •์„ Lucene์˜ flush๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค.

๐Ÿ’ก์—ญ์ƒ‰์ธ ์ด๋ž€?
Lucene์ด ๋ฌธ์„œ๋ฅผ ๊ฒ€์ƒ‰ ๊ฐ€๋Šฅํ•œ ๋ฐ์ดํ„ฐ๋กœ ๋งŒ๋“ค๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉํ•˜๋Š” ์ผ์ข…์˜ ์ž๋ฃŒ๊ตฌ์กฐ
An "inverted index" is the data structure that Lucene uses to make data searchable

 

๐Ÿ’กํŽ˜์ด์ง€ ์บ์‹œ๋ž€?
ํŽ˜์ด์ง€ ์บ์‹œ(page cache)๋Š” ๋•Œ๋•Œ๋กœ ํŒŒ์ผ ์‹œ์Šคํ…œ ์บ์‹œ๋ผ๊ณ ๋„ ๋ถˆ๋ฆฌ๋ฉฐ, ์šด์˜์ฒด์ œ๊ฐ€ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ํ™œ์šฉํ•ด ๋””์Šคํฌ ๋ฐ์ดํ„ฐ๋ฅผ ์ž„์‹œ๋กœ ์ €์žฅํ•˜๋Š” ๊ณต๊ฐ„์ž…๋‹ˆ๋‹ค. ์ฆ‰, ์šด์˜์ฒด์ œ ์ˆ˜์ค€์—์„œ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์บ์‹œ์ฒ˜๋Ÿผ ์‚ฌ์šฉํ•˜์—ฌ, ๋””์Šคํฌ I/O ์—†์ด ๋” ๋น ๋ฅด๊ฒŒ ๋ฐ์ดํ„ฐ๋ฅผ ์ฝ๊ณ  ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ๋„๋ก ๋„์™€์ฃผ๋Š” ๊ธฐ๋Šฅ์ž…๋‹ˆ๋‹ค.

Elasticsearch๋Š” ๊ฒ€์ƒ‰ ์„ฑ๋Šฅ์„ ๋†’์ด๊ธฐ ์œ„ํ•ด ํŒŒ์ผ ์‹œ์Šคํ…œ ์บ์‹œ์— ๋งŽ์ด ์˜์กดํ•ฉ๋‹ˆ๋‹ค. (Elasticsearch heavily relies on the filesystem cache in order to make search fast)

https://www.elastic.co/blog/elasticsearch-caching-deep-dive-boosting-query-speed-one-cache-at-a-time

 

์ฆ‰, Lucene์˜ flush ๋ฅผ ํ†ตํ•ด ํŽ˜์ด์ง€ ์บ์‹œ์— ๋ฐ์ดํ„ฐ๋ฅผ ์“ฐ๊ณ  ๋ฉ”๋ชจ๋ฆฌ์—์„œ ๋ฐ์ดํ„ฐ๋ฅผ ์ฝ๊ธฐ ๋•Œ๋ฌธ์— ๊ฒ€์ƒ‰ ์„ฑ๋Šฅ์ด ํ–ฅ์ƒ๋ฉ๋‹ˆ๋‹ค.

 

์ด๋Ÿฌํ•œ ์ด์œ ๋กœ, Elasticsearch๋Š” JVM Heap ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์ „์ฒด ์‹œ์Šคํ…œ ๋ฉ”๋ชจ๋ฆฌ์˜ ์ ˆ๋ฐ˜ ์ดํ•˜๋กœ ์„ค์ •ํ•  ๊ฒƒ์„ ๊ถŒ์žฅํ•ฉ๋‹ˆ๋‹ค.
๋‚˜๋จธ์ง€ ์ ˆ๋ฐ˜์€ ํŽ˜์ด์ง€ ์บ์‹œ ์šฉ๋„๋กœ ์šด์˜์ฒด์ œ๊ฐ€ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ๋‚จ๊ฒจ๋‘๊ธฐ ์œ„ํ•จ์ž…๋‹ˆ๋‹ค.(Set JVM heap size)

 

์ฆ‰, JVM Heap ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ๊ฐ€์šฉ ๋ฉ”๋ชจ๋ฆฌ์˜ ์ ˆ๋ฐ˜ ์ดํ•˜๋กœ ์„ค์ •ํ•˜๋Š” ๊ฒƒ์€ ๋ฉ”๋ชจ๋ฆฌ ๋‚ญ๋น„๊ฐ€ ์•„๋‹ˆ๋ผ๋Š” ๊ฒƒ์„ ์˜๋ฏธํ•˜๋ฉฐ, ์˜คํžˆ๋ ค ํŽ˜์ด์ง€ ์บ์‹œ๋ฅผ ์œ„ํ•ด ์žฌ์‚ฌ์šฉ๋˜๊ณ  ์žˆ๋‹ค๋Š” ๋œป์ž…๋‹ˆ๋‹ค.

 

๋˜ํ•œ ํŽ˜์ด์ง€ ์บ์‹œ๋Š” ์ปค๋„ ๋ฌธ์„œ์— ๋ช…์‹œ๋œ ๊ฒƒ์ฒ˜๋Ÿผ LRU(Least Recently Used) ๋ฐฉ์‹์œผ๋กœ ๋™์ž‘ํ•˜๋ฉฐ, ์˜ค๋ž˜ ์‚ฌ์šฉํ•˜์ง€ ์•Š์€ ๋ฐ์ดํ„ฐ๋ถ€ํ„ฐ ์ œ๊ฑฐ๋ฉ๋‹ˆ๋‹ค.(Elasticsearch Page Cache)

 

Elasticsearch์˜ Refresh

Elasticsearch์˜ refresh๋Š” Lucene์˜ flush ์„ ํ™œ์šฉํ•ด ์š”์ฒญํ•œ ๋ฌธ์„œ๋ฅผ ์ƒ‰์ธ์‹œ์ผœ ๊ฒ€์ƒ‰ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•ฉ๋‹ˆ๋‹ค.

๋ฌธ์„œ ์ƒ‰์ธ ๊ณผ์ •

  1. Elasticsearch๋Š” ๋ฌธ์„œ๋ฅผ ๋ฐ›์•„ Lucene์— ์ „๋‹ฌ
  2. Lucene์€ ๋ฌธ์„œ๋ฅผ ๋ฉ”๋ชจ๋ฆฌ ๋ฒ„ํผ์— ์ƒ‰์ธ(Lucene flush์˜ ์ผ๋ถ€)
  3. ๋™์‹œ์— Elasticsearch๋Š” ํ•ด๋‹น ์ž‘์—…์„ translog์— ๊ธฐ๋ก (translog์— ๋Œ€ํ•ด์„œ๋Š” ํ›„์— ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค)
  4. ์ด ๋‹จ๊ณ„๊นŒ์ง€๋Š” ๊ฒ€์ƒ‰์ด ๋ถˆ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค

๊ฒ€์ƒ‰ ๊ฐ€๋Šฅํ•ด์ง€๋Š” ์‹œ์ : refresh ๋ฐœ์ƒ

  1. Elasticsearch์˜ refresh๊ฐ€ ๋ฐœ์ƒํ•˜๋ฉด,
  2. Lucene์€ ๋ฉ”๋ชจ๋ฆฌ์— ์žˆ๋˜ ์ƒ‰์ธ ๋ฐ์ดํ„ฐ๋ฅผ ์ฝ์„ ์ˆ˜ ์žˆ๋Š” ์„ธ๊ทธ๋จผํŠธ(segment)๋กœ ์ƒ์„ฑํ•˜๊ณ , ํŽ˜์ด์ง€ ์บ์‹œ(filesystem cache)์— ๊ธฐ๋กํ•ฉ๋‹ˆ๋‹ค. (Lucene flush ์ผ๋ถ€)
  3. ์ด ์‹œ์ ๋ถ€ํ„ฐ ํ•ด๋‹น ๋ฐ์ดํ„ฐ(๋ฌธ์„œ)๋Š” ๊ฒ€์ƒ‰ ๊ฐ€๋Šฅํ•œ ์ƒํƒœ(searchable)๊ฐ€ ๋ฉ๋‹ˆ๋‹ค

๊ธฐ๋ณธ์ ์œผ๋กœ refresh ์ฃผ๊ธฐ๋Š” 1์ดˆ์ด๋ฉฐ, ์ด๋Š” ์ธ๋ฑ์Šค ์ƒ์„ฑํ•  ๋•Œ refresh_interval ์„ค์ •์„ ํ†ตํ•ด ์กฐ์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋‹ค๋งŒ, refresh ๋Š” ๋ฌด๊ฑฐ์šด ์ž‘์—…์ด๊ธฐ ๋•Œ๋ฌธ์— ์ตœ๊ทผ 30์ดˆ ์ด๋‚ด์— ํ•œ ๋ฒˆ ์ด์ƒ ๊ฒ€์ƒ‰ ์š”์ฒญ์„ ๋ฐ›์€ ์ธ๋ฑ์Šค์— ํ•œํ•ด์„œ๋งŒ ์ฃผ๊ธฐ์ ์ธ refresh๊ฐ€ ์ˆ˜ํ–‰๋ฉ๋‹ˆ๋‹ค.

 

์—ฌ๊ธฐ๊นŒ์ง€ ์ •๋ฆฌํ•˜์ž๋ฉด, ์ƒ‰์ธ๋˜๊ฑฐ๋‚˜ ๋ณ€๊ฒฝ๋œ ๋ฌธ์„œ๋ฅผ ๊ฒ€์ƒ‰ํ•˜๋ ค๋ฉด Elasticsearch์˜ refresh๊ฐ€ ๋ฐ˜๋“œ์‹œ ํ•„์š”ํ•˜๋ฉฐ, ์ด ๊ณผ์ •์—์„œ ๋‚ด๋ถ€์ ์œผ๋กœ Lucene์˜ flush๊ฐ€ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.

 

๋ฐ์ดํ„ฐ ์œ ์‹ค ์œ„ํ—˜, Lucene์˜ commit ๋“ฑ์žฅ

ํ•˜์ง€๋งŒ, ์ง€๊ธˆ๊นŒ์ง€ ๊ณผ์ •์€ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฉ”๋ชจ๋ฆฌ(ํŽ˜์ด์ง€ ์บ์‹œ) ์—๋งŒ ์ €์žฅํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ „์› ์žฅ์• ๋‚˜ ์‹œ์Šคํ…œ ํฌ๋ž˜์‹œ ์‹œ ๋ฐ์ดํ„ฐ๊ฐ€ ์œ ์‹ค๋  ์œ„ํ—˜์ด ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋ฅผ ๋ฐฉ์ง€ํ•˜๊ธฐ ์œ„ํ•ด Lucene์€ commit ๊ธฐ๋Šฅ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. Lucene์˜ commit์€ fsync ๋ผ๋Š” ์‹œ์Šคํ…œ ์ฝœ์„ ํ†ตํ•ด ์šด์˜์ฒด์ œ์˜ ํŽ˜์ด์ง€ ์บ์‹œ์™€ ๋””์Šคํฌ ๋‚ด์šฉ์„ ๋™๊ธฐํ™”ํ•˜์—ฌ ๋ฐ์ดํ„ฐ์˜ ์˜์†์„ฑ์„ ๋ณด์žฅํ•ฉ๋‹ˆ๋‹ค.

 

๊ทธ๋ฆฌ๊ณ , Elasticsearch์˜ flush๊ฐ€ Lucene์˜ commit์„ ํŠธ๋ฆฌ๊ฑฐํ•˜์—ฌ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฉ”๋ชจ๋ฆฌ(ํŽ˜์ด์ง€ ์บ์‹œ)๊ฐ€ ์•„๋‹Œ ๋””์Šคํฌ์— ๊ธฐ๋กํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•ฉ๋‹ˆ๋‹ค. ์ด ๊ณผ์ •์„ ํ†ตํ•ด ๋ฐ์ดํ„ฐ๋ฅผ ๋””์Šคํฌ์— ์•ˆ์ „ํ•˜๊ฒŒ ์˜๊ตฌ ์ €์žฅํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.

๋‹ค๋งŒ, ์ด ๊ณผ์ •์€ ๋””์Šคํฌ I/O๋ฅผ ์œ ๋ฐœํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๋น„์šฉ์ด ํฐ ์ž‘์—…์ด๋ผ ์ž์ฃผํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ๊ทธ๋ ‡๋‹ค๊ณ , ํ•œ ๋ฒˆ์— ๋งŽ์€ ๋ฐ์ดํ„ฐ๋ฅผ ๋””์Šคํฌ์— ์“ฐ๊ธฐ์—๋Š” ์–ธ์ œ ๋ฐ์ดํ„ฐ๊ฐ€ ์œ ์‹ค๋ ์ง€ ๋ชจ๋ฅด๋Š” ๋ฌธ์ œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

 

๋ฐ์ดํ„ฐ ์œ ์‹ค ๋ฐฉ์ง€๋ฅผ ์œ„ํ•œ Translog์˜ ์—ญํ• ๊ณผ Elasticsearch์˜ flush

์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ณ ์ž ์—˜๋ผ์Šคํ‹ฑ์„œ์น˜๋Š” Translog๋ฅผ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค.

Translog๋Š” ๋ฐ์ดํ„ฐ ์œ ์‹ค์„ ๋ฐฉ์ง€ํ•˜๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉํ•˜๋Š” ์ผ์ข…์˜ ๋กœ๊ทธ์ž…๋‹ˆ๋‹ค. Translog๋Š” Lucene์˜ ๋ฌธ์„œ ์ƒ‰์ธ ์ž‘์—…์ด ์ˆ˜ํ–‰๋œ ์งํ›„์— ๊ธฐ๋ก๋˜๋ฉฐ, ์ด๋Ÿฌํ•œ ๊ธฐ๋ก์ด ๋๋‚œ ํ›„์—์•ผ Elasticsearch๋Š” ๋ฌธ์„œ ์ƒ‰์ธ์„ ์š”์ฒญํ•œ ํด๋ผ์ด์–ธํŠธ์—๊ฒŒ ์„ฑ๊ณต(200, OK)๊ณผ ๊ฐ™์€ ์‘๋‹ต์„ ๋‚ด๋ ค์ค๋‹ˆ๋‹ค.

 

์ฆ‰, Elasticsearch์˜ ์ƒค๋“œ๋Š” ๋ชจ๋“  ์ž‘์—…๋งˆ๋‹ค Translog๋ผ๋Š” ์ด๋ฆ„์˜ ์ž‘์—… ๋กœ๊ทธ๋ฅผ ๋‚จ๊ธฐ๊ณ , ์ด๋Ÿฌํ•œ ๋กœ๊ทธ๊ฐ€ ์žฅ์•  ๋ฐœ์ƒ ์‹œ ๋ณต๊ตฌ์— ํ™œ์šฉ๋ฉ๋‹ˆ๋‹ค. ๋งŒ์•ฝ ๋…ธ๋“œ์— ์žฅ์• ๊ฐ€ ๋‚˜์„œ ๋ฐ์ดํ„ฐ๊ฐ€ ์œ ์‹ค๋œ๋‹ค๋ฉด ์ƒค๋“œ ๋ณต๊ตฌ ๊ณผ์ •์—์„œ translog๋ฅผ ์ด์šฉํ•ด Lucene์˜ commit์— ํฌํ•จ๋˜์ง€ ์•Š์•˜๋˜ ๋ฐ์ดํ„ฐ๋“ค์„ ๋ณต๊ตฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

 

ํ•˜์ง€๋งŒ Translog์˜ ํฌ๊ธฐ๊ฐ€ ๋„ˆ๋ฌด ์ปค์ง€๋ฉด, ๋ณต๊ตฌ ๊ณผ์ •์—์„œ ์‹œ๊ฐ„์ด ์˜ค๋ž˜ ๊ฑธ๋ฆด ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋ฅผ ๋ฐฉ์ง€ํ•˜๊ธฐ ์œ„ํ•ด Elasticsearch๋Š” ๋ฐฑ๊ทธ๋ผ์šด๋“œ์—์„œ ์ฃผ๊ธฐ์ ์œผ๋กœ flush๋ฅผ ์ˆ˜ํ–‰ํ•˜์—ฌ Translog์˜ ํฌ๊ธฐ๋ฅผ ์ ์ ˆํ•˜๊ฒŒ ์œ ์ง€ํ•ฉ๋‹ˆ๋‹ค.

 

์ฆ‰, Elasticsearch์˜ flush๋Š”

  1. Lucene์˜ commit์„ ํŠธ๋ฆฌ๊ฑฐํ•ด ๋ฐ์ดํ„ฐ๋ฅผ ๋””์Šคํฌ์— ์˜๊ตฌ์ ์œผ๋กœ ์ €์žฅํ•˜๊ณ ,
  2. ํ•ด๋‹น ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ Translog๊ฐ€ ๋” ์ด์ƒ ํ•„์š” ์—†์–ด์ง€๋ฏ€๋กœ, ํŠธ๋žœ์žญ์…˜ ๋กœ๊ทธ๋ฅผ ํ•จ๊ป˜ ๋น„์šฐ๋Š” ์ž‘์—…๊นŒ์ง€ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

 

์„ธ๊ทธ๋จผํŠธ ๊ตฌ์กฐ์™€ ์‚ญ์ œ ์ฒ˜๋ฆฌ ๋ฐฉ์‹

Elasticsearch์—์„œ ๋ฌธ์„œ์˜ UPDATE๋Š” ๊ธฐ์กด ๋ฌธ์„œ๋ฅผ ์ˆ˜์ •ํ•˜๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ, ๊ธฐ์กด ๋ฌธ์„œ๋ฅผ ์‚ญ์ œํ•˜๊ณ , ๋ณ€๊ฒฝ๋œ ๋‚ด์šฉ์„ ํฌํ•จํ•œ ์ƒˆ๋กœ์šด ๋ฌธ์„œ๋ฅผ ์žฌ์ƒ‰์ธํ•˜๋Š” ๋ฐฉ์‹์œผ๋กœ ์ฒ˜๋ฆฌ๋ฉ๋‹ˆ๋‹ค.

 

์™œ ์ด๋ ‡๊ฒŒ ์ž‘๋™ํ•˜๋Š”์ง€ ์•Œ๋ ค๋ฉด Lucene ์˜ ์„ธ๊ทธ๋จผํŠธ์— ๋Œ€ํ•ด ์ดํ•ดํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. Lucene flush๊ฐ€ ๋ฐœ์ƒํ•˜๋ฉด, ๋ฉ”๋ชจ๋ฆฌ์— ์žˆ๋˜ ์ƒ‰์ธ ์ •๋ณด๊ฐ€ ์„ธ๊ทธ๋จผํŠธ(segment)๋ผ๋Š” ํ˜•ํƒœ๋กœ ๋””์Šคํฌ์— ์ €์žฅ๋ฉ๋‹ˆ๋‹ค.(์‚ฌ์‹ค์€ OS๊ฐ€ ๋””์Šคํฌ์— ์“ฐ์ง€ ์•Š๊ณ , ๋ฉ”๋ชจ๋ฆฌ์— ์˜ฌ๋ฆฌ๊ฒ ์ง€๋งŒ์š”)
์ด๋ ‡๊ฒŒ ์ƒ์„ฑ๋œ ์„ธ๊ทธ๋จผํŠธ๋“ค์€ ๊ณ„์† ๋””์Šคํฌ์— ์Œ“์ด๊ฒŒ ๋˜๊ณ , Lucene์€ ์ด ์„ธ๊ทธ๋จผํŠธ ๋‹จ์œ„๋กœ ๊ฒ€์ƒ‰์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

 

๊ทธ๋Ÿฐ๋ฐ ์ค‘์š”ํ•œ ์ ์€, ์„ธ๊ทธ๋จผํŠธ๋Š” ํ•œ ๋ฒˆ ์ƒ์„ฑ๋˜๋ฉด ๋ณ€๊ฒฝ์ด ๋ถˆ๊ฐ€๋Šฅํ•œ ๋ถˆ๋ณ€ ๊ตฌ์กฐ๋ผ๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
๋”ฐ๋ผ์„œ ๋ฌธ์„œ๋ฅผ ์ˆ˜์ •ํ•˜๊ฑฐ๋‚˜ ์‚ญ์ œํ•  ๊ฒฝ์šฐ, Lucene์€ ์„ธ๊ทธ๋จผํŠธ์—์„œ ํ•ด๋‹น ๋ฌธ์„œ๋ฅผ ์‹ค์ œ๋กœ ์‚ญ์ œํ•˜์ง€ ์•Š๊ณ , ์‚ญ์ œ ํ”Œ๋ž˜๊ทธ(delete flag)๋ฅผ ํ‘œ์‹œํ•ด๋‘๊ณ  ๊ฒ€์ƒ‰ ์‹œ ์ œ์™ธ์‹œํ‚ค๋Š” ๋ฐฉ์‹์œผ๋กœ ์ฒ˜๋ฆฌํ•ฉ๋‹ˆ๋‹ค.

 

๊ทธ๋ž˜์„œ, Elasticsearch์˜ UPDATE๋Š” Lucene ๋™์ž‘ ๋ฐฉ์‹์— ๋”ฐ๋ผ ๊ธฐ์กด ๋ฌธ์„œ์— ์‚ญ์ œ ํ”Œ๋ž˜๊ทธ๋ฅผ ์„ค์ •ํ•˜๊ณ , ์ˆ˜์ •๋œ ๋‚ด์šฉ์„ ๋‹ด์€ ์ƒˆ๋กœ์šด ๋ฌธ์„œ๋ฅผ ์ƒˆ๋กœ์šด ์„ธ๊ทธ๋จผํŠธ์— ์ƒ‰์ธํ•˜๋Š” ๋ฐฉ์‹์œผ๋กœ ๋™์ž‘ํ•ฉ๋‹ˆ๋‹ค.

 

์ด๋ ‡๊ฒŒ ํ•ด์„œ ์‚ญ์ œ ํ”Œ๋ž˜๊ทธ๊ฐ€ ํ‘œ์‹œ๋œ ๋ฌธ์„œ๋“ค์ด ํฌํ•จ๋œ ์„ธ๊ทธ๋จผํŠธ๋Š” ๋‚˜์ค‘์— ์„ธ๊ทธ๋จผํŠธ ๋ณ‘ํ•ฉ(segment merge)์ด ์ˆ˜ํ–‰๋  ๋•Œ ์—ฌ๋Ÿฌ ์ž‘์€ ์„ธ๊ทธ๋จผํŠธ๋ฅผ ํ•˜๋‚˜์˜ ํฐ ์„ธ๊ทธ๋จผํŠธ๋กœ ํ†ตํ•ฉํ•˜๋ฉด์„œ ๋™์‹œ์— ์‚ญ์ œ๋œ ๋ฌธ์„œ๋ฅผ ์‹ค์ œ ๋””์Šคํฌ์—์„œ ์ œ๊ฑฐํ•ฉ๋‹ˆ๋‹ค.

 

๊ทธ๋ฆฌ๊ณ , ์„ธ๊ทธ๋จผํŠธ ๋ณ‘ํ•ฉ์„ ํ†ตํ•ด ๊ฒ€์ƒ‰ํ•  ์„ธ๊ทธ๋จผํŠธ ์ˆ˜๊ฐ€ ์ค„์–ด๋“œ๋ฏ€๋กœ ๋ฌธ์„œ ๊ฒ€์ƒ‰์— ๋” ๋น ๋ฅด๋‹ค๋Š” ์žฅ์ ์„ ๊ฐ€์ง€๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.(๋งˆ์น˜ RDBMS์˜ ์—ญ์ •๊ทœํ™” ๊ฐ™๋‹ค๋Š” ์ƒ๊ฐ์ด ๋“ญ๋‹ˆ๋‹ค)

 

* ๋ณ‘ํ•ฉ์ฃผ๊ธฐ์— ๋Œ€ํ•ด ๊ถ๊ธˆํ•˜์‹  ๋ถ„๋“ค์€ ๋งํฌ ๊ฑธ์–ด๋‘๊ฒ ์Šต๋‹ˆ๋‹ค. ๋‹ค๋งŒ, ํ•ด๋‹น ๋ฌธ์„œ์—์„œ๋„ ์ตœ์ ํ™”๊ฐ€ ๋˜์–ด์žˆ๋‹ค๊ณ  ์„ค๋ช…ํ•˜๋ฉฐ, ์ตœ์ ํ™” ๋ฐฉ๋ฒ•์€ ํ•ด๋‹น ๋ฌธ์„œ์— ๋‚˜์˜ค๋Š” ๋‹จ์–ด๋“ค์„ ๋ณด์‹œ๋ฉด ๋  ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-merge.html

 

 

 

์ฐธ๊ณ 

What is the difference between Lucene and Elasticsearch

- ์—˜๋ผ์Šคํ‹ฑ ๋ฐ”์ด๋ธ”

- What is the internal structure of Elasticsearch when it is clustered?

- Elasticsearch Page Cache(https://www.elastic.co/blog/elasticsearch-caching-deep-dive-boosting-query-speed-one-cache-at-a-time)

- Elasticsearch Flush API(https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-flush.html)

- Elasticsearch Refresh API(https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-refresh.html)

- https://stackoverflow.com/questions/15426441/understanding-segments-in-elasticsearch

- https://stackoverflow.com/questions/19963406/refresh-vs-flush