1

I have a self hosted Elasticsearch 6.2 cluster (2 master nodes, ~200Gb data each).

I plan to move to AWS Elasticsearch service & it's not possible to ssh into it.

What's the fastest way to move all indices from an old ES cluster to the cloud one?

On a self hosted ES I could copy the indices folder to a new ES & that's it.

GTXBxaKgCANmT9D9
  • 395
  • 1
  • 6
  • 15

2 Answers2

4

Use a tool for dumping & restoring Elasticsearch data, like Elasticdump (https://www.npmjs.com/package/elasticdump).

You can easily use bash to transfer all of the indexes from one instance to another, like so:

old_instance="http://old_address:9200"
new_instance="http://new_address:9200"

es_indexes=$(curl -s "${old_instance}/_cat/indices" | awk '{ print $3 }')

for index in $es_indexes; do
  elasticdump \
    --input="${old_instance}/${index}" \
    --output="${new_instance}/${index}" \
    --type=mapping

  elasticdump \
    --input="${old_instance}/${index}" \
    --output="${new_instance}/${index}" \
    --type=data
done
Wiesław Herr
  • 340
  • 1
  • 9
1

I momentarily created a shell script for this -

Github - https://github.com/vivekyad4v/aws-elasticsearch-domain-migration/blob/master/migrate.sh

#!/bin/bash

#### Make sure you have Docker engine installed on the host ####
###### TODO - Support parameters ######

export AWS_ACCESS_KEY_ID=xxxxxxxxxx
export AWS_SECRET_ACCESS_KEY=xxxxxxxxx
export AWS_DEFAULT_REGION=ap-south-1
export AWS_DEFAULT_OUTPUT=json
export S3_BUCKET_NAME=my-es-migration-bucket
export DATE=$(date +%d-%b-%H_%M)

old_instance="https://vpc-my-es-ykp2tlrxonk23dblqkseidmllu.ap-southeast-1.es.amazonaws.com"
new_instance="https://vpc-my-es-mg5td7bqwp4zuiddwgx2n474sm.ap-south-1.es.amazonaws.com"
delete=(.kibana)
es_indexes=$(curl -s "${old_instance}/_cat/indices" | awk '{ print $3 }')
es_indexes=${es_indexes//$delete/}
es_indexes=$(echo $es_indexes|tr -d '\n')

echo "index to be copied are - $es_indexes"

for index in $es_indexes; do

# Export ES data to S3 (using s3urls) 
docker run --rm -ti taskrabbit/elasticsearch-dump \
  --s3AccessKeyId "${AWS_ACCESS_KEY_ID}" \
  --s3SecretAccessKey "${AWS_SECRET_ACCESS_KEY}" \
  --input="${old_instance}/${index}" \
  --output "s3://${S3_BUCKET_NAME}/${index}-${DATE}.json"

# Import data from S3 into ES (using s3urls) 
docker run --rm -ti taskrabbit/elasticsearch-dump \
  --s3AccessKeyId "${AWS_ACCESS_KEY_ID}" \
  --s3SecretAccessKey "${AWS_SECRET_ACCESS_KEY}" \
  --input "s3://${S3_BUCKET_NAME}/${index}-${DATE}.json" \
  --output="${new_instance}/${index}"

new_indexes=$(curl -s "${new_instance}/_cat/indices" | awk '{ print $3 }')
echo $new_indexes
curl -s "${new_instance}/_cat/indices"

done
vivekyad4v
  • 111
  • 5
  • 1
    Used this script without the docker part, to copy data in night to s3 and back again in morning. Thank you. :-) – We are Borg Dec 15 '20 at 11:27