1

I'm experimenting with Elasticsearch in relation to backups and restoring data.

I can back up data into a snapshot using curator no problems.

I then physically delete the files related to the index (to somewhat simulate a HD crash etc.)

I restart Elasticsearch and verify in Kibana that the data is no longer there.

If I then go to restore the latest snapshot I made; any data stored in Elasticsearch between that last snapshot and the time I do the restore is lost.

The restoration of a snapshot doesn't seem to merge with newer data in the indices and I can't find any references to this problem online but surely restoring a backup doesn't just throw out newer data and I must be missing something?

To summarise:

A sample of my snapshots in the backup directory:

snapshot-curator-20150830191221
snapshot-curator-20150901225612
snapshot-curator-20150902090327

which were generated by the following command:

curator snapshot --repository es_backup indices --all-indices

I then delete the files for an index of a specific day:

rm -rf /mnt/storage/var/lib/elasticsearch/elasticsearch/nodes/0/indices/logstash-production-media-2015.09.02

Restart Elasticsearch (I initially didn't do this and the data was always still there, it seems the Java or machine buffer held onto the data and Elasticsearch didn't realise it was gone!)

Verify that that dates data is all gone in Kibana.

Close all indices:

curator close indices --all-indices

Restore the latest snapshot:

curl -XPOST http://localhost:9200/_snapshot/es_backup/curator-20150729133045/_restore

The deleted data is back when looking in Kibana but any data put into elasticsearch between the snapshot been taken and the time of the restore is gone.

e.g. Last snapshot taken at 10am. Restore at 1pm. Data from 10am to 1pm disappears after restore.

So what am I doing wrong? How do I do a restore with a merge of current newer data that has been stored in Elasticsearch since the previous snapshot was taken?

Thanks!

Iain
  • 46
  • 3

1 Answers1

1

Well unfortunately it seems merging isn't possible.

An answer for the question that I also posted on the ES forum when I didn't get any responses here:

https://discuss.elastic.co/t/restore-from-backup-and-merge-with-newer-data/28760

Snapshot is a point in time copy of the data. When you restore you restore things to that point in time.

There is currently no way to merge like this.

A suggestion was to restore to a different index name and then use an alias which points to both indices for search, which could be a goer but I would think that would led to duplicate data being returned for searches.

Maybe the solution is to have 2 nodes with a copy of the data in each, although with a large data store that obviously could take up a lot more space.

Iain
  • 46
  • 3