Situation: I have an index with strict mapping and I want to delete an old field from it which is no longer used. So I create a new index with mapping that doesn't include that field and I try to reindex the data into the new index.
Problem: When I reindex, I get an error, because I'm trying to index data into a field that is not available in the mapping. So to solve this, I want to remove that field from all documents in the original index first, before I can reindex.
PUT old_index/_doc/1
{
"field_to_delete" : 5
}
PUT old_index/_doc/2
{
"field_to_delete" : null
}
POST _reindex
{
"source": {
"index": "old_index"
},
"dest": {
"index": "new_index"
}
}
"reason": "mapping set to strict, dynamic introduction of [field_to_delete] within [new_index] is not allowed"
1. Some places I found suggest doing:
POST old_index/_doc/_update_by_query
{
"script": "ctx._source.remove('field_to_delete')",
"query": {
"bool": {
"must": [
{
"exists": {
"field": "field_to_delete"
}
}
]
}
}
}
However that doesn't match documents that have an explicit value of null, so reindexing still fails after this update.
2. Others (like members of the Elastic team in their official forum) suggest doing something like:
POST old_index/_doc/_update_by_query
{
"script": {
"source": """
if (ctx._source.field_to_delete != null) {
ctx._source.remove("field_to_delete");
} else {
ctx.op="noop";
}
"""
}
},
"query": {
"match_all": {}
}
}
However this has the same problem - it doesn't remove the second document that has an explicit value of null.
3. In the end I could just do:
POST old_index/_doc/_update_by_query
{
"script": {
"source": "ctx._source.remove("field_to_delete");"}
},
"query": {
"match_all": {}
}
}
But this will update all documents and for a large index could mean additional downtime during deployment.