Converting and changing the type of a column in database is easy. Unfortunately, the type of a field in a mapping could not be changed in Elastic. If the docs is already indexed, the mapping could not be changed and an reindex-ing would be needed.
Dynamic Mapping Failed Grafana
Elastic could be used as a data source for Grafana. But if there exists no field of date
type in a index for Time field name, the data source could not be configured properly.
In my case, unfortunately, the type of timestamp
field is set to text
during dynamic mapping process due to inrecognizable time format (requires yyyy-MM-dd'T'HH:mm:ss.SSSSSSZ
but gets no "T" as a delimiter).
Explained in official documentation:
Except for supported mapping parameters, you can’t change the mapping or field type of an existing field. Changing an existing field could invalidate data that’s already indexed.
If you need to change the mapping of a field in other indices, create a new index with the correct mapping and reindex your data into that index.
Elasticsearch does not provide the functionality to change types for existing fields. A workaround is to import the data to a new index with correct types:
- create a new index
- put updated mapping (create manually a mapping where fields have the types you need before sending the first data) to the new index
- use reindex API to copy data from the old index to the newly created one
- (optionally) create an alias with the old index name pointing to the new index
Construct Explicit Mapping Request
First we should know the old index mapping and find out what is wrong:
$ curl "$ES_URL/$INDEX/_mapping/field/timestamp?pretty"
{
"testrun": {
"mappings": {
"timestamp": {
"full_name": "timestamp",
"mapping": {
"timestamp": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
We can clearly see a timestamp
field with text
type. This is what to expect, as Grafana requires date
type. So we can construct a payload.json
with explicit mapping:
Here we have a custom date format because our raw data is in non-standard format like 2017-11-30 11:59:19.717648+00:00
.
Then we will create a new index and put an explicit mapping into it.
$ curl -X PUT $ES_URL/$INDEX_NEW
$ curl -X PUT $ES_URL/$INDEX_NEW/_mapping -H "Content-Type: application/json" -d @payload.json
Reindex from Old Index to New
Reindex API
Again we can create a payload.json
:
$ curl -X POST "$ES_URL/_reindex?pretty" -H 'Content-Type: application/json' -d @payload.json
If no errors are returned, reindex process is started. The process will take from minutes to hours, depending on the quantity of data. But by now Grafana should recognize the field and shows
Index OK. Time field name OK.
And the data source is sucessfully configured.
Reindex Asynchronously
Reindexing a huge index could take hours to days and will result in an HTTP request timeout. To make sure
If the request containswait_for_completion=false
, Elasticsearch performs some preflight checks, launches the request, and returns a task you can use to cancel or get the status of the task. Elasticsearch creates a record of this task as a document at_tasks/<task_id>
$ curl -X POST "$ES_URL/_reindex?wait_for_completion=false" -H 'Content-Type: application/json' -d @payload.json
This will returns a tasks represented by its ID:
{ "task": "Sb_1wXmiTWWY2uFEczPhIQ:868863502" }
And this task ID could be used to query the process:
$ export TASK_ID="Sb_1wXmiTWWY2uFEczPhIQ:868863502"
$ curl "$ES_URL/_tasks/$TASK_ID?pretty"
Any warning or error will be included in the query result. When you confirm the process is done and task logs are no longer needed, this task should be deleted. The tasks are store in an intenral index .tasks
, and it is where the task document should be deleted from.
$ curl -X DELETE "$ES_URL/.tasks/task/$TASK_ID"
How About the Old Index?
There is no index renaming operation in Elastic. But by the experience we've gained, it is more eaiers to rename one by deleting, creating and reindexing.
There is also an option to create alias for the index.