I would like to store some mobile app version data on ElasticSearch and visualize it on Kibana/Grafana. The purpose is to know the usage of the app version.
Say I have these "mobile app init" event entries in ElasticSearch (simplified):
* clientId: ABC, clientVersion: 1.2.3, time: 2018-01-01
* clientId: DEF, clientVersion: 1.2.3, time: 2018-01-02
* clientId: GHI, clientVersion: 1.2.3, time: 2018-01-03
* clientId: DEF, clientVersion: 1.2.3, time: 2018-01-04
* clientId: GHI, clientVersion: 1.2.4, time: 2018-01-05
And I would like to have a visualization in Kibana/Grafana like:
* version 1.2.3: 2 installations
* version 1.2.4: 1 installation
According to data above, version 1.2.3 is currently in use by ABC and DEF. Version 1.2.4 is in use by 1.2.4.
- ABC has only 1 entry with 1.2.3.
 - DEF has 2 entries, both with 1.2.3. Since I would like to see the usage of the app version per clients, duplicate data should be ignored.
 - GHI has 2 entries, 1 with 1.2.3 and 1 with 1.2.4. But since the latest version entry is 1.2.4, 1.2.3 entry is ignored.
 
If I was doing this with some code for in memory data without ElasticSearch, the algorithm would be:
- get the latest entry for each client
 - group them by the version and count each entry
 
Questions:
- How to get the desired output with ElasticSearch queries?
 - If the thing asked in question#1 is not possible, how to create visualizations on Kibana/Grafana?
 
Please note that, I don't want to use update/upsert operations. I just would like to simply add documents to ElasticSearch.
UPDATE: more info on this question.
From this question, I can see the de-duplication of data is possible using top_hits aggregation:
Remove duplicate documents from a search in Elasticsearch
However, de-duplicating by clientId using top_hits and then grouping by clientId using terms is not possible. This is because top_hits aggregation doesn't accept any sub aggregations. Google for [top_hits] cannot accept sub-aggregations"
This is how far I came:
GET /metric/_search
{
  "aggs" : {
    "latestEntriesPerClients" : {
        "terms" : { "field" : "clientid" },
        "aggs": {
        "1": {
          "top_hits": {
            "sort": [{
                "date": {"order": "desc"}
            }],
            "size": 1
          }
          //, THIS WON'T WORK
          // "aggs": {
          //  "NAME": {
          //    "terms": {"field": "clientVersion"}
          //  }
          //}
        }
      }
    }
  }
}
Simplified output:
* clientId: ABC, clientVersion: 1.2.3, time: 2018-01-01
* clientId: DEF, clientVersion: 1.2.3, time: 2018-01-04
* clientId: GHI, clientVersion: 1.2.4, time: 2018-01-05
Now, how can I pipe this into some other aggregation? I tried pipeline aggregations but failed to get these values above grouped and counted.