I have a very large file (20GB+ compressed) called input.json containing a stream of JSON objects as follows:
{
    "timestamp": "12345",
    "name": "Some name",
    "type": "typea"
}
{
    "timestamp": "12345",
    "name": "Some name",
    "type": "typea"
}
{
    "timestamp": "12345",
    "name": "Some name",
    "type": "typeb"
}
I want to split this file into files dependent on their type property: typea.json, typeb.json etc., each containing their own stream of json objects that only have the matching type property.
I've managed to solve this problem for smaller files, however with such a large file I run out of memory on my AWS instance. As I wish to keep memory usage down, I understand I need to use --stream but I'm struggling to see how I can achieve this.
cat input.json | jq -c --stream 'select(.[0][0]=="type") | .[1]' will return me the values of each of the type properties, but how do I use this to then filter the objects?
Any help would be greatly appreciated!
 
     
    