i have 2 collections : words and phrases Each word document has an array of phrases id's. And each phrase can be active or inactive.
For example :
words : 
{"word" => "hello", phrases => [1,2]}
{"word" => "table", phrases => [2]}
phrases :
{"id" => 1, "phrase" => "hello world!", "active" => 1}
{"id" => 2, "phrase" => "hello, i have already bought new table", "active" => 0}
I need to get count of active phrases for each word.
In php i do it like this:
 1. get all words
 2. for each word get count of active phrases with condition ['active' => 1]
Question: How can i get words with active phrases count in one request? I tried to use MapReduce, but i need to make a request for each word to get count of active phrases.
UPD:
In my test collection there are 92 000 phrases and 23 000 words.
I have already tested both variant: with php loop for each word in which i get phrases count and aggreagation function in mongo.
But i changed aggregation pipeline in commets below because of phrases_data. It is array, so i can't use $match on it. I use $unwind after $lookup.
[ '$unwind'  =>  '$5'],
    [
        '$lookup' =>  [
        'from' =>  'phrases_926ee3bc9fa72b029e028ec90e282072ea0721d1',
            'localField' =>  '5',
            'foreignField' =>  '0',
            'as' =>  'phrases_data'
        ]
    ],
    [ '$unwind'  =>  '$phrases_data'],
    [ '$match'  =>  [ 'phrases_data.3'  =>  77] ], //phrases_data.3 => 77 it is similar to phrases_data.active => 1
    [ '$group'  =>  
        [
            '_id'  =>  ['word'  =>  '$1', 'id'  =>  '$0'],
            'active_count'  =>  [ '$sum'  =>  1]
        ]
    ],
    [ '$match'  =>  [ 'active_count'  =>  ['$gt' => 0]] ],
    [ '$sort'  =>
        [
            'active_count'  => -1
        ]
    ]
The problem is that $group command take 80% of process time. And it is much slower than php loop. Here is my results for test collection:
1. Php loop (get words-> get phrases count for each word): 10 seconds
2. Aggregation function : 20 seconds
 
     
    