123456789_123456789_123456789_123456789_123456789_

Map/Reduce [map-reduce]

Mongoid provides a DSL around MongoDB's map/reduce framework, for performing custom map/reduce jobs or simple aggregations.

Note

The map-reduce operation is deprecated. The aggregation framework <aggregation-pipeline> provides better performance and usability than map-reduce operations, and should be preferred for new development.

Execution

You can tell Mongoid off the class or a criteria to perform a map/reduce by calling map_reduce and providing map and reduce javascript functions.

map = %Q{
  function() {
    emit(this.name, { likes: this.likes });
  }
}

reduce = %Q{
  function(key, values) {
    var result = { likes: 0 };
    values.forEach(function(value) {
      result.likes += value.likes;
    });
    return result;
  }
}

Band.where(:likes.gt => 100).map_reduce(map, reduce).out(inline: 1)

Just like criteria, map/reduce calls are lazily evaluated. So nothing will hit the database until you iterate over the results, or make a call on the wrapper that would need to force a database hit.

Band.map_reduce(map, reduce).out(replace: "mr-results").each do |document|
  p document # { "_id" => "Tool", "value" => { "likes" => 200 }}
end

The only required thing you provide along with a map/reduce is where to output the results. If you do not provide this an error will be raised. Valid options to #out are:

Raw Results

Results of Map/Reduce execution can be retrieved via the execute method or its aliases raw and results:

mr = Band.where(:likes.gt => 100).map_reduce(map, reduce).out(inline: 1)

mr.execute
# => {"results"=>[{"_id"=>"Tool", "value"=>{"likes"=>200.0}}],
      "timeMillis"=>14,
      "counts"=>{"input"=>4, "emit"=>4, "reduce"=>1, "output"=>1},
      "ok"=>1.0,
      "$clusterTime"=>{"clusterTime"=>#<BSON::Timestamp:0x00005633c2c2ad20 @seconds=1590105400, @increment=1>, "signature"=>{"hash"=><BSON::Binary:0x12240 type=generic data=0x0000000000000000...>, "keyId"=>0}},
      "operationTime"=>#<BSON::Timestamp:0x00005633c2c2aaf0 @seconds=1590105400, @increment=1>}

Statistics

MongoDB servers 4.2 and lower provide Map/Reduce execution statistics. As of MongoDB 4.4, Map/Reduce is implemented via the aggregation pipeline and statistics described in this section are not available.

The following methods are provided on the MapReduce object:

The following code illustrates retrieving the statistics:

mr = Band.where(:likes.gt => 100).map_reduce(map, reduce).out(inline: 1)

mr.counts
# => {"input"=>4, "emit"=>4, "reduce"=>1, "output"=>1}

mr.input
# => 4

mr.emitted
# => 4

mr.reduced
# => 1

mr.output
# => 1

mr.time
# => 14
Note

Each statistics method invocation re-executes the Map/Reduce pipeline. The results of execution are not stored by Mongoid. Consider using the execute method to retrieve the raw results and obtaining the statistics from the raw results if multiple statistics are desired.