Environmental Science
mongos 3.0.14
aggregate
Using aggregate can realize more complex data aggregation operations, such as count, distinct count, group having, etc.
The result returned by aggregate is an array. It should be noted that the data size cannot exceed 16m.
For example:
$pipeline = [
['$match' => $tmpCondition],
['$group' => [
'_id' => ['user_id'=>'$user_id']
]],
['$group' => [
'_id' => '_id.user_id',
'number' => ['$sum'=>1]
]]
];
$options = [
'allowDiskUse'=>true,
'cursor'=>['batchSize'=>1]
];
$data = MongoSvc::get('user')->user_info->aggregate($pipeline,$options);
aggregateCursor
For the aggregation of a large number of returned results, the aggregate cursor can be used to return cursors to avoid data size overrun.
The return result of aggregatecursor is cursor, which can retrieve data circularly.
For example:
$pipeline = [
['$match' => $matchArr],
['$project' => ['id'=>1,'_id'=>0]],
['$group' => [
'_id' => '$id',
'count' => ['$sum' => 1]
]],
['$match' => [
'count' => ['$gt' => 1]
]]
];
//Here it is changed to aggregate cursor, which is obtained by cursor loop
$data = MongoSvc::get('user')->user_info->aggregateCursor($pipeline);
Pipeline parameters
- $match
Condition matching. - $addFields
Add a new field. - $count
The total number of documents in this stage. - $group
grouping. - $limit
Limit the quantity. - $skip
Jump. - $sort
Sort. - $out
Output the result to the collection. - $project
Filter fields.
https://docs.mongodb.com/manu…
Options parameter
- explain boolean
Processing information. - allowDiskUse boolean
True to write temporary data to disk. - cursor
cursor: { batchSize: <int> }
Sets an initial size for the return collection. - hint string or document
Force the index to be specified.
https://docs.mongodb.com/manu…
Query examples
Count value of a field (such as’ sum ‘) in the summary statistics document:
$pipeline = [
['$match' => $tmpCondition],
['$group' => [
'_id' => ['sum'],
'sum_value' => ['$sum' => '$money']
]]
];
Data after de duplication of a column:
$pipeline = [
['$match' => $tmpCondition],
['$group' => [
'_id' => ['user_id' => '$user_id']
]]
];
Count a column (e.g.’user ‘_ Count value after de duplication:
$pipeline = [
['$match' => $tmpCondition],
['$group' => [
'_id' => ['user_id'=>'$user_id']
]],
['$group' => [
'_id' => '_id.user_id',
'number' => ['$sum'=>1]
]]
];
$pipeline = [
['$match' => $tmpCondition],
['$group' => [
'_id' => ['qid' => '$qid'],
'max_number' => ['$max' => '$days']
]],
['$group' => [
'_id' => ['number' => '$max_number'],
'total' => ['$sum' => 1]
]]
];
After statistical grouping, the total value of a column in each group is as follows:
$pipeline = [
['$match' => $tmpCondition],
['$group' => [
'_id' => ['type' => '$type'],
'sum_value' => ['$sum' => '$number']
]]
];