It takes around 6min to get the result from MongoDB, when I use the following aggregate query.
db.barcodes.aggregate([
{
$lookup: {
from: 'company',
localField: 'company',
foreignField: '_id',
as: 'company'
}
},
{
$match: {
'company.name': 'ABCd'
}
}
]);
I have two collections in my DB, company and barcode. If I search with text 'ABC' (instead of 'ABCd', company name 'ABC' already exists in the DB) it takes only 0.05Sec to complete the result.
Total 42,14,301 documents in barcode collection and 2 documents in company collection.
Sample documents
Company
{
"_id" : ObjectId("615dd7873c4f710b71438772"),
"name" : "ABC",
"isActive" : true
}
Barcode
{
"_id" : ObjectId("615dd8ff3c4f710b71438773"),
"barcode" : "1",
"company" : ObjectId("615dd7873c4f710b71438772"),
"comment" : "text 1"
}
Indexed fields
- company._id
- company.name
- company.isActive
- barcode.company
- barcode._id
Mongo clients used: Studio 3t and MongoDB CLI
Output of explain
{
"stages" : [
{
"$cursor" : {
"query" : {
},
"queryPlanner" : {
"plannerVersion" : 1.0,
"namespace" : "diet.barcodes",
"indexFilterSet" : false,
"parsedQuery" : {
},
"winningPlan" : {
"stage" : "COLLSCAN",
"direction" : "forward"
},
"rejectedPlans" : [
]
}
}
},
{
"$lookup" : {
"from" : "company",
"as" : "company",
"localField" : "company",
"foreignField" : "_id"
}
},
{
"$match" : {
"company.name" : {
"$eq" : "ABCd"
}
}
}
],
"ok" : 1.0
}
2 Answers 2
Your winning query is running against diet.barcodes and is doing a COLLSCAN because it doesn't have an index.
Create an index of that collection.
-
How to create an index for a collection. i created index for company, db.barcodes.createIndex({company:1})Albert– Albert2021年10月11日 19:29:38 +00:00Commented Oct 11, 2021 at 19:29
-
Please share the output of: > db.barcodes.findOne() > db.company.findOne()zelmario– zelmario2021年10月11日 19:47:56 +00:00Commented Oct 11, 2021 at 19:47
-
db.barcodes.findOne() >{ "_id" : ObjectId("615dd8ff3c4f710b71438773"), "barcode" : "1", "company" : ObjectId("615dd7873c4f710b71438772"), "comment" : "text 1" } db.company.findOne(){ "_id" : ObjectId("615dd7873c4f710b71438772"), "name" : "ABC", "isActive" : true }Albert– Albert2021年10月12日 04:29:47 +00:00Commented Oct 12, 2021 at 4:29
-
Try to create an index of barcodes._id and run the query again, if it possible with the explaing argument.zelmario– zelmario2021年10月12日 09:58:03 +00:00Commented Oct 12, 2021 at 9:58
-
index for barcodes._id is already addedAlbert– Albert2021年10月12日 11:05:35 +00:00Commented Oct 12, 2021 at 11:05
Consider the aggregate query from the question post. The following is based upon the MongoDB version 7.
$lookup Performance Considerations says:
Equality Match with a Single Join:
$lookup
operations that perform equality matches with a single join typically perform better when the source collection contains an index on theforeignField
.
So, the foreign field is _id
of the company
collection. This index already exists (_id
field has a unique index, by default). Though, the query plan output doesn't show this explicitly, the index optimization is there in the $lookup
stage.
In this case there are only two documents in the company
collection, and the index is not of much use.
Generate a query plan for the aggregate query:
db.barcodes.explain().aggregate([ ... ])
The explain plan output shows the following (partial output shown here):
...winningPlan: {
queryPlan: {
stage: 'EQ_LOOKUP',
planNodeId: 2,
foreignCollection: 'test.company',
localField: 'company',
foreignField: '_id',
asField: 'r_company',
strategy: 'IndexedLoopJoin',
indexName: '_id_',
indexKeyPattern: { _id: 1 },
inputStage: {
stage: 'COLLSCAN',
planNodeId: 1,
filter: {},
direction: 'forward'
}
},
slotBasedPlan: { ...
Some points to note from the winningPlan
of the explain output:
stage: 'EQ_LOOKUP'
- EQ_LOOKUP means "equality lookup"winningPlan.slotBasedPlan
- slot-based query engine usage
To find and return query results, MongoDB uses one of the following query engines:
- The classic query engine
- The slot-based query execution engine (MongoDB v5.1 or higher)
MongoDB automatically selects the engine to execute the query. You cannot manually specify an engine for a particular query.
MongoDB can use the slot-based query execution engine for a subset of queries which are eligible and provided certain conditions are met. In most cases, the slot-based execution engine provides improved performance and lower CPU and memory costs compared to the classic query engine.
The winningPlan.slotBasedPlan
field in the above plan output and the 'EQ_LOOKUP'
stage indicate the usage of the slot-based query engine.
Reference: $lookup Optimization (MongoDB v6.0 or higher).
The $lookup
stage in the query already benefits from this. The $match
stage after the initial $lookup
cannot use any indexes in this case.
More details from the explain can be produced and studied by using the "executionStats" option of the explain()
.
Explore related questions
See similar questions with these tags.