Updating Inside a Nested Array with the MongoDB Positional Operator in C#
Sometimes you have a document in MongoDB with a property that is an array of nested objects. You’d like to update one of those objects. What’s the best way to go about that in C#?
Let’s say you have a simple document representing a class with students:
{
"_id" : ObjectId("583ee54339eddb19c03f2bf5"),
"Name" : "Introduction to Databases",
"Code" : "CS321",
"Students" : [
{
"Name" : "Alice",
"Grade" : 92
},
{
"Name" : "Bob",
"Grade" : 87
},
{
"Name" : "Charlie",
"Grade" : 76
}
]
}
I’d like to update Bob’s grade to a 93 since he passed his last exam.
One simple way to do it is to load the whole Class document into memory, update the grade in C#, and then put the document back into the database.
That would look something like this:
var client = new MongoClient("mongodb://localhost:27017")
var db = client.GetDatabase("school");
var classes = db.GetCollection<Class>("classes");
// Bring the whole Class document into memory
var classToUpdate = classes.Find(c => c.Code == "CS321").First();
// Find bob in the Students array
var bob = classToUpdate.Students.First(s => s.Name == "Bob");
// Update Bob's grade
bob.Grade = 93;
// Save the entire document back to the database
classes.ReplaceOne(c => c.Id == classToUpdate.Id, classToUpdate);
There is a problem with this approach though.
For one thing, its pretty inefficient, especially for a large document. It’s probably not much of a problem for this small sample. but imagine this was a seminar class with 200 students. And that the document also included individual paper grades for each student. This is a lot of data to bring across the wire, instantiate C# objects for, send back across the wire, then eventually garbage collect the whole document in .NET land. It can add up pretty quick, especially in a high volume environment.
A more insidious problem is the race condition involved in updating the whole document. There’s no such thing as a lock in MongoDB, so its possible that two threads could pull down the whole document, make changes and race to get their version saved last.
For example, imagine initially Alice’s score is 95 and Bob’s is 85. Two threads pull down that document.
Thread 1 updates Alice’s score to 92, and Thread 2 update’s Bob’s score to 89. Then they both write their version of the document back to the database. Will Alice have her 92? Will Bob have his 89?
Who knows! Since each thread is replacing the entire document, who ever saves last wins!
Undefined behavior is not something you want in your database layer.
Luckily, MongoDB offers a findAndUpdate operation that can atomically change a subset of the fields in a document. Since the entire document is the unit of atomicity in MongoDB, the two threads can each issue a findAndUpdate at the same time and it will work out OK, provided they are each editing separate students.
If both threads are attempting to modify the same student, than just like before, all bets are off.
Anyway, the magic incantation for findAndModify
looks like this:
db.classes.findAndModify({
query: { Code: "CS321", Students: { $elemMatch: { Name: "Bob" } } },
update: { $set: { "Students.$.Grade": NumberInt(89) } }
})
Couple things going on here.
In the query
description, we’re using an $elemMatch
projection which tells
MongoDB to find a document with an array called Students
with an element with
the Name
“Bob”.
The update
description includes that curious string, "Students.$.Grade"
.
That dollar sign is called the Positional Operator.
It refers to the first matching element in the array.
We also use NumberInt(89)
so that Mongo writes an integer instead of a
floating point number, which is the default.
OK, this is all great, but how do we coerce the C# LINQ provider into executing
this query? C# doesn’t have a positional operator! And whatabout the $elemMatch
projection? There’s no LINQ method called ElementMatch
or anything…
It turns out the Mongo Driver LINQ provider translates the LINQ Any
method
into $elemMatch
. So the expression cls => cls.Students.Any(s => s.Name == "Bob")
will give us that query.
That makes a bit of sense: it lines up with how you might query an in memory list of Classes using LINQ to Objects.
But what about the Positional Operator? Unfortunately, since there’s no C#
language support for a custom operator like this, the authors of the C# driver
have special-cased an index of -1
to mean “use the positional operator”.
So the whole thing winds up looking like this:
var client = new MongoClient("mongodb://localhost:27017")
var db = client.GetDatabase("school");
var classes = db.GetCollection<Class>("classes");
classes.FindOneAndUpdate(
c => c.Code == "CS321" && c.Students.Any(s => s.Name == "Bob"), // find this match
Builders<Class>.Update.Set(c => c.Students[-1].Grade, 72)); // -1 means update first matching array element
I think the hard-coded -1
index1 is confusing. You have to know that the
LINQ provider handles that in a different way than it would work in LINQ to
objects. You could use a constant, like PositionalOperator
to bring clarity,
but I’d rather see an extension method provided by Mongo called something like
FirstMatching()
. Then it might look like:
// NOT SUPPORTED SYNTAX
classes.FindOneAndUpdate(
c => c.Code == "CS321" && c.Students.Any(s => s.Name == "Bob"),
Builders<Class>.Update.Set(c => c.Students.FirstMatching().Grade, 72));
Anyway, good luck.
- 1.Note that if you've modeled the
Students
property asIEnumerable<Student>
you won't be able to apply the indexer ([]
) in your LINQ expression. In that case you can usec.Students.ElementAt(-1)
for the same effect. ↩