How to count documents in Google Cloud Firestore?
Count documents in the collection is a known issue of Cloud Firestore service.
Issue Description
All known(probably) relational databases and a lot of NoSQL databases, has a pretty simple interface for getting a total number of documents/rows/tables. Typically such interfaces support by DB Engine from a box.
Most of the developers that start to work with Firestore for the first time expect the same in Cloud Firestore SDK. But there is no built-in function for it.
Official Firestore github repository has a few Feature Requests for ‘count() documents’ function, that was created a few years ago. Checking the comments we can understand that a team did not plan to implement a feature in future releases.
Trying to find a way to count the total number of documents in a firestore collection you will find a huge amount of StackOverflow questions. All of them have various hacks and workarounds, with a lot of limitations and bugs.
Possible solutions
I will go through all the possible solutions that I found and trying to analyze their weaknesses.
Snapshot Size
The first solution is pretty simple and straight forward. The idea is to get all documents and count them.
✅ Simple to implement.
✅ Fine, for small-size collections (10–250 documents).
❌ Return the wrong value for huge collections (1000+ documents).
❌ Increase latency for simple operation(need to fetch all documents before count).
❌ Overuse of Firestore Reading Limits — Each operation will use N reading operations instead of ~1. Obviously it will rapidly increase your budget.
Write On Create
Probably the first idea will be to store a count number in a separate collection. And increase it each time we create a new item.
✅ No needs to fetch all documents to count it. Save Firestore budget.
❌ You need to place code to change counter in each place you create/delete docs. Hard to handle errors in transactions or batch create/delete.
❌ Not possible to handle items, that were created/removed from Firestore Web UI/Firestore Client.
Write Listener
Google Cloud Functions / Firebase Functions — Give us the possibility to create Lambda functions that can be triggered on special events.
Firestore has events to track write operations on collections/documents. Implementations of it look native and organic for such a problem.
There are a lot of references for such a solution across the internet.
❌ It looks like a perfect solution but it not working properly. If you will try to run this function and then create documents(100 as an example). The final counter value will be more than >100.
Let's investigate wats wrong with this solution, and why it is not working as expected.
Firestore Trigger Limitations
The last point tells as that each trigger function will be executed at least once. This means it can be triggered a few times in case of some issues, instance replication, etc.
It is a main point that we need to keep in mind to create a perfect solution.
Final Solution
A final solution will be based on the Write Listener solution. But we need to fix duplication writes for a counter. And improve the solution for multi counters.
Each firestore event has a context ID. This ID guaranteed to be unique for each create/delete operation.
Lets first create a separate collection to store Events by ID. Each event should be a separate document with a few fields, timestamp, collectionName, and value.
Now run this trigger, and create an item, to check events creating fine.
The next step will be to count these events and write a number to a separate collection. And as an improvement to clean up events collection. As we do not need these values anymore. (Can be skipped for a low load system, less than <100 events per day).
On a line, we set a limitDate to a value (currentTime-5minutes). It is because of 1 point in limitation document. Which means that each trigger can take up to 10 seconds to execute.
Finally, we need to get the exact value of the documents number. Even if it was not moved to counter from events.
We can do it using a simple script, that takes a last saved counter + count events for current collection.
✅ Working Properly
❌ Increase write operations twice 2N + delete operations 1N. But a counter read operation take ~1(the first solution take N operation each read).
❌ Complex setup. It will be good to have a more simple solution, but.
I have tested the current solution with bulk creation and removing over 2k documents in few seconds. Everything working properly for me for a long time.
👌 Thank you for reading. I hope My Article was helpful to somebody that faces the same problem.
👏 Clap in case it was helpful for you.
🙌 Share your own experience, so we can discuss it and improve the solution.