The Hacker News community that contributed to the adoption of MongoDB is showing dissent, dismay and desertion of the quintessential rainbows-and-unicorns NoSQL database. The fire was set off last week by an anonymous post ‘Don’t use MongoDB’ and, during the same period, the ‘Failing with MongoDB’ post. These posts triggered all sorts of interesting discussions on the Hacker News threads – some trolling, but mostly from people experienced in mongoDB, scalability, databases, open source and startups. A good sampling of opinion, both technical and not, from people who collectively have a good and valid opinion of MongoDB.
The basis of the trouble is that MongoDB, under certain load conditions, has a tendency to fall over and, crucially, lose data. That begs the question about the quality of the code, the involvement of 10gen and whether or not things will improve over time, or at all. Added to the specific MongoDB concerns, this seems to have cast a broad shadow over NoSQL databases in general.
Below are the links to the relevant posts, with the Hacker News comment thread (in brackets). I urge you to scan through the comment threads as there are some useful nuggets in there.
I have little to add to the overall discussion (there are some detailed and insightful comments in the threads), but would make the following brief observations.
- MongoDB 1.8 and back was unashamedly fast and the compromise was that the performance gain was obtained by being memory based (where commits happen in memory as opposed to disk). It was almost like a persistent cache than a database for primary data.
- If you absolutely have to have solid, sure, consistent, reliable, error free, recoverable, transactioned and similar attributes on your data, then MongoDB is probably not a good choice and it would be safer to go with one of the incumbent SQL RDBMSs.
- Not all data has to be so safe and MongoDB has clear and definite use cases.
- However, unexpected and unexplained data loss is a big deal for any data store, even if it is text files sitting in a directory. MongoDB could ‘throw away’ data in a managed fashion and get away with it (say giving up in replication deadlocks), but for it to happen mysteriously is a big problem.
- Architects assessing and implementing MongoDB should be responsible. Test it to make sure that it works and manage any of the (by now well known) issues around MongoDB.
- Discussions about NoSQL in general should not be thrown in with MongoDB at all. Amazon SimpleDB is also NoSQL, but doesn’t suffer from data loss issues. (It has others, such as latency, but there is no compromise on data loss)
- The big problem that I have with using MongoDB properly is that it is beginning to require knowledge about ‘data modelling’ (whatever that means in a document database context), detailed configuration and understanding about the internals of how it works. NoSQL is supposed to take a lot of that away for you and if you need to worry about that detail, then going old school may be better. In other words, the benefits of using MongoDB over say MySQL have to significantly outweigh the risks of just using MySQL from the beginning.
- Arguably creating databases is hard work and MongoDB is going to run up against problems that Oracle did, and solved, thirty years ago. It will be interesting to see where this lands up – a better positioned lean database (in terms of its use case) or a bloated, high quality one. MongoDB is where MySQL was ten years ago, and I’m keen to see what the community does with it.