Meteor - MongoDB - Designing Structures


How should we design our schema in MongoDB?

Now that you are familiar with the basic API of Simple Schema, it’s worth considering a few of the constraints of the Meteor data system that can influence the design of your data schema. Although generally speaking you can build a Meteor data schema much like any MongoDB data schema, there are some important details to keep in mind.

The most important consideration is related to the way DDP, Meteor’s data loading protocol, communicates documents over the wire. The key thing to realize is that DDP sends changes to documents at the level of top-level document fields. What this means is that if you have large and complex subfields on document that change often, DDP can send unnecessary changes over the wire.

For instance, in “pure” MongoDB you might design the schema so that each list document had a field called todos which was an array of todo items:

Lists.schema = new SimpleSchema({
  name: {type: String},
  todos: {type: [Object]}

The issue with this schema is that due to the DDP behavior just mentioned, each change to any todo item in a list will require sending the entire set of todos for that list over the network. This is because DDP has no concept of “change the text field of the 3rd item in the field called todos“, simply “change the field called todos to a totally new array”.

The implication of the above is that we need to create more collections to contain sub-documents. In the case of the Todos application, we need both a Lists collection and a Todos collection to contain each list’s todo items. Consequently we need to do some things that you’d typically associate with a SQL database, like using foreign keys (todo.listId) to associate one document with another.

In Meteor, it’s often less of a problem doing this than it would be in a typical MongoDB application, as it’s easy to publish overlapping sets of documents (we might need one set of users to render one screen of our app, and an intersecting set for another), which may stay on the client as we move around the application. So in that scenario there is an advantage to separating the subdocuments from the parent.

However, given that MongoDB prior to version 3.2 doesn’t support queries over multiple collections (“joins”), we typically end up having to denormalize some data back onto the parent collection. Denormalization is the practice of storing the same piece of information in the database multiple times (as opposed to a non-redundant “normal” form). MongoDB is a database where denormalizing is encouraged, and thus optimized for this practice.

In the case of the Todos application, as we want to display the number of unfinished todos next to each list, we need to denormalize list.incompleteTodoCount. This is an inconvenience but typically reasonably easy to do as we’ll see in the section on abstracting denormalizers below.

Another denormalization that this architecture sometimes requires can be from the parent document onto sub-documents. For instance, in Todos, as we enforce privacy of the todo lists via the list.userId attribute, but we publish the todos separately, it might make sense to denormalize todo.userId also. To do this, we’d need to be careful to take the userId from the list when creating the todo, and updating all relevant todos whenever a list’s userId changed.

An application, especially a web application, is rarely finished, and it’s useful to consider potential future changes when designing your data schema. As in most things, it’s rarely a good idea to add fields before you actually need them (often what you anticipate doesn’t actually end up happening, after all).

However, it’s a good idea to think ahead to how the schema may change over time. For instance, you may have a list of strings on a document (perhaps a set of tags). Although it’s tempting to leave them as a subfield on the document (assuming they don’t change much), if there’s a good chance that they’ll end up becoming more complicated in the future (perhaps tags will have a creator, or subtags later on?), then it might be easier in the long run to make a separate collection from the beginning.

The amount of foresight you bake into your schema design will depend on your app’s individual constraints, and will need to be a judgement call on your part.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License