Object-Relational Impedance Mismatch
Active Record versus Data Mapper
Does "Active Record" in Rails use Identity Map? Even if we use Identity Map, should we cache the object in our web servers, or should we keep them in our bank of Redis / memcache servers? If we use Redis / memcache, there are complexity involved in setting up, administering, and maintaining these cache servers, and of course, this involve time to get data from these cache servers. If we cache these objects on the web servers, we need to make sure that our web servers have enough memory.
Like everything else, Active Record may not be the right tool for the job, depending on the requirements on the ERM (Entity Relationship Model) or how you want to store the data in your database. You have to evaluate your requirements, and choose the right tool for the job. AR trades flexibility for simplicity. For complex requirements, we may have to roll our own solution or go with the bare minimum. If your application does not need web-scalability, or does not have complex requirements, using AR will get the job done faster, keep the code simple and clean.
Relying on ORM is a cruch. It enables poor performance application by being greedy in its selection and updating of records. If you have a record with 50 fields, but you only need to access 3 fields, AR will select all 50 fields. If you need to update only email, AR will write all 50 fields. Blindly relying on ORM to do the work for you is bad.
In the SQL world, it is recommended that we only select the columns that we need (less traffic on the network). Also if all the columns that we need are in an index, then only the index is read (the data table is not read at all), which is a performance gain. In the NoSQL (big web scale distributed database), it is recommended that we select all columns. Then there is also columnar databases. We need to measure AR performance for each of these scenario.
With regarding to fetching too many or two columns, this is a problem of modular app design in general. If you don't know ahead of time which components will be on a page (and consequently what columns will be used) you may as well fetch all the fields once, since you spent all that time finding the columns. Unless the fields are huge TEXT or BLOB fields. It's developer time versus network performance. We should not take this consideration lightly on either side. We can look into using Lazy Loaders. Perhaps the particular Active Record implementation can intelligently cache the database schema, and therefore know which columns should be lazy loaded.
Active Record: An object that wraps a row in a database table or view, encapsulate the database access, and adds domain logic on that data.
- Active Record is a good choice for domain logic that isn't too complex, such as creates, reads, updates, and delete. Derivations and validations based on a single record work well in this structure.
- The pattern could provide a great productivity boost and value when business logic is simple and does not require us to work on multiple rows at once or in a loop.
- In a typical implementation, you will have the following set of methods and properties in every class:
- Getting the data from database
- Instantiating a new instance in memory for inserting into the database
- Saving changes to the database
- Loading related entities
- Usually loads of methods (inherited from the base framework class) to deal with all the complexity involved with the above-mentioned methods
- Column related-properties: There will be at least one property generated per column
- Also the frameworks usually provide several overloads of each method to enable you to handle every possible scenario. And then of course, there is the business logic that you put into this class.
- Database is very nicely abstracted away which is a good thing, but this also means accessing a property can cause a database hit. In fact, due to this simplicity, a lot of developers tend to forget that they are working with a row in a database. When working with ORM or Active Record, we have to be specifically conscious about working with loop. If we use ORM in a loop (say 50 iterations), we may be executing 50 queries to fetch data from the database. ORM frameworks should offer a way to bulk load the data, and batch update. ORM should be used where it adds values. We should continue to use store procedures, and SQL wherever it make sense.