The past couple of weeks I have been doing some research because I am planning on working on a FastCGI Model-View-Controller framework for developing web application services in C++. I am a C++ engineer by day and have been working with the language for nigh on a decade. The brief research I was able to do has shown me that there are no DataMapper or Active Record open source libraries available.
What Exactly Is The N+1 Query Problem?
When using an abstract framework to build a data model for use with an SQL-esque language you generally need to build a set of classes to model the schema in said language. For Type Aloud, lets say that I want to display all of the stories in a set of categories after you get them back based upon two slugs taken from the site’s URL.
SELECT id FROM categories WHERE slug='sci-fi' OR slug='fantasy' SELECT id, name FROM stories WHERE category_id = 0 OR category_id = 1
How many queries will be run? At most there will be two queries, but what about if none of the categories match the two slugs provided? That first dangling select query will be executed each and every time even if there are no matches to search against. How can we fix this?
SELECT * FROM stories INNER JOIN categories ON stories.category_id = categories.id AND (categories.slug = 'sci-fi' OR categories.slug = 'fantasy')
This type of explicit join will allow you to execute a single SQL query and, in the long run, will save you thousands of CPU cycles with your web applications. This type of data mapper already exists in Ruby on Rails using the Data Mapper gem which is freely available. But as of right now there is no such framework available (unless I am mistaken) in C++ which gives you the power and flexibility to build your data models and eat your cake too. My first plan in developing an MVC framework is to work on architecting a proper data mapper library for C++.
Anyone else game?
|
Dan Kubb |
I'm not a C++ dev, but I am the maintainer of the DataMapper ruby gem. I'm not sure what help I'd be, but if you ever need someone to bounce ideas off, or wonder how we solved a specific problem, feel free to ping me anytime — I'm "dkubb" on the #datamapper freenode.net IRC channel.
One suggestion I'd make to anyone seriously considering writing an ORM is to pick up Martin Fowler's book Patterns of Enterprise Application Architecture (PoEAA). You won't find anything more densely packed with ideas on ORM design. Most importantly it gives you the names for things and provides the vocabulary to use when searching for more information, or discussing ORM design with other devs inside and outside your community.
The N+1 problem can actually be solved using two approaches: Lazy Loading and Eager Loading. Lazy Loading means that you wait to perform the query until the client requests the actual information, and it's at that time that you fetch only what is required in the most efficient way. Eager Loading is when the client specifies up-front what the object graph should look like that they want to load. An ORM can implement both, but in the ruby world DataMapper only implements Lazy Loading, while ActiveRecord only implements Eager Loading.
In retrospect, I think this was probably a design mistake for DM. If I had to do it over again (and I am working on it), I would probably provide both options. Lazy Loading associations is ridiculously difficult to get working properly, and it would've been nice for the cases where it didn't work to have Eager Loading (which is easier to test IMHO) as a fallback. Also, it might be possible to design the code so that both approaches share some (most?) of the code behind the scenes, and once you get all the edge cases working with Eager Loading, anyone using Lazy Loading reaps the same benefits.
johnbellone |
Thanks Dan for the comment. I was actually considering purchasing that book the first time I saw it, and you definitely just sold me on it. I actually wrote a reusable lazy loading C++ class that I was going to attempt to get into the boost project. It is nice to hear from someone that already went through the process.
I may be hitting you up when I start the project for some design advice.