Another LinQ in the Chain of O/R Mappers
Microsoft is talking a lot about its forthcoming LinQ for SQL (not DLinQ!) and ADO.NET Entity Framework technology and it really looks pretty sharp. (Despite the confusion over there being two mapping technologies.)
But I have a reservation. Here's a quote from the Entity Framework doc:
No plumbing. The code is very database intensive, yet there are no database connection objects, no external language such as SQL for query formulation, no parameter binding, no configuration embedded in code. In this sense, you could say this code is “pure business logic”.
(No time to grab an image just now. Go to the doc to see the code it's talking about, but if you've seen DLinQ samples before, you've seen it.)
So, is that a good thing? Pure business logic that generates the appropriate SQL and parameter binding and config logic? Or is that an unfortunate merging of domain concerns with implementation concerns?
For a look at why this may not be a good thing, let me introduce Ibatis. When Paul Gielens and others posted about Ibatis, I checked it out. They used more of a pure Fowler form of a Data Mapper, rather than an O/R mapper, and the difference is subtle at first but hugely important. From the Ibatis site:
This framework maps classes to SQL statements using a very simple XML descriptor.
So, what Ibatis does for you is this: You still write SQL. But you don't have to write all the glue that gets the SQL data into your objects and back. You just put the SQL into their simple XML format and it just works. Slick.
This avoids the problem of the leaky abstraction, as explained by Clemens Vaster's post. Basically, an O/R mapper needs to know everything about how the data can be accessed, and there are a lot of special cases that can come up pretty quickly. So I think the specific queries and joins and when to load really are domain decisions. Of course, I don't mean domain decisions in the sense of being integrated with the business logic. An Adapter, or a Repository, if you will, should be as simple as possible, but no simpler. Generating it automatically from metadata removes too many decision points where implementations may need variation.
So use those stored procedures, temp tables, views, or straight SQL for table access. Use the right relational tool for the job to manage duplication and efficiency in the database queries. We don't need new ways to get data out of a relational database. We need simpler ways to call existing queries.
The problem of data transfer isn't SQL. It's the extra lines that it takes to add a parameter, and get the parameter values from the domain objects. It's the lines that it takes to loop through the values in a SQL statement and put them in the right place in the domain object. It's tracking changes and knowing how to commit the changes cleanly. It's knowing that all your client-side data was loaded as a unit and your relational invariants can be tested successfully.
Ibatis solves the first two problems, for either Java or C#. Recently, I threw together some code to use Ibatis, but to add some Unit of Work capabilities. I showed the result to the new guy at my main client and he asked some questions that made me think about what Ibatis actually did for us. Well, I wasn't thrilled with the XML to begin with, open source is a hard sell in this client's environment, and the only thing I could come up with was that it simplified the process of binding the SQL to the class and vice versa. So I took another look to see if there wasn't an even simpler, more strongly typed, no-XML way to accomplish Ibatis's goals. And there surely is. I've now got the first three problems solved for C# 2.0, in less code than Ibatis, with support for nullables, and the upgraded 2.0 design-time WinForms binding. I'll post some code if anybody's interested. It's only a dozen or so short classes.
My next challenge is to see if I can handle Aggregates cleanly. The above post I quoted on lazy loading really crystallized a lot of my thoughts in this area, because it had very much already occurred to me what a big gaping hole was created in Lhotka's otherwise fantastic mobile object model if the data was inconsistent. Okay, in practice, it's a very small hole for most projects... but I have been known to get hung up on theory instead of practice.
Inconsistent Aggregates is not the problem with Lazy Loading
If one has an explicit Aggregate concept, then one can, even with Lazy Loading, still do some version checking to see if the whole Aggregate shares the same version. As long as one is using the Aggregate concept, it really doesn't matter when the data is loaded. Lazy Loading is back.
On the other hand, I'm still not sure how I feel about lazy loading from the interface perspective. In a GUI app, there is a benefit to making sure user-perceptible delays are only in the places where they're expected by the user. So I probably won't use it much.