LINQ-2-SQL – The Good and the Bad

It’s been a few weeks since I’ve posted an update on here, and I do have a good excuse! My brother is getting married (today – July 17th), I ended up going to Vegas for his bachelor party, and I’m preparing to go on vacation myself on Sunday. Before I go I did want to make another post though, and so here it is.

 

Since the release of LINQ in .NET 3.5 I had been a little skeptical of it. I’ve never been a fan of code that does things automagically (yes, I said automagically). When I took my latest job I started working on a project which made use of LINQ-2-SQL very highly, and so I was forced to embrace it and begin working with it. My overall impression of it is that it’s not quite as bad as I originally thought. Selecting objects from a list with SQL-like syntax actually seems very natural, and the selection can replace a lot of lines of messy code when you’re searching for a certain subset of items within a larger collection. On the other hand, I still feel a foreach loop iterating over a collection makes your code more readable in the long run.

 

There are also some nasty pitfalls when using LINQ-2-SQL which most developers won’t ever notice unless they’re using large datasets, which we just so happen to be doing. The system I was working on as mentioned earlier has a database which is between 10-20GB at any point in time. This is by no means extremely large, but it was big enough for us to start seeing major performance issues with sloppy LINQ-2-SQL code.

 

No matter what the queries LINQ generated for selections were fairly efficient; I have no complaints about that at all. Where it breaks down, and horribly so if you don’t know what you’re doing, is when doing insert, update and delete operations. The default behavior of LINQ-2-SQL is to open the connection to the data store execute one data modification query for one record, close the connection, and then move to the next record. Like I said before, unless you’re working with big datasets this probably isn’t an issue. On the other hand our system was working with datasets on the size of hundreds of thousands to millions of rows at a time, working one row at a time. Can you imagine how much overhead was being generated for data modifications?? It was utter ridiculousness.

 

The bright side of this whole situation is you can make LINQ-2-SQL more efficient in this regard, but you do need to be explicit about it. You must make sure to open your connection to the database before you begin your updates. If you want your whole update wrapped in a transaction you must tell LINQ this or each update will be its own transaction. Now, and not before, you ask LINQ to update your database. (There may be some inconsistencies here, I’m writing this from memory and I haven’t touched the offending project in months.)

 

How does being explicit with LINQ fare compared to trusting the default behavior? Quite well I must say. Our modifications which were taking hours upon hours before could be done in as 15 minutes or less. We’re talking an order of magnitude of difference here. Personally I couldn’t believe how inefficient it was off the bat and how well it performed before. A couple little steps before any data modifications saved me the trouble of having to rip out the guts of the data access layer for this project and rewrite it.

 

Anyway, I hope this helps someone else in the future. I’ll update with some code examples later when I get back to work.

About Michael Carpenter

I'm a software developer for a Kitchener based software company, and do custom work on the side. I also enjoy creating my own projects to solve problems which I encounter, if a solution doesn't already exist.
This entry was posted in Development. Bookmark the permalink.

Leave a Reply