entity framework | Tim Anderson's IT Writing

I have been working on a project which I thought would be simpler than it turned out to be – nothing new there, most software projects are like that.

The project involves upload and download of large files from Azure storage. There is a database as part of the application, nothing too demanding, but requiring some typical CRUD (Create, Retrieve, Update, Delete) functionality. I had to decide how to implement this.

First, a confession. I am comfortable using SQL and my normal approach to a database application is to use ADO.NET DataReaders to read data. They are brilliant; you just send some SQL to the database and back comes the data in a format that is easy to read back in C# code.

When I need to update the data, I use SqlCommand.ExecuteNonQuery which executes arbitrary SQL. It is easy to use parameters and transactions, and I get full control over how many connections are open and so on.

This approach has always worked well for me and I get excellent performance and complete flexibility.

However, when coding in ASP.NET MVC and Visual Studio you are now steered firmly towards Entity Framework (EF), Microsoft’s object-relational mapping library. You can use a code-first approach. Simply create a C# class for the object you want to store, and EF handles all the drudgery of creating tables and building SQL queries, letting you concentrate on the unique features of your application.

In addition, you can right-click in the Solution Explorer, choose Add Controller, and a wizard will generate all the code for listing, creating, editing and deleting those objects.

Well, that is the idea, and it does work, but I soon ran into issues that made me wonder if I had made the right decision.

One of the issues is what happens when you change your mind. Maybe that field should be an Int rather than a String. Maybe you need a second phone number field. Maybe you need to create new tables. How do you keep the database in synch with your classes?

This is called Code First Migrations and involves running commands that work out how the database needs to change and generates code to update it. It’s clever stuff, but the downside is that I now have a bunch of generated classes and a generated _MigrationHistory table which I did not need before. In addition, something when slightly wrong in my case and I ended up having to comment out some of the generated code in order to make the migration work.

At this point EF is creating work for me, rather than saving it.

Another issue I encountered was puzzling out how to do stuff beyond the most trivial. How do you replace an HTML edit box with a dropdown list? How do you exclude fields from being saved when you call dbContext.SaveChanges? What is the correct way to retrieve and modify data in pure code, without data binding?

I am not the first to have questions. I came across this documentation: an article promisingly entitled How to: Add, Modify, and Delete Objects which tells you nothing of value. Spot how many found it helpful:

You should probably start here instead. Still, be aware that EF is by no means straightforward. Instead of having to know SQL and the basics of ADO.NET commands and DataReaders, you now have to know EF, and I am not sure it is any less intricate. You also need to be comfortable with data binding and LINQ (Language Integrated Query) to make sense of it all, though I will add that strong data binding support is one reason whey EF is a good fit for ASP.NET MVC.

Should you use Entity Framework? It remains, as far as I can tell, the strategic direction for data access on Microsoft’s platform, and once you have worked out the basics you should be able to put together simple database applications more quickly and more naturally than with manually coded SQL.

I am not sure it makes sense for heavy-duty data access, since it is harder to fine-tune performance and if you hit subtle bugs, you may end up in the depths of EF rather than debugging your own code.

I would be interested in hearing from other developers. Do you love EF, avoid it, or is it just about OK?

Microsoft has announced the release candidate of Entity Framework 4.1, the data persistence library for .NET, with a go-live licence. The final release to the web is expected in around one month’s time.

The big new feature is code-first, where you do not need to define a database schema or even a database model. You simply write classes that define objects you want to store, and the framework handles the work of defining the database for you.

Note that according to this article on MSDN:

The Entity Framework is Microsoft’s recommended data access technology for most types of applications.

Of course Microsoft has a long history of data access APIs and keeping up with the latest recommendation over the years has been a challenge. That said, the low-level ADO.NET data API has been in place since the first release of the .NET framework and has evolved rather than been replaced. There has been some confusion over LINQ to SQL versus Entity Framework; but note that LINQ (Language Integrated Query) works with Entity Framework as well.

So what is code-first? A good starting point is VP Scott Guthrie’s post from July last year, where he walks through a complete example using his Nerd Dinner theme. He writes classes to define two entities, Dinner and RSVP. Then he writes the following code:

Having defined this NerdDinners class inheriting from DbContext, he can go ahead and write a complete database application.

At this point there is still no database. In the simplest case though, you can just add a database connection to the project with the same name as the DbContext class – in this case, “NerdDinners”. The Entity Framework will use this connection, define a database schema for you, and save and retrieve objects accordingly.

The magic under the hood is an example of convention over configuration. That is, the framework will generate code and schema according to assumptions it makes based on the names used in your classes. For example, it picks up the field named DinnerID and makes it a primary key; and seeing a collection of RSVP objects called RSVPs in the Dinner class, the framework creates a relationship between the two generated tables. You can override the default behaviour with code mapping rules. There is also provision for updating the schema if you need to add or modify the fields, though this is a point of uncertainty in Guthrie’s post.

It looks fantastic; though there are a few caveats. One is that Microsoft tends to assume use of its own database managers, SQL Server or for simple cases, SQL Server CE. That said, there are drivers for other databases; for example devart has code-first drivers for Oracle, MySQL, PostgreSQL and SQLite.

Another point is that there is a trade-off when working at such a high level of abstraction. There is less code for you to write, but a large amount of generated code, which can make debugging or optimizing an application harder. This is a familiar trade-off though; and you could say that hand-rolled SQL is no different in principle from hand-rolled assembly code; you can get fantastic results but the amount of effort and skill required is greater, as is the risk of errors if you get it slightly wrong.

The Mono team has said that it does not intend to implement Entity Framework currently; there is a summary of work needed here. If you want to write .NET code that ports easily to Mono it is best avoided.

Are you using Entity Framework in new .NET projects? I would be interested to hear from .NET developers what approach you take to data persistence.

Tim Anderson's IT Writing

Tag Archives: entity framework

Should you use Entity Framework for .NET applications?

Microsoft’s code-first Entity Framework 4.1 nearly done

Tech Writing