Microsoft has announced the release candidate of Entity Framework 4.1, the data persistence library for .NET, with a go-live licence. The final release to the web is expected in around one month’s time.
The big new feature is code-first, where you do not need to define a database schema or even a database model. You simply write classes that define objects you want to store, and the framework handles the work of defining the database for you.
Note that according to this article on MSDN:
The Entity Framework is Microsoft’s recommended data access technology for most types of applications.
Of course Microsoft has a long history of data access APIs and keeping up with the latest recommendation over the years has been a challenge. That said, the low-level ADO.NET data API has been in place since the first release of the .NET framework and has evolved rather than been replaced. There has been some confusion over LINQ to SQL versus Entity Framework; but note that LINQ (Language Integrated Query) works with Entity Framework as well.
So what is code-first? A good starting point is VP Scott Guthrie’s post from July last year, where he walks through a complete example using his Nerd Dinner theme. He writes classes to define two entities, Dinner and RSVP. Then he writes the following code:
Having defined this NerdDinners class inheriting from DbContext, he can go ahead and write a complete database application.
At this point there is still no database. In the simplest case though, you can just add a database connection to the project with the same name as the DbContext class – in this case, “NerdDinners”. The Entity Framework will use this connection, define a database schema for you, and save and retrieve objects accordingly.
The magic under the hood is an example of convention over configuration. That is, the framework will generate code and schema according to assumptions it makes based on the names used in your classes. For example, it picks up the field named DinnerID and makes it a primary key; and seeing a collection of RSVP objects called RSVPs in the Dinner class, the framework creates a relationship between the two generated tables. You can override the default behaviour with code mapping rules. There is also provision for updating the schema if you need to add or modify the fields, though this is a point of uncertainty in Guthrie’s post.
It looks fantastic; though there are a few caveats. One is that Microsoft tends to assume use of its own database managers, SQL Server or for simple cases, SQL Server CE. That said, there are drivers for other databases; for example devart has code-first drivers for Oracle, MySQL, PostgreSQL and SQLite.
Another point is that there is a trade-off when working at such a high level of abstraction. There is less code for you to write, but a large amount of generated code, which can make debugging or optimizing an application harder. This is a familiar trade-off though; and you could say that hand-rolled SQL is no different in principle from hand-rolled assembly code; you can get fantastic results but the amount of effort and skill required is greater, as is the risk of errors if you get it slightly wrong.
The Mono team has said that it does not intend to implement Entity Framework currently; there is a summary of work needed here. If you want to write .NET code that ports easily to Mono it is best avoided.
Are you using Entity Framework in new .NET projects? I would be interested to hear from .NET developers what approach you take to data persistence.