I have been working on a project which I thought would be simpler than it turned out to be – nothing new there, most software projects are like that.
The project involves upload and download of large files from Azure storage. There is a database as part of the application, nothing too demanding, but requiring some typical CRUD (Create, Retrieve, Update, Delete) functionality. I had to decide how to implement this.
First, a confession. I am comfortable using SQL and my normal approach to a database application is to use ADO.NET DataReaders to read data. They are brilliant; you just send some SQL to the database and back comes the data in a format that is easy to read back in C# code.
When I need to update the data, I use SqlCommand.ExecuteNonQuery which executes arbitrary SQL. It is easy to use parameters and transactions, and I get full control over how many connections are open and so on.
This approach has always worked well for me and I get excellent performance and complete flexibility.
However, when coding in ASP.NET MVC and Visual Studio you are now steered firmly towards Entity Framework (EF), Microsoft’s object-relational mapping library. You can use a code-first approach. Simply create a C# class for the object you want to store, and EF handles all the drudgery of creating tables and building SQL queries, letting you concentrate on the unique features of your application.
In addition, you can right-click in the Solution Explorer, choose Add Controller, and a wizard will generate all the code for listing, creating, editing and deleting those objects.
Well, that is the idea, and it does work, but I soon ran into issues that made me wonder if I had made the right decision.
One of the issues is what happens when you change your mind. Maybe that field should be an Int rather than a String. Maybe you need a second phone number field. Maybe you need to create new tables. How do you keep the database in synch with your classes?
This is called Code First Migrations and involves running commands that work out how the database needs to change and generates code to update it. It’s clever stuff, but the downside is that I now have a bunch of generated classes and a generated _MigrationHistory table which I did not need before. In addition, something when slightly wrong in my case and I ended up having to comment out some of the generated code in order to make the migration work.
At this point EF is creating work for me, rather than saving it.
Another issue I encountered was puzzling out how to do stuff beyond the most trivial. How do you replace an HTML edit box with a dropdown list? How do you exclude fields from being saved when you call dbContext.SaveChanges? What is the correct way to retrieve and modify data in pure code, without data binding?
I am not the first to have questions. I came across this documentation: an article promisingly entitled How to: Add, Modify, and Delete Objects which tells you nothing of value. Spot how many found it helpful:
You should probably start here instead. Still, be aware that EF is by no means straightforward. Instead of having to know SQL and the basics of ADO.NET commands and DataReaders, you now have to know EF, and I am not sure it is any less intricate. You also need to be comfortable with data binding and LINQ (Language Integrated Query) to make sense of it all, though I will add that strong data binding support is one reason whey EF is a good fit for ASP.NET MVC.
Should you use Entity Framework? It remains, as far as I can tell, the strategic direction for data access on Microsoft’s platform, and once you have worked out the basics you should be able to put together simple database applications more quickly and more naturally than with manually coded SQL.
I am not sure it makes sense for heavy-duty data access, since it is harder to fine-tune performance and if you hit subtle bugs, you may end up in the depths of EF rather than debugging your own code.
I would be interested in hearing from other developers. Do you love EF, avoid it, or is it just about OK?
EF was aweful until EF 5 on .NET 4.5
As of EF 6 it’s great. The only thing missing is table valued functions that can be used in where clauses (You can use them in selects, but then why would you bother and not just use a stored procedure?)
Once you get the hang of how it all works, and define your relationships explicitly using fluent or attributes instead of relying on convention, it’s great to use, and honestly strongly typed is the only way to build a large scale app that doesn’t end up with bugs that you find out only when the user reports it to you. (yes, I know, unit testing, but your unit tests will always miss things)
The trick is to skip repository pattern. It’s not useful when you consider Web Api and how it’s constructed.
And honestly, I can’t fathom why anyone would be creating new mvc.net applications and not going to Web API for data, and Durandal + Knockout or Angular with Typescript and doing everything client side and just using the server side for routing and some basic security functions.
I use NHibernate, and haven’t moved onto Entity Framework mostly out of laziness, I think; it works fine for what I need it to do, and I’m used to it. Based on what I’ve read, if I actually moved to EF, it would be about the same.
In my workflow, a few copy-and-pasted XML files, and all of the CRUD stuff is taken care of for me, which saves a lot of time, and what and how it is persisted is clearly defined.
I do get frustrated with getting data back from NHibernate, though — I’ll use the QueryOver API or HQL for relatively simple queries, but if there’s a query that becomes even moderately complex, I sidestep the ORM and objects entirely and create a “Query” class, which executes SQL code with ADO.NET DataReaders and returns the results as a specific DTO. I ported a bit of MyBatis (http://mybatis.github.io/mybatis-3/dynamic-sql.html) to C# to help with this last part, so that I can declaratively assemble the SQL statements based on the parameters in the query class. (Imagine a screen where different filters can be turned on, and the Query class helps with the messy inserting and deleting extra ANDs and ORs and parentheses in the actual SQL statement as is appropriate.)
As for schema changes, that’s not fancy, either — I have a console application with a bunch of SQL scripts embedded as resources, numbered things like “128-AddAgeColumnToCheesesTable.sql”, and it just loops through them all. The scripts are coded so that they insert a record into a DatabasePatches table when they’re done, and they don’t run themselves if the corresponding patches record already exists. That way I can explicitly control when a schema update is going to happen. And while there are frameworks that can sort of express this in code, I find it easier just to write the SQL and be done with it.
I’m not sure if this is the best way to do things, but it’s the way I settled into, and I’m very productive with it.
We have used Dapper.NET (from one of the StackOverflow guys) very successfully. I am not that great a fan of ORM frameworks like NHibernate or EF because they are very very “heavy”. Dapper adds some ORM bits bit is almost as lightweight is straight DataReaders and DataWriters. It is well worth checking out.
https://github.com/StackExchange/dapper-dot-net
I’ve been bitten by ORMs too many times. For me the main problem is that you create a maintenance nightmare down the road. For example, let’s say you’ve used EF2 for a large application. Now you bring in a new person to work on that code. Which means that the new guy needs to learn a huge old framework that doesn’t do him any good in his future career. How motivating is that? I hate boilerplate code as much as anyone. But to rely on magic that changes every other year is dangerous for a longlived product.
Putting aside the question of EF’s technical desirability, it appears to be doing well today as a requested job-skill.
Job trends graph from Indeed.com: http://goo.gl/R8rrdL
Vic
Hi, here’s a Brazilian developer. Since I started coding for Windows using Delphi, first using BDE then DBX, with a short incursion into ECO – my first and very frustrating ORM experience, I’m kind of distrustful in this kind of approach. But since the very begining in .Net/C# It’s clear that’s a nightmare to control all the data access classes without a tool. So we made one that read our DDL SQL’s and generate classes, access methods, deal with collections, control memory allocation…
Provided that it works based on models, it gave us flexibility to generate classes in objective-c when we started to code for iOS. Of course it’s another tool to maintain but it’s worth it.
For my projects, I ran into CodeFluent Entities. This a great alternative to Entity Framework. It generates a lot of stuff (Code, database scripts, UI components) based on a model. This Model-Driven tool is integrated into Visual Studio just like Entity Framework. You should try it.
Good discussion here…
I’m not a big fan of EF and ORM’s in general for a number of the caveats stated above. Perhaps one of the biggest gripes I have is performance tuning to for optimizing queries. There’s probably a way to do this with something like views and mapping on to that, and suing programmability built into the database through store procedures and the like, but then there is the question of coupling between the database and the code…
EF also has burned me in the passed, so I’m hesitant to use it know. I’ve heard it is much more mature, but still, I haven’t soon forgot.
Anyways, there are lots of great alternatives out there… Maybe its time to check these out.
My biggest problem with Entity Framework is that it tries to do too much. It’s a great idea to have simple classes that can handle CRUD, but when you start using LINQ, forget about performance. It’s just terrible. Nothing can automatically generate complex queries that perform as well as a handwritten query. Inserting and updating large volumes of data is also VERY slow. I had a process that would take 2.5 hours to run. It would take about 10 minutes to read in the data it needed to do a conversion on and then the other 2 hours and 20 minutes to update the database. Just by putting all of those updates into a single SQL statement, I was able to reduce the write part to less than 30 seconds.
It also handles inheritance poorly. If you use Code First and have a Parent and Child class that gets written to the DB, it will create a Parent table that also includes the columns from the Child class and just add a column called Discriminator that identifies what the row of data represents.
EF seems like it was created to show off the kind of wizardry you can pull of using reflection rather than something that can be used to develop well-performing, easy to read, and easy to understand applications.
I’m not saying that it shouldn’t be used. I’m just saying that you should use it sparingly. When you’re just reading and writing a handful of records, use it. It’s convenient. Migrations are a cool feature once you get the hang of it. When doing large, complex queries or updating large volumes of data, don’t use it. The Database property of a DbContext does have some great methods such as SqlQuery and ExecuteNotQuery, that allow you to use a hybrid approach that will save you in the long run.
Hi Jim
Yes, I came to similar conclusions. Fine for simple applications and small databases. Not so good at scale or for more intricate database schemas.
Tim