Friday, 19 December 2008

Astoria and LINQ-to-SQL; batching and replacing

So far, we have been doing individual updates - but if we perform multiple updates, each is currently sent as an individual update. We can do better; if the server supports it (and ADO.NET Data Services does), we can use batch-mode to send multiple requests at once, and to commit all the changes in a single transaction at the server. Note that this is far preferable to the option of using a distributed transaction to span multiple requests. Fortunately this latter option is not supported under ADO.NET Data Services: it simply doesn't scale, and besides - there is no way to convey this intent over a simple REST request which follow the YAGNI approach of keeping things simple.

Using Batches

The decision to use batches is taken at the client - for example, a non-batched update might be written:

var emp = ctx.Employees.Where(x => x.EmployeeID == empId).Single();
var boss = ctx.Employees.Where(x => x.EmployeeID == bossId).Single();

emp.BirthDate = boss.BirthDate = DateTime.Now;


This performs 2 http requests, and involves 2 data-contexts / SubmitChanges - i.e. adding some logging shows:

GET: http://localhost:28601/Restful.svc/Employees(1)
GET: http://localhost:28601/Restful.svc/Employees(2)
MERGE: http://localhost:28601/Restful.svc/Employees(2)
3: D 0, I 0, U 1 ## the 3rd data-context usage, doing a single update
MERGE: http://localhost:28601/Restful.svc/Employees(1)
4: D 0, I 0, U 1 ## the 4th data-context usage, doing a single update

Changing the last line at the client makes a big difference:


with the trace output:

GET: http://localhost:28601/Restful.svc/Employees(1)
GET: http://localhost:28601/Restful.svc/Employees(2)
POST: http://localhost:28601/Restful.svc/$batch ## note different endpoint and POST not MERGE
3: D 0, I 0, U 2 ## 3rd data-context usage, doing both updates

This usage can make the API far less chatty. However, it raises the issue of what to do if the second update fails (concurrency, blocking, etc). The answer lies in one of our few remaining IUpdatable methods. In fact, in batch mode (but not in single-update mode), erros in SaveChanges normally cause the ClearChanges method to be invoked, allowing us chance to cancel our changes. As it happens, LINQ-to-SQL uses transactions by default anyway, so our database should still be in a good state. Entity Framework, at this point, detaches all the objecs in the change-set, but LINQ-to-SQL doesn't offer this option; as a substitute, we can take this opportunity to prevent any further usage of the data-context in an unexpected state:

public static void ClearChanges(DataContext context)
{ // prevent any further usage

In other scenarios, this would be an opportunity to undo any changes.

Replacing Records

ADO.NET Data Services also provides another option for submits - replace-on-update. The difference here is that normally the data uploaded from the client is merged with the existing row, where-as with replace-on-update, the uplodaded row effectively *replaces* the existing. Importantly, any values not sent in the upload are reset to their default values (which might make a difference if the server knows about more columns than the client). At the REST level, the http verb is "PUT" instead of the "MERGE" or "POST" we saw earlier.

At the client, the only change here is our argument to SaveChanges:


Note that this mode is not compatible with batching; a request is made per-record. At the server, this maps to the final IUpdatable method - ResetResource. This method is responsible for clearing all data fields except the identity / primary key /etc (since we still want to write over the same database object). Since LINQ-to-SQL supports multiple mapping schemes (file-based vs attribute-based) we need to ask the MetaType for help here; we'll create a new vanilla object, and use these default property to reset the actual object:

public static object ResetResource(DataContext context, object resource)
Type type = resource.GetType();
object vanilla = Activator.CreateInstance(type);
MetaType metaType = context.Mapping.GetMetaType(type);
var identity = metaType.IdentityMembers;
foreach (MetaDataMember member in metaType.DataMembers)
if(member.IsPrimaryKey || member.IsVersion || member.IsDeferred
|| member.IsDbGenerated || !member.IsPersistent || identity.Contains(member))
{ // exclusions
MetaAccessor accessor = member.MemberAccessor;
accessor.SetBoxedValue(ref resource, accessor.GetBoxedValue(vanilla));
return resource;


Breath a sigh; we have now completely implemented IUpdatable, completing all of the raw CRUD operations required to underpin the framework, and all without too much inconvenience. Hopefully it should be fairly clear how this could be applied to other implementations. I promised to return to inheritance when creating records (and I will), and I'll also explore some of the other features of ADO.NET Data Services.

Saturday, 13 December 2008

Astoria and LINQ-to-SQL; associations

We can get now manipulate flat data - but we also need to model associates between data (read: foreign keys in RDBMS terms).

Lets start by modifying our client to pick two employees: we'll add one of the records as a subordinate of the other, verify, and then remove them. Note that the ADO.NET Data Services client code doesn't do much to automate this - as with "UpdateObject" we need to keep nudging it to ask it to mirror the changes we make. We also need to manually load child collections.

Implementing Collection (/Child) Associations

Here's our new client test code:

ctx.LoadProperty(boss, "Employees");


ctx.AddLink(boss, "Employees", grunt);


ctx.DeleteLink(boss, "Employees", grunt);


Here, the CountSubordinates creates a separate query context to validate the data independently. When we execute this, we immediately get errors in the data-service about "AddReferenceToCollection", then "RemoveReferenceFromCollection". To implement these methods, note that collection associations in LINQ-to-SQL are EntitySet instances, which implement IList:

public static void AddReferenceToCollection(
DataContext context, object targetResource,
string propertyName, object resourceToBeAdded)
IList list = (IList) GetValue(context,
targetResource, propertyName);

public static void RemoveReferenceFromCollection(
DataContext context, object targetResource,
string propertyName, object resourceToBeRemoved)
IList list = (IList)GetValue(context,
targetResource, propertyName);

Here we are using our existing "GetValue" method with a simple cast, then just adding the extra data.

Implementing Single (/Parent) Associations

Individual associations are handled differently to collection associations; this time, we'll set the manager of the subordinate directly on the subordinate:


grunt.Manager = boss;
ctx.SetLink(grunt, "Manager", grunt.Manager);


grunt.Manager = null;
ctx.SetLink(grunt, "Manager", grunt.Manager);


The "SetLink" method reminds the data-context that we care... This time, the data-service breaks on the "SetReference" method. Fortunately, since we are treating resources and instances as interchangeable, this is trivial to implement with our existing "SetValue" method:

public static void SetReference(
DataContext context, object targetResource,
string propertyName, object propertyValue)
SetValue(context, targetResource,
propertyName, propertyValue);

With this in place, our Manager is updated correctly, which can be validated in the database.

Summary - and Why oh Why...

So; we've got associations working in either direction. But te "SetLink" etc nags at me. It seems very unusual to have to do this so manually. And annoyingly, the standard client-side code:

  • doesn't implement INotifyPropertyChanged - which means the objects don't work well in an "observer" setup
  • doesn't provide an OnPropertyChanged (or similar) partial method - which means we can't conveniently add the missing functionality

If either of these were done, I'm pretty certain we could hook some things together so that you (as the consumer) don't need to do all this work, perhaps using events to talk to the context (rather than referencing the context directly from the object, which has some POCO issues). There are lots of On{Foo}Changed partial methods, but you'd have to do them all manually.

Maybe PostSharp would help here. Or maybe ADO.NET Data Services should include these features; I might look at this later...

Friday, 12 December 2008

Astoria and LINQ-to-SQL; modifying data

Our ADO.NET Data Service can now act to create records... but how about changing existing data? Starting simple, we'll now look at removing records from the database. We'll update the test client:

// create it
Employee newEmp = new Employee {
FirstName = "Fred",
LastName = "Jones"};

// delete it

Implementing Delete

This time, our DataService<NorthwindDataContext> code (from the previous article) explodes first on the mysterious "GetResource" method. This curious beast is used to resolve a resource from a query. Fortunately, since we are treating "resource" and "instance" as interchangeable, we can simply offload most of this work to the regular provider (LINQ-to-SQL in this case):

public static object GetResource(
DataContext context, IQueryable query,
string fullTypeName)
object execResult = query.Provider
return ((IEnumerable) execResult)

Here, "Provider.Execute" uses LINQ-to-SQL to build the query; we can't easily predict what data we will get back, but we ultimately expect it to be an enumerable set, with exactly one item - so LINQ-to-Objects can do the rest for us via Cast<T>() and Single().

With this implemented, we now get the largely anticipated error on "DeleteResource"; again this is fairly simple to implement:

public static void DeleteResource(
DataContext context, object targetResource)
ITable table = context.GetTable(

And sure enough, it works; we can verify that the data is removed from the database.

When abstraction attacks

As a minor observation (yet more LOLA), I also notice that attempting (at the client) to query data that has been removed by primary key doesn't behave quite as we might expect:

var ctx = new NorthwindDataContext(BaseUri);
Employee emp = ctx.Employees
.Where(x => x.EmployeeID == id)

With an invalid id, this breaks with "Resource not found for the segment 'Employees'."; this applies for any combination of First/Single, with/withot OrDefault, and with/without AsEnumerable() - basically, ADO.NET Data Services sees the primary key, and asserts that it must exist. Something else to remember!

Implementing Update

Interestingly, the combination of "GetResource" (which we implemented for delete) and "SetValue" (which we implemented for create) provides everything we need for simple updates. And sure enough, we can test this from the client. Note that the ADO.NET Data Services client requires us to be explicit about which records we want to update (there is no automatic change tracking):

newEmp.BirthDate = new DateTime(1980,1,1);
ctx.UpdateObject(newEmp); // mark for send


We're getting there; we now have our 4 fundamental CRUD operations. Next, we'll look at associations between data. We are whittling through the operations, though - we have remaining (not implemented):

  • Association 
    • AddReferenceToCollection
    • RemoveReferenceFromCollection
    • SetReference
  • Batch (rollback) 
    • ClearChanges
    • ResetResource

Astoria and LINQ-to-SQL; creating data

Previously, we looked at how to get LINQ-to-SQL talking to ADO.NET Data Services to query data - but how about editing data? This again just works for Entity Framework via the metadata model. So what about LINQ-to-SQL? Lets give it a whirl... we'll start with some simple code at the client to create a new Employee:

Employee newEmp = new Employee {
FirstName = "Fred",
LastName = "Jones"};

Give that a go, and boom! "The data source must implement IUpdatable to support updates.". IUpdatable is the ADO.NET Data Services interface that allows a REST service to talk back to the model... and it isn't trivial:


Hmm... lots to do. We should also note that the IUpdatable model uses two types of object:

  1. resources - an opaque, implementation-defined notion of an instance (anything you want that allows you to map to an instance)
  2. instances - actual entity objects (such as Customer)

LINQ-to-SQL is already highly object oriented (and to do anything useful with the instance we'll need the actual object from the DataContext), so for simplicity I'll simply treat these as one-and-the-same; our opaque resource will be the instance itself.

Defining our own Data Context

This is one of those cases where the simplest approach is to use inheritance - i.e. create a WebDataContext class (that uses explicit implementation to satisfy the interface) from which we can inherit our own DataContext. Note that since people may already have their own inheritance trees, I'll keep the implementation code separate, allowing us to use the code even if we haven't subclassed. Here's a very small part of it:


Luckily, LINQ-to-SQL allows you to control the base-class of the data-context, so change the dbml to point at our new WebDataContext (using the namespace-qualified name), and it will update the generated code accordingly. In the designer this is the "Base Class" when you don't have any table selected. In the xml, it is the "/Database/@BaseType" property.

Remembering that we haven't actually coded any of it yet, fire it up. This time, we get an error at the client: "The method or operation is not implemented.", which we expected; the debugger conveniently breaks at the offending methods, letting us locate them.

Implementing Create

The first thing that breaks is CreateResource; we need to implement this to create the new object to use. So we'll add a static method to our WebUpdateable class, and point our base-class at it. CreateResource supplies a "containerName" - this is essentially a property name for the data we are adding. Creating an object of a known type is easy enough - but what type? We could use the provided "fullTypeName" (and I'll come back to this later), but a simple approach might be simply to look at the container itself: if the container is an IQueryable<T>, we probably want to create a T. I say probably because LINQ-to-SQL supports discriminator-based inheritance, but we'll park that issue for now...

For re-use, it is useful to write a method that uses reflection to identify the type from IQueryable<T>:

static Type GetContainedType(
DataContext context, string containerName)
PropertyDescriptor property = TypeDescriptor
if(property == null) throw new ArgumentException(
"Invalid container: " + containerName);

return (
from i in property.PropertyType.GetInterfaces()
where i.IsGenericType
&& i.GetGenericTypeDefinition() ==
typeof (IQueryable<>)
select i.GetGenericArguments()[0]

With this in place, we can now create objects, and (importantly) add them to the table; fortunately, DataContext allows you to work with tables by type:

public static object CreateResource(
DataContext context, string containerName,
string fullTypeName)
Type type = GetContainedType(context,
// TODO: anything involving inheritance
ITable table = context.GetTable(type);
object obj = Activator.CreateInstance(type);
return obj;

Then all we need to do is hook that into our WebDataContext base-class:

object IUpdatable.CreateResource(
string containerName, string fullTypeName)
return WebUpdatable.CreateResource(
this, containerName, fullTypeName);

Note that all of the interface implementations in WebDataContext will be shallow methods that forward to the static methods in WebUpdatable, so I won't repeat them.

Run that, and this time it breaks at SetValue; ADO.NET Data Services wants us to update the object-model with the values it supplies. Fortunately, this is very simple via reflection (or TypeDescriptor), so I'll show the implementation and the interface implementation together (and we'll cover GetValue at the same time):

public static void SetValue(
DataContext context, object targetResource,
string propertyName, object propertyValue)

public static object GetValue(
DataContext context, object targetResource,
string propertyName)
return TypeDescriptor.GetProperties(

The next time we try the service, it breaks at SaveChanges; again, this is simple to implement with LINQ-to-SQL:

public static void SaveChanges(
DataContext context)

After this, ResolveResource breaks; for data inserts, this can be implemented trivially:

public static object ResolveResource(
DataContext context, object resource)
return resource;


Finally, we can validate (at the client) that our data is saved:


And hoorah! It did indeed save; we can look in the database to see the new records.


We've now truly dipped our toe in the ADO.NET Data Services pool; we can create data! Always useful. Next, we'll look at modifying the records we have just added.

Brute force (but lazily)

Seemingly an oxymoron, but not in the world of code...

As part of my current mini-project, I recently needed to write some code to use brute force to solve a problem (i.e. test every combination to see what works). There might be 0, 1, or many answers to the problem, and the caller might:

  • just care that there is a solution, but not what it is
  • want any solution
  • want every solution
  • want the only solution
  • etc

Now, first off - brute force is a mixed approach; you need to understand your data and your problem to know whether such an approach will ever work, and what the performance will be. In this case, the candidates were in the crunchable range; it might take a bit of CPU, but it isn't going to be terrible, and we aren't doing it in a tight loop.

I was very happy with my approach to the problem, so I thought I'd share... in short: iterator blocks.

Basically, rather than write a method of the form:

// NOT actual C#!!!
Answer[] SolveProblem(question)
List<Answer> answers = new List()
for(lots of crunching, looping, etc)
if(answer is good) answers.Add(answer);
return answers.ToArray();

(or anything more complex involving the number of answers needed, etc), I wrote it as:

// NOT actual C#!!!
IEnumerable<Answer> SolveProblem(question)
for(lots of crunching, looping, etc)
if(answer is good) yield return answer;

This gives us lazy (on-demand) processing for free; and the caller can use regular LINQ extension methods to handle the results in the appropriate way; so taking the previous bullets in order (where "qry" is our IEnumerable<Answer>):

  • qry.Any()
  • qry.First()
  • qry.ToArray()
  • qry.Single()
  • everything else you can think of... Skip(), Take(), Where(), FirstOrDefault(), etc

And job done! Every query will read just as much data as it wants; no more, no less. So First() will stop crunching once the first answer is found, where-as ToArray() will run the process to exhaustion. Very tidy.

As a footnote, I should note that my particular problem was recursive (but not massively so); I solved this by simply injecting an extra loop:

foreach(Answer subAnswer
in SolveProblem(different question)
yield return subAnswer;

Wednesday, 10 December 2008

Astoria and LINQ-to-SQL; getting started

ADO.NET Data Services (previously known as "Astoria") is one the new .NET 3.5 SP1 tools for offering CRUD data access over the web.

In this article, I'll discuss using ADO.NET Data Services, looking first at Entity Framework, but (since I'm a firm believer in parity across implementations) more specifically, looking at LINQ-to-SQL.

Although ADO.NET Data Services uses WCF, it is at heart a RESTful stack (not a SOAP operation stack), that offers:

  • Simple (REST/CRUD) platform-independent access to data, including filtering etc, but limited to homogenous data and full object representations [no custom projections at the point of query]
  • A .NET client and object model, including change-tracking support, making it easy to make complex changes
  • Support for LINQ at the client, which is translated into requests that the REST layer understands

A very powerful tool!

Exposing data via Entity Framework

So: how can we use it? Unsurprisingly, it plays very nicely with Entity Framework: you simply create a data-service endpoint (using the "ADO.NET Data Service" template in VS 2008), and tell it about your object-context:

    public class EntityService : DataService<NorthwindEntities> // your object-context here...
public static void InitializeService(IDataServiceConfiguration config)
// specify permissions for entity-sets and operations
config.SetEntitySetAccessRule("*", EntitySetRights.All); // debug only; don't do this!

And that is it! Navigate to the .svc in a browser, and you should see the REST/ATOM-based response:

<service [snip]>
<collection href="Categories">
<collection href="CustomerDemographics">

This shows the navigable entity-sets - so adding /Categories to our request would return the categories, etc. So: that's Entity Framework sorted... how about LINQ-to-SQL?

Exposing data via LINQ-to-SQL

You might be slightly surprised that things aren't nearly so tidy. It transpires that ADO.NET Data Services will respect either a metadata-model (ala Entity Framework), or it will support a per-implementation fallback model otherwise. And since LINQ-to-SQL pre-dates ADO.NET Data Services, LINQ-to-SQL doesn't qualify (out of the box) for either. We need to wire a few things together... we'll look at exposing data in this article, and then consider updating data as a separate piece of work.

So - lets be optimistic, and change NorthwindEntities (our Entity Framework object-context) to NorthwindDataContext (our LINQ-to-SQL data-context). Boom! We get a big fat "Request Error", "The server encountered an error processing the request. See server logs for more details.":


So the first thing we need is some better debugging... as a debugging aid (i.e. not on a production server), we can get a lot more information by tweaking our DataService:

    [ServiceBehavior(IncludeExceptionDetailInFaults = true)] // ### show error details in the response
public class EntityService : DataService<NorthwindDataContext> // your data-context here...
public static void InitializeService(IDataServiceConfiguration config)
config.UseVerboseErrors = true; // ### and tell us as much as you can
// specify permissions for entity-sets and operations
config.SetEntitySetAccessRule("*", EntitySetRights.All); // debug only; don't do this!

We've enabled exception-details in the response, and enabled verbose errors. It still doesn't work, but at least we get a clue why: "On data context type 'NorthwindDataContext', there is a top IQueryable property 'CustomerDemographics' whose element type is not an entity type. Make sure that the IQueryable property is of entity type or specify the IgnoreProperties attribute on the data context type to ignore this property.":


Hmm... we don't want to ignore the property (or we cant get at the data), and it isn't an EntityObject in the Entity Framework sense, so what do we do?

It turns out that what this message actually means here is "I can't find the primary key"; outside of metadata-models, it supports a few ways of doing this:

  • via the new [DataServiceKey] attribute
  • by looking for a "FooID" property, where the type is called "Foo"
  • by looking for an "ID" property

As it happens, CustomerDemographic uses CustomerTypeID for the primary key, so no surprise. Fortunately, the designers seem to have considered the (likely) scenario of generated code, and so we can specify [DataServiceKey] at the type-level in a partial class. This is fortunate, since you can't use a partial class to add an attribute to a member (such as a property) defined in a separate portion of a partial class. So we add (in our data-context namespace):

partial class CustomerDemographic {}

And likewise for any other types that it isn't happy with. In general, the "FooID" pattern is so ubiquitous that it catches most use-cases. One minor gripe here is that it is case-sensitive: "FooId" won't work without a [DataServiceKey]. Something else to remember for that monthly "ID vs Id" squabble...

Interestingly, note that we haven't done anything LINQ-to-SQL specific yet; everything is just hanging off the exposed IQueryable<T> propertieson our data-context, so everything here would also work with LINQ-to-Objects.

Consuming the Data

Our service now shows in a browser without a big error - so  we can look at creating a client. We can do this in VS2008 simply using the "Add Service Reference..." dialog; it recognises REST/ATOM endpoints, and behaves accordingly:


Alternatively, you can use "datasvcutil" at the command-line, but (unusually, compared to wsdl.exe and svcutil.exe) you get a very limited set of command-line options here. After generating the client, you can use it very similarly to a LINQ-to-SQL data-context:

        static void Main()
var ctx = new NorthwindDataContext(BaseUri); // note not IDisposable
// enable crude logging
ctx.SendingRequest += (sender, e) => Debug.WriteLine(
e.Request.Method + ": " + e.Request.RequestUri);

ShowEmployees(from emp in ctx.Employees
where emp.FirstName.StartsWith("M")
select emp);
ShowEmployees(from emp in ctx.Employees
orderby emp.FirstName descending
select emp);
            Employee tmp = ctx.Employees.Where(x=>x.EmployeeID == 9).Single();
static void ShowEmployees(IQueryable<Employee> query)
foreach(var emp in query)
Console.WriteLine("{0}: {1}, {2}",
emp.EmployeeID, emp.LastName, emp.FirstName);

Which displays the correct employees, and also shows the HTTP request uris:

GET: http://localhost:28601/Restful.svc/Employees()?$top=20
GET: http://localhost:28601/Restful.svc/Employees()?$filter=startswith(FirstName,'M')
GET: http://localhost:28601/Restful.svc/Employees()?$orderby=FirstName desc
GET: http://localhost:28601/Restful.svc/Employees(9)

These queries give a lot of insight into how it all hangs together. Some points:

  • Count() is not supported, which is a shame as it is so universally useful.
  • Single(predicate) is not supported - you must use Where(predicate).Single(); this is doubly LOLA, since it contrasts with a bug in LINQ-to-SQL, where the identity-manager can only short-circuit primary-key lookups (avoid hitting the database) if you use .Single(predicate). Oh well; in Entity Framework it doesn't work *at all*, so perhaps I should be grateful.
  • We are limited to homogenous results - i.e. we can't do complex projections over the wire. We can, of course, do simple queries, then use LINQ-to-Objects (with .AsEnumerable() if necessary) to do the complex bits at the client - just remember that all the entity properties are coming back, even if you ignore \most of them.


Exposing LINQ-to-SQL isn't particularly hard; with a few simple tweaks, it generally works. The client tools provide a rich (but not quite complete) set of query tools for consuming ADO.NET Data Services.

Next, we'll see how deep the hole goes, by trying to submit changes to our LINQ-to-SQL ADO.NET Data Service.

Entity Framework in reality

I'm currently working on a mini-project with Microsoft (more news on this next year) - and the project team are keen for this project to be exemplar of recommended MS technologies. So (where appropriate), the system makes use of Entity Framework (EF) and ADO.NET Data Services - two technologies that I know a bit about, but have only previously toyed with (although I've done a lot with their cousins, LINQ-to-SQL and WCF).

Now, I've commented a few times in the past about the short-comings of EF, but this project has been a good opportunity to do something non-trivial start-to-end with it. And while I'm still not an EF evangelist, I'm now not quite so hostile towards it.

The truth is: if I hadn't seen LINQ-to-SQL, I'd probably be delighted with EF as a LINQ back-end. Forgetting about architectural purity concerns (POCO, etc), there are still a number of annoying glitches with EF, including:

LOLA - things that work with most other IQueryable<T> implementations, but not EF:

  • No support for Single()
  • No support for expression-composition (Expression.Invoke)
  • Skip()/Take() demand an ordered sequence (IOrderedQueryable<T>)

General annoyances:

  • Limited support for composable UDF usage
  • A few niggles with the detach mechanism
  • Some lazy loading issues
  • Lack of "delete all" - a pain for reset scripts, but probably not very helpful in production code ;-p

On the plus side, the ability to remove trivial many-to-many link tables from the conceptual model makes the programming experience much nicer - but this is the only thing I've used in EF that I couldn't have done just as easily with LINQ-to-SQL.

But: it gets the job done. If anything, the exercise has gone a long way to making me feel warm and cosy about the future of EF: if the team can take the flexibility of the EF model, but squeeze in the bits that LINQ-to-SQL gets so right, I'll be a happy coder.

(In my next post, I hope to discuss my experiences with ADO.NET Data Services)

Wednesday, 26 November 2008

Dynamic Operators

One of the interesting proposed features mentioned in the C# 4.0 CTP documents is operator support on "dynamic". I've done a lot of looking at generic operators in the past, but this potentially allows for a new answer for operators with generics - something of the type:

        public static T Add<T>(T x, T y)
dynamic xDynamic = x, yDynamic = y;
return xDynamic + yDynamic;

Unfortunately, the support for operators on dynamic didn't make it into the current CTP bits, so we can't try this yet. But certainly one to watch in the next CTP.

However, looking at the complexity involved behind the scenes with dynamic, I suspect that this isn't going to be nearly as efficient as some of the things you can already do in .NET 3.5 by pre-compiling an Expression to a Func<T,T,T>, like here. A heavily-cut down version of this can be written as follows:

        static class Operator<T>
private static readonly Func<T, T, T> add;
public static T Add(T x, T y) { return add(x, y); }
static Operator()
var x = Expression.Parameter(typeof(T), "x");
var y = Expression.Parameter(typeof(T), "y");
add = Expression.Lambda<Func<T, T, T>>(
Expression.Add(x, y), x, y).Compile();

In terms of expression-trees, that is about as simple as they get - but very useful for providing things like operator support. Obviously the code in MiscUtil allows a lot more flexibility ;-p

But! In many cases you aren't going to be doing heavy number crunching... so maybe an occasional dynamic call is acceptable if it reduces some external code. Once the CTP supports it, I will be sure to run a performance test to see where the split lies.

Monday, 24 November 2008

const decimal Tax = 0.175M;

If you're based in the UK, and work on bespoke corporate systems, you are probably having "fun" at the moment. If you are a consultant, you're probably rubbing your hands - but if (like me) you are a salaried employee, you are probably hastily running searches over every piece of code and customer-facing document looking for anything that looks like 17.5, 1.175, 0.175, etc.

Why? A number (essentially: sales tax) that has been constant since 1991 is changing from 17.5% to 15%... in one weeks time (correction: 5 days).

Of course, in a perfect world, every system will get this number from a configuration file, database, etc - but I'm happy to bet that it has snuck into more places than we care to imagine. Time to start paying back that code debt, people...

don't(don't(use using))

I was at DDD day on Saturday, and I was a little surprised to hear a recommendation "don't use using". Now, to be fair: I'm very familiar with the context of this quote - specifically, a handful of classes that do not provide a clean (non-throwing) implementation of IDisposable. But I still don't agree with the advice...

For example, it is fairly well known that WCF makes a bit of a mess in this area... and a complicated mess too. The core ClientBase class throws on Dispose() under a range of fault conditions - with the side effect that you can very easily end up accidentally losing your original exception to a using block.

The recommended workaround here was to not use "using", but to wrap the code manually - something like:

Proxy proxy = new Proxy();
try {
} finally {
// lots of code based on state of proxy
Now, we could move all the "lots of code" to a static method, but it still makes us do more work that we would like. With WCF, the scenario is further complicated because of the way that proxies work in terms of code-generation - it would be very messy to have to do any subclassing, since you would be subclassing an unknown subclass of ClientBase.

So in this scenario, I would greatly favor encapsulation... an in particular, C# 3.0 extension methods rescue us hugely. Consider we create a common base-class / interface to handle "something we want to dispose but which might explode on us":
public interface IDisposableWrapper<T> : IDisposable
T BaseObject {get;}
public class DisposableWrapper<T> : IDisposableWrapper<T> where T : class, IDisposable {
public T BaseObject {get;private set;}
public DisposableWrapper(T baseObject) { BaseObject = baseObject; }
protected virtual void OnDispose() {
public void Dispose() {
if (BaseObject != null) {
try {
} catch { } // swallow...
BaseObject = null;
Now we can create some extension methods to wrap that for us...

    // core "just dispose it without barfing"
public static IDisposableWrapper<T> Wrap<T>(this T baseObject)
where T : class, IDisposable
if (baseObject is IDisposableWrapper<T>) return (IDisposableWrapper<T>)(object)baseObject;
return new DisposableWrapper<T>(baseObject);
But more - because extension methods support overloading, we can introduce a WCF-specific extension overload to support clean disposal of WCF clients:

public class ClientWrapper<TProxy, TService> : DisposableWrapper<TProxy>
where TProxy : ClientBase<TService>
where TService : class {
public ClientWrapper(TProxy proxy) : base(proxy) { }
protected override void OnDispose()
// lots of code per state of BaseObject
// specific handling for service-model
public static IDisposableWrapper<TProxy> Wrap<TProxy, TService>(
this TProxy proxy)
where TProxy : ClientBase<TService>
where TService : class
return new ClientWrapper<TProxy, TService>(proxy);
It might not be obvious, but the generic type-inference actually does all the heavy lifting here; we have a fully wrapped WCF client, regardless of the specific subclass of ClientBase - so we can use:

using (var client = new Proxy().Wrap()) {
Safe, robust, easy to call, and no code duplication. And extensible to other special cases too.

Thursday, 20 November 2008

CF woes

Grumble grumble.

I spent some time on the train this morning scratching my head wondering why some perfectly innocent code was failing in protobuf-net; a user had (quite reasonably) reported the issue, and I was investigating...

Now, I'm not usually a mobile developer, so this might be well known to CF developers - but it was news to me! Here's the setup (after taking out lots of stuff that doesn't matter) - see if you can spot the problem:

static void Foo<T>(T[] values)
List<Wrapper<T>> list = new List<Wrapper<T>>();
foreach(T value in values)
list.Add(new Wrapper<T>(value));
list.Sort(delegate (Wrapper<T> x, Wrapper<T> y) {
return x.SomeProp.CompareTo(y.SomeProp);
}); // BOOM!!!
// or even just list.Sort();

All it does is wrap the values in a wrapper, add them to a list, and sort them by a property of the wrapper (a simple, field-based property - nothing fancy). As it happens, the same thing happens even with just the regular Sort() [assuming you implement IComparable or the generic counterpart].

Looks innocent enough, yes? And it will work in every framework except CF 2.0 - so: can you predict the problem?

It turns out that the CF 2.0 security model is severely, erm.... "interesting" when combined with generics. Even though we don't attempt anything evil via reflection, the above code can throw a MethodAccessException. Well, it was news to me!

The problem is that in CF 2.0, it really, really wants T to be accessible to this method... which means that if Foo and T are in different assemblies, T must be public. Otherwise, even if we don't so much as look sideways at T, the code will fail. And if the type is nested in another type, the containing type must also be public. And so on. Does anybody else smell a fairly large runtime bug here? This pretty much defeats the whole point of "You give me a T (I don't care what), and I know what to do" - with CF 2.0 it is "You give me a T (but please, pretty please, do something that I can't enforce on a constraint), and I might maybe sometimes work without crashing".

The good news is that this is fixed in CF 3.5...

But there you go: if you want to work with generics and library dlls in CF 2.0, then you are probably going to have to make your classes public.

Tuesday, 18 November 2008

Dynamic Objects, part1

(caveat: all based on the October 2008 public CTP; for all I know, this will be fixed in RTM - but either way, it shows how to work with IDynamicObject)

C# 4.0 / .NET 4.0 introduce DLR concepts into the core framework and language. However, you may-or-may-not know that actually we've had *a* form of dynamic object right from 1.1, in the form of ICustomTypeDescriptor (and TypeDescriptionProvider in .NET 2.0). These are used to provide custom data-binding to .NET objects, and is how (for example) a DataView displays columns in a grid - each column in the DataTable pretends to be a property on the object (each row).

Since C# 4.0 doesn't directly allow you to write dynamic objects, I've been playing with Tobias Hertkorn's dynamic class here.

I was a little surprised to see that this type of custom data-binding doesn't automatically work for dynamic objects (IDynamicObject) - i.e. the following fails:

     dynamic duck = new Duck();
duck.Value = "abc";
Application.Run(new Form { DataBindings = {
{ "Text", duck, "Value" } } });

It can't find a property called "Value" on our duck. Vexing. So; what can we do about this? Well, we could try exposing our dynamic object via the standard data-binding API. Of course, we could extend the Duck class to support ICustomTypeDescriptor directly, but I thought it would be more fun (read: challenging) to look at the more general problem of talking to an unknown dynamic object ;-p

To get data-binding working (on objects, rather than types), we need to use ICustomTypeDescriptor:

  • IDynamicObject - the dynamic object, which will be wrapped with... 
    • ICustomTypeDescriptor - which provides a... 
      • PropertyDescriptorCollection - which (for each named property) can provide a... 
        • PropertyDescriptor - which exposes... 
          • GetValue
          • SetValue

Most of the code involved here isn't very interesting (I will post it when I find somewhere suitable)... but it presents a few challenges. The main trick here is: given an IDynamicObject and a property name as a string, how do you get/set the value? ILDASM to the rescue, and it turns out to be something like the following (where "getter" and "setter" are used to cache the call-site):

   CallSite<Func<CallSite, object, object>> getter;
public override object GetValue(object component)
if (getter == null)
var binder = Microsoft.CSharp.RuntimeBinder.RuntimeBinder.GetInstance();
var payload = new Microsoft.CSharp.RuntimeBinder.CSharpGetMemberPayload(binder, Name);
getter = CallSite<Func<CallSite, object, object>>.Create(payload);
return getter.Target(getter, Unwrap(component));
CallSite<Func<CallSite, object, object, object>> setter;
public override void SetValue(object component, object value)
if (setter == null)
var binder = Microsoft.CSharp.RuntimeBinder.RuntimeBinder.GetInstance();
var payload = new Microsoft.CSharp.RuntimeBinder.CSharpSetMemberPayload(binder, true, null, Name);
setter = CallSite<Func<CallSite, object, object, object>>.Create(payload);
setter.Target(setter, Unwrap(component), value);

So how do we work with this? I've written a .AsBindable() extension method that wraps an IDynamicObject in my custom type descriptor. We can then bind to the dynamic object as below:

       var duck = new Duck();
dynamic dynamicDuck = duck;
dynamicDuck.Value = "abc";
var bindable = duck.AsBindable();
Application.Run(new Form { DataBindings = {
{ "Text", bindable, "Value" } } });

Cool ;-p

Next time, I might try and reverse things - i.e. rather than an existing IDynamicObject wrapped in a custom descriptor, I'll try and wrap a regular object (with custom 2.0 data-binding) with IDynamicObject to simplify data-access. Of course, HyperDescriptor would still be much quicker! This is just to play with the objects to see how it all works...

Wednesday, 5 November 2008

Immutability and optional parameters

(caveat: based on the October 2008 public VS2010 CTP; everything subject to change)

C# 3.0 made it very easy to initialize objects, in particular via "object initializer" syntax. This allows us to set the properties for an object (and child objects) very simply.

I'll give a simple example - but note that this makes a lot more sense where the class has lots of properties... obviously that isn't condusive to blogging, so you'll have to use your imagination ;-p

public sealed class Person {
public string Forename { get; set; }
public string Surname { get; set; }
// snip: other properties
static void Main() {
Person p = new Person {
Forename = "Fred",
Surname = "Flintstone"

In the Main method, we create and initialize the Person via the object initializer, setting the Forename and Surname properties.

OK; all good - but what about immutability? Unfortunately, it starts to break down.

public sealed class Person {
private readonly string forename, surname;
public string Forename { get {return forename; }}
public string Surname { get {return surname; }}
// snip: other properties

static void Main() {
Person p = new Person {
Forename = "Fred",
Surname = "Flintstone"

Error 1 Property or indexer 'Forename' cannot be assigned to -- it is read only
Error 2 Property or indexer 'Surname' cannot be assigned to -- it is read only

The problem is that object initializers work by invoking the setter, which can't happen if there *is* no setter.
Obviously we could add a constructor that sets both the forename and surname, but keep in mind that we might have 10 properties, and different callers might want to set different permutations of properties. We don't really want to add large numbers of constructors to support each caller's whim.

So what about C# 4.0? This introduces optional and named parameters... as well as making COM a lot more friendly, we can use this on our constructors to provide something close to an object initializer:

public sealed class Person {
// readonly fields/properties
private readonly string forename, surname;
public string Forename { get { return forename; } }
public string Surname { get { return surname; } }
// snip: other properties

// constructor with optional arguments
public Person(
string forename = "", string surname = ""
/* snip: other parameters */) {

this.forename = forename;
this.surname = surname;
// snip: other fields
static void Main() {
Person p = new Person(
forename: "Fred",
surname: "Flintstone");

Here, the Main method creates a new immutable Person, specifying just the properties that they want to assign. The rest are defaulted to whatever the "=foo" says in the constructor declaration.

Simple; easy.

Tuesday, 4 November 2008

Future Expressions

Last post, I looked at what Expression offered in C# 3.5; to re-cap, very versatile at building simple queries, but not useful for writing manipulation code - in particular, the only in-built assignment is via Expression.MemberInit etc, which is useful only when creating new objects - or at a push you can perform a single update by obtaining the setter and using Expression.Call.

The interesting news is that this looks very different in .NET 4.0 (based on the public October 2008 CTP). It appears that Expression underpins much of the DLR / dynamic work - so there are all sorts of interesting new options - essentially, Expression is the new CodeDOM... personally I am very excited about this; I've always found Expression a far more intuitive and workable abstraction than the counterparts.

In .NET 3.5, there are 46 values in the ExpressionType enum; in .NET 4.0 (CTP), this jumps to 69, with options for loop constructs (loop/dowhile/break/continue/yield), member assignment, composite statements (block/comma?), and exception handling.

The new entries (again, based on the CTP) are:

  • ActionExpression
  • Assign
  • Block
  • BreakStatement
  • Generator
  • ContinueStatement
  • Delete
  • DoStatement
  • EmptyStatement
  • Extension
  • IndexedProperty
  • LabeledStatement
  • LocalScope
  • LoopStatement
  • OnesComplement
  • ReturnStatement
  • Scope
  • SwitchStatement
  • ThrowStatement
  • TryStatement
  • Unbox
  • Variable
  • YieldStatement

I won't pretend to understand them much yet, but I will be investigating! I also understand that the C# 4.0 compiler doesn't support expressions with statement bodies at the moment (although you can build things by hand - which isn't trivial). It will be interesting to see if this changes (not least because reflector is such a good tool for discovering how you are meant to construct these expressions!).

[update] As an example, I previously gave the example of something you can't do in .NET 3.5:

    Expression<Action<Foo>> action = foo =>
foo.Bar = "def";

Well, here it is in .NET 4.0:

   class Foo
public string Bar { get; set; }
public override string ToString()
return Bar;
static void Main()
var param = Expression.Parameter(typeof(Foo), "foo");
var assign = Expression.AssignProperty(
Expression.Constant("def", typeof(string)));
var writeLine = Expression.Call(
typeof(Console).GetMethod("WriteLine", new[] { typeof(object) }),

var action = Expression.Lambda<Action<Foo>>(Expression.Comma(assign, writeLine), param);
var actionDel = action.Compile();
Foo foo = new Foo();

And of course, there are obvious issues with existing LINQ implementations: if you use some of the new constructs, you should expect your legacy LINQ provider to barf; and indeed, most of the new options simply don't make sense for a database call, so we shouldn't expect too much to change in LINQ-to-SQL/EF.

Monday, 20 October 2008

Express yourself

I thought I'd start by having a look at expressions - in particular System.Linq.Expressions.Expression.

Expression is an important part of the glue that makes LINQ work, but it has been almost completely hidden away by the spectacular job that C# 3.0 makes of lambda statements and LINQ.

For example, consider the following C# 3.0 query:

    var qry = from foo in source
where foo.Bar == "abc"
select foo;

Because of how query syntax is interpreted, this is identical to the following:

    var qry = source.Where(foo => foo.Bar == "abc"); // Sample A

But what does this mean? Well, first it depends on what our "source" is, or more accurately, whether our Where() method accepts a delegate or an expression tree. For LINQ-to-Objects, then the Where will accept a delegate (such as Func<T, bool>), and so (keeping the extension method syntax) this is identical to:

    var qry = source.Where(delegate(Foo foo) {
return foo.Bar == "abc"; });

However, for most other LINQ providers, it is likely that the Where method accepts an expression tree instead (such as Expression<Func<T,bool>>). So what is an expression tree? The short and imprecise answer is that is is an object representation of code - similar in some (very limited) ways to things like CodeDom - but aimed squarely at providing LINQ support - i.e. all the operations you might need for every-day queries, and the ability to point to framework methods for things that can't be expressed purely in expressions.

Doing It The Hard Way

So lets dissect our example: looking at the delegate version, you can see that the function (the argument to Where) accepts an argument called "foo" (of type Foo), performs member-access (.Bar), and performs an equality test with the string constant "abc". So lets try writing that (as an expression) ourselves:

    ParameterExpression foo = Expression.Parameter(typeof(Foo), "foo");
MemberExpression bar = Expression.PropertyOrField(foo, "Bar");
ConstantExpression abc = Expression.Constant("abc", typeof(string));
BinaryExpression test = Expression.Equal(bar, abc);

Expression<Func<Foo, bool>> lambda =
Expression.Lambda<Func<Foo, bool>>( test, foo);
var qry = source.Where(lambda);

Yikes! That looks like a lot of work! And you'd be right... but take it step-by-step and it makes sense:

  • We declare a parameter "foo" of type Foo
  • We perform member-access to get foo.Bar
  • We define a string constant "abc"
  • We test the value of .Bar against this constant
  • Finally, we wrap the whole thing up in a lambda; this is just our way of baking the expression into something usable (and note that we need to explicitly re-state the parameter we expect).
[update] Note that the expression is typed using Expression<T>, where T is a delegate-type that reflects what we want. In this case we wan't to use the method as a predicate - i.e. to accept a Foo and return a bool - so we use Expression<Func<Foo,bool>>.

Doing It The Easy Way

The good news is that in most cases, we simply don't need to do any of this; the compiler does all the hard work for us; in fact, it can use a number of tricks that can't be done conveniently in regular C#, such as pointing directly at a MethodInfo for the "get" (.Bar); here's what reflector shows for our original "Sample A":

    ParameterExpression CS$0$0000;
IQueryable<Foo> qry = source.Where<Foo>(Expression.Lambda<Func<Foo, bool>>(Expression.Equal(Expression.Property(CS$0$0000 = Expression.Parameter(typeof(Foo), "foo"), (MethodInfo) methodof(Foo.get_Bar)), Expression.Constant("abc", typeof(string)), false, (MethodInfo) methodof(string.op_Equality)), new ParameterExpression[] { CS$0$0000 }));

OK, that isn't the easiest code to read... it is mainly the same as our previous example (but with explicit generics and "params" values). The parameter usage (CS$0$0000) looks tricky, but is still the same - just initialised and used in the same code block.

The two big differences are in how ".Bar" and "==" are performed; note the use of "methodof(Foo.get_Bar)" and "methodof(string.op_Equality)". No, this isn't a C# keyword you didn't know about - it is simply reflector doing the best job it can of displaying something that isn't valid C#, but is valid IL - using a MethodInfo directly instead of via reflection. In our hand-cranked version, the Expression API will get the same answer, but needs to jump through a few more hoops first.

So Why Bother?

It would be valid, at this point, to ask: "If the compiler can write expressions for me, and do it better, why should I care?". And in most day-to-day coding you would be right: please don't start re-writing your LINQ queries with hand-cranked expressions! And in particular, they are pretty-much a nightmare to use (manually) with anonymous types. However, they do have some intersting possibilities. For regular LINQ (even LINQ-to-Objects via .AsQueryable()), this can be useful for things like:

  • Dynamic filters / sorts / etc based on properties that cannot be known at compile-time
  • General expression-tree manipulation: merging trees (at InvocationExpression nodes), re-forming trees, etc

But for me, a more common-place usage is in using expressions as the basis of a micro-compiler: a hidden gem of an Expression<T> is the Compile() method; this lets us turn our lambda into a typed delegate, ready for use. But this isn't using reflection (when executed): it creates regular IL in a dynamic assembly that can be invoked very efficiently. This is as simple as:

    Func<Foo, bool> predicate = lambda.Compile();
Foo obj = new Foo { Bar = "abc" };
bool isAbc = predicate(obj);

Now, obviously there is some cost to this:

  • we need to build an expression tree (with the reflection lookups etc)
  • the library code needs to parse the expression tree (including validation), and emit IL

So this isn't something we'd want to do every time we call the method - but stick the delegate creation in some start-up code (perhaps in a static constructor / type initializer, storing it in a static readonly field) and you have a powerful tool.

So What Can We Do?

imageSo far, I've only given a very small glimpse of the expression capabilities available, but the list is quite comprehensive, with no fewer than 60 different methods. This covers most of the common things you want to do, including:

  • all common operators
  • branching (conditional)
  • object construction
  • collection / array construction
  • member access
  • invocation (to both regular .NET methods and to pre-existing lambda expressions)
  • casting / conversion

The biggest thing to note is that expression-trees are designed to encapsulate a functional statement; in particular:

  • it doesn't lend itself to mutating existing instances
  • it doesn't lend itself to writing a series of statements (unless you can chain them via a fluent API or an operator such as +)
  • it should return something (i.e. not Action<T>)

What I mean is that you can't write the following (either in C# or manually):

    Expression<Action<Foo>> action = foo =>
foo.Bar = "def";

since it fails all 3 conditions.


So far, I've just tried to give a flavor of what you can do with expression trees. I've covered some more of this in various places, that might be interesting:

  • Generic operators [overview | usage] - i.e. +-/* with generic T; this is used in MiscUtil as part of the Push LINQ code
  • Object cloning [here] - efficient object cloning without the pain
  • LINQ object construction [here]
  • Partial LINQ projections [here]
I hope that has given you a taste for expressions, and a few ideas for when to use them.


Blog.Write("Hello, World!")

Hello and welcome, existing and new friends alike...

About me

I'm a C#/.NET/TSQL/etc developer working for RM in Abingdon, UK. I'm also a keen community supporter, mainly in the C#/.NET arenas. Oh, and I spend a lot of time (probably too much time) dabbling with code ;-p

On Oct 1st, Microsoft decided to make me a Visual C# MVP, so it seemed timely to start blogging properly.

Back in the real world, I'm blessed with a fantastic wife, son, and another baby in the works.

About the blog

My plan is to use this blog for a dumping ground for bits'n'bobs - sharing knowledge, writing up ideas (hopefully some good), and just discussing various thoughts on development. If somebody out there reads it, even better ;-p

So, things like:

  • .NET/C# - general thoughts on the language and runtimes
  • System.ComponentModel - custom binding ahoy
  • LINQ - in various guises
  • Expression - the .NET 3.5 Swiss army API
  • protocol buffers, HyperDescriptor, Push LINQ or any of the other mini-projects I've contributed to
  • Anything else that takes my interest

So without further ado:

goto blog;