Friday, 12 December 2008

Brute force (but lazily)

Seemingly an oxymoron, but not in the world of code...

As part of my current mini-project, I recently needed to write some code to use brute force to solve a problem (i.e. test every combination to see what works). There might be 0, 1, or many answers to the problem, and the caller might:

  • just care that there is a solution, but not what it is
  • want any solution
  • want every solution
  • want the only solution
  • etc

Now, first off - brute force is a mixed approach; you need to understand your data and your problem to know whether such an approach will ever work, and what the performance will be. In this case, the candidates were in the crunchable range; it might take a bit of CPU, but it isn't going to be terrible, and we aren't doing it in a tight loop.

I was very happy with my approach to the problem, so I thought I'd share... in short: iterator blocks.

Basically, rather than write a method of the form:

// NOT actual C#!!!
Answer[] SolveProblem(question)
{
List<Answer> answers = new List()
for(lots of crunching, looping, etc)
{
if(answer is good) answers.Add(answer);
}
return answers.ToArray();
}

(or anything more complex involving the number of answers needed, etc), I wrote it as:

// NOT actual C#!!!
IEnumerable<Answer> SolveProblem(question)
{
for(lots of crunching, looping, etc)
{
if(answer is good) yield return answer;
}
}

This gives us lazy (on-demand) processing for free; and the caller can use regular LINQ extension methods to handle the results in the appropriate way; so taking the previous bullets in order (where "qry" is our IEnumerable<Answer>):

  • qry.Any()
  • qry.First()
  • qry.ToArray()
  • qry.Single()
  • everything else you can think of... Skip(), Take(), Where(), FirstOrDefault(), etc

And job done! Every query will read just as much data as it wants; no more, no less. So First() will stop crunching once the first answer is found, where-as ToArray() will run the process to exhaustion. Very tidy.

As a footnote, I should note that my particular problem was recursive (but not massively so); I solved this by simply injecting an extra loop:

foreach(Answer subAnswer
in SolveProblem(different question)
{
yield return subAnswer;
}