Tuesday, July 26, 2011

Emitting Unit Tests from Assemblies Part 1


In the pursuit of correct software, the team and I have written several tools to "discover" the features of various Everest components and create unit tests for them. Here are just a few of the tools we developed:




  • A nifty little gem that reflects the generated assemblies created by GPMR and populates as many properties as it can find, serializes and compares to XSDs
  • The same gem also generates a round-trip test, that is, populate as many properties as possible, serialize then de-serialize and then compare (to make sure they're the same)
  • A program that reflects the DataTypes library, populates all the values in two instances, then changes one and compares them (to make sure they're not equal)
  • A program that serializes just datatypes and compares equality (datatype round-tripping)
Now, that seems pretty boring, and you might be asking why the heck is he posting this? I'm bored. Well, I'll show you how I wrote the equality generator so you can do the same for your projects.

In order to generate a program to test equality we need to think of an algorithm for such a program. If we have two classes and we want to test the .Equals method, we'll need to generate a unit test class with (n + 1) tests where n is the number of properties in the type. For example, if our type looks like this:
MyClass
+ Name : String
- Age : Int
+ DOB : DateTime


We'll need to generate 3 unit tests, two testing inequality and one testing equality. The tests generated should look like this:
Test #PropertyA InstanceB InstanceAssertion
1Name"Bob""Bill"A Not Equal B
DOB01-01-199001-01-1990
2Name"Bob""Bob"A Not Equal B
DOB01-01-199001-01-1991
3Name"Bob""Bob"A Equal B
DOB01-01-199001-01-1990


In our final output, the generated code should look like this:

[TestMethod]
public void TestNameInquality()
{
    MyClass aInstance = new MyClass(), bInstance = new MyClass();
    aInstance.Name = "Bob";
    aInstance.DOB = DateTime.Parse("01-01-1990");
    bInstance.Name = "Bill";
    bInstance.DOB = DateTime.Parse("01-01-1990");
    Assert.AreNotEqual(aInstance, bInstance);
}

First, let's define a structure for holding our test data, since I couldn't think of a better name, we'll call it TestData. I'm feeling a little lazy right now, so we'll just make it a simple structure:

struct TestData
{
   public String TestName { get; set; }
   public List<KeyValuePair<String, String>> AValues { get; set; }
   public List<KeyValuePair<String, String>> BValues { get; set; }
   public bool IsTrue { get; set; }
}

In this structure, we'll store the name of the test, as well as the expected outcome (ie: are they equal is what I meant by IsTrue). The structure also defines the A Instance and B Instance values are key/value pairs of string. Since the program we're writing is going to emit code, and code is ... well, for the most part, a text file the array of key/value pairs will hold the name of the property and the necessary code to set the value.

Next, we'll create a utility function that can take those nice System.Type classes and format them as a C# type reference. This is a relatively simple process, we start with the method signature:

private static string CreateClassRef(Type t)
{

This utility function will expect a System.Type parameter and will generate the C# datatype reference (as appropriate for code). Next, we'll determine the actual name of the class. In .NET, generic types are represented with a back-tick, so CE<T> is actually CE`1[[.... when you call System.Type.FullName. Since the emitted code won't work with the back-tick, we need to turn CE`1[[ back into CE< ...

String className = t.FullName;
if (className.Contains("`"))
   className = className.Substring(0, className.IndexOf("`"));

Next, the code needs to determine if the type is indeed a generic, and if so, it needs to generate those funky angle brackets. In .NET each generic type definition (ie: CE<T>) can be used to generate one or more generic types (ie: CE<String>). Since .NET doesn't do nasty type erasure like some languages who shall remain nameless (*cough* Java) it is possible to generate a code signature from a System.Type:

if (t.IsGenericType)
{
   className += "<";
   foreach (var genParm in t.GetGenericArguments())
     className += String.Format("{0},", CreateClassRef(genParm));
   className = className.Remove(className.Length - 1);
   className += ">";
}

This little snippet of code will turn CE`1[System.String, mscorlib... into CE<System.String>. Notice that it is recursively calling itself. This is to handle nested generic type definitions, such as LIST<CE<String>>. Finally, we'll finish the method by returning the generated refernece:

   return className;
}

Our next utility function will be used to populate those nice (albeit scary looking) AValue/BValue properties on the TestData class. If we think about our outputs, this method will have to generate code that populates all the data in the AValue and BValue. The approach I've taken is to write a utility function that returns one Key/Value pair class for a single property. In order for the function to know what it is generating, it needs to know if you want option 1 or option 2 (ie: which initializer data) to be assigned to the property.

/// <summary>
/// Create property value setter
/// </summary>
private static KeyValuePair<string, string> CreatePropertyValue(PropertyInfo info, int p)
{
   string initializer = GetInitializer(info.PropertyType, p);

   return new KeyValuePair<string, string>(info.Name, initializer);
}

I use yet another utility function to actually generate the initializer code. Why? So I can emit code for constructors with parameters, or any .Add methods the program might find. Let's take a look, first my method signature:

private static string GetInitializer(Type type, int p)
{

As you might expect, in order to generate code to initialize something, the function will need to know the type of the something. Also, I'm propogating the initial value choice (parameter p). Next, I start to determine if the type is something I can directly or easily just assign:

string initializer = "";
if (type == typeof(byte) || type == typeof(byte?))
    initializer = p.ToString();

This little snippet will generate initialization code for either a byte or nullable byte. Since p is an integer, my code can simply use the value passed to it (ie: in code it is ok to write: byte myByte = 3;). Notice I don't append a ";" to the end of the initializer string, remember this method can be called to set constructor parameters as well. For all of the simple datatypes I follow the same pattern:

else if (type == typeof(int) || type == typeof(int?))
  initializer = p.ToString();
else if (type == typeof(double) || type == typeof(double?))
  initializer = p.ToString("0.0f");
else if (type == typeof(decimal) || type == typeof(decimal?))
  initializer = String.Format("(decimal){0}", p);
else if (type == typeof(String))
  initializer = String.Format("\"{0}\"", p);
else if (type == typeof(bool) || type == typeof(bool?))
  initializer = Convert.ToBoolean(p).ToString().ToLower();

Next, my function may encounter something that is a DateTime, since I can't just assign my P value to a dateTime I have to do something a little more complex. I've chosen to create a date on January 1, 201x.

else if (type == typeof(DateTime) || type == typeof(DateTime?))
   initializer = String.Format("DateTime.Parse(\"201{1}-1-10\")", p + 1);

If my type is an array, I can emit code that generates an array. So, for example, if my type is byte[] I can emit the code new byte[] { 1 } (again it is perfectly legal to emit: byte[] myByte = new byte[] { 1 };). However, because my class might be a more complex object, I recursively call the GetInitializer method (example emitting: DateTime[] myDates = new DateTime[] { DateTime.Parse("2012-01-01") };)

else if (type.IsArray)

{
   initializer = String.Format("new {0} {{ {1} }}", CreateClassRef(type), GetInitializer(type.GetMethod("Get").ReturnParameter.ParameterType, p));
}


If my type is an enumeration, well that is easy, we can just pick a random enumeration value to assign:

else if (type.IsEnum)
{
    initializer = String.Format("{0}.{1}", type.FullName, type.GetFields()[p == 0 ? 1 : type.GetFields().Count() - 1].Name);
}

Finally, we come to complex classes. Let's say I call the GetInitializer method and the type parameter is MyClass, well I'd have to emit code to construct a MyClass object (example: MyClass aInstance = new MyClass();). But, in order to make sure I generate good code, I first have to ensure I'm not constructing an abstract class:

else if (!type.IsAbstract)
{

Then, I have to look for a parameterized constructor, where I can create an initializer for all of the parameters in the particular constructor overload. I've wanted to be adventerous and always call a parameterized constructor. If you're more cautious you can remove the o.GetParameters().Length > 0 part and call parameterless constructors as well.

var ctor = Array.Find<ConstructorInfo>(type.GetConstructors(), o => o.GetParameters().Length > 0 && !Array.Exists<ParameterInfo>(o.GetParameters(), pa => String.IsNullOrEmpty(GetInitializer(pa.ParameterType, p))));

Then, it's easy, if we didn't find any constructors, we can't create the type so we just return nothing

if (ctor == null)
   return string.Empty;

Otherwise, we start constructing our constructor call. First, we have to emit the appropriate constructor call, ie: new typeX( where typeX is our appropriate C# code reference to the type (for which I can call CreateClassRef):

else
{
   StringBuilder initBuilder = new
   StringBuilder(String.Format("new {0}(", CreateClassRef(type)));

Then, the code iterates over the parameters, and appends the parameter data to the function.

   foreach (ParameterInfo pi in ctor.GetParameters())
     initBuilder.AppendFormat("{0},", GetInitializer(pi.ParameterType, p));
   if(!initBuilder.EndsWith("(")) initBuilder.Remove(initBuilder.Length - 1, 1);
   initBuilder.Append(")");
   initializer = initBuilder.ToString();
}
}

Finally, if my type has an Add method, I can append a type initializer. So, if my type reference is List<Int32> I can emit the code: List<Int32> d = new List<Int32>() { 0 }). That is what this snippet does:

if (type.GetMethod("Add") != null && type.IsGenericType)
{
   initializer += String.Format(" {{ {0} }}", GetInitializer(type.GetGenericArguments()[0], p));
}
return initializer;
}

So, where does that leave the program? Well, we have a structure that describes our unit tests, a function that creates fancy C# code declarations for classes (CreateClassRef), a function that populates the AValues/BValues' KeyValuePair, and a function that creates the necessary code to initialize an object.

In my next post, I'll show how it all comes together to generate unit tests.

No comments:

Post a Comment