Tuesday, July 26, 2011

Emitting Unit Tests from Assemblies Part 1


In the pursuit of correct software, the team and I have written several tools to "discover" the features of various Everest components and create unit tests for them. Here are just a few of the tools we developed:




  • A nifty little gem that reflects the generated assemblies created by GPMR and populates as many properties as it can find, serializes and compares to XSDs
  • The same gem also generates a round-trip test, that is, populate as many properties as possible, serialize then de-serialize and then compare (to make sure they're the same)
  • A program that reflects the DataTypes library, populates all the values in two instances, then changes one and compares them (to make sure they're not equal)
  • A program that serializes just datatypes and compares equality (datatype round-tripping)
Now, that seems pretty boring, and you might be asking why the heck is he posting this? I'm bored. Well, I'll show you how I wrote the equality generator so you can do the same for your projects.

In order to generate a program to test equality we need to think of an algorithm for such a program. If we have two classes and we want to test the .Equals method, we'll need to generate a unit test class with (n + 1) tests where n is the number of properties in the type. For example, if our type looks like this:
MyClass
+ Name : String
- Age : Int
+ DOB : DateTime


We'll need to generate 3 unit tests, two testing inequality and one testing equality. The tests generated should look like this:
Test #PropertyA InstanceB InstanceAssertion
1Name"Bob""Bill"A Not Equal B
DOB01-01-199001-01-1990
2Name"Bob""Bob"A Not Equal B
DOB01-01-199001-01-1991
3Name"Bob""Bob"A Equal B
DOB01-01-199001-01-1990


In our final output, the generated code should look like this:

[TestMethod]
public void TestNameInquality()
{
    MyClass aInstance = new MyClass(), bInstance = new MyClass();
    aInstance.Name = "Bob";
    aInstance.DOB = DateTime.Parse("01-01-1990");
    bInstance.Name = "Bill";
    bInstance.DOB = DateTime.Parse("01-01-1990");
    Assert.AreNotEqual(aInstance, bInstance);
}

First, let's define a structure for holding our test data, since I couldn't think of a better name, we'll call it TestData. I'm feeling a little lazy right now, so we'll just make it a simple structure:

struct TestData
{
   public String TestName { get; set; }
   public List<KeyValuePair<String, String>> AValues { get; set; }
   public List<KeyValuePair<String, String>> BValues { get; set; }
   public bool IsTrue { get; set; }
}

In this structure, we'll store the name of the test, as well as the expected outcome (ie: are they equal is what I meant by IsTrue). The structure also defines the A Instance and B Instance values are key/value pairs of string. Since the program we're writing is going to emit code, and code is ... well, for the most part, a text file the array of key/value pairs will hold the name of the property and the necessary code to set the value.

Next, we'll create a utility function that can take those nice System.Type classes and format them as a C# type reference. This is a relatively simple process, we start with the method signature:

private static string CreateClassRef(Type t)
{

This utility function will expect a System.Type parameter and will generate the C# datatype reference (as appropriate for code). Next, we'll determine the actual name of the class. In .NET, generic types are represented with a back-tick, so CE<T> is actually CE`1[[.... when you call System.Type.FullName. Since the emitted code won't work with the back-tick, we need to turn CE`1[[ back into CE< ...

String className = t.FullName;
if (className.Contains("`"))
   className = className.Substring(0, className.IndexOf("`"));

Next, the code needs to determine if the type is indeed a generic, and if so, it needs to generate those funky angle brackets. In .NET each generic type definition (ie: CE<T>) can be used to generate one or more generic types (ie: CE<String>). Since .NET doesn't do nasty type erasure like some languages who shall remain nameless (*cough* Java) it is possible to generate a code signature from a System.Type:

if (t.IsGenericType)
{
   className += "<";
   foreach (var genParm in t.GetGenericArguments())
     className += String.Format("{0},", CreateClassRef(genParm));
   className = className.Remove(className.Length - 1);
   className += ">";
}

This little snippet of code will turn CE`1[System.String, mscorlib... into CE<System.String>. Notice that it is recursively calling itself. This is to handle nested generic type definitions, such as LIST<CE<String>>. Finally, we'll finish the method by returning the generated refernece:

   return className;
}

Our next utility function will be used to populate those nice (albeit scary looking) AValue/BValue properties on the TestData class. If we think about our outputs, this method will have to generate code that populates all the data in the AValue and BValue. The approach I've taken is to write a utility function that returns one Key/Value pair class for a single property. In order for the function to know what it is generating, it needs to know if you want option 1 or option 2 (ie: which initializer data) to be assigned to the property.

/// <summary>
/// Create property value setter
/// </summary>
private static KeyValuePair<string, string> CreatePropertyValue(PropertyInfo info, int p)
{
   string initializer = GetInitializer(info.PropertyType, p);

   return new KeyValuePair<string, string>(info.Name, initializer);
}

I use yet another utility function to actually generate the initializer code. Why? So I can emit code for constructors with parameters, or any .Add methods the program might find. Let's take a look, first my method signature:

private static string GetInitializer(Type type, int p)
{

As you might expect, in order to generate code to initialize something, the function will need to know the type of the something. Also, I'm propogating the initial value choice (parameter p). Next, I start to determine if the type is something I can directly or easily just assign:

string initializer = "";
if (type == typeof(byte) || type == typeof(byte?))
    initializer = p.ToString();

This little snippet will generate initialization code for either a byte or nullable byte. Since p is an integer, my code can simply use the value passed to it (ie: in code it is ok to write: byte myByte = 3;). Notice I don't append a ";" to the end of the initializer string, remember this method can be called to set constructor parameters as well. For all of the simple datatypes I follow the same pattern:

else if (type == typeof(int) || type == typeof(int?))
  initializer = p.ToString();
else if (type == typeof(double) || type == typeof(double?))
  initializer = p.ToString("0.0f");
else if (type == typeof(decimal) || type == typeof(decimal?))
  initializer = String.Format("(decimal){0}", p);
else if (type == typeof(String))
  initializer = String.Format("\"{0}\"", p);
else if (type == typeof(bool) || type == typeof(bool?))
  initializer = Convert.ToBoolean(p).ToString().ToLower();

Next, my function may encounter something that is a DateTime, since I can't just assign my P value to a dateTime I have to do something a little more complex. I've chosen to create a date on January 1, 201x.

else if (type == typeof(DateTime) || type == typeof(DateTime?))
   initializer = String.Format("DateTime.Parse(\"201{1}-1-10\")", p + 1);

If my type is an array, I can emit code that generates an array. So, for example, if my type is byte[] I can emit the code new byte[] { 1 } (again it is perfectly legal to emit: byte[] myByte = new byte[] { 1 };). However, because my class might be a more complex object, I recursively call the GetInitializer method (example emitting: DateTime[] myDates = new DateTime[] { DateTime.Parse("2012-01-01") };)

else if (type.IsArray)

{
   initializer = String.Format("new {0} {{ {1} }}", CreateClassRef(type), GetInitializer(type.GetMethod("Get").ReturnParameter.ParameterType, p));
}


If my type is an enumeration, well that is easy, we can just pick a random enumeration value to assign:

else if (type.IsEnum)
{
    initializer = String.Format("{0}.{1}", type.FullName, type.GetFields()[p == 0 ? 1 : type.GetFields().Count() - 1].Name);
}

Finally, we come to complex classes. Let's say I call the GetInitializer method and the type parameter is MyClass, well I'd have to emit code to construct a MyClass object (example: MyClass aInstance = new MyClass();). But, in order to make sure I generate good code, I first have to ensure I'm not constructing an abstract class:

else if (!type.IsAbstract)
{

Then, I have to look for a parameterized constructor, where I can create an initializer for all of the parameters in the particular constructor overload. I've wanted to be adventerous and always call a parameterized constructor. If you're more cautious you can remove the o.GetParameters().Length > 0 part and call parameterless constructors as well.

var ctor = Array.Find<ConstructorInfo>(type.GetConstructors(), o => o.GetParameters().Length > 0 && !Array.Exists<ParameterInfo>(o.GetParameters(), pa => String.IsNullOrEmpty(GetInitializer(pa.ParameterType, p))));

Then, it's easy, if we didn't find any constructors, we can't create the type so we just return nothing

if (ctor == null)
   return string.Empty;

Otherwise, we start constructing our constructor call. First, we have to emit the appropriate constructor call, ie: new typeX( where typeX is our appropriate C# code reference to the type (for which I can call CreateClassRef):

else
{
   StringBuilder initBuilder = new
   StringBuilder(String.Format("new {0}(", CreateClassRef(type)));

Then, the code iterates over the parameters, and appends the parameter data to the function.

   foreach (ParameterInfo pi in ctor.GetParameters())
     initBuilder.AppendFormat("{0},", GetInitializer(pi.ParameterType, p));
   if(!initBuilder.EndsWith("(")) initBuilder.Remove(initBuilder.Length - 1, 1);
   initBuilder.Append(")");
   initializer = initBuilder.ToString();
}
}

Finally, if my type has an Add method, I can append a type initializer. So, if my type reference is List<Int32> I can emit the code: List<Int32> d = new List<Int32>() { 0 }). That is what this snippet does:

if (type.GetMethod("Add") != null && type.IsGenericType)
{
   initializer += String.Format(" {{ {0} }}", GetInitializer(type.GetGenericArguments()[0], p));
}
return initializer;
}

So, where does that leave the program? Well, we have a structure that describes our unit tests, a function that creates fancy C# code declarations for classes (CreateClassRef), a function that populates the AValues/BValues' KeyValuePair, and a function that creates the necessary code to initialize an object.

In my next post, I'll show how it all comes together to generate unit tests.

Monday, July 25, 2011

Everest Framework 1.0 Formatter Changes


Well, it's been awhile since I've posted anything on this blog and for good reason. I've been a busy little bee working on several projects and one of these is the all important rush to get Everest 1.0 ready for production. Seeing as I have to wait for 5,200 unit tests to finish, I thought I'd take some time to blog about Everest 1.0.

Everest 1.0 introduces many new features, the biggest being jEverest (a Java version of Everest), which I might blog about in the future (I might just make you wait, depends on the mood) but not today. Today's post is about the exciting world of Formatters in Everest 1.0 ... yay! I've never been happy with the way that formatting messages has been handled in Everest and when it came time to implement them in Java, I decided an overhaul was needed (don't worry, I made sure it was backwards compatible). First, let's create a simple message to illustrate the formatting process:

static
IGraphable CreateASimpleMessage()

{
   return new MCCI_IN000002CA(
      Guid.NewGuid(),

      DateTime.Now,

      ResponseMode.Immediate,

      MCCI_IN000002CA.GetInteractionId(),
      MCCI_IN000002CA.GetProfileId(),
      ProcessingID.Production,
      AcknowledgementCondition.Always);
}

I've chosen a general acknowledgement for the sample as you can tell :) Anyways, in Everest, we'd traditionally do the following to format the message and get the formatter details:

static void FormatAMessage()
{
    var fmtr = new MARC.Everest.Formatters.XML.ITS1.Formatter()
                  { ValidateConformance = false };
    fmtr.GraphAides.Add(typeof(MARC.Everest.Formatters.XML.Datatypes.R1.Formatter));
    fmtr.GraphObject(Console.OpenStandardOutput(), CreateASimpleMessage());

    // Output details
    foreach (var dtl in fmtr.Details)
      if (dtl.Type == MARC.Everest.Connectors.ResultDetailType.Error)
         Console.WriteLine("Error: {0}", dtl.Message);
}


The process should look familiar if you've used Everest before. We setup the formatter, turn off validation (I know that the message I create is invalid), and graph to the console. Then I can iterate over the Details array in the Formatter object and get the problems encountered while formatting.

While this works, it does present some problems:

  1. Details[] is attached to the instance of Formatter, which means that the formatter cannot be shared across threads or concurrent functions as they will attempt to write to the same array
  2. Anytime we want to reuse the formatter we have to call .Clone() to copy our settings over (or create one of those *gag* factories and call NewInstance())
So, I fixed the design issue in Everest 1.0. I liked the manner that we send/receive messages using the ISendResult and IReceiveResult interfaces in Connectors so I decided to model the formatters in the same way. We now have IFormatterGraphResult and IFormatterParseResult which can be used to share a formatter and call Graph() or Parse() on different threads. Here is an example of how this used:

static void FormatAMessageInACleanWay()
{
    var fmtr = new MARC.Everest.Formatters.XML.ITS1.Formatter()
         { ValidateConformance = false };
    fmtr.GraphAides.Add(typeof(MARC.Everest.Formatters.XML.Datatypes.R1.Formatter));

    // Now we get an IFormatterGraphResult from Graph()
    IFormatterGraphResult result = fmtr.Graph(Console.OpenStandardOutput(), CreateASimpleMessage());

    // Output details
    foreach(var dtl in result.Details)
       if (dtl.Type == MARC.Everest.Connectors.ResultDetailType.Error)
           Console.WriteLine("Error: {0}", dtl.Message);
}

Much better. I recommend using the new method of Formatting in Everest whenever you can as it is cleaner and we'll probably remove the GraphObject/ParseObject methods in future versions of Everest (not immediately though).

Also, you don't have to worry about all your Everest code not working because of this change. GraphObject and ParseObject are wrappers for this new pattern and are completely backwards compatible (at least that's what about 3,000 unit tests are telling me).