SharePoint: Isolating Test Data

If you extract the change sets described in the previous installment of this series into a separate installer class, you can use it to create data structures without installing WSPs. This enables automated integration tests for the installer itself, and for code based on the resulting data structures. Automated tests should be reproducible. When you run them twice, each run should yield the same result, if you didn’t change the code in between. If you run tests against the SharePoint object model, you will notice that SharePoint persists your changes between two test runs. This means that two consecutive runs can differ, if the second run evaluates data from the first run. Looking at SQL doesn’t help in this case. Most SQL servers support transactions, the basic tool for integration tests against databases. You start the transaction during the test setup and perform a rollback in the tear down method. This leaves the database effectively untouched. SharePoint does not support transactions, so this way is out of reach.

A common solution is to rely on mock objects. If you run your tests using mocks, your data won’t reach SharePoint, and won’t be persisted. This is feasible when the system under test is the business logic. But in many cases, the integration with SharePoint is more critical than the business logic itself. The object model exhibits some strange behavior which you probably won’t mirror in your mock objects. Take the SPWeb class as an example. When you create an instance, then add a new user defined field type and have a look at the field types exposed by your SPWeb instance, you will see the old list not including your new type. Somewhere deep inside SPWeb this list is cached, and you cannot influence it. Similar behavior can be observed for the Properties list. This can result in hard to find bugs. The second important source of errors hidden by mocking is the invisible dependency chain. Switch on forms based authentication, and SPWeb.EnsureUser will actually require a web.config with the appropriate settings for System.Web.Security.Roles.Providers. Although this is reasonable given the nature of forms based authentication, it is a source of confusion since it runs fine in a web context and fails for console applications or automated tests. Given these drawbacks, mocking the SharePoint object model should be handled with care.

Another source of inspiration can be unit tests. Main memory doesn’t support transactions either, yet unit tests run isolated from each other due to the lack of persistence. The point is that new main memory is allocated for each test run. Even if the physical bytes are reused, they are logically unused. You can mirror this by creating a new environment for each test run. Similar to memory-based unit tests, this environment is used once and then discarded. SharePoint provides different levels of isolation: You can create a new farm, a new web application or a new site collection. Creating a new farm provides perfect isolation, but takes a lot of resources. This is not feasible in practice. New site collections provide isolation of lists and content types, but share installed solutions, user defined field types and the like. Web applications fall somewhere in between. We prefer using one site collection per tests, since these are relatively cheap to create and sufficient in many cases. Creating a new web application is orders of magnitude slower.

You gain another order of magnitude in execution speed when you pre-allocate the test site collections. A windows service in the background can ensure that there are always a few dozen site collections ready to be used as a testing environment. Each test run then take one of them (if available), mark them as used and delete it when it’s done:

public SPSite GetSite()
{
  var site = UnunsedSites.FirstOrDefault() ?? CreateSite();
  site.RootWeb.AllProperties["IsUsed"] = true;
  site.RootWeb.Update();
  return site;
}

private static SPSite CreateSite()
{
  var site = WebApplicationHelper.CreateSite();
  site.RootWeb.AllProperties["IsReady"] = true;
  site.RootWeb.Update();
  return site;
}

It is not trivial to determine whether there is a usable site collection. SharePoint likes throwing exceptions when you access a site collection during its creation or deletion:

public static IEnumerable<SPSite> UnunsedSites
{
  get { return TryFilter(s=>!IsUsed(s) && IsReady(s)); }
}

private static IEnumerable<SPSite> Sites
{
  get { return WebApplicationHelper.WebApplication.Sites; }
}

private static IEnumerable<SPSite> TryFilter(Func<SPSite, bool> filter)
{
  foreach (var site in Sites)
  {
    try
    {
      try
      {
        if (!filter(site))
          continue;
      }
      catch
      {
        continue;
      }
      yield return site;
    }
    finally
    {
      site.Dispose();
    }
  }
}

private static bool? TryParseBool(object value)
{
  if (value == null)
    return null;
  bool result;
  if (bool.TryParse(value.ToString(), out result))
    return result;
  return null;
}

private static bool IsReady(SPSite site)
{
  return TryParseBool(site.RootWeb.AllProperties["IsReady"]) ?? false;
}

private static bool IsUsed(SPSite site)
{
  return TryParseBool(site.RootWeb.AllProperties["IsUsed"]) ?? false;
}

If test runs fail in a way that the tear down method is not reached, for example when you stop a run in the debugger, the site collection won’t get deleted. You can add a garbage collector to the windows service to remove these zombie site collections:

public static IEnumerable<SPSite> ZombieSites(TimeSpan timeOutReady, TimeSpan timeOutUsed)
{
  return TryFilter(site => IsZombie(site, timeOutReady, timeOutUsed));
}

private static bool IsZombie(SPSite site, TimeSpan timeOutReady, TimeSpan timeOutUsed)
{
  var age = DateTime.UtcNow - site.LastContentModifiedDate;
  var isZombie = (!IsReady(site) && age > timeOutReady) ||
                 (IsUsed(site) && age > timeOutUsed);
  return isZombie;
}

Using this cache reduces the overhead for each test run to about 300 ms. This is huge when compared to unit tests. On the other hand, it is fast enough to encourage developers to write and run a few tests for the code they are working on, probably even using test-driven development.

We developed and implemented the site collection cache at adesso. We might publish the implementation, be it open or closed source, but this is still undecided. If you are interested in this, please leave a comment. I am not the one who decides, but we are actively seeking opinions on this, so you will actually influence the outcome.