Friday, March 20, 2020

Parsing SVG documents into useful layout specifications

In a previous post, I discussed using SVG documents to specify layouts in a Unity app. More recently, I started delving into that subject and laid out a list of things to cover:

  1. How to make the SVG easy to interpret visually.
  2. How to convert the SVG into a set of rules.
  3. How to select the right rules given a query string.
  4. Generating an SVG that explains what went wrong when a test fails.

Among the many things I deferred was how to actually parse the SVG into a set of layout constraints.

Both #2 and #3 are pretty simple and they are strongly related, so I'll handle them in this text.

Context


I'm almost not sure this is worth writing; it's so straightforward. I mean, there's a lot of code but none of it is very challenging.

For the sake of completeness, I'll write it anyway.

I'm leaving out things like "how to find the SVG file" and "how to load a resource from an assembly". If you don't know that stuff, I'm sure you can search and find a couple hundred tutorials or just look it up in the documentation.

First, we'll need some code. There's a lot of it and it's all relevant. So, rather than embed it inline, I've decided to put it all into a couple gists that you can browse at your leisure.

One logical block of code is the SVG parsing and editing code. I'm not going to spend a lot of energy explaining that code. It's pretty straightforward. It just adds a layer of SVG understanding over the basic XML format.

The other relevant body of code is the code that transforms the SVG model into the shape specifications. You can find that, along with the relevant specifications, here.

Separated but Related


The result is an array of adjacent-but-unmixed perspectives, each of which is simple.

For instance, the test's perspective only includes a few types. From test-binding code's perspective, this is an almost trivial problem:

  1. Reconstitute a specification database from some stream.
  2. Ask it for the specification associated with a key.
  3. Ask that specification to validate an actual shape.


How test-binding code sees shape-specifications

Parsing, interpreting, and editing SVG documents is a little more complicated but it's still fairly straightforward.

SVG parsing/editing classes

Finding all the rectangles associated with a key has an impact limited to a few classes.

Building RectangleAssertion instances is an algorithm inspects contents of an SVG document

The ShapeSpecification abstraction has several variants, which are depicted below. Their listings can be found in the aforementioned ed gist.

The ShapeSpecification hierarchy

Compiling those rectangle assertions (which should really be parallelogram assertions) into ShapeSpecification objects also involves a limited subset of these classes.

Compiling specifications is independent of the source shape type

This is the code I think is most worthy of deeper consideration because it shows a useful trick I apply on a regular basis.

Compiling the ShapeSpecification Graphs


The first step is finding all the SVG shapes with conforming IDs, via the following method:

IEnumerable<RectangleSpecification> FindSelectedRectanglesInDocument(string Key)
{
  var SourceRectangles = SvgDocument.Rects.Select(ToRectangleRule)
    .Concat(SvgDocument.Polygons.Select(ToRectangleRule))
    .Concat(SvgDocument.Polylines.Select(ToRectangleRule));

  return SourceRectangles.Where(R => Regex.IsMatch(
    R.Origin.Id,
    $@"^{Key}(?:\.[^.]*)*$",
    RegexOptions.IgnoreCase));
}

The list of all relevant shapes can then be converted into a single ShapeSpecification using this method:

ShapeSpecification ConvertRectanglesToShapeSpecification(
  IEnumerable<RectangleSpecification> Rectangles)
{
  ShapeSpecification Result = new NoSpecification();
  foreach (var RectangleRule in Rectangles)
  {
    var Next = ConvertRectangleToSpecification(
      RectangleRule.Origin.Id, RectangleRule.Operand);

    if (Next == null)
      continue;

    Next = new AddContextSpecification(
      new GenerateErrorSpecification(
        Next,
        Artifacts,
        RectangleRule.Origin),
      RectangleRule.Origin.Id);
    Result = new CombineShapeSpecification(Result, Next);
  }

  return Result;
}

That method hinges on the ability to convert an individual shape into its corresponding shape specification. This is accomplished with the following:

static ShapeSpecification ConvertRectangleToSpecification(
  string Id, Rectangle ExpectedViewportRectangle)
{
  Id = Id.Trim();

  if (IsOfType("inner"))
    return new ContainsShapeSpecification(ExpectedViewportRectangle);

  if (IsOfType("outer"))
    return new ContainedByShapeSpecification(ExpectedViewportRectangle);

  return null;

  bool IsOfType(string Indicator)
  {
    return Regex.IsMatch(
      Id,
      $@"^(?:[^.]*\.)*{Indicator}$",
      RegexOptions.IgnoreCase);
  }
}

This is a pretty typical flow, for me. If you have a complex, configuration driven rule. You find a little language to describe that rule, you write a parser and interpreter for that language, to define all the individual parts of the rule as classes under an abstraction, and you have your interpreter build up a chain of composites combining the behavior of each of the constituent parts.

Conclusion


Like I wrote at the beginning of this entry. I'm not sure if this was worth writing. I don't have a good sense of what is trivial. A coworker and (I hope) friend once said to me, "I've noticed that what you call easy covers 80% of what's possible."

I think the lesson I ended up building is this: Test code is code. It's not a second-class citizen deserving of minimal investment, any more than production code would be.

This is what some people would consider "a lot of layers" or "too much design" for mere test code, but there's actually no such thing as too much design for any code. At least, there's no such thing as too much relevant design.

I know I've written and said this many times. However, I'm not sure I've ever written about an actual example demonstrating this principle.

So there it is. Treat your test code just like it's real code because it is. Here is a demonstration that it can be done and that it works.