Smoke and Mirrors

If you've ever stared at a 3MB Javascript file generated by dart2js, you know the frustration that dart:mirrors can cause. After the hundredth time I found myself typing a @MirrorsUsed annotation to bring the code down to size, I wondered if there was a better way. As it turns out, there is: it's called Smoke.

In this article, I'll address the pros and cons of reflective code via dart:mirrors, and explain how Smoke offers a better alternative in most cases.

Why we use mirrors

We often don't know ahead of time what the structure of our code will be. Reflection is extremely useful in these situations, and allows us to reason about and manipulate the code despite our lack of foreknowledge. Even if we do know the structure of our code beforehand, it can sometimes be tedious to manually write a static implementation.

Let's consider a simple example:

class Person {
  String firstName;
  String lastName;

  @override
  String toString() => '$firstName $lastName';
}

The default toString provided by Object is not terribly useful (for our Person it would return Instance of 'Person'), so we've overridden it to return the first and last name of the Person.

But what happens if we add more fields? Maybe a phone number, or an address? Our toString implementation is brittle. Every time we modify the fields for Person we'll have to remember to update toString as well.

By using mirrors, we can dynamically figure out what fields Person has without having to specify them:

abstract class BoilerplateToString {

  @override
  String toString () {
    InstanceMirror im = reflect(this);
    ClassMirror cm = im.type;

    var str = '';
    cm.instanceMembers.forEach((symbol, mirror) {
      String symbolStr = MirrorSystem.getName(symbol);
      if(mirror.isGetter && !mirror.isPrivate
          && symbolStr != 'hashCode' && symbolStr != 'runtimeType') {
        str += '${im.getField(symbol).reflectee} ';
      }
    });

    return str;
  }
}

We simply obtain a ClassMirror for Person and then reflectively walk through the public getters and add them to the output of toString.

Now any class that wants to have this toString behaviour can use BoilerplateToString as a mixin:

class Person extends Object with BoilerplateToString {
  String firstName;
  String lastName;
}

We now have a robust and flexible toString method. It doesn't matter how often we modify Person's fields, toString will always return the correct result automatically. What's more, this behaviour isn't unique to Person. Since we've wrapped the method in an abstract class it can be used by any other class that wants to. BoilerplateToString can be used by an Alien class, or a BankAccount, or any other class, despite not knowing anything about those classes prior to runtime.

This is powerful behaviour, and it's why mirrors are so useful.

Why mirrors lead to generated code bloat

Let's add a very simple program that tests our Person class:

void main() {
  var person = new Person();
  person.firstName = 'John';
  person.lastName = 'Smith';
  print(person);
}

Make sure to specify that dart2js should minify the code to keep the size down:

transformers:
  - $dart2js:
      minify: true

Running pub build generates a 639KB Javascript file, which is very large considering how little the program does. The static version of the program generates an 11KB file which is far more reasonable. How did we go from 11KB to 639KB just by switching to a mirrors-based implementation?

When dart2js walks through your Dart code to generate Javascript, it performs a process called tree-shaking (you can learn more about tree-shaking in Seth Ladd's article). Tree-shaking figures out which parts of your code are actually needed to run your program, and gets rid of the rest. Consider what happens when you import a large utility library but only use one method from it; without tree-shaking the whole library would be used just for that one method. Tree-shaking prunes all the unused code, leaving behind the single method that you need.

This is a great feature that helps keep the size of code down. So long as you don't use mirrors, that is. Since mirrors are resolved dynamically at runtime, dart2js can't risk pruning out code that might be accessed by a mirror, and therefore leaves in a lot of extra code, even though it is not actually used.

Luckily, we can turn tree-shaking back on by by using a @MirrorsUsed annotation. @MirrorsUsed lets us tell dart2js which classes will be accessed via reflection, allowing to tree-shake everything else:

@MirrorsUsed(
  targets: const['Person'],
  override: "*")

Now running pub build will only generate an 84KB file. This is still quite a bit larger than the 11KB of the static version, but worth it to gain access to the power of reflection.

Be kind to your users

Our boilerplate code works, and the code size is manageable, so we decide to share it with the world by publishing it to Pub or GitHub. After all, since we're using mirrors for dynamic introspection, anyone can use our code.

But there's one part that isn't dynamic, and that's the @MirrorsUsed annotation. Since we can't know ahead of time which classes a user wants to reflect, each consumer of our library will have to write their own custom annotation. If a user wants to use BoilerplateToString on Alien, Spaceship, and LaserBeam, they'll have to include each one of those in @MirrorsUsed:

@MirrorsUsed(
  targets: const['Alien', 'Spaceship', 'LaserBeam'],
  override: "*")

We've now invaded our user's code and forced them to carefully manage their mirrors or face a massive increase in generated code size. If the user is using several mirrors-based libraries it can quickly become tedious to stay on top of everything that must be included in @MirrorsUsed. This is not ideal.

Enter Smoke

Here's where Smoke comes in. Smoke will let us use reflective capabilities without having to use @MirrorsUsed. Smoke provides a reflective API similar to dart:mirrors. By default, dart:mirrors will be used behind the scenes to provide the reflection, but at build time this will be replaced with a static implementation (this dual approach ensures that development iterations are fast since no rebuild is required between runs).

Here's the Smoke version of DefaultToString:

import 'package:smoke/smoke.dart';

abstract class DefaultToString {

  String toString () {
    var str = '';

    List<Declaration> declarations = query(this.runtimeType, 
        new QueryOptions(includeProperties: false));
      
    for(var declaration in declarations) {
      str += '${read(this, declaration.name)} ';
    }

    return str;
  }
}

We're just doing the same thing as before, walking through the object's fields and outputting their values, albeit with a slightly different API. The Smoke reflection code is a bit shorter and more readable than the dart:mirrors version, but currently won't really provide much benefit over dart:mirrors. In order to really utilize Smoke, we'll need to build a static implementation, which means we'll need to write a transformer.

Writing the transformer

If you're not familiar with transformers read Pub Assets and Transformers to get an idea of how these work. When pub build is run, all transformers are executed and convert input files into output files. How input files are converted into output depends on the transformer. For instance, dart2js is a transformer that runs on input Dart files and converts them to Javascript output files. We're going to write a Smoke transformer that runs on input Dart files, examines them for Smoke reflection usage, and writes static versions as output.

Here's the transformer:

import 'dart:async';
import 'package:barback/barback.dart';
import 'package:smoke/codegen/recorder.dart';
import 'package:smoke/codegen/generator.dart';
import 'package:code_transformers/resolver.dart';
import 'package:code_transformers/src/dart_sdk.dart';

class SmokeTransformer extends Transformer with ResolverTransformer {
  final BarbackSettings _settings;
  Transform _transform;
  AssetId _primaryInputId;
  String _fileSuffix = '_bootstrap';

  SmokeTransformer.asPlugin(this._settings) {
    resolvers = new Resolvers(dartSdkDirectory);
  }

  @override
  Future applyResolver(Transform transform, Resolver resolver) {
    _transform = transform;
    _primaryInputId = _transform.primaryInput.id;
    _buildSmokeBootstrap(resolver);

    return _buildHtmlBootstrap();
  }

  /// Builds a Smoke bootstrapper that intializes static Smoke access
  /// and then calls the actual entry point.
  _buildSmokeBootstrap(Resolver resolver) {
      // Initialize the Smoke generator and recorder
      var generator = new SmokeCodeGenerator();
      Recorder recorder = new Recorder(generator,
          (lib) => resolver.getImportUri(lib, from: _primaryInputId).toString());

      // Record each class in the library for our generator
      var lib = resolver.getLibrary(_primaryInputId);
      var classes = lib.units.expand((u) => u.types);
      for(var clazz in classes) {
        recorder.runQuery(clazz, new QueryOptions(includeProperties: false));
      }

      // Generate the Smoke bootstrapper
      StringBuffer sb = new StringBuffer();
      sb.write('library smoke_bootstrap;\n\n');
      generator.writeImports(sb);
      sb.write('\n');
      generator.writeTopLevelDeclarations(sb);
      sb.write('\nvoid main() {\n');
      generator.writeStaticConfiguration(sb);
      // Call the entry point's main method
      sb.write(';\n  smoke_0.main();\n}');

      // Add the Smoke bootstrapper to the output files
      var bootstrapId = _primaryInputId.changeExtension('${_fileSuffix}.dart');
      _transform.addOutput(new Asset.fromString(bootstrapId, sb.toString()));
  }

  /// Builds an HTML file that is identical to the entry point HTML
  /// but uses our Smoke bootstrap as the Dart entry point
  Future _buildHtmlBootstrap() {
    AssetId primaryHtml = _primaryInputId.changeExtension('.html');
    return _transform.getInput(primaryHtml).then((asset) {
      var packageName = _transform.primaryInput.id.package.toLowerCase();

      return asset.readAsString().then((content) {
        AssetId bootstrapHtmlId = _primaryInputId.changeExtension('${_fileSuffix}.html');
        RegExp pattern = new RegExp(packageName);
        String replace = packageName + _fileSuffix;
        _transform.addOutput(new Asset.fromString(bootstrapHtmlId, content.replaceAll(pattern, replace)));
      });
    });
  }

  String get allowedExtensions => '.dart';

}

applyResolver is called automatically when the transformer is run. The interesting work is mainly done in _buildSmokeBootstrap. First we record our Smoke reflection usage using a Recorder. Your recorder usage should match the usage in your actual application. Once we have the reflection recorded, a SmokeCodeGenerator will generate the necessary static code to enable mirrors-free reflection. We will need to generate a bit of our own code around Smoke to ensure it will be run before our own application. Finally, _buildHtmlBootstrap will generate an HTML file that links to the bootstrapped Smoke file.

The transformer needs to be added to pubspec.yaml in order to have it run when the project is built:

transformers:
  - smoke_example
  - $dart2js:
      minify: true

Note that the order is important - the Smoke transformer needs to run before dart2js so that dart2js will see the Smoke output and compile that to Javascript as well. If the order was reversed, the Smoke transformer would run after dart2js and not be compiled to Javascript.

Now when the project is built via pub build the size is 37KB. This is half the size of the mirrors version and not much larger than the static version. Even better, consumers of our library no longer need to muck around with @MirrorsUsed annotations!

Performance

After seeing the reduction in generated code size, I hoped to see something similar for performance. However, it seems that the Smoke performance story is not so straightforward. When benchmarking the program in Dartium (using the benchmark harness, and the static Smoke implementation) there was indeed an improvement over mirrors. But when benchmarking the dart2js generated code though there was no discernible performance between Smoke and mirrors. In both Dartium and Firefox the purely static version carried the day, though the difference was less pronounced in Dartium.

Dartium Firefox (dart2js)
Static 3 us 30 us
Smoke 30 us 1900 us
Mirrors 195 us 1900 us

Conclusion

This article explored three different approaches to code: static, mirrors, and Smoke codegen. Here is how they stack up:

Code size Requires @MirrorsUsed Reflective capabilities Performance - Dartium Performance - dart2js
Static Small No No Very fast Fast
Smoke Medium No Yes Fast Slow
Mirrors Large Yes Yes Average Slow

When it is possible to use, static code is the clear winner due to the small generated code size and fast performance. However, once reflection is required, static is out and the choice is between Smoke and mirrors. Though Smoke is more complicated to setup than mirrors (due to the use of transformers), it generates smaller Javascript sizes, as well as removing the necessity to specify @MirrorsUsed.

If you are a library author using dart:mirrors for reflection, you should definitely consider replacing it with Smoke.


You can view all the code used in this post on Github.