Skip to content

New tall style: Automated trailing commas and other formatting changes #1253

Closed
@munificent

Description

@munificent

TL;DR: We're proposing a set of style changes to dart format that would affect about 10% of all Dart code. We want to know what you think.

The main change is that you no longer need to manually add or remove trailing commas. The formatter adds them to any argument or parameter list that splits across multiple lines and removes them from ones that don't. When an argument list or parameter list splits, it is formatted like:

longFunction(
  longArgument,
  anotherLongArgument,
);

This change means less work writing and refactoring code, and no more reminding people to add or remove trailing commas in code reviews. We have a prototype implementation that you can try out. There is a detailed feedback form below, but you can give us your overall impression by reacting to the issue:

  • 👍 Yes, I like the proposal. Ship it!
  • 👎 No, I don't like the proposal. Don't do it.

We have always been very cautious about changing the formatter's style rules to minimize churn in already-formatted code. Most of these style rules were chosen before Flutter existed and before we knew what idiomatic Flutter code looked like.

We've learned a lot about what idiomatic Dart looks like since then. In particular, large deeply nested function calls of "data-like" code are common. To accommodate them, the formatter lets you partially opt into a different formatting style by explicitly authoring trailing commas in argument and parameter lists. I think we can do better, but doing so would be the largest change we've ever made to the formatter.

I'm starting this change process to determine whether or not it's a change the community is in favor of. Details are below, but in short, I propose:

  • The formatter treats trailing commas in argument lists and parameter lists as "whitespace". The formatter adds or removes them as it sees fit just as it does with other whitespace.

  • When an argument or parameter list is split, it always uses a "tall" style where the elements are indented two spaces, and the closing ) is moved to the next line:

    longFunction(
      longArgument,
      anotherLongArgument,
    );
  • The formatting rules for other language constructs are adjusted to mesh well with argument and parameter lists formatted in that style. For example, with that style, I think it looks better to prefer splitting in an argument list instead of after = in an assignment or variable declaration:

    // Before:
    var something =
        function(argument);
    
    // After:
    var something = function(
      argument,
    );

The overall goal is to produce output that is as beautiful and readable as possible for the kind of code most Dart users are writing today, and to give you that without any additional effort on your part.

Background

The formatter currently treats a trailing comma in an argument or parameter list as an explicit hand-authored signal to select one of two formatting styles:

// Page width:               |

// No trailing comma = "short" style:
noTrailingComma(
    longArgument,
    anotherLongArgument);

// Trailing comma = "tall" style:
withTrailingComma(
  longArgument,
  anotherLongArgument,
);

A trailing comma also forces the surrounding construct to split even when it would otherwise fit within a single line:

force(
  arg,
);

The pros of this approach are:

  • Before Dart allowed trailing commas in argument and parameter lists, they were formatted in the short style. Having to opt in to the tall style by writing an explicit trailing comma, avoids churn in already-formatted code.

  • Users who prefer either the short or tall style can each have it their way.

  • If a user wants to force an argument or parameter list to split that would otherwise fit (and they want a tall style), they can control this explicitly.

The cons are:

  • The short style is inconsistent with how list and map literals are formatted, so nested code containing a mixture of function calls and collections—very common in Flutter code—looks inconsistent.

  • If a user wants the tall style but doesn't want to force every argument or parameter list to split, they have no way of getting that. There's no way to say "split or unsplit this as needed, but if you do split, use a tall style".

    Instead, users who want a tall style must remember to manually remove any trailing comma on an argument or parameter list if they want to allow it fit on one line. This is particularly painful with large scale automated refactorings where it's infeasible to massage the potentially thousands of modified lists to remove their trailing commas.

    Worse, there's no way to tell if a trailing comma was authored to mean "I want to force this to split" versus "I want this to have tall style", so anyone maintaining the code afterwards doesn't know whether a now-unnecessary trailing comma should remain or not.

  • Users must be diligent in maintaining trailing commas in order to get the same style across an entire codebase.

    The goal of automated formatting is to get users out of the business of making style choices. But the current implementation requires them to hand-author formatting choices on every argument and parameter list.

    For users who prefer a tall style, they must be careful to add a trailing comma and then reformat whenever an existing argument or parameter list that used to fit on one line no longer does and gets split. By default, as soon as the list overflows, it will get the short style.

    Again, this is most painful with large scale refactorings. A rename that makes an identifier longer could lead to thousands of argument lists splitting. If those are in code that otherwise prefers a tall style, the result is a mixture of short and tall styles.

  • The rest of the style rules aren't designed to harmonize with tall style argument and parameter lists since they don't know whether nearby code will use a short or tall style. There's no way to opt in to a holistic set of style rules designed to make all Dart code look great in that style.

    This leads to a long tail of bug reports where some other construct clearly doesn't look good when composed with tall-style argument and parameter lists, but there's not much we can do because the construct does look fine when composed with short style code.

I believe the cons outweigh the pros, which leads to this proposal.

Proposal

There are basically three interelated pieces to the proposal:

Trailing commas as whitespace

Trailing commas in argument and parameter lists are treated as "whitespace" characters and under the formatter's purview to add and remove. (Arguably, they really are whitespace characters, since they have zero effect on program semantics.)

The rule for adding and removing them is simple:

  • If a comma-separated construct is split across multiple lines, then add a trailing comma after the last element. This applies to argument lists, parameter lists, list literals, map literals, set literals, records, record types, enum values, switch expression cases, and assert arguments.

    (There are a couple of comma-separated constructs that don't allow trailing commas that are excluded from this like type parameter and type argument lists.)

  • Conversely, if the formatter decides to collapse a comma-separated construct onto a single line, then do so and remove the trailing comma.

The last part means that users can no longer hand-author a trailing comma to mean, "I know it fits on one line but force it to split anyway because I want it to." I think it's worth losing that capability in order to preserve reversibility. If the formatter treated trailing commas as user-authored intent, but also inserted them itself, then once a piece of code has been formatted once, it can no longer tell which commas came from a human and which came from itself.

Always use tall style argument and parameter lists

If a trailing comma no longer lets a user choose between a short or tall style, then the formatter has to choose. I think it should always choose a tall style.

Both inside Google and externally, the clear trend is users adding trailing commas to opt into the tall style. An analysis of the 2,000 most recently published Pub packages shows:

-- Arg list (2672877 total) --
1522870 ( 56.975%): Single-line  ===========================
 455266 ( 17.033%): Empty        =========
 347996 ( 13.020%): Tall         =======
 208585 (  7.804%): Block-like   ====
 137682 (  5.151%): Short        ===
    478 (  0.018%): Other        =

The two lines that are relevant here are:

-- Arg list (2672877 total) --
 347996 ( 13.020%): Tall         =======
 137682 (  5.151%): Short        ===

Of the argument lists where either a short or tall style preference can be distinguished, users prefer the tall style ~71% of the time, even though they have to remember to hand-author a trailing comma on every argument list to get it.

Block-like argument lists

I've been talking about "short" and "tall" argument lists, but there are actually three ways that argument lists get formatted:

// Page width:               |

// Short style:
longFunctionName(
    veryLongArgument,
    anotherLongArgument);

// Tall style:
longFunctionName(
  veryLongArgument,
  anotherLongArgument,
);

// Block style:
longFunctionName(() {
  closure();
});

The third style is used when some of the arguments are function expressions or collection literals and formatting the argument list almost as if it were a built-in statement in the language looks better. The most common example is in tests:

main() {
  group('Some group', () {
    test('A test', () {
      expect(1 + 1, 2);
    });
  });
}

This proposal still supports block-style argument lists. It does somewhat tweak the heuristics the formatter uses to decide when an argument list should be block-like or not. The current heuristics are very complex, subtle, and still don't always look right.

Format the rest of the language holistically to match

The set of formatting rules for the different language features were not designed to make each feature look nice in isolation. Instead, the goal was a coherent set of style rules that hang together and make an entire source file clear and readable.

Those rules currently assume that argument and parameter lists have short formatting. With this proposal, we also adjust many of those other rules and heuristics now that we know that when an argument or parameter list splits, it will split in a tall style way.

There are many of these mostly small-scale tweaks. Some examples:

  • Prefer to split in the initializer instead of after "=":

    // Page width:               |
    
    // Before:
    var something =
        function(argument);
    
    // After:
    var something = function(
      argument,
    );
  • Less indentation for collection literal => member bodies:

    // Page width:               |
    
    // Before:
    class Foo {
      List<String> get things => [
            'a long string literal',
            'another long string literal',
          ];
    }
    
    // After:
    class Foo {
      List<String> get things => [
        'a long string literal',
        'another long string literal',
      ];
    }
  • Don't split after => if a parameter list splits:

    // Page width:               |
    
    // Before:
    function(String parameter,
            int another) =>
        otherFunction(
            parameter, another);
    
    // After:
    function(
      String parameter,
      int another,
    ) => otherFunction(
      parameter,
      another,
    );

I used the Flutter repository as a reference, which uses tall style and has been very carefully hand-formatted for maximum readability, and tweaked the rules to follow that wherever I could. Many of the changes are subtle in ways that are hard to describe here. The best way to see them in action is to try out the prototype implementation, described below.

Risks

While I believe the proposed style looks better for most Dart code and makes users' lives simpler by not having to worry about maintaining trailing commas, this is a large change with some downsides:

Churn

Formatting a previously formatted codebase with this new style will cause about 10% of the lines of code to change. That's a large diff and can be disruptive to codebases where there are many in-progress changes.

Migrating to the new style may be painful for users, though of course it is totally automated.

Users may dislike the style

Obviously, if a large enough fraction of users don't want this change, we won't do it, which is what the change process aims to discover. But even if 90% of the users prefer the new style, that still leaves 10% who now feel the tool is worse than it was before.

Realistically, no change of this scale will please everyone. One of the main challenges in maintaining an opinionated formatter that only supports one consistent style is that some users are always unhappy with it. At least by rarely changing the style, we avoid drawing user attention to the style when they don't like it. This large, disruptive change will make all users at least briefly very aware of automated formatting.

Two configurable styles

An obvious solution to user dislike is to make the style configurable: We could let you specify whether you want the old formatting rules or the new ones. The formatter could support two separate holistic styles, without going all the way towards full configurability (which is an anti-goal for the tool).

There are engineering challenges with this. The internal representation the formatter uses to determine where to split lines has grown piecemeal over the formatter's history. The result is hard to extend and limited in the kind of style rules it can represent. When new language features are added it's often difficult to express the style rules we want in terms that the internal representation supports. Many long-standing bugs can't be fixed because the rule a user wants can't be modeled in the current representation.

This became clear when working on a prototype of the proposal. Getting it working was difficult and there are edge cases where I can't get it to model the rules I want.

If this proposal is accepted and we make large-scale changes to the formatting rules, we intend to take the opportunity to also implement a better internal representation. That's a large piece of work for a tool that generally doesn't have a lot of engineering resources allocated to it.

We don't have the bandwidth during this change to write a new IR, a new set of style rules and migrate the old style rules to the new IR. There may be old rules the new IR can't represent well.

We could keep the old IR around for the old style and use the new IR only for the new style. But going forward, we don't have the resources to maintain two sets of style rules and two separate internal representations, one for each. Every time a new language feature ships, the formatter needs support for it. Bugs would have to get fixed twice (or, more likely, only fixed for one style).

We might be able to temporarily support both styles in order to offer a migration period. But we don't have the resources to support both styles in perpetuity. We would rather spend those resources supporting one style really well instead of two styles poorly.

Evaluation

This is a large style change. Because the formatter is deliberately opinionated and non-configurable, it will affect all users of dart format. That means it's important to make sure that it's a good change in the eyes of as many users as possible. We can't please everyone, but we should certainly please a clear majority.

To that end, we need your feedback. That feedback is most useful when it's grounded in concrete experience.

Prototype implementation

To help with that, we have a prototype implementation of the new style rules. The prototype has known performance regressions and doesn't get the style exactly right in some cases. But, for the most part, it should show you the proposed behavior.

Diff corpus

We ran this prototype on a randomly selected corpus of code, which yields the resulting diff. This shows you how the new formatting compares to the behavior of the current formatter. Keep in mind that a diff focuses your attention on places where the style is different. It doesn't highlight the majority of lines of code that are formatted exactly the same under this proposal as they are today.

Running it yourself

To get a better sense of what it feels like to use this proposed behavior, you can install the prototype branch of the formatter from Git:

$ dart pub global activate \
    --source=git https://github.com/dart-lang/dart_style \
    --git-ref=flutter-style-experiment

You can run this using the same command-line options as dart format. For example, to reformat the current directly and its subdirectories, you would run:

$ dart pub global run dart_style:format .

To format just a single file, example.dart, you would run:

$ dart pub global run dart_style:format example.dart

Give us your feedback

Once you've tried it out, let us know what you think by taking this survey. You can also reply directly on this issue, but the survey will help us aggregate the responses more easily. We'll take the survey responses and any comments here into account and try our best to do what's right for the community.

After a week or, to give people time to reply, I'll update with what we've learned.

This is an uncomfortably large change to propose (and a hard one to implement!), so I appreciate your patience and understanding while we work through the process.

Thank you!

– Bob

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions