2 месяцев назад · 45b3f47349
--- a/proposals/p6699.md
+++ b/proposals/p6699.md
@@ -0,0 +1,126 @@
 
															+# Diagnostic sorting
														
 
															+
														
 
															+<!--
														
 
															+Part of the Carbon Language project, under the Apache License v2.0 with LLVM
														
 
															+Exceptions. See /LICENSE for license information.
														
 
															+SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
														
 
															+-->
														
 
															+
														
 
															+[Pull request](https://github.com/carbon-language/carbon-lang/pull/6699)
														
 
															+
														
 
															+<!-- toc -->
														
 
															+
														
 
															+## Table of contents
														
 
															+
														
 
															+-   [Abstract](#abstract)
														
 
															+-   [Problem](#problem)
														
 
															+-   [Background](#background)
														
 
															+-   [Proposal](#proposal)
														
 
															+-   [Details](#details)
														
 
															+-   [Rationale](#rationale)
														
 
															+-   [Alternatives considered](#alternatives-considered)
														
 
															+    -   [Don't sort diagnostics](#dont-sort-diagnostics)
														
 
															+    -   [Sort by line and column](#sort-by-line-and-column)
														
 
															+    -   [Sort by last processed token](#sort-by-last-processed-token)
														
 
															+
														
 
															+<!-- tocstop -->
														
 
															+
														
 
															+## Abstract
														
 
															+
														
 
															+Change `SortingConsumer` from sorting by last processed token (per-phase) to
														
 
															+additionally allow diagnostics to request sorting by start position (line and
														
 
															+column) when the last processed token is the same.
														
 
															+
														
 
															+## Problem
														
 
															+
														
 
															+Diagnostics in many toolchains are emitted in the order they are discovered
														
 
															+during code traversal. While this naturally reflects the causal relationship
														
 
															+between errors (for example, an error in a macro expansion causing subsequent
														
 
															+parse errors), it can lead to a confusing experience for developers if the
														
 
															+diagnostics jump back and forth through the file. Conversely, sorting purely by
														
 
															+source location (line and column) can break causality, presenting a consequence
														
 
															+before its cause. We need a sorting strategy that feels natural to humans but
														
 
															+respects the underlying toolchain logic.
														
 
															+
														
 
															+## Background
														
 
															+
														
 
															+Carbon's processing of code in stages (lex, parse, check) causes diagnostics to
														
 
															+be produced in that order. In contrast, Clang interleaves parse and check, and
														
 
															+as a consequence the diagnostics produced are similarly interleaved.
														
 
															+
														
 
															+A more detailed overview of Carbon's diagnostic infrastructure can be found in
														
 
															+[diagnostics.md](/toolchain/docs/diagnostics.md).
														
 
															+
														
 
															+## Proposal
														
 
															+
														
 
															+In addition to sorting by the last processed token (which `SortingConsumer`
														
 
															+already does), add a way to sort based on the start position (line and column)
														
 
															+by request. This is being called "on-scope" because current cases we've
														
 
															+discussed are scope-related.
														
 
															+
														
 
															+See [SortingConsumer](/toolchain/docs/diagnostics.md#sortingconsumer) for more
														
 
															+documentation.
														
 
															+
														
 
															+## Details
														
 
															+
														
 
															+This was already implemented in
														
 
															+[PR #6687](https://github.com/carbon-language/carbon-lang/pull/6687).
														
 
															+
														
 
															+## Rationale
														
 
															+
														
 
															+This proposal advances Carbon's goal of providing
														
 
															+[Language tools and ecosystem](/docs/project/goals.md#language-tools-and-ecosystem)
														
 
															+and creating
														
 
															+[Code that is easy to read, understand, and write](/docs/project/goals.md#code-that-is-easy-to-read-understand-and-write)
														
 
															+by making the developer experience with error messages more predictable and
														
 
															+logical. It respects the inherent causal order of the toolchain while tailoring
														
 
															+the output to human expectations.
														
 
															+
														
 
															+## Alternatives considered
														
 
															+
														
 
															+### Don't sort diagnostics
														
 
															+
														
 
															+We could just print diagnostics in the order they are produced. This would print
														
 
															+all lex errors, then all parse errors, then all check errors. This would be
														
 
															+simple, but might be confusing when a parse error at the end of a file comes
														
 
															+before check errors, and fixing the check errors would fix the parse error.
														
 
															+
														
 
															+### Sort by line and column
														
 
															+
														
 
															+We could sort diagnostics purely by their line and column. This runs into issues
														
 
															+with cases such as:
														
 
															+
														
 
															+```carbon
														
 
															+fn F(x: i32, y: i32);
														
 
															+
														
 
															+fn G() {
														
 
															+  F(1 2);
														
 
															+}
														
 
															+```
														
 
															+
														
 
															+Here, the diagnostic for an invalid parse of `1 2` would appear after the
														
 
															+diagnostic that `F` expects two arguments, not one. This is confusing because
														
 
															+the missing comma is the root cause of the incorrect argument count.
														
 
															+
														
 
															+### Sort by last processed token
														
 
															+
														
 
															+We could sort diagnostics purely by the last token that was processed when the
														
 
															+diagnostic was emitted. This runs into issues with cases such as:
														
 
															+
														
 
															+```carbon
														
 
															+fn F(x: i32, y: i32) {}
														
 
															+```
														
 
															+
														
 
															+Here, both `x` and `y` would be diagnosed as unused at the `}`. The order would
														
 
															+be non-deterministic, hindering golden tests.
														
 
															+
														
 
															+This could be partially addressed by sorting the diagnostics locally (for
														
 
															+example, sorting each `unused` diagnostic together), but this is an incomplete
														
 
															+solution because we may introduce further scope-related checks, particularly
														
 
															+flow checking (for example, checking if there are provable out-of-bounds
														
 
															+accesses). These would all have the same last processed tokens. It also would
														
 
															+likely lead to sorting regardless of whether sorting was requested by the tool
														
 
															+user, a performance overhead we want to avoid.
														
 
															+
														
 
															+We believe sorting by the last processed token is a partial solution, which
														
 
															+we're building on.
														
--- a/toolchain/docs/diagnostics.md
+++ b/toolchain/docs/diagnostics.md
@@ -11,14 +11,17 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
 
															 ## Table of contents
														
 
															 -   [Overview](#overview)
														
 
															--   [DiagnosticEmitter](#diagnosticemitter)
														
 
															--   [DiagnosticConsumers](#diagnosticconsumers)
														
 
															+-   [Emitters](#emitters)
														
 
															+-   [Consumers](#consumers)
														
 
															+    -   [SortingConsumer](#sortingconsumer)
														
 
															 -   [Producing diagnostics](#producing-diagnostics)
														
 
															 -   [Diagnostic registry](#diagnostic-registry)
														
 
															 -   [CARBON_DIAGNOSTIC placement](#carbon_diagnostic-placement)
														
 
															 -   [Diagnostic context](#diagnostic-context)
														
 
															 -   [Diagnostic parameter types](#diagnostic-parameter-types)
														
 
															 -   [Diagnostic message style guide](#diagnostic-message-style-guide)
														
 
															+-   [Alternatives considered](#alternatives-considered)
														
 
															+-   [References](#references)
														
 
															 <!-- tocstop -->
														
@@ -26,20 +29,18 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
 
															 The diagnostic code is used by the toolchain to produce output.
														
 
															-## DiagnosticEmitter
														
 
															+## Emitters
														
 
															-[Emitters](/toolchain/diagnostics/emitter.h) handle the main formatting of a
														
 
															-message. It's parameterized on a location type, for which a
														
 
															-DiagnosticLocationTranslator must be provided that can translate the location
														
 
															-type into a standardized DiagnosticLocation of file, line, and column.
														
 
															+[`Emitter`s](/toolchain/diagnostics/emitter.h) handle the main formatting of a
														
 
															+message. It's parameterized on a location type, which `ConvertLoc` translates
														
 
															+into a standardized DiagnosticLocation of file, line, and column.
														
 
															-When emitting, the resulting formatted message is passed to a
														
 
															-DiagnosticConsumer.
														
 
															+When emitting, the resulting formatted message is passed to a `Consumer`.
														
 
															-## DiagnosticConsumers
														
 
															+## Consumers
														
 
															-DiagnosticConsumers handle output of diagnostic messages after they've been
														
 
															-formatted by an Emitter. Important consumers are:
														
 
															+`Consumer`s handle output of diagnostic messages after they've been formatted by
														
 
															+an Emitter. Important consumers are:
														
 
															 -   [ConsoleConsumer](/toolchain/diagnostics/consumer.cpp): prints diagnostics
														
 
															     to console.
														
@@ -48,22 +49,45 @@ formatted by an Emitter. Important consumers are:
 
															     number of errors produced, particularly so that it can be determined whether
														
 
															     any errors were encountered.
														
 
															--   [SortingConsumer](/toolchain/diagnostics/sorting_consumer.h): sorts
														
 
															-    diagnostics by line so that diagnostics are seen in terminal based on their
														
 
															-    order in the file rather than the order they were produced.
														
 
															+-   [SortingConsumer](/toolchain/diagnostics/sorting_consumer.h): buffers and
														
 
															+    sorts diagnostics to provide a more human-understandable order while
														
 
															+    maintaining causal consistency.
														
 
															--   [NullConsumer](/toolchain/diagnostics/null_diagnostics.h): suppresses
														
 
															-    diagnostics, particularly for tests.
														
 
															+### SortingConsumer
														
 
															-Note that `SortingConsumer` is used by default by `carbon compile`. In cases
														
 
															-where one error leads to another error at an earlier location, for example if an
														
 
															-error in a function call argument leads to an error in the function call, this
														
 
															-can result in confusing diagnostic output where a consequence of the error is
														
 
															-reported before the cause. Usually this should be handled by tracking that an
														
 
															-error occurred and suppressing the follow-on diagnostic. During toolchain
														
 
															-development, it can be useful to disable the sorting so that the diagnostic
														
 
															-order matches the order in which the file was processed. This can be done using
														
 
															-`carbon compile –stream-errors`.
														
 
															+`SortingConsumer` is used by default by `carbon compile`. To see the actual
														
 
															+emitted order, use `carbon compile --stream-errors`.
														
 
															+
														
 
															+The current `SortingConsumer` implementation sorts diagnostics based on the
														
 
															+`last_byte_offset`, which represents the latest token handled by the phase
														
 
															+emitting the diagnostic. This maintains the causal order of the toolchain's
														
 
															+traversal.
														
 
															+
														
 
															+We expect cases where multiple diagnostics are emitted at the same offset,
														
 
															+particularly when they're emitted at the end of a scope. These can have attached
														
 
															+locations earlier in the scope, such as variable declarations. In these cases,
														
 
															+we put non-on-scope diagnostics first, and then sort on-scope diagnostics by
														
 
															+their start position (line and column).
														
 
															+
														
 
															+Diagnostic sorting is stable, so that diagnostics from earlier phases are
														
 
															+printed first if all else is equal.
														
 
															+
														
 
															+The sorting approach balances several competing needs:
														
 
															+
														
 
															+-   **Causal order**: Developers generally want to fix errors in the order they
														
 
															+    are printed. If fixing error A could also fix error B, A should be printed
														
 
															+    first.
														
 
															+
														
 
															+-   **Human-understandable order**: A human expects diagnostics to follow the
														
 
															+    flow of the file. If all parse errors in a file are printed before any
														
 
															+    semantic check errors, the developer may find it confusing to jump back and
														
 
															+    forth through the file.
														
 
															+
														
 
															+-   **Performance where possible**: On fully correct code with no diagnostics,
														
 
															+    which is our performance priority, this has negligible overhead. When there
														
 
															+    are diagnostics, we try to only sort within the `SortingConsumer`. When
														
 
															+    sorting is not desired (such as tools and IDEs that provide their own
														
 
															+    ordering), it's easy to disable.
														
 
															 ## Producing diagnostics
														
@@ -94,6 +118,14 @@ of an argument to expect for message formatting. The `invalid_char` argument to
 
															 `Emit` provides the matching value. It's then passed along with the diagnostic
														
 
															 message format to `llvm::formatv` to produce the final diagnostic message.
														
 
															+An on-scope diagnostic uses `CARBON_DIAGNOSTIC_ON_SCOPE` which is identical
														
 
															+other than the macro name. For example:
														
 
															+
														
 
															+```cpp
														
 
															+CARBON_DIAGNOSTIC_ON_SCOPE(InvalidInScope, Error, "error inside scope");
														
 
															+emitter.Emit(location, InvalidInScope);
														
 
															+```
														
 
															+
														
 
															 ## Diagnostic registry
														
 
															 There is a [registry](/toolchain/diagnostics/kind.def) which all diagnostics
														
@@ -285,3 +317,14 @@ Carbon's diagnostic style aims to balance these concerns. Our style is:
 
															     only allowed for types?
														
 
															 -   TODO: Lots more things to decide, give examples.
														
 
															+
														
 
															+## Alternatives considered
														
 
															+
														
 
															+-   [Don't sort diagnostics](/proposals/p6699.md#dont-sort-diagnostics)
														
 
															+-   [Sort by line and column](/proposals/p6699.md#sort-by-line-and-column)
														
 
															+-   [Sort by last processed token](/proposals/p6699.md#sort-by-last-processed-token)
														
 
															+
														
 
															+## References
														
 
															+
														
 
															+-   Proposal
														
 
															+    [#6699: Sort diagnostics](https://github.com/carbon-language/carbon-lang/pull/6699)