Sfoglia il codice sorgente

Generics details 7: final impls (#983)

Add support for marking impls as `final` to say they can't be specialized. This allows generic functions that see that the impl applies to determine the values for its associated types. For example this allows us to say that the implementation of the `Deref` interface for pointers can't be specialized. Otherwise, `*p` could have unknown type in a generic function.

Co-authored-by: Chandler Carruth <chandlerc@gmail.com>
josh11b 4 anni fa
parent
commit
3f12316d6b
2 ha cambiato i file con 299 aggiunte e 8 eliminazioni
  1. 131 8
      docs/design/generics/details.md
  2. 168 0
      proposals/p0983.md

+ 131 - 8
docs/design/generics/details.md

@@ -89,7 +89,9 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
         -   [Prioritization rule](#prioritization-rule)
         -   [Acyclic rule](#acyclic-rule)
         -   [Termination rule](#termination-rule)
-        -   [Comparison to Rust](#comparison-to-rust)
+    -   [`final` impls](#final-impls)
+        -   [Libraries that can contain `final` impls](#libraries-that-can-contain-final-impls)
+    -   [Comparison to Rust](#comparison-to-rust)
 -   [Future work](#future-work)
     -   [Dynamic types](#dynamic-types)
         -   [Runtime type parameters](#runtime-type-parameters)
@@ -3750,8 +3752,8 @@ or
 
 ### Blanket impls
 
-A _blanket impl_ is an `impl` that could apply to more than one type, so the
-`impl` will use a type variable for the `Self` type. Here are some examples
+A _blanket impl_ is an `impl` that could apply to more than one root type, so
+the `impl` will use a type variable for the `Self` type. Here are some examples
 where blanket impls arise:
 
 -   Any type implementing `Ordered` should get an implementation of
@@ -4111,7 +4113,127 @@ allow our desired use cases, but allow the compiler to detect non-terminating
 cases? Perhaps there is some sort of complexity measure Carbon can require
 doesn't increase when recursing?
 
-#### Comparison to Rust
+### `final` impls
+
+There are cases where knowing that a parameterized impl won't be specialized is
+particularly valuable. This could let the compiler know the return type of a
+generic function call, such as using an operator:
+
+```
+// Interface defining the behavior of the prefix-* operator
+interface Deref {
+  let Result:! Type;
+  fn DoDeref[me: Self]() -> Result;
+}
+
+// Types implementing `Deref`
+class Ptr(T:! Type) {
+  ...
+  external impl as Deref {
+    let Result:! Type = T;
+    fn DoDeref[me: Self]() -> Result { ... }
+  }
+}
+class Optional(T:! Type) {
+  ...
+  external impl as Deref {
+    let Result:! Type = T;
+    fn DoDeref[me: Self]() -> Result { ... }
+  }
+}
+
+fn F[T:! Type](x: T) {
+  // uses Ptr(T) and Optional(T) in implementation
+}
+```
+
+The concern is the possibility of specializing `Optional(T) as Deref` or
+`Ptr(T) as Deref` for a more specific `T` means that the compiler can't assume
+anything about the return type of `Deref.DoDeref` calls. This means `F` would in
+practice have to add a constraint, which is both verbose and exposes what should
+be implementation details:
+
+```
+fn F[T:! Type where Optional(T).(Deref.Result) == .Self
+                and Ptr(T).(Deref.Result) == .Self](x: T) {
+  // uses Ptr(T) and Optional(T) in implementation
+}
+```
+
+To mark an impl as not able to be specialized, prefix it with the keyword
+`final`:
+
+```
+class Ptr(T:! Type) {
+  ...
+  // Note: added `final`
+  final external impl as Deref {
+    let Result:! Type = T;
+    fn DoDeref[me: Self]() -> Result { ... }
+  }
+}
+class Optional(T:! Type) {
+  ...
+  // Note: added `final`
+  final external impl as Deref {
+    let Result:! Type = T;
+    fn DoDeref[me: Self]() -> Result { ... }
+  }
+}
+
+// ❌ Illegal: external impl Ptr(i32) as Deref { ... }
+// ❌ Illegal: external impl Optional(i32) as Deref { ... }
+```
+
+This prevents any higher-priority impl that overlaps a final impl from being
+defined. Further, if the Carbon compiler sees a matching final impl, it can
+assume it won't be specialized so it can use the assignments of the associated
+types in that impl definition.
+
+```
+fn F[T:! Type](x: T) {
+  var p: Ptr(T) = ...;
+  // *p has type `T`
+  var o: Optional(T) = ...;
+  // *o has type `T`
+}
+```
+
+#### Libraries that can contain `final` impls
+
+To prevent the possibility of two unrelated libraries defining conflicting
+impls, Carbon restricts which libraries may declare an impl as `final` to only:
+
+-   the library declaring the impl's interface and
+-   the library declaring the root of the `Self` type.
+
+This means:
+
+-   A blanket impl with type structure `impl ? as MyInterface(...)` may only be
+    defined in the same library as `MyInterface`.
+-   An impl with type structure `impl MyType(...) as MyInterface(...)` may be
+    defined in the library with `MyType` or `MyInterface`.
+
+These restrictions ensure that the Carbon compiler can locally check that no
+higher-priority impl is defined superseding a `final` impl.
+
+-   An impl with type structure `impl MyType(...) as MyInterface(...)` defined
+    in the library with `MyType` must import the library defining `MyInterface`,
+    and so will be able to see any final blanket impls.
+-   A blanket impl with type structure
+    `impl ? as MyInterface(...ParameterType(...)...)` may be defined in the
+    library with `ParameterType`, but that library must import the library
+    defining `MyInterface`, and so will be able to see any `final` blanket impls
+    that might overlap. A final impl with type structure
+    `impl MyType(...) as MyInterface(...)` would be given priority over any
+    overlapping blanket impl defined in the `ParameterType` library.
+-   An impl with type structure
+    `impl MyType(...ParameterType(...)...) as MyInterface(...)` may be defined
+    in the library with `ParameterType`, but that library must import the
+    libraries defining `MyType` and `MyInterface`, and so will be able to see
+    any `final` impls that might overlap.
+
+### Comparison to Rust
 
 Rust has been designing a specialization feature, but it has not been completed.
 Luckily, Rust team members have done a lot of blogging during their design
@@ -4123,13 +4245,13 @@ differences between the Carbon and Rust plans:
 
 -   A Rust impl defaults to not being able to be specialized, with a `default`
     keyword used to opt-in to allowing specialization, reflecting the existing
-    code base developed without specialization. Carbon impls may always be
-    specialized.
+    code base developed without specialization. Carbon impls default to allowing
+    specialization, with restrictions on which may be declared `final`.
 -   Since Rust impls are not specializable by default, generic functions can
     assume that if a matching blanket impl is found, the associated types from
     that impl will be used. In Carbon, if a generic function requires an
-    associated type to have a particular value, the function needs to state that
-    using an explicit constraint.
+    associated type to have a particular value, the function commonly will need
+    to state that using an explicit constraint.
 -   Carbon will not have the "fundamental" attribute used by Rust on types or
     traits, as described in
     [Rust RFC 1023: "Rebalancing Coherence"](https://rust-lang.github.io/rfcs/1023-rebalancing-coherence.html).
@@ -4287,3 +4409,4 @@ parameter, as opposed to an associated type, as in `N:! u32 where ___ >= 2`.
 -   [#818: Constraints for generics (generics details 3)](https://github.com/carbon-language/carbon-lang/pull/818)
 -   [#931: Generic impls access (details 4)](https://github.com/carbon-language/carbon-lang/pull/931)
 -   [#920: Generic parameterized impls (details 5)](https://github.com/carbon-language/carbon-lang/pull/920)
+-   [#983: Generic details 7: final impls](https://github.com/carbon-language/carbon-lang/pull/983)

+ 168 - 0
proposals/p0983.md

@@ -0,0 +1,168 @@
+# Generics details 7: final impls
+
+<!--
+Part of the Carbon Language project, under the Apache License v2.0 with LLVM
+Exceptions. See /LICENSE for license information.
+SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+-->
+
+[Pull request](https://github.com/carbon-language/carbon-lang/pull/983)
+
+<!-- toc -->
+
+## Table of contents
+
+-   [Problem](#problem)
+-   [Background](#background)
+-   [Proposal](#proposal)
+-   [Details](#details)
+-   [Rationale based on Carbon's goals](#rationale-based-on-carbons-goals)
+-   [Alternatives considered](#alternatives-considered)
+    -   [No `final`, all parameterized impls may be specialized](#no-final-all-parameterized-impls-may-be-specialized)
+    -   [`final` associated constants instead of `final` impls](#final-associated-constants-instead-of-final-impls)
+
+<!-- tocstop -->
+
+## Problem
+
+Allowing an impl to be specialized can lead to higher performance if there are
+parameter values for which a more optimized version can be written. However, not
+all impls will be specialized and there are some benefits when that is known:
+
+-   The values of associated types can be assumed to come from the impl. In many
+    cases this means leaking fewer implementation details into the signature of
+    a function using generics.
+-   The bodies of functions from the impl could be inlined into the caller even
+    when using a more dynamic implementation strategy rather than
+    monomorphization.
+
+However, not all impls can opt-out of specialization, since this can create
+incompatibilities between unrelated libraries. For example, consider two
+libraries that both import parameterized type `TA` and interface `I`:
+
+-   Library `LB` that defines type `TB` can define an impl with type structure
+    `impl TA(TB, ?) as I`.
+-   Library `LC` that defines type `TC` can define an impl with type structure
+    `impl TA(?, TC) as I`.
+
+Both of these are allowed under
+[Carbon's current orphan rules](/docs/design/generics/details.md#orphan-rule). A
+library `LD` that imports both `LB` and `LC` could then query for the
+implementation of `I` by `TA(TB, TC)` and would use the definition from library
+`LB`, which would be a conflict if library `LC` marked its impl definition as
+not specializable.
+
+## Background
+
+Rust currently does not support specialization, so for backwards compatibility
+impls are final by default in Rust's specialization proposal.
+
+## Proposal
+
+We propose that impls can be declared `final`, but only in libraries that must
+be imported by any file that would otherwise be able to define a higher-priority
+impl.
+
+## Details
+
+Details are in
+[the added `final` impl section to the generics details design document](/docs/design/generics/details.md#final-impls).
+
+## Rationale based on Carbon's goals
+
+This proposal supports the following of Carbon's goals:
+
+-   [Performance-critical software](/docs/project/goals.md#performance-critical-software):
+    the ability to inline functions defined in `final` impls will in some cases
+    improve performance.
+-   [Software and language evolution](/docs/project/goals.md#software-and-language-evolution):
+    reducing how much implementation details are exposed in a generic function's
+    signature allows that function to evolve.
+-   [Code that is easy to read, understand, and write](/docs/project/goals.md#code-that-is-easy-to-read-understand-and-write):
+    reducing the list of requirements in a generic function signature is an
+    improvement to both readability and writability. Furthermore, `final` impls
+    are a tool for making code more predictable. For example, making the
+    dereferencing impl for pointers final means it always does the same thing
+    and produces a value of the expected type.
+
+## Alternatives considered
+
+### No `final`, all parameterized impls may be specialized
+
+In addition to the [problems listed above](#problem), we ran into problems in
+proposal [#911](https://github.com/carbon-language/carbon-lang/pull/911) trying
+to use the `CommonType` interface to define the type of a conditional
+expression, `if <condition> then <true-result> else <false-result>`. The idea is
+that `CommonType` implementations would specify how to combine the types of the
+`<true-result>` and `<false-result>` expressions using an associated type. In
+generic code, however, there was nothing to guarantee that there wouldn't be a
+specialization that would change the result. As a result, nothing could be
+concluded about the common type if either expression was generic. If at least
+the common type of two equal types was guaranteed, then you could use an
+explicit cast to make sure the types were as expected. Some method of limiting
+specialization was needed.
+
+We considered other approaches, such as using the fact that the compiler could
+see all implementations of private interfaces, but that didn't address other use
+cases. For example, we don't want users to be able to customize dereferencing
+pointers for their types so that dereferencing pointers behaves predictably in
+generic and regular code.
+
+### `final` associated constants instead of `final` impls
+
+We considered allowing developers to mark individual items in an impl as `final`
+instead. This gave developers more control, but we didn't have examples where
+that extra control was needed. It also introduced a number of complexities and
+concerns.
+
+The value for a `final let` could be an expression dependent on other associated
+constants which could be `final` or not. Checking that a refining impl adheres
+to that constraint is possible, but subtle and possibly tricky to diagnose
+mistakes clearly.
+
+If an impl matches a subset of an impl with a `final let`, how should the
+narrower impl comply with the restriction from the broader?
+
+```
+interface A {
+  let T:! type;
+}
+
+impl [U:! Type] Vector(U) as A {
+  final let T:! Type = i32;
+}
+
+impl Vector(f32) as A {
+  // T has to be `i32` because of the `final let`
+  // from the previous impl. What needs to be
+  // written here?
+}
+```
+
+We considered two different approaches, neither of which was satisfying:
+
+-   **Restate approach:** It could restate the `let` with a consistent value.
+    This does not give any indication that the `let` value is constrained, and
+    what impl is introducing that constraint, leading to spooky action at a
+    distance. It was unclear to us whether the restated `let` should use `final`
+    as well, or maybe some other keyword?
+-   **Inheritance approach:** We could have a concept of inheriting from an
+    impl, and require that any impl refining an impl with `final` members must
+    inherit those values rather than declaring them. Inheritance between impls
+    might be a useful feature in its own right, but requires there be some way
+    to name the impl being inherited from.
+
+Consider two overlapping impls that both use `final let`. The compiler would
+need to validate that they are consistent on their overlap, a source of
+complexity for the user. An impl that overlaps both would have to be consistent
+with both, but would not be able to inherit from both, a problem with using the
+inheritance approach.
+
+Ultimately we decided that this approach had a lot of complexity, concerns, and
+edge cases and we could postpone trying to solve these problems until such time
+as we determined there was a need for the greater expressivity of being able to
+mark individual items as `final`. This discussion occurred in:
+
+-   [Document examining an extended example using specialization](https://docs.google.com/document/d/1w-kRC338Jc1ibTu7Vf0pOlGKdrpumfz63bzUIxEj9jY/edit)
+-   [2021-11-29 open discussion](https://docs.google.com/document/d/1cRrhRrmaUf2hVi2lFcHsYo2j0jI6t9RGZoYjWhRxp14/edit?resourcekey=0-xWHBEZ8zIqnJiB4yfBSLfA#heading=h.6komy889g3hc)
+-   [Carbon's #typesystem channel on Discord](https://discord.com/channels/655572317891461132/708431657849585705/910681126236987495)