# Carbon <-> C++ Interop: Primitive Types [Pull request](https://github.com/carbon-language/carbon-lang/pull/5448) ## Table of contents - [Abstract](#abstract) - [Problem](#problem) - [Background](#background) - [Data models](#data-models) - [Carbon Primitive Types](#carbon-primitive-types) - [C++ Fundamental Types](#c-fundamental-types) - [void](#void) - [std::nullptr_t](#stdnullptr_t) - [std::byte](#stdbyte) - [Character types](#character-types) - [Signed integer types](#signed-integer-types) - [Unsigned integer types](#unsigned-integer-types) - [Floating-point types](#floating-point-types) - [Proposal](#proposal) - [Details](#details) - [C++ -> Carbon mapping details](#c---carbon-mapping-details) - [Carbon -> C++ mapping details](#carbon---c-mapping-details) - [Rationale](#rationale) - [Alternatives considered](#alternatives-considered) - [Open questions](#open-questions) ## Abstract Define the type mapping of the primitive types between Carbon and C++. ## Problem Interoperability of Carbon with C++ is one of the Carbon language goals (see [Interoperability with and migration from existing C++ code](/docs/project/goals.md#interoperability-with-and-migration-from-existing-c-code)). Providing [unsurprising mappings between C++ and Carbon types](/docs/design/interoperability/philosophy_and_goals.md#unsurprising-mappings-between-c-and-carbon-types) is one of it's sub goals. This proposal addresses the type mapping between the two languages to support achieving this goal. ## Background ### Data models The following data models are widely accepted: - 32-bit systems: - `LP32` (Win16 API): `int` 16-bit; `long` 32-bit; `pointer` 32-bit. - `ILP32` (Win32 API; Unix and Unix-like systems): `int` 32-bit; `long` 32-bit; `pointer` 32-bit. - 64-bit systems: - `LLP64` (Win32 API: 64-bit ARM or x86-64): `int` 32-bit; `long` 32-bit; `pointer` 64-bit. - `LP64` (Unix and Unix-like systems (Linux, macOS)): `int` 32-bit; `long` 64-bit; `pointer` 64-bit. [Carbon supported platforms](/docs/project/principles/success_criteria.md#modern-os-platforms-hardware-architectures-and-environments) Carbon will prioritize supporting modern OS, 64-bit little endian platforms (for example [LLP64](/proposals/p5448.md#data-models), [LP64](/proposals/p5448.md#data-models)). Historic platforms like [LP32](/proposals/p5448.md#data-models) won't be supported. For clarity, the text below omits [LP32](/proposals/p5448.md#data-models) relevant information and focuses only on the Carbon supported platforms. ### Carbon Primitive Types Carbon has the following [primitive types](https://github.com/carbon-language/carbon-lang/blob/trunk/docs/design/README.md#primitive-types): - `bool`: boolean type taking `true` or `false` - integer types: - signed integer types: `iN` (`N` - bit width, a positive multiple of 8) - `i8`, `i16`, `i32`, `i64`, `i128`, `i256` - unsigned integer types: `uN` (`N` - bit width, a positive multiple of 8) - `u8`, `u16`, `u32`, `u64`, `u128`, `u256` - floating-point types: `fN` (`N` - bit width, a positive multiple of 8), IEEE-754 format - `f16`, `f32`, and `f64` - always available - `f80`, `f128`, or `f256` may be available, depending on the platform ### C++ Fundamental Types C++ calls the primitive types [fundamental types](https://en.cppreference.com/w/cpp/language/types). The following fundamental types exist in C++: - `void` - `std::nullptr_t` - `std::byte` - integral types (also integer types): - `bool` - character types: - narrow character types: `signed char`, `unsigned char`, `char`, `char8_t` (c++20) - wide character types: `char16_t`, `char32_t`, `wchar_t` - signed integer types: - standard signed integer types: `signed char`, `short`, `int`, `long`, `long long` - extended signed integer types (implementation-defined) - unsigned integer types: - standard unsigned integer types: `unsigned char`, `unsigned short`, `unsigned int`, `unsigned long`, `unsigned long long` - extended unsigned integer types - floating-point types: - standard floating-point types: `float`, `double`, `long double` - extended floating-point types: - fixed width floating-point types (since C++23): `float16_t`, `float32_t`, `float64_t`, `float128_t`, `bfloat16_t` - other implementation-defined extended floating-point types #### void Objects of type `void` are not allowed, neither are arrays of `void`, nor references to `void`. Pointers to `void` and functions returning `void` are allowed. #### std::nullptr_t The type of `nullptr` (the null pointer literal). It's a distinct type that is not itself a pointer type. #### std::byte | Type | Width in bits | Notes | | ----------- | ------------- | ---------------------------------------------------------------------------------------------------------------------- | | `std::byte` | 8-bit | can be used to access raw memory, same as `unsigned char`, but it's not a character type and is not an arithmetic type | #### Character types | Type | Width in bits | Notes | | --------------- | ---------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `char` | 8-bit | multibyte characters; same representation, alignment and signedness as either `signed char` or `unsigned char` (platform-dependent), but it's a distinct type | | `signed char` | 8-bit | signed character representation | | `unsigned char` | 8-bit | unsigned character representation; raw memory access | | `char8_t` | 8-bit | UTF-8 character representation; same size, alignment and signedness as `unsigned char`, but a distinct type | | `char16_t` | 16-bit | UTF-16 character representation; same size, alignment and signedness as `std::uint_least16_t`, but a distinct type | | `char32_t` | 32-bit | UTF-32 character representation; same size, alignment and signedness as `std::uint_least32_t`, but a distinct type | | `wchar_t` | 32-bit on Linux, 16-bit on Windows | wide character representation, holds UTF-32 on Linux and other non-Windows platforms, UTF-16 on Windows. | #### Signed integer types **Standard signed integer types** | Type | Width in bits | | ------------- | ------------------------------------------------------------------------------------------------- | | `signed char` | 8-bit | | `short` | 16-bit | | `int` | 32-bit | | `long` | [LLP64](/proposals/p5448.md#data-models): 32-bit; [LP64](/proposals/p5448.md#data-models): 64-bit | | `long long` | 64-bit | **Exact-width integer types** Typically aliases of the standard integer types. | Type | Width in bits | Defined as | | -------------- | ------------- | ---------------------------------------------------------------------------- | | `std::int8_t` | 8-bit | `typedef signed char int8_t` | | `std::int16_t` | 16-bit | `typedef signed short int16_t` | | `std::int32_t` | 32-bit | `typedef signed int int32_t` | | `std::int64_t` | 64-bit | [LLP64](/proposals/p5448.md#data-models): `typedef signed long long int64_t` | | | | [LP64](/proposals/p5448.md#data-models): `typedef signed long int64_t` | **Fastest minimum-width integer types** Integer types that are usually fastest to operate with among all integer types that have the minimum specified width. | Type | Width in bits | Defined as | | ------------------- | ------------- | --------------------------------------------------------------------------------- | | `std::int_fast8_t` | >=8-bit | `typedef signed char int_fast8_t` | | `std::int_fast16_t` | >=16-bit | implementation dependent | | `std::int_fast32_t` | >=32-bit | implementation dependent | | `std::int_fast64_t` | >=64-bit | [LLP64](/proposals/p5448.md#data-models): `typedef signed long long int_fast64_t` | | | | [LP64](/proposals/p5448.md#data-models): `typedef signed long int_fast64_t` | **Minimum-width integer types** Smallest signed integer type with width of at least N-bits. | Type | Width in bits | Defined as | | -------------------- | ------------- | ---------------------------------------------------------------------------------- | | `std::int_least8_t` | >=8-bit | `typedef signed char int_least8_t` | | `std::int_least16_t` | >=16-bit | `typedef short int_least16_t` | | `std::int_least32_t` | >=32-bit | `typedef int int_least32_t` | | `std::int_least64_t` | >=64-bit | [LLP64](/proposals/p5448.md#data-models): `typedef signed long long int_least64_t` | | | | [LP64](/proposals/p5448.md#data-models): `typedef signed long int_least64_t` | **Greatest-width integer types** Maximum-width signed integer type. | Type | Width in bits | Defined as | | --------------- | ------------- | ----------------------------------------------------------------------------- | | `std::intmax_t` | >=32-bit | [LLP64](/proposals/p5448.md#data-models): `typedef signed long long intmax_t` | | | | [LP64](/proposals/p5448.md#data-models): `typedef signed long intmax_t` | **Integer types capable of holding object pointers** Signed integer type, capable of holding any pointer. | Type | Width in bits | Defined as | | --------------- | ------------- | -------------------------------------------------------------------- | | `std::intptr_t` | >=16-bit | most platforms: `typedef long intptr_t` | | | | some [ILP32](/proposals/p5448.md#data-models):`typedef int intptr_t` | **Other signed integer types** | Type | Width in bits | Defined as | | ----------- | ------------- | ------------------------------------------------- | | `ptrdiff_t` | >=16-bit | most platforms: `typedef std::intptr_t ptrdiff_t` | | | | Holds the result of subtracting two pointers. | #### Unsigned integer types The unsigned integer types have the same sizes as their [signed counterparts](/proposals/p5448.md#signed-integer-types). | Type | Width in bits | Defined as | | -------- | ------------- | ------------------------------------------ | | `size_t` | >=16-bit | most platforms: `typedef uintptr_t size_t` | | | | Holds the result of the `sizeof` operator. | #### Floating-point types **Standard floating-point types** | Type | Format | Width in bits | Note | | ------------- | ------------------------------------------------------------------------------------------------------------------- | ---------------- | --------------------------------------------------------------------------- | | `float` | usually [IEEE-754 binary32](https://en.wikipedia.org/wiki/Single-precision_floating-point_format) | 32-bits | The format or the size can vary depending on the compiler and the platform. | | `double` | usually [IEEE-754 binary64](https://en.wikipedia.org/wiki/Double-precision_floating-point_format) | 64-bits | The format or the size can vary depending on the compiler and the platform. | | `long double` | [IEEE-754 binary128](https://en.wikipedia.org/wiki/Quadruple-precision_floating-point_format) | 128-bit | used by some SPARC, MIPS, ARM64 implementations. | | | [IEEE-754 binary64-extended format](https://en.wikipedia.org/wiki/Extended_precision) | 80-bit or 64-bit | 80-bit (most x86 and x86-64 implementations); 64-bit used by MSVC. | | | [`double-double`](https://en.wikipedia.org/wiki/Quadruple-precision_floating-point_format#Double-double_arithmetic) | 128-bit | used on PowerPC. | **Fixed-width floating-point types (C++23)** They aren’t aliases to the standard floating-point types (`float`, `double`, `long double`), but to an extended floating-point type. | Type | Width in bits | Defined as | | ----------------- | ------------- | ------------------------------ | | `std::float16_t` | 16-bit | `using float16_t = _Float16` | | `std::float32_t` | 32-bit | `using float32_t = _Float32` | | `std::float64_t` | 64-bit | `using float64_t = _Float64` | | `std::float128_t` | 128-bit | `using float128_t = _Float128` | | `std::bfloat16_t` | 16-bit | | ## Proposal - The C++ fixed-width integer types `intN_t` will be the same type as Carbon integer types `iN`. Likewise for `uintN_t` <-> `uN`. - A C++ `builtin type` will be available in Carbon as `Cpp.builtin_type`, for the standard C++ signed/unsigned integer and floating-point types. - A C++ integer `builtin type` that is not the same as `intN_t` or `uintN_t` for any N, will be nameable in Carbon only as `Cpp.builtin_type`. - Different C++ types will be considered different in Carbon, so C++ overload resolution can be handled without issues. ## Details The table of Carbon <-> C++ mappings is as follows: | Carbon | C++ | | ------------------------ | -------------------------------------- | | `()` as a return type | `void` | | `bool` | `bool` | | `i8` | `int8_t` | | `i16` | `int16_t` | | `i32` | `int32_t` | | `i64` | `int64_t` | | `i128` | `int128_t` | | `u8` | `uint8_t` | | `u16` | `uint16_t` | | `u32` | `uint32_t` | | `u64` | `uint64_t` | | `u128` | `uint128_t` | | `Cpp.signed_char` | `signed char` | | `Cpp.short` | `short` | | `Cpp.int` | `int` | | `Cpp.long` | `long` | | `Cpp.long_long` | `long long` | | `Cpp.unsigned_char` | `unsigned char` | | `Cpp.unsigned_short` | `unsigned short` | | `Cpp.unsigned_int` | `unsigned int` | | `Cpp.unsigned_long` | `unsigned long` | | `Cpp.unsigned_long_long` | `unsigned long long` | | `Cpp.float` | `float` | | `Cpp.double` | `double` | | `Cpp.long_double` | `long double` | | `f16` | `std::float16_t (_Float16)` | | `f128` | `std::float128_t (_Float128)` | | TBD | `float32_t`, `float64_t`, `bfloat16_t` | | TBD | `char`, `charN_t`, `wchar_t` | | TBD | `std::byte` | | TBD | `std::nullptr_t` | In addition to the exact mappings above, the following are expected to be the same type due to the different spellings of the types in C++ being the same: | Carbon type | C++ type | | ----------- | --------------------------------------- | | `i8` | `signed char` | | `u8` | `unsigned char` | | `i16` | `short` | | `u16` | `unsigned short` | | `i32` | `int` | | `u32` | `unsigned int` | | `i64` | `long` or `long long` | | `u64` | `unsigned long` or `unsigned long long` | ### C++ -> Carbon mapping details - C++ `intN_t` type will be considered the same type as Carbon's `iN` type. Likewise for `uintN_t` <-> `uN`. - C++ `builtin type` will be available in Carbon inside the `Cpp` namespace under the name `Cpp.builtin_type`, for the standard signed/unsigned integer and floating-point types. - The names will follow the pattern: - `Cpp.[unsigned_](long_long|long|int|short|double|float)` that is signedness, then size keyword(s), then a type keyword only if there are no size keywords. For example `Cpp.unsigned_int` not `Cpp.unsigned`, `Cpp.long` not `Cpp.long_int`. - They will be available when an `import Cpp` declaration is present. - Name collision: This naming may cause name collisions if such a name already exist in the unnamed C++ namespace. We consider this not to be a common case and would not support such cases, for the benefit of having the C++-specific stuff in the package `Cpp`. - `Cpp.builtin_type` will be the same type as `iN`/`uN`, if the corresponding C++ `builtin type` is the same as `intN_t`/`uintN_t` on that platform. Otherwise it will be available in Carbon as a new, distinct type that is compatible with some of the `iN`/`uN` types. For example: - If `int32_t` is the same type as `int`, then `Cpp.int` will be the same type as `i32`. - If `int64_t` is the same type as `long`, then `Cpp.long` will be the same type as `i64`. `Cpp.long_long` will be a different type, compatible with `i64`. - `Cpp.float` and `Cpp.double` will be the same type as `f32` and `f64` correspondingly. - The type aliases `[u]int_fastN_t`, `[u]int_leastN_t`, `[u]intmax_t`, `[u]intptr_t`, `ptrdiff_t` and `size_t` will be available in Carbon in the `Cpp` namespace if the C++ header declaring them is imported (for example ``, `` etc), with names like `Cpp.[u]int_fastN_t`, `Cpp.[u]int_leastN_t`, `Cpp.size_t` etc. No special support will be provided. ### Carbon -> C++ mapping details - Same as above, Carbon `iN`/`uN` types will map to the C++ `intN_t`/`uintN_t` types. - `f32`/`f64` will map to `float`/`double` correspondingly. - `f16`/`f128` will map to `std::float16_t (_Float16)`/`std::float128_t (_Float128)` correspondingly. - Some Carbon types may not have direct mappings in C++: `i256`, `u256` , `f80`, `f256`. ## Rationale One of Carbon's goals is seamless interoperability with C++ (see [Interoperability with and migration from existing C++ code](/docs/project/goals.md#interoperability-with-and-migration-from-existing-c-code)), calling for clarity of the calls and high performance. The proposal maps the Carbon types to their direct equivalents in C++, with zero overhead, supporting the request for unsurprising mappings between C++ and Carbon types with high performance. ## Alternatives considered Naming of new types: - Allow all keyword permutations. - Reason not to do this: unnecessary and complicated. - Only include the keywords, and provide some syntax for combining them (eg, `Cpp.unsigned` & `Cpp.long` or `Cpp.unsigned(Cpp.long)`). - Reason to do this: avoids taking any identifiers from Cpp that are not C++ keywords. - Reason not to do this: overly complicated. - Use `Core.Cpp.T` instead of `Cpp.T`. - Reason to do this: avoid name collisions with C++ code. - Reason not to do this: The name collisions should not be a problem in practice, and would prefer to keep C++-specific stuff in package Cpp. `long` - `Cpp.long` and `Cpp.long_long` both map to Carbon types that are distinct from `iN` for any `N`, but are compatible with either `i32` or `i64` as appropriate. - Reason to not do this: unnecessary conversions and handling `long` and `long long` differently than the other C++ types. - Provide platform-dependent conversion functions for `long`. - Reason to do this: the conversions will be clearly outlined. - Reason not to do this: performance overhead for certain platforms. - Map `long` always to a fixed-sized Carbon type depending on the platform (for example to either `i32` or `i64`) - Reason to do this: all the code will be using fixed-sized types. - Reason not to do this: the same C++ function may map differently on different platforms and the Carbon code should compensate for that to make the code compile. `float32_t`, `float64_t` - Map `f32` <-> `float32_t` and `f64` <-> `float64_t` - Reason to do this: follow the same analogy as for the integer types (`iN` <-> `intN_t`) - Reason not to do this: - `float32_t`, `float64_t` are new types since C++23, so this won't be directly achievable, but the corresponding `_FloatN` types will need to be used for the older C++ versions. - they are not aliases for the standard floating-point types (`float`, `double`, `long double`), but for extended floating-point types, so type conversions will be needed for the standard types. ## Open questions The mapping of the following types remains open and will be discussed at a later point: - `char`, `char8_t`, `char16_t`, `char32_t`, `wchar_t` - Carbon still doesn't have character types, so the mapping of these types will be discussed once they are available. - These are all distinct types in C++, which should be taken into account to prevent any issues for overloading. - `std::byte` - `std::nullptr_t` - `void*` - `Cpp.long_double` - details of this new type is still to be discussed. - `float32_t`, `float64_t`, `bfloat16_t`.