From 8cea06aa854f617828e9e6b3abab40612d19d2fa Mon Sep 17 00:00:00 2001 From: Ralf Jung Date: Wed, 29 Aug 2018 21:59:41 +0200 Subject: [PATCH 1/3] const safety and promotion --- README.md | 7 ++- const_safety.md | 135 +++++++++++++++++++++++++++++++++++++++++++++++- promotion.md | 50 ++++++++++++++---- 3 files changed, 181 insertions(+), 11 deletions(-) diff --git a/README.md b/README.md index 1ce69af..fd0cbb8 100644 --- a/README.md +++ b/README.md @@ -18,6 +18,11 @@ The Rust compiler runs the [MIR](https://rust-lang-nursery.github.io/rustc-guide in the [`MIR` interpreter (miri)](https://rust-lang-nursery.github.io/rustc-guide/const-eval.html), which sort of is a virtual machine using `MIR` as "bytecode". +## Table of Contents + +* [Const Safety](const_safety.md) +* [Promotion](const_safety.md) + ## Related RFCs ### Const Promotion @@ -62,4 +67,4 @@ even if it does not break the compilation of the current crate's dependencies. Some of these features interact. E.g. * `match` + `loop` yields `while` -* `panic!` + `if` + `locals` yields `assert!` \ No newline at end of file +* `panic!` + `if` + `locals` yields `assert!` diff --git a/const_safety.md b/const_safety.md index 835b0b7..ca63a1a 100644 --- a/const_safety.md +++ b/const_safety.md @@ -1 +1,134 @@ -# Const safety \ No newline at end of file +# Const safety + +The miri engine, which is used to execute code at compile time, can fail in +four possible ways: + +* The program performs an unsupported operation (e.g., calling an unimplemented + intrinsics, or doing an operation that would observe the integer address of a + pointer). +* The program causes undefined behavior (e.g., dereferencing an out-of-bounds + pointer). +* The program panics (e.g., a failed bounds check). +* The program loops forever, and this is detected by the loop detector. + +Just like panics and non-termination are acceptable in safe run-time Rust code, +we also consider these acceptable in safe compile-time Rust code. However, we +would like to rule out the first two kinds of failures in safe code. Following +the terminology in [this blog post], we call a program that does not fail in the +first two ways *const safe*. + +[this blog post]: https://www.ralfj.de/blog/2018/07/19/const.html + +The goal of the const safety check, then, is to ensure that a program is const +safe. What makes this tricky is that there are some operations that are safe as +far as run-time Rust is concerned, but unsupported in the miri engine and hence +not const safe (they fall in the first category of failures above). We call these operations *unconst*. The purpose +of the following section is to explain this in more detail, before proceeding +with the main definitions. + +## Miri background + +A very simple example of an unconst operation is +```rust +static S:i32 = 0; +const BAD:bool = (&S as *const i32 as usize) % 16 == 0; +``` +The modulo operation here is not supported by the miri engine because evaluating +it requires knowing the actual integer address of `S`. + +The way miri handles this is by treating pointer and integer values separately. +The most primitive kind of value in miri is a `Scalar`, and a scalar is *either* +a pointer (`Scalar::Ptr`) or a bunch of bits representing an integer +(`Scalar::Bits`). Every value of a variable of primitive type is stored as a +`Scalar`. In the code above, casting the pointer `&S` to `*const i32` and then +to `usize` does not actually change the value -- we end up with a local variable +of type `usize` whose value is a `Scalar::Ptr`. This is not a problem in +itself, but then executing `%` on this *pointer value* is unsupported. + +However, it does not seem appropriate to blame the `%` operation above for this +failure. `%` on "normal" `usize` values (`Scalar::Bits`) is perfectly fine, just using it on +values computed from pointers is an issue. Essentially, `&i32 as *const i32 as +usize` is a "safe" `usize` at run-time (meaning that applying safe operations to +this `usize` cannot lead to misbehavior, following terminology [suggested here]) +-- but the same value is *not* "safe" at compile-time, because we can cause a +const safety violation by applying a safe operation (namely, `%`). + +[suggested here]: https://www.ralfj.de/blog/2018/08/22/two-kinds-of-invariants.html + +## Const safety check on values + +The result of any const computation (`const`, `static`, promoteds) is subject to +a "sanity check" which enforces const safety. (A sanity check is already +happening, but it is not exactly checking const safety currently.) Const safety +is defined as follows: + +* Integer and floating point types are const-safe if they are a `Scalar::Bits`. + This makes sure that we can run `%` and other operations without violating + const safety. In particular, the value must *not* be uninitialized. +* References are const-safe if they are `Scalar::Ptr` into allocated memory, and + the data stored there is const-safe. (Technically, we would also like to + require `&mut` to be unique and `&` to not be mutable unless there is an + `UnsafeCell`, but it seems infeasible to check that.) For fat pointers, the + length of a slice must be a valid `usize` and the vtable of a `dyn Trait` must + be a valid vtable. +* `bool` is const-safe if it is `Scalar::Bits` with a value of `0` or `1`. +* `char` is const-safe if it is a valid unicode codepoint. +* `()` is always const-safe. +* `!` is never const-safe. +* Tuples, structs, arrays and slices are const-safe if all their fields are + const-safe. +* Enums are const-safe if they have a valid discriminant and the fields of the + active variant are const-safe. +* Unions are always const-safe; the data does not matter. +* `dyn Trait` is const-safe if the value is const-safe at the type indicated by + the vtable. +* Function pointers are const-safe if they point to an actual function. A + `const fn` pointer (when/if we have those) must point to a `const fn`. + +For example: +```rust +static S: i32 = 0; +const BAD: usize = &S as *const i32 as usize; +``` +Here, `S` is const-safe because `0` is a `Scalar::Bits`. However, `BAD` is *not* const-safe because it is a `Scalar::Ptr`. + +## Const safety check on code + +The purpose of the const safety check on code is to prohibit construction of +non-const-safe values in safe code. We can allow *almost* all safe operations, +except for unconst operations -- which are all related to raw pointers: +Comparing raw pointers for (in)equality, converting them to integers, hashing +them (including hashing references) and so on must be prohibited. Basically, we +should not permit any raw pointer operations to begin with, and carefully +evaluate any that we permit to make sure they are fully supported by miri and do +not permit constructing non-const-safe values. + +There should also be a mechanism akin to `unsafe` blocks to opt-in to using +unconst operations. At this point, it becomes the responsibility of the +programmer to preserve const safety. In particular, a *safe* `const fn` must +always execute const-safely when called with const-safe arguments, and produce a +const-safe result. For example, the following function is const-safe (after +some extensions of the miri engine that are already implemented in miri) even +though it uses raw pointer operations: +```rust +const fn test_eq(x: &T, y: &T) -> bool { + x as *const _ == y as *const _ +} +``` +On the other hand, the following function is *not* const-safe and hence it is considered a bug to mark it as such: +``` +const fn convert(x: &T) -> usize { + x as *const _ as usize +} +``` + +## Open questions + +* Do we allow unconst operations in `unsafe` blocks, or do we have some other + mechanism for opting in to them (like `unconst` blocks)? + +* How do we communicate that the rules for safe `const fn` using unsafe code are + different than the ones for "runtime" functions? The good news here is that + violating the rules, at worst, leads to a compile-time error in a dependency. + No UB can arise. However, thanks to [promotion](promotion.md), compile-time + errors can arise even if no `const` or `static` is involved. diff --git a/promotion.md b/promotion.md index 7d0b312..c50731f 100644 --- a/promotion.md +++ b/promotion.md @@ -1,20 +1,52 @@ # Const promotion +Note that promotion happens on the MIR, not on surface-level syntax. This is +relevant when discussing e.g. handling of panics caused by overflowing +arithmetic. + ## Rules ### 1. No side effects -Promotion is not allowed to throw away side effects. -This includes panicking. So in order to promote `&(0_usize - 1)`, -the subtraction is thrown away and only the panic is kept. +Promotion is not allowed to throw away side effects. This includes +panicking. let us look at what happens when we promote `&(0_usize - 1)`: +In the MIR, this looks roughly like +``` +_tmp1 = CheckedSub (const 0usize) (const 1usize) +assert(!_tmp1.1) -> [success: bb2; unwind: ..] + +bb2: +_tmp2 = tmp1.0 +_res = &_tmp2 +``` +Both `_tmp1` and `_tmp2` are promoted to statics. `_tmp1` evaluates to `(~0, +true)`, so the assertion will always fail at run-time. Computing `_tmp2` fails +with a panic, which is thrown away -- so we have no result. In principle, we +could generate any code for this because we know the code is unreachable (the +assertion is going to fail). Just to be safe, we generate a call to +`llvm.trap`. ### 2. Const safety -Only const safe code gets promoted. This means that promotion doesn't happen -if the code does some action which, when run at compile time, either errors or -produces a value that differs from runtime. +Only const safe code gets promoted. The exact details for `const safety` are +discussed in [here](const_safety.md). + +An example of this would be `&(&1 as *const i32 as usize % 16 == 0)`. The actual +location is not known at compile-time, so we cannot promote this. Generally, we +can guarantee const-safety by not promoting when an unsafe or unconst operation +is performed. However, things get more tricky when `const` and `const fn` are +involved. -An example of this would be `&1 as *const i32 as usize == 42`. While it is highly -unlikely that the address of temporary value is `42`, at runtime this could be true. +For `const`, based on the const safety check described [here](const_safety.md), +we can rely on there not being const-unsafe values in the `const`, so we should +be able to promote freely. -The exact details for `const safety` are discussed in [here](const_safety.md). \ No newline at end of file +For `const fn`, there is no way to check anything in advance. We can either +just not promote, or we can move responsibility to the `const fn` and promote +*if* all function arguments pass the const safety check. So, `foo(42usize)` +would get promoted, but `foo(&1 as *const i32 as usize)` would not. When this +call panics, compilation proceeds and we just hard-code a panic to happen as +well at run-time. However, when const evaluation fails with another error, we +have no choice but to abort compilation of a program that would have compiled +fine if we would not have decided to promote. It is the responsibility of `foo` +to not fail this way when working with const-safe arguments. From df4e2fec193baf2060bc3b8673de429aa719adb5 Mon Sep 17 00:00:00 2001 From: Ralf Jung Date: Thu, 30 Aug 2018 10:14:55 +0200 Subject: [PATCH 2/3] extend promotion discussion --- promotion.md | 80 ++++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 62 insertions(+), 18 deletions(-) diff --git a/promotion.md b/promotion.md index c50731f..069439d 100644 --- a/promotion.md +++ b/promotion.md @@ -6,11 +6,14 @@ arithmetic. ## Rules -### 1. No side effects +### 1. Panics Promotion is not allowed to throw away side effects. This includes -panicking. let us look at what happens when we promote `&(0_usize - 1)`: -In the MIR, this looks roughly like +panicking. Let us look at what happens when we promote `&(0_usize - 1)` in a +debug build: We have to avoid erroring at compile-time (because that would be +promotion breaking compilation), but we must be sure to error correctly at +run-time. In the MIR, this looks roughly like + ``` _tmp1 = CheckedSub (const 0usize) (const 1usize) assert(!_tmp1.1) -> [success: bb2; unwind: ..] @@ -19,6 +22,7 @@ bb2: _tmp2 = tmp1.0 _res = &_tmp2 ``` + Both `_tmp1` and `_tmp2` are promoted to statics. `_tmp1` evaluates to `(~0, true)`, so the assertion will always fail at run-time. Computing `_tmp2` fails with a panic, which is thrown away -- so we have no result. In principle, we @@ -26,27 +30,67 @@ could generate any code for this because we know the code is unreachable (the assertion is going to fail). Just to be safe, we generate a call to `llvm.trap`. +As long as CTFE only panics when run-time code would also have panicked, this +works out correctly: The MIR already contains provisions for what to do on +panics (unwind edges etc.), so when CTFE panics we can generate code that +hard-codes a panic to happen at run-time. In other words, *promotion relies on +CTFE correctly implementing both normal program behavior and panics*. An +earlier version of miri used to panic on arithmetic overflow even in release +mode. This breaks promotion, because now promoting code that would work (and +could not panic!) at run-time leads to a compile-time CTFE error. + ### 2. Const safety -Only const safe code gets promoted. The exact details for `const safety` are -discussed in [here](const_safety.md). +We have explained what happens when evaluating a promoted panics, but what about +other kinds of failure -- what about hitting an unsupported operation or +undefined behavior? To make sure this does not happen, only const safe code +gets promoted. The exact details for `const safety` are discussed in +[here](const_safety.md). An example of this would be `&(&1 as *const i32 as usize % 16 == 0)`. The actual location is not known at compile-time, so we cannot promote this. Generally, we can guarantee const-safety by not promoting when an unsafe or unconst operation -is performed. However, things get more tricky when `const` and `const fn` are -involved. +is performed -- if our const safety checker is correct, that has to cover +everything, so the only possible remaining failure are panics. + +However, things get more tricky when `const` and `const fn` are involved. For `const`, based on the const safety check described [here](const_safety.md), we can rely on there not being const-unsafe values in the `const`, so we should -be able to promote freely. - -For `const fn`, there is no way to check anything in advance. We can either -just not promote, or we can move responsibility to the `const fn` and promote -*if* all function arguments pass the const safety check. So, `foo(42usize)` -would get promoted, but `foo(&1 as *const i32 as usize)` would not. When this -call panics, compilation proceeds and we just hard-code a panic to happen as -well at run-time. However, when const evaluation fails with another error, we -have no choice but to abort compilation of a program that would have compiled -fine if we would not have decided to promote. It is the responsibility of `foo` -to not fail this way when working with const-safe arguments. +be able to promote freely. For example: + +```rust +union Foo { x: &'static i32, y: usize } +const A: usize = unsafe { Foo { x: &1 }.y }; +const B: usize = unsafe { Foo { x: &2 }.y }; +let x: &bool = &(A < B); +``` + +Promoting `x` would lead to a compile failure because we cannot compare pointer +addresses. However, we do not even get there -- computing `A` or `B` fails with +a const safety check error because these are values of type `usize` that contain +a `Scalar::Ptr`. + +For `const fn`, however, there is no way to check anything in advance. We can +either just not promote, or we can move responsibility to the `const fn` and +promote *if* all function arguments pass the const safety check. So, +`foo(42usize)` would get promoted, but `foo(&1 as *const i32 as usize)` would +not. When this call panics, compilation proceeds and we just hard-code a panic +to happen as well at run-time. However, when const evaluation fails with +another error (unsupported operation or undefined behavior), we have no choice +but to abort compilation of a program that would have compiled fine if we would +not have decided to promote. It is the responsibility of `foo` to not fail this +way when working with const-safe arguments. + +### 3. Drop + +TODO: Fill this with information. + +### 4. Interior Mutability + +TODO: Fill this with information. + +## Open questions + +* There is a fourth kind of CTFE failure -- and endless loop being detected. + What do we do when that happens while evaluating a promoted? From 4e9cb4cc66b3c21b0bafadbad2077dfba44bc9c2 Mon Sep 17 00:00:00 2001 From: Ralf Jung Date: Thu, 30 Aug 2018 15:55:23 +0200 Subject: [PATCH 3/3] loop detection is best-effort --- const_safety.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/const_safety.md b/const_safety.md index ca63a1a..ac703c5 100644 --- a/const_safety.md +++ b/const_safety.md @@ -9,7 +9,8 @@ four possible ways: * The program causes undefined behavior (e.g., dereferencing an out-of-bounds pointer). * The program panics (e.g., a failed bounds check). -* The program loops forever, and this is detected by the loop detector. +* The program loops forever, and this is detected by the loop detector. Note + that this detection happens on a best-effort basis only. Just like panics and non-termination are acceptable in safe run-time Rust code, we also consider these acceptable in safe compile-time Rust code. However, we