|
| 1 | +- Start Date: (fill me in with today's date, YYYY-MM-DD) |
| 2 | +- RFC PR: (leave this empty) |
| 3 | +- Rust Issue: (leave this empty) |
| 4 | + |
| 5 | + |
| 6 | +# Summary |
| 7 | + |
| 8 | +Change the address-of operator (`&`) to a borrow operator. This is an |
| 9 | +alternative to #241 and #226 (cross-borrowing coercions). The borrow operator |
| 10 | +would perform as many dereferences as possible and then take the address of the |
| 11 | +result. The result of `&expr` would always have type `&T` where `T` does not |
| 12 | +implement `Deref`. |
| 13 | + |
| 14 | + |
| 15 | +# Motivation |
| 16 | + |
| 17 | +In Rust the concept of ownership is more important than the precise level of |
| 18 | +indirection. Whilst it is important to distinguish between values and references |
| 19 | +for performance reasons, Rust's ownership model means it is less important to |
| 20 | +know how many levels of indirection are involved in a reference. |
| 21 | + |
| 22 | +It is annoying to have to write out `&*`, `&**`, etc. to convert from one |
| 23 | +pointer kind to another. It is not really informative and just makes reading and |
| 24 | +writing Rust more painful. |
| 25 | + |
| 26 | +It would be nice to strongly enforce the principle that the first type a |
| 27 | +programmer should think of for a function signature is `&T` and to discourage |
| 28 | +use of types like `&Box<T>` or `Box<T>`, since these are less general. However, |
| 29 | +that generality is somewhat lost if the user of such functions has to consider |
| 30 | +how to convert to `&T`. |
| 31 | + |
| 32 | + |
| 33 | +# Detailed design |
| 34 | + |
| 35 | +Writing `&expr` has the effect of dereferencing `expr` as many times as possible |
| 36 | +(by calling `deref` from the `Deref` trait or by doing a compiler-built-in |
| 37 | +dereference) and taking the address of the result. |
| 38 | + |
| 39 | +Where `T` is some type that does not implement `Deref`, `&x` will have type `&T` |
| 40 | +if `x` has type `T`, `&T`, `Box<T>`, `Rc<T>`, `&Rc<T>`, `Box<&Rc<Box<&T>`, and |
| 41 | +so forth. |
| 42 | + |
| 43 | +`&mut expr` would behave the same way but take a mutable reference as the final |
| 44 | +step. The expression would have type `&mut T`. The usual rules for dereferencing |
| 45 | +and taking a mutable reference would apply, so the programmer cannot subvert |
| 46 | +Rust's mutability invariants. |
| 47 | + |
| 48 | +No coercions may be applied to `expr` in `&expr`, but they may be applied to |
| 49 | +`&expr` if it would otherwise be possible. |
| 50 | + |
| 51 | +Raw pointers would not be dereferenced by `&`. We expect raw pointer |
| 52 | +dereferences to be explicit and to be in an unsafe block. So if `x` has type |
| 53 | +`&Box<*Gc<T>>`, then `&x` would have type `&*Gc<T>`. Alternatively, we could |
| 54 | +make attempting to dereference a raw pointer using `&` a type error, so `&x` |
| 55 | +would give a type error and a note advising to use explicit dereferencing. |
| 56 | + |
| 57 | +Writing `&(expr)` (and similarly for `&mut(expr)`) will have the effect of |
| 58 | +taking the address of `expr` (the current semantics of `&expr`). If `expr` has |
| 59 | +type `U`, for any `U`, then `&(expr)` will have type `&U`. This syntax is not |
| 60 | +the greatest, and I'm very open to other suggestions. In particular writing |
| 61 | +`&(some_big_expression)` will give the address-of not borrow behaviour, which |
| 62 | +might be confusing. In practice, I hope this works, since when doing explicit |
| 63 | +referencing/dereferencing, people often use brackets (e.g., `(&*x).foo()` would |
| 64 | +become `(&(*x)).foo()`). I hope these cases are very rare. It is only necessary |
| 65 | +when you need an expression to have type `&Rc<T>` or similar, and when that |
| 66 | +expression is not the receiver of a method call. |
| 67 | + |
| 68 | + |
| 69 | +# Drawbacks |
| 70 | + |
| 71 | +Arguably, we should be very explicit about indirection in a systems language, |
| 72 | +and this proposal blurs that distinctions somewhat. |
| 73 | + |
| 74 | +When a function _does_ want to borrow an owning reference (e.g., takes a |
| 75 | +`&Box<T>` or `&mut Vec<T>`), it would be more painful to call that function. I |
| 76 | +believe this situation is rare, however. |
| 77 | + |
| 78 | + |
| 79 | +# Alternatives |
| 80 | + |
| 81 | +Take this proposal, but use a different operator. This new operator would have |
| 82 | +the semantics proposed here for `&`, and `&` would continue to be an address-of |
| 83 | +operator. |
| 84 | + |
| 85 | +There are two RFCs for different flavours of cross-borrowing: #226 and #241. |
| 86 | + |
| 87 | +#226 proposes sugaring `&*expr` as `expr` by doing a dereference and then an |
| 88 | +address-of. This converts any pointer-like type to a borrowed reference. |
| 89 | + |
| 90 | +#241 proposes sugaring `&*n expr` to `expr` where `*n` means any number of |
| 91 | +dereferences. This converts any borrowed pointer-like type to a borrowed |
| 92 | +reference, erasing multiple layers of indirection. |
| 93 | + |
| 94 | +At a high level, #226 privileges the level of indirection, and #241 privileges |
| 95 | +ownership. This RFC is closer to #241 in spirit, in that it erases multiple |
| 96 | +layers of indirection and privileges ownership over indirection. |
| 97 | + |
| 98 | +All three proposals mean less fiddling with `&` and `*` to get the type you want |
| 99 | +and none of them erase the difference between a value and a reference (as auto- |
| 100 | +borrowing would). |
| 101 | + |
| 102 | +In many cases this proposal and #241 give similar results. The difference is |
| 103 | +that this proposal is linked to an operator and is type independent, whereas |
| 104 | +#241 is implicit and depends on the required type. An example which type checks |
| 105 | +under #241, but not this proposal is: |
| 106 | + |
| 107 | +``` |
| 108 | +fn foo(x: &Rc<T>) { |
| 109 | + let y: &T = x; |
| 110 | +} |
| 111 | +``` |
| 112 | + |
| 113 | +Under this proposal you would use `let y = &x;`. |
| 114 | + |
| 115 | +I believe the advantages of this approach vs an implicit coercion are: |
| 116 | + |
| 117 | +* better integration with type inference (note no explicit type in the above |
| 118 | + example); |
| 119 | +* more easily predictable and explainable behaviour (because we always do |
| 120 | + as many dereferences as possible, c.f. a coercion which does _some_ number of |
| 121 | + dereferences, dependent on the expected type); |
| 122 | +* does not complicate the coercion system, which is already fairly complex and |
| 123 | + obscure (RFC on this coming up soon, btw). |
| 124 | + |
| 125 | +The principle advantage of the coercion approach is flexibility, in particular |
| 126 | +in the case where we want to borrow a reference to a smart pointer, e.g. |
| 127 | +(aturon), |
| 128 | + |
| 129 | +``` |
| 130 | +fn wants_vec_ref(v: &mut Vec<u8>) { ... } |
| 131 | +
|
| 132 | +fn has_vec(v: Vec<u8>) { |
| 133 | + wants_vec_ref(&mut v); // coercing Vec to &mut Vec |
| 134 | +} |
| 135 | +``` |
| 136 | + |
| 137 | +Under this proposal `&mut v` would have type `&mut[u8]` so we would fail type |
| 138 | +checking (I actually think this is desirable because it is more predictable, |
| 139 | +although it is also a bit surprising). Instead you would write `&mut(v)`. (This |
| 140 | +example assumes `Deref` for `Vec`, but the point stands without it, in general). |
| 141 | + |
| 142 | + |
| 143 | +# Unresolved questions |
| 144 | + |
| 145 | +Can we do better than `&(expr)` syntax for address-of? |
| 146 | + |
| 147 | + |
| 148 | +## Slicing |
| 149 | + |
| 150 | +There is a separate question about how to handle the `Vec<T>` -> `&[T]` and |
| 151 | +`String` -> `&str` conversions. We currently support this conversion by calling |
| 152 | +the `as_slice` method or using the empty slicing syntax (`expr[]`). If we want, |
| 153 | +we could implement `Deref<[T]>` for `Vec<T>` and `Deref<str>` for `String`, |
| 154 | +which would allow us to convert using `&*expr`. With this RFC, we could convert |
| 155 | +using `&expr` (with RFC #226 the conversion would be implicit). |
| 156 | + |
| 157 | +The question is really about `Vec`, `String`, and `Deref`, and is mostly |
| 158 | +orthogonal to this RFC. As long as we accept this or one of the cross-borrowing |
| 159 | +RFCs, then `Deref` could give us 'nice' conversions from `Vec` and `String`. |
0 commit comments