-
Notifications
You must be signed in to change notification settings - Fork 535
&str and &[u8] have the same layout #1848
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
@rustbot label: +I-lang-nominated +T-lang |
@rustbot label: +I-lang-easy-decision |
Unknown labels: I-lang-easy-decision |
FTR, the standard library has every right to make assumptions about the implementation of the language beyond what the language does guarantees, because it is intrinsically tied to rustc. Not necesssarily a point against making a decision here, but I don't think it's a strong point in favour of stabilizing the equivalence either. |
Agreed -- I actually am going to reword this to make it clear that I mean this isn't a change for Rustc, only a codification of existing decisions |
@@ -110,6 +110,7 @@ r[layout.str] | |||
## `str` Layout | |||
|
|||
String slices are a UTF-8 representation of characters that have the same layout as slices of type `[u8]`. | |||
A reference `&str` has the same layout as a reference `&[u8]`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This statement should probably be generalized to be about all primitive pointer types (&
, &mut
, *const
, *mut
), not only &
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary IMO because of https://doc.rust-lang.org/reference/type-layout.html#r-layout.pointer.intro
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically unnecessary, yes, but I bet it’ll be confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about adding a link afterwards, such as:
A reference `&str` has the same layout as a reference `&[u8]`.
> [!NOTE]
See [pointer layouts](https://doc.rust-lang.org/reference/type-layout.html#r-layout.pointer.intro) for more information on the layout rules of references in general.
That would point people in the right direction but it wouldn't increase the maintenance burden of the spec or be confusing by its over specification (I would, for example, be confused why the str
section is repeating something in the pointer section -- it would make me think that perhaps there's something special about &str
's relationship to &mut str
and &mut [u8]
, when that isn't the case at all).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the right thing in general is to have a single formal term which the Reference uses everywhere for “all primitive pointer types” and can link to an appropriate definition, but there isn’t such a term now, so that doesn’t help this PR. I take your point about avoiding repetition, though, so maybe this is a problem to solve entirely separately. Feel free to mark this conversation resolved.
(I say “primitive pointer types” to distinguish from “smart pointer” types which are not guaranteed to have the same layout.)
Currently,
str
and[u8]
are promised to have the same layout, but&str
and&[u8]
are not promised to have the same layout. The std currently assumes that they are promised to have the same layout (https://doc.rust-lang.org/src/core/str/converts.rs.html#172), so this change would have no impact beyond codifying what is already in practice. This PR defines&str
and&[u8]
to have the same layout, though what that layout is continues to be unspecified.There are some further steps here that I didn't take:
str
. I have addedstr
in several places in the reference where it otherwise refered to slices, but likely the definition of a slice should also simply includestr
. This is a bigger conversation and frankly unimportant if...str
into a libcore struct (redux) rust#107939 ever getts stabilized. In that case, all of this doesn't matter andstr
would be removed from the reference. This seems to me to be obviously the better choice.In any case, this PR represents a fairly incrementalist approach.
Thanks for the insight of those on the Zulip thread here