Skip to content

RFC: improve CString construction methods #912

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 73 additions & 0 deletions text/0000-c-string-construction.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
- Feature Name: c_string_from_iter
- Start Date: 2015-02-26
- RFC PR:
- Rust Issue:

# Summary

This is currently a fallback proposal in case
[generic conversion traits](https://github.com/rust-lang/rfcs/pull/529)
are not adopted. The changes proposed here amend the methods available to
construct a `CString` for more flexibility and better possibilities for
optimization.

# Motivation

The implementation of RFCs [#592][rfc 592] and [#840][rfc 840] has resolved
most of the issues with the design of `std::ffi::CString`, but some new APIs
have been created at the same time, needing stabilization. This proposal
aims at addressing the following issues:

1. The `IntoBytes` trait does not seem wholly justified: it falls short of
supporting `IntoIterator`, and has become yet another special-interest trait
to care about when designing APIs producing string-like data.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this trait was introduced but largely only as a temporary measure. With generic conversion traits this would be much less ad-hoc. The semi-current-plan is to leave IntoBytes as #[unstable] until we have conversion traits at which point we can switch to using those.


2. The exposure of `Vec` as an intermediate type for all conversions
precludes small-string optimizations, such as an in-place variant
implemented in [c_string](https://github.com/mzabaluev/rust-c-str).

[rfc 592]: https://github.com/rust-lang/rfcs/pull/592
[rfc 840]: https://github.com/rust-lang/rfcs/pull/840

# Detailed design

Replace `IntoBytes` with trait `IntoCString`, with the return type
of the conversion method changed to `Result<CString, NulError>`.
All implementations of `IntoBytes` are converted to `IntoCString`,
and the generic bound on parameter of `CString::new` is changed to
`IntoCString`.

A constructor accepting `IntoIterator` should also be added,
following the `from_iter` pattern in collection types:

```rust
impl CString {
pub fn from_iter<I>(iterable: I) -> Result<CString, NulError>
where I: IntoIterator<Item=u8>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like from_iter may want a different name as these look a bit odd to me:

CString::from_iter(b"foo");
CString::from_iter("foo");
CString::from_iter(my_string);

(none of them actually have to do with iterators)

Also, with an IntoIterator<Item=u8> bound, I think that this may break CString::new("foo") (e.g. construction from a str) because implementing IntoIterator<Item=u8> for str probably won't get too far. For example it would enable code like:

for byte in str {
    println!("{}", byte);
}

It's quite debatable whether byte should indeed be a byte or perhaps a char (hence the requirement that it must be chosen).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vec has a from_iter method with exactly the same signature, so it should be familiar to every user of Vec (or vice versa, it should be made to look less odd there as well).

this may break CString::new("foo")

That just needs a bit more explicit conversion: CString::from_iter("foo".as_bytes()).
Hence my suggestion to turn IntoBytes to IntoCString as a utility trait to bypass the composition acrobatics and exploit additional optimization.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vec has a from_iter method with exactly the same signature, so it should be familiar to every user of Vec (or vice versa, it should be made to look less odd there as well).

True, but it's manifested through an implementation of the FromIterator trait and isn't the only constructor (e.g. vec!, Vec::new, etc). (maybe this would be a good idea for CString regardless?

That just needs a bit more explicit conversion: CString::from_iter("foo".as_bytes()).

True, but I would personally consider that a concrete drawback (loss in ergonomics).

Hence my suggestion to turn IntoBytes to IntoCString as a utility trait to bypass the composition acrobatics and exploit additional optimization.

This seems plausible to me!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no objections really to more ergonomic constructors. IntoIterator currently seems to be the trait with the widest coverage for generic conversions, but CString cannot implement FromIterator due to the failure mode with NulError.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right I forgot about the Result, that is unfortunate :(

Perhaps though the best route for now is to presently rename IntoBytes to IntoCString? The impl blocks could all be in the c_str module and we wouldn't need to add specialized constructors just yet to CString itself.

{ ... }
}
```

# Proof of concept

As usual for my RFCs concerning `CString`, most of the proposed changes are
implemented on its workalike `CStrBuf` in crate
[c_string](https://github.com/mzabaluev/rust-c-str).

# Drawbacks

None identified.

# Alternatives
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you consider usage of the generic conversion traits to be an acceptable alternative?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes; I wasn't aware of that proposal, and it's OK to use temporary unstable traits awaiting for a proper solution to land. Perhaps the only caveat is that the conversions should be directly to CString, as Vec is a current implementation detail and may be suboptimal. I should update the proposal soon.


Implement [generic conversion traits](https://github.com/rust-lang/rfcs/pull/529).

Living with `IntoBytes` is tolerable as it is.

# Unresolved questions

Stylistic issue: `Result` as return value type of `new` feels a bit too
'loaded'. It's used in some other places, but the general expectation on `new`
is to be the most straightforward way to obtain a value of the type, while more
involved failure modes tend to be more typical on `from_*` constructors
and the like.