-
Notifications
You must be signed in to change notification settings - Fork 221
Add support for custom bakes to databake #6576
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback. |
Things I'd like feedback on:
|
/// To bake to a different type than this, use `custom_bake` | ||
/// and implement `CustomBake`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think having a trait for this is overkill if you can provide a method to the macro
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reasons I made it a trait:
- Gives a place to enforce the strange safety requirement in the unsafe version
- And since we have it for unsafe, we can also use it for safe
How do you suggest handling the safety requirement without a trait?
/// | ||
/// #[derive(Bake)] | ||
/// #[databake(path = bar::module)] | ||
/// #[databake(path = custom_bake)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// #[databake(path = custom_bake)] | |
/// #[databake(custom_bake = Self::bake_to_bytes)] |
So, the promised "complicated thoughts": By and large I disprefer adding new types and traits. On the other hand, adding attributes/toggles/etc to a derive are something I think is a good way to achieve goals. As such, the original proposed design seemed pretty good to me. So my first reaction to this PR was "we should go back to the original proposal that we agreed on". Or use So basically something where we can specify to/from functions or use a preexisting trait. So But that works for safe conversions. I agree that this is not as good for unsafe conversions. For an unsafe conversion the bare minimum is that the macro should mention But if you have to implement a custom trait, I once again go back to comparing it with the motivation of reducing boilerplate: isn't the whole idea to remove custom impls? I thought about it more, and concluded that replacing a Putting all of this together, I end up with:
Looking at the existing trait I'm not really a fan of the nonlocality of the guarantees, referencing the existence of an inherent method. How about a single trait: /// Safety: implementation is valid if from_baked is always safe when fed values from to_bake
unsafe trait CustomBakeConversions {
type Baked<'a>: Bake;
/// Allowed to panic
fn to_baked(&'a self) -> Self::Bake<'a>;
/// Safety: called on values produced by to_baked
unsafe fn from_baked(baked: Self::Bake) -> Self;
} invoked with And then for the "safe" bake we have MVP Thoughts? |
can't use a trait for const construction, which is the actual unsafe part (to_baked is safe) |
This has been my position and I appreciate your eloquence. ❤️
My position is that since we need a trait for the unsafe, then it's harmless to support it in safe mode. The unsafe trait can just be an extension on the safe trait, as proposed in this PR. At least, it can be the default behavior when
I would prefer a single trait, but the constructor can't be on a trait, at least not until we have const traits. |
argh. okay, fine, the indirect const function is acceptable, but I still don't like it.
I think this is a bad trait. It has a strange nonlocal guarantee1 and it's mimicking existing Rust conversion traits. We need it for proper unsafe hygeine, which makes me marginally okay with having it: unfortunately the bad trait is the best we can do. We do not need it for the other things. I do not want to introduce a second bad trait that is only there for consistency. In the long run, we can probably have the first trait be If the options on the table are two traits or not doing this at all I prefer not doing this at all. I accept the motivation of this change, I do not accept that it overrides all other concerns, and I think having a largely extraneous trait is where I draw the line. I would prefer to solve this without new traits at all, but safety hygeine forces us to have at least one, which I begrudgingly accept. I don't want to stretch that to two traits. Footnotes
|
I don't think I agree with From/Into being a long-term goal we want to work toward. <reasoning>
Traits are cheap, clean, and easy to understand. I've moved much more toward "favor use-case-specific traits" than "try and shoehorn some existing trait into a use case that doesn't exactly match". However, I still think that this is cleaner using some trait, rather than making the proc macro more complicated. As you know, I would rather us work toward getting rid of proc macros. There seemed to be consensus at RustWeek that proc macros are bad, because the pull in Syn, require running code at build time, are hard for tooling like rust-analyzer, etc. This is a theme that came up again and again. There was desire to eventually move toward |
If we wait for const traits to land, which I'm told by the lang team should be some time in the not-too-distant future, would you approve this with a trait-based solution? You listed two reasons you don't prefer traits:
Const traits will fix (1). See my previous post for why I think we should not aim for (2). |
This is an extremely long term goal and I do not think we will get close to this any time soon (in the next five years or so). I've seen a lot of this desire, but the actual design work for this is extremely nascent1, and macro work has historically taken ages to occur. We still do not have "Macros 2.0", a nine-year old feature proposal that is still actively desired and occasionally worked on. The flexibility of the macro system overall makes it very tricky to evolve: I do not begrudge the Rust team their time in working on this, but I also expect very little when it comes to large macro system improvements. Given that, I disagree with "what we can do for now is avoid overcomplicating our proc macro". Something that is >5 years out in the future, potentially even 10 years out in the future, is not something I find it useful (or even possible?) to design towards. When that time comes near, we can perform a proper holistic redesign of databake. Until then I don't find it useful to prevent ourselves from certain design patterns because they will need to change at that level; we cannot truly predict what will and won't be complicated in that future. Furthermore I think designing proc macros with good UX now would be helpful in informing what use cases the lang team should consider when designing a declarative macro future. I very explicitly did not try and design I feel the same way about our proc macros and a potential future with more powerful decl macros. For zerovec, I am interested in ways to supplement zerovec with currently-possible decl macros to improve the dependency situation. But databake is not a normal runtime dep so I'm not super interested in databake decl macros unless we can replace it completely with decl macros (with decent UX), and I don't think decl macros are currently at that point. Eventually when we have fancy decl macros that can do this type of thing well, I'd love to try and use them, and revisit decisions like these. 5+ years is a wonderful time to perform a new holistic design.
I'd still be hesitant. My preference in databake and zerovec is if we are already using a proc macro, then we should use proc macro attribute configs as much as possible before adding new items to the public API. We have more flexibility with those attributes, and can play around with it and arrive on better holistic designs much more easily. This is a reason I have not yet stated in this thread, but I have stated before when it comes to additions to zerovec. I'll also note: the problem with a full bidirectional trait without the indirectness is that the crate now needs an unconditional runtime dependency on I recognize that I've both expressed a dislike for the indirectness and just now expressed why not having the indirectness is also bad; but this is why I disprefer traits here. Given the runtime dependency problem I think my preferred path forward is to have an "indirect" Footnotes
|
For documentation, I still think I prefer a trait for the safe case (as well as the unsafe case) because
But, it's not a good use of time to debate that point further. I assume that we will move forward with a macro-selected constructor instead of a trait for the safe case. "the crate now needs an unconditional runtime dependency on databake" => true, I hadn't considered this. But, you agree that a trait is needed for the unsafe case. Is this a fatal flaw / should I close the PR, or should I move forward despite it? Should we consider an alternative such as having the proc macro paste the unsafe trait definition locally inside the file? |
I think that is a major-but-not-fatal flaw for the unsafe trait; and makes me prefer just having unsafe users use a custom As dissected before, there are two motivations here:
I think the boilerplate one is well accepted, and is solved wonderfully with the macro solution. In the unsafe trait case I am more hesitant because the trait guarantee is strangely nonlocal and users need to depend on Also, fwiw, while So I think you could either close this PR, land the attribute just for the safe case, or land a safe attribute and an unsafe trait. I prefer not doing the unsafe trait but I am ultimately okay with it. I definitely want the safe attribute if possible.
minor note: not to further this discussion, but personally I don't want utils crate design to be driven by things like this; this is ICU4X API policy and we could easily solve this problem by deciding on a policy for ICU4X. Could even be documented in |
What did you think of this suggestion:
The trait would be private locally in each crate with bakeable structs and its only purpose would be to require the caller to write an explicit |
Oh, sorry, forgot to address that. That's an interesting proposition. The main problem I see is that there won't be source code for unsafe reviewers to look at for the invariant, whereas if it's just driven by It's a bit unconventional, but it's sufficiently decoupled that it's probably fine? I still have a preference for implementing the safe version of this and seeing how much we actually need the unsafe version. |
(I'm not aware of any call sites in ICU4X that would be served by the safe version) |
I'm looking around and while I see some safe custom The unsafe ones are by and large concentrated in zerovec/zerotrie/etc. In general I am okay with the utils crates needing to do extra heavy lifting (in fact, I have a slight preference for zerovec to keep its Bake impls because they're clearer from an unsafe perspective, and zerovec does a ton of unsafe already); it's when it starts affecting components that it becomes an issue, because we want components to be easy to write. And within components I mostly see some units/currency and the three collections. I don't see that as much, ultimately. I think an additional thing that colors my opinion here is that while we want components to be easy to write, unsafe code is already hard to get right and I'm not overly concerned with friction there. We do not have the best reputation for avoiding silly unsafe code mistakes with code review (e.g. #6805); we sometimes catch things but not always. I'm generally pleased with the quality of our existing unsafe code; especially the utils code, we've put a lot of work into it, gotten it reviewed, and soundness/security issues tend to have a half-life for going undetected. An ICU4X contributor who needs unsafe bake needing to ask someone with unsafe/proc macro experience to write it for them is not necessarily a bad thing. |
The impls I've mostly been looking at are things like PackedPattern, PatternItem, etc. Things with safety invariants and that might have a VarULE inside. |
I think it is valuable and easy to review for the safety invariant to be "the input to this function MUST be the output of that function". It is easy to verify correctness. Having to reason about this in the middle of a TokenStream is worse for everyone. |
Yes, but this invariant won't be listed anywhere in the code, if we generate a trait. Tracking it down is tricky. If we don't generate a trait, then we have the runtime dep problem, and the invariant is still nonlocal in a strange way.
None of these types have unsafe bake impls right now, as far as I can tell. |
I generally agree, but with the tokenstream at least the invariants are all in the same file and relate to each other in straightforward ways. The unsafe Bake impls we have so far are quite straightforward. |
What if databake produces an type BakeParts = ...;
impl Thing {
#[doc(hidden)]
pub fn to_bake_parts_v1(&self) -> BakeParts { ... }
/// # Safety
/// The parameter MUST have been returned by to_bake_parts_v1
#[doc(hidden)]
pub const unsafe fn from_bake_parts_v1(parts: BakeParts) -> Self { ... }
}
// Safety: from_bake_parts_v1 accepts the return value of to_bake_parts_v1
unsafe {
databake::impl_from_parts_unsafe!(Thing, to_bake_parts_v1 => from_bake_parts_v1);
} |
I don't understand what is and isn't generated there. I'm in favor of a |
My thought was that |
Yes, that's standard macro naming practice. I wish there were a better way to do unsafe macros, but it's ... fine. |
Do not rely on clippy lints triggering within macros; they are often explicitly disabled within macros depending on the lint |
#2452