-
Notifications
You must be signed in to change notification settings - Fork 183
PE: Add parser for .NET CLR #432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Turns out that CLR imports and tables parser requires very complex implemenetations and I think it should not be implemented in goblin, it's up to consuemrs handle it in upper implementations, it at least should be enough to expose fundamental datas for that. So I am marking this as ready-for-review. |
|
I don't have the time right now to review this, but I assume this is something we want in? if yes, needs rebasing, and i'll try to get to it next week |
|
@m4b Yes, please. Thank you so much for your time :) |
m4b
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is really great, and looks like a lot of work, thank you (as usual!); we should remove the repr(c) and then this is ready to go. in general, unless that struct is literally being passed to or from C functions, or something else that is speaking C-ish (e.g., python), you should avoid repr(c) since it forces a particular layout on the struct that rust does/does not have to have, which can sometimes even have performance implications (though is is most likely not relevant at all here)
|
@m4b Done :) |
|
ok this has some minor conflicts, once rebased we can merge! @kkent030315 you've done so much incredible work on the PE end I'd like to add you as a crate author, and also ask if you'd like to be the default PE backend maintainer? We/I don't really have a formal structure in this repo/crate, so what that generally means I guess is that I'd default to your expertise on the PE format in structure and how it's used "in the wild", and otherwise I'd just do basic code review (as I do today), primarily style, sometimes perf, and these days mostly whether something is breaking/some new public interface we may want/not want to support. After a while I might not even do that, and we could talk about merge rights, etc. If you have any other questions, feel free to email me directly as well, and we can correspond in that manner if you like. Anyway, hope you're having a good weekend, and again, thanks for all your hard work! |
|
@m4b Hey, thanks for the proposal about being a maintainer of this great crate. It's better to discuss that matter further on an email and I am definitely interested in. Thank you for spending your time on reviewing my code (uh sometimes I do some idiot things). Much appreciated! (Also done a rebase) |
m4b
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just the two byte preads, and unless good reason, I suggest to parse out the string. After that we can merge, thanks for your patience!
src/pe/clr.rs
Outdated
| pub size: u32, | ||
| /// Indicates the name of the stream as a null-terminated ANSI string. | ||
| /// This field can hold up to 32-bytes long, including the null-terminator. | ||
| pub name: &'a [u8], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in other places in goblin we use a &'a str here; scroll can pass a terminator byte to extract out a string, or in your case you can use DelimiterUntil(0, 32). There might be good reasons you don't want to parse out the string (but since it's null-terminated i don't think there isn't a reason to, e.g., if it was 32-bytes long and that sequence was used as an identifier, which might not be valid utf-8 for example), but if it really is ANSI null terminated, then since every ansi string is a utf-8 string (i'm pretty sure :P), I think it's better to make this a proper &str.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think DelimiterUntil(0, 32) is suitable for this case due to this check1. It will fail if the slice length is less than 32 bytes long even if it is valid. I don't know if this is intentional in scroll, I'd rater expect DelimiterUntil(0, 32) to bahave like min(first_nul_pos, min(slice_len, 32)) instead of being the slice length (32) required.
---- pe::clr::tests::parse_clr_sections stdout ----
[6c, 00, 00, 00, f0, 15, 0d, 00, 23, 7e, 00, 00, 5c, 16, 0d, 00, 34, 23, 04, 00, 23, 53, 74, 72, 69, 6e, 67, 73, 00, 00, 00, 00, 90, 39, 11, 00, 38, b9, 04, 00, 23, 55, 53, 00, c8, f2, 15, 00, 10, 00, 00, 00, 23, 47, 55, 49, 44, 00, 00, 00, d8, f2, 15, 00, fc, 12, 02, 00, 23, 42, 6c, 6f, 62, 00, 00, 00]
#~: 11 (12)
[5c, 16, 0d, 00, 34, 23, 04, 00, 23, 53, 74, 72, 69, 6e, 67, 73, 00, 00, 00, 00, 90, 39, 11, 00, 38, b9, 04, 00, 23, 55, 53, 00, c8, f2, 15, 00, 10, 00, 00, 00, 23, 47, 55, 49, 44, 00, 00, 00, d8, f2, 15, 00, fc, 12, 02, 00, 23, 42, 6c, 6f, 62, 00, 00, 00]
#Strings: 17 (20)
[90, 39, 11, 00, 38, b9, 04, 00, 23, 55, 53, 00, c8, f2, 15, 00, 10, 00, 00, 00, 23, 47, 55, 49, 44, 00, 00, 00, d8, f2, 15, 00, fc, 12, 02, 00, 23, 42, 6c, 6f, 62, 00, 00, 00]
#US: 12 (12)
[c8, f2, 15, 00, 10, 00, 00, 00, 23, 47, 55, 49, 44, 00, 00, 00, d8, f2, 15, 00, fc, 12, 02, 00, 23, 42, 6c, 6f, 62, 00, 00, 00]
It is guaranteed that the string is terminated by null, so it should suffice just do bytes.gread::<&str>(offset).
Footnotes
m4b
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for patience, lgtm; i'll try to fix the unnecessary collect in a hotpatch
| pub fn mvid(&self) -> error::Result<Option<&'a [u8]>> { | ||
| Ok(self | ||
| .sections() | ||
| .collect::<Result<Vec<_>, _>>()? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
collect shouldn't be necessary here
|
NB: backwards compatible |
Added CLR parser implementations as proposed in the issue.
and PK token (Public Key Token)parser.Added CLR tables parser.IMetaDataTableswhere originally written in hard C++Added CLR imports parser.IMetaDataImportwhere originally written in hard C++