-
Notifications
You must be signed in to change notification settings - Fork 1.6k
RFC: Tweak the compiler's default linkage for dylibs #404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,147 @@ | ||
- Start Date: (fill me in with today's date, YYYY-MM-DD) | ||
- RFC PR: (leave this empty) | ||
- Rust Issue: (leave this empty) | ||
|
||
# Summary | ||
|
||
When the compiler generates a dynamic library, alter the default behavior to | ||
favor linking all dependencies statically rather than maximizing the number of | ||
dynamic libraries. This behavior can be disabled with the existing | ||
`-C prefer-dynamic` flag. | ||
|
||
# Motivation | ||
|
||
Long ago rustc used to only be able to generate dynamic libraries and as a | ||
consequence all Rust libraries were distributed/used in a dynamic form. Over | ||
time the compiler learned to create static libraries (dubbed rlibs). With this | ||
ability the compiler had to grow the ability to choose between linking a library | ||
either statically or dynamically depending on the available formats available to | ||
the compiler. | ||
|
||
Today's heuristics and algorithm are [documented in the compiler][linkage], and | ||
the general idea is that as soon as "statically link all dependencies" fails | ||
then the compiler maximizes the number of dynamic dependencies. Today there is | ||
also not a method of instructing the compiler precisely what form intermediate | ||
libraries should be linked in the source code itself. The linkage can be | ||
"controlled" by passing `--extern` flags with only one per dependency where the | ||
desired format is passed. | ||
|
||
[linkage]: https://github.com/rust-lang/rust/blob/master/src/librustc/middle/dependency_format.rs | ||
|
||
While functional, these heuristics do not allow expressing an important use case | ||
of building a dynamic library as a final product (as opposed to an intermediate | ||
Rust library) while having all dependencies statically linked to the final | ||
dynamic library. This use case has been seen in the wild a number of times, and | ||
the current workaround is to generate a `staticlib` and then invoke the linker | ||
directly to convert that to a `dylib` (which relies on rustc generating PIC | ||
objects by default). | ||
|
||
The purpose of this RFC is to remedy this use case while largely retaining the | ||
current abilities of the compiler today. | ||
|
||
# Detailed design | ||
|
||
In english, the compiler will change its heuristics for when a dynamic library | ||
is being generated. When doing so, it will attempt to link all dependencies | ||
statically, and failing that, will continue to maximize the number of dynamic | ||
libraries which are linked in. | ||
|
||
The compiler will also repurpose the `-C prefer-dynamic` flag to indicate that | ||
this behavior is not desired, and the compiler should maximize dynamic | ||
dependencies regardless. | ||
|
||
In terms of code, the following patch will be applied to the compiler: | ||
|
||
```patch | ||
diff --git a/src/librustc/middle/dependency_format.rs b/src/librustc/middle/dependency_format.rs | ||
index 8e2d4d0..dc248eb 100644 | ||
--- a/src/librustc/middle/dependency_format.rs | ||
+++ b/src/librustc/middle/dependency_format.rs | ||
@@ -123,6 +123,16 @@ fn calculate_type(sess: &session::Session, | ||
return Vec::new(); | ||
} | ||
|
||
+ // Generating a dylib without `-C prefer-dynamic` means that we're going | ||
+ // to try to eagerly statically link all dependencies. This is normally | ||
+ // done for end-product dylibs, not intermediate products. | ||
+ config::CrateTypeDylib if !sess.opts.cg.prefer_dynamic => { | ||
+ match attempt_static(sess) { | ||
+ Some(v) => return v, | ||
+ None => {} | ||
+ } | ||
+ } | ||
+ | ||
// Everything else falls through below | ||
config::CrateTypeExecutable | config::CrateTypeDylib => {}, | ||
} | ||
``` | ||
|
||
# Drawbacks | ||
|
||
None currently, but the next section of alternatives lists a few other methods | ||
of possibly achieving the same goal. | ||
|
||
# Alternatives | ||
|
||
## Disallow intermediate dynamic libraries | ||
|
||
One possible solution to this problem is to completely disallow dynamic | ||
libraries as a possible intermediate format for rust libraries. This would solve | ||
the above problem in the sense that the compiler never has to make a choice. | ||
This would also additionally cut the distribution size in roughly half because | ||
only rlibs would be shipped, not dylibs. | ||
|
||
Another point in favor of this approach is that the story for dynamic libraries | ||
in Rust (for Rust) is also somewhat lacking with today's compiler. The ABI of a | ||
library changes quite frequently for unrelated changes, and it is thus | ||
infeasible to expect to ship a dynamic Rust library to later be updated | ||
in-place without recompiling downstream consumers. By disallowing dynamic | ||
libraries as intermediate formats in Rust, it is made quite obvious that a Rust | ||
library cannot depend on another dynamic Rust library. This would be codifying | ||
the convention today of "statically link all Rust code" in the compiler itself. | ||
|
||
The major downside of this approach is that it would then be impossible to write | ||
a plugin for Rust in Rust. For example compiler plugins would cease to work | ||
because the standard library would be statically linked to both the `rustc` | ||
executable as well as the plugin being loaded. | ||
|
||
In the common case duplication of a library in the same process does not tend to | ||
have adverse side effects, but some of the more flavorful features tend to | ||
interact adversely with duplication such as: | ||
|
||
* Globals with significant addresses (`static`s). These globals would all be | ||
duplicated and have different addresses depending on what library you're | ||
talking to. | ||
* TLS/TLD. Any "thread local" or "task local" notion will be duplicated | ||
across each library in the process. | ||
|
||
Today's design of the runtime in the standard library causes dynamically loaded | ||
plugins with a statically linked standard library to fail very quickly as soon | ||
as any runtime-related operations is performed. Note, however, that the runtime | ||
of the standard library will likely be phased out soon, but this RFC considers | ||
the cons listed above to be reasons to not take this course of action. | ||
|
||
## Allow fine-grained control of linkage | ||
|
||
Another possible alternative is to allow fine-grained control in the compiler to | ||
explicitly specify how each library should be linked (as opposed to a blanked | ||
prefer dynamic or not). | ||
|
||
Recent forays with native libraries in Cargo has led to the conclusion that | ||
hardcoding linkage into source code is often a hazard and a source of pain down | ||
the line. The ultimate decision of how a library is linked is often not up to | ||
the author, but rather the developer or builder of a library itself. | ||
|
||
This leads to the conclusion that linkage control of this form should be | ||
controlled through the command line instead, which is essentially already | ||
possible today (via `--extern`). Cargo essentially does this, but the standard | ||
libraries are shipped in dylib/rlib formats, causing the pain points listed in | ||
the motivation. | ||
|
||
As a result, this RFC does not recommend pursuing this alternative too far, but | ||
rather considers the alteration above to the compiler's heuristics to be | ||
satisfactory for now. | ||
|
||
# Unresolved questions | ||
|
||
None yet! |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason to not try to maximize the number of static libraries unless
-C prefer-dynamic
is passed?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's mostly a matter of taste for which heuristic we think is best to do by default. Invariably each one will be right in some cases and wrong in others. I would intuitively expect that if any dynamic libraries must be linked, then you may as well choose as many others as possible.
Additionally, dynamic libraries impose restrictions to what kinds of other libraries can be linked. The compiler ensures that each library appears only once, so if an rlib A is linked to dylibs B/C (statically into both), you can't link B and C into any other future products because it would include the library A twice. I think that if each dylib maximizes the number of dylibs it links to (which is hopefully the case if they're all intermediate products) then it may be best to continue maximizing, but if each dylib separately maximizes the number of static libraries then it's likely that they're mostly incompatible.
I'm really sure which way is the best option, I think there may be downsides to both. Supposedly we need a distinction between an intermediate dylib and an end-product dylib where an intermediary maximizes dylib deps while an end-product minimizes dylib deps, but that's also somewhat complicated...