-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[WASM] wasm-ld: split up __wasm_apply_data_relocs #129007
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
@llvm/pr-subscribers-lld-wasm @llvm/pr-subscribers-lld Author: None (dmjio) ChangesAddresses the issue of function body size explosion during linking with This patch works by taking a chunking strategy and splitting up As mentioned previously by @sbc100 @k1nder10
Full diff: https://github.com/llvm/llvm-project/pull/129007.diff 4 Files Affected:
diff --git a/lld/wasm/InputChunks.cpp b/lld/wasm/InputChunks.cpp
index ccdc92f5c8d71..40e1644de56e0 100644
--- a/lld/wasm/InputChunks.cpp
+++ b/lld/wasm/InputChunks.cpp
@@ -361,12 +361,11 @@ uint64_t InputChunk::getVA(uint64_t offset) const {
// Generate code to apply relocations to the data section at runtime.
// This is only called when generating shared libraries (PIC) where address are
// not known at static link time.
-bool InputChunk::generateRelocationCode(raw_ostream &os) const {
+void InputChunk::generateRelocationCode(std::vector<std::string> &funcs) const {
LLVM_DEBUG(dbgs() << "generating runtime relocations: " << name
<< " count=" << relocations.size() << "\n");
bool is64 = ctx.arg.is64.value_or(false);
- bool generated = false;
unsigned opcode_ptr_const = is64 ? WASM_OPCODE_I64_CONST
: WASM_OPCODE_I32_CONST;
unsigned opcode_ptr_add = is64 ? WASM_OPCODE_I64_ADD
@@ -385,6 +384,14 @@ bool InputChunk::generateRelocationCode(raw_ostream &os) const {
if (!requiresRuntimeReloc)
continue;
+ if (funcs.empty() || funcs.back().size() >= 7654300) {
+ funcs.emplace_back(std::string());
+ raw_string_ostream os(funcs.back());
+ writeUleb128(os, 0, "num locals");
+ }
+
+ raw_string_ostream os(funcs.back());
+
LLVM_DEBUG(dbgs() << "gen reloc: type=" << relocTypeToString(rel.Type)
<< " addend=" << rel.Addend << " index=" << rel.Index
<< " output offset=" << offset << "\n");
@@ -439,9 +446,7 @@ bool InputChunk::generateRelocationCode(raw_ostream &os) const {
writeU8(os, opcode_reloc_store, "I32_STORE");
writeUleb128(os, 2, "align");
writeUleb128(os, 0, "offset");
- generated = true;
}
- return generated;
}
// Split WASM_SEG_FLAG_STRINGS section. Such a section is a sequence of
diff --git a/lld/wasm/InputChunks.h b/lld/wasm/InputChunks.h
index f545449e1246f..d231382a5f5ef 100644
--- a/lld/wasm/InputChunks.h
+++ b/lld/wasm/InputChunks.h
@@ -78,7 +78,7 @@ class InputChunk {
size_t getNumRelocations() const { return relocations.size(); }
void writeRelocations(llvm::raw_ostream &os) const;
- bool generateRelocationCode(raw_ostream &os) const;
+ void generateRelocationCode(std::vector<std::string> &funcs) const;
bool isTLS() const { return flags & llvm::wasm::WASM_SEG_FLAG_TLS; }
bool isRetained() const { return flags & llvm::wasm::WASM_SEG_FLAG_RETAIN; }
diff --git a/lld/wasm/SyntheticSections.cpp b/lld/wasm/SyntheticSections.cpp
index 7fb44b9f0c009..3cab58ed16f93 100644
--- a/lld/wasm/SyntheticSections.cpp
+++ b/lld/wasm/SyntheticSections.cpp
@@ -299,6 +299,8 @@ void FunctionSection::writeBody() {
void FunctionSection::addFunction(InputFunction *func) {
if (!func->live)
return;
+ if (func->hasFunctionIndex())
+ return;
uint32_t functionIndex =
out.importSec->getNumImportedFunctions() + inputFunctions.size();
inputFunctions.emplace_back(func);
diff --git a/lld/wasm/Writer.cpp b/lld/wasm/Writer.cpp
index 7770bdcfc1f16..dd3e94b4fabee 100644
--- a/lld/wasm/Writer.cpp
+++ b/lld/wasm/Writer.cpp
@@ -1459,20 +1459,21 @@ void Writer::createStartFunction() {
void Writer::createApplyDataRelocationsFunction() {
LLVM_DEBUG(dbgs() << "createApplyDataRelocationsFunction\n");
// First write the body's contents to a string.
- std::string bodyContent;
+ std::vector<std::string> funcs;
{
- raw_string_ostream os(bodyContent);
- writeUleb128(os, 0, "num locals");
- bool generated = false;
for (const OutputSegment *seg : segments)
if (!ctx.arg.sharedMemory || !seg->isTLS())
for (const InputChunk *inSeg : seg->inputSegments)
- generated |= inSeg->generateRelocationCode(os);
+ inSeg->generateRelocationCode(funcs);
+ }
- if (!generated) {
- LLVM_DEBUG(dbgs() << "skipping empty __wasm_apply_data_relocs\n");
- return;
- }
+ if (funcs.empty()) {
+ LLVM_DEBUG(dbgs() << "skipping empty __wasm_apply_data_relocs\n");
+ return;
+ }
+
+ for (auto &func : funcs) {
+ raw_string_ostream os(func);
writeU8(os, WASM_OPCODE_END, "END");
}
@@ -1485,24 +1486,67 @@ void Writer::createApplyDataRelocationsFunction() {
make<SyntheticFunction>(nullSignature, "__wasm_apply_data_relocs"));
def->markLive();
- createFunction(def, bodyContent);
+ if (funcs.size() == 1) {
+ createFunction(def, funcs.back());
+ return;
+ }
+
+ std::string body;
+ {
+ raw_string_ostream os(body);
+ writeUleb128(os, 0, "num locals");
+
+ for (std::size_t i = 0; i < funcs.size(); ++i) {
+ auto &name =
+ *make<std::string>("__wasm_apply_data_relocs_" + std::to_string(i));
+ auto *func = make<SyntheticFunction>(nullSignature, name);
+ auto *def = symtab->addSyntheticFunction(
+ name, WASM_SYMBOL_VISIBILITY_HIDDEN, func);
+ def->markLive();
+ // Normally this shouldn't be called manually for a synthetic
+ // function, since the function indices in
+ // ctx.syntheticFunctions will be calculated later (check
+ // functionSec->addFunction call hierarchy for details).
+ // However, at this point we already need the correct index. The
+ // solution is to place the new synthetic function eagerly, and
+ // also making addFunction idempotent by skipping when there's
+ // already a function index.
+ out.functionSec->addFunction(func);
+ createFunction(def, funcs[i]);
+
+ writeU8(os, WASM_OPCODE_CALL, "CALL");
+ writeUleb128(os, def->getFunctionIndex(), "function index");
+ }
+
+ writeU8(os, WASM_OPCODE_END, "END");
+ }
+ createFunction(def, body);
}
void Writer::createApplyTLSRelocationsFunction() {
LLVM_DEBUG(dbgs() << "createApplyTLSRelocationsFunction\n");
- std::string bodyContent;
+ std::vector<std::string> funcs;
{
- raw_string_ostream os(bodyContent);
- writeUleb128(os, 0, "num locals");
for (const OutputSegment *seg : segments)
if (seg->isTLS())
for (const InputChunk *inSeg : seg->inputSegments)
- inSeg->generateRelocationCode(os);
+ inSeg->generateRelocationCode(funcs);
+ }
+ if (funcs.empty()) {
+ funcs.emplace_back(std::string());
+ raw_string_ostream os(funcs.back());
+ writeUleb128(os, 0, "num locals");
+ }
+
+ for (auto &func : funcs) {
+ raw_string_ostream os(func);
writeU8(os, WASM_OPCODE_END, "END");
}
- createFunction(WasmSym::applyTLSRelocs, bodyContent);
+ assert(funcs.size() == 1);
+
+ createFunction(WasmSym::applyTLSRelocs, funcs.back());
}
// Similar to createApplyDataRelocationsFunction but generates relocation code
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a reasonable approach. Thanks for working on this.
Any ideas who to write a test for this?
@@ -299,6 +299,8 @@ void FunctionSection::writeBody() { | |||
void FunctionSection::addFunction(InputFunction *func) { | |||
if (!func->live) | |||
return; | |||
if (func->hasFunctionIndex()) | |||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious why this was needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe assert(!hasFunctionIndex());
was being triggered in setFunctionIndex
, and if (func->hasFunctionIndex()) return;
was introduced to keep program termination from occurring.
void InputFunction::setFunctionIndex(uint32_t index) {
LLVM_DEBUG(dbgs() << "InputFunction::setFunctionIndex: " << name << " -> "
<< index << "\n");
assert(!hasFunctionIndex());
functionIndex = index;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, it's described below
addFunction
or markLive
is setting the index already it seems. So this explains the if (func->hasFunctionIndex()) return;
(since the assert
invariant no longer holds)
// However, at this point we already need the correct index. The | ||
// solution is to place the new synthetic function eagerly, and | ||
// also making addFunction idempotent by skipping when there's | ||
// already a function index. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that alternative to this would be to somehow generate relocations here.
I wonder how this works for other synthetic functions that can call each such as __wasm_init_memory
?
No problem, I imagine we'll have to create source that generates Test bed: module Main where
main :: IO ()
main = putStrLn "hello world" Make shared object code $ wasm32-wasi-ghc -dynamic -shared Main.hs -o Main.wasm Export to C ( $ wasm2c Main.wasm > test.c && cat test.c | grep --color wasm_apply_data_relocs_0
static void w2c__0x5F_wasm_apply_data_relocs_0(w2c_*);
w2c__0x5F_wasm_apply_data_relocs_0(instance);
void w2c__0x5F_wasm_apply_data_relocs_0(w2c_* instance) { Could use @sbc100 We'll probably need to create a large C project (maybe generate it dynamically to repro this condition) and repeat the above. |
942a9b0
to
4717cfd
Compare
a4272a1
to
b109200
Compare
b109200
to
0d3cf94
Compare
Thought of an idea on how to test this. What if we made the max body size parameter configurable (right now its max |
Addresses the issue of function body size explosion during linking with
wasm-ld
. The specific function in question is__wasm_apply_data_relocs
which is used to apply relocations to the data section at runtime. Since GHC relies on unknown calls between modules the generated__wasm_apply_data_relocs
quickly becomes unwieldy. This patch is currently in use with the WASM backend for GHCThis patch works by taking a chunking strategy and splitting up
__wasm_apply_data_relocs
into smaller functions that do not exceed the body size limit threshold.As mentioned previously by @sbc100 @k1nder10