Skip to content

[WASM] wasm-ld: split up __wasm_apply_data_relocs #129007

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

dmjio
Copy link

@dmjio dmjio commented Feb 27, 2025

Addresses the issue of function body size explosion during linking with wasm-ld. The specific function in question is __wasm_apply_data_relocs which is used to apply relocations to the data section at runtime. Since GHC relies on unknown calls between modules the generated __wasm_apply_data_relocs quickly becomes unwieldy. This patch is currently in use with the WASM backend for GHC

This patch works by taking a chunking strategy and splitting up __wasm_apply_data_relocs into smaller functions that do not exceed the body size limit threshold.

As mentioned previously by @sbc100 @k1nder10

Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot
Copy link
Member

llvmbot commented Feb 27, 2025

@llvm/pr-subscribers-lld-wasm

@llvm/pr-subscribers-lld

Author: None (dmjio)

Changes

Addresses the issue of function body size explosion during linking with wasm-ld. The specific function in question is __wasm_apply_data_relocs which is used to apply relocations to the data section at runtime. Since GHC relies on unknown calls between modules the generated __wasm_apply_data_relocs quickly becomes unwieldy. This patch is currently in use with the WASM backend for GHC

This patch works by taking a chunking strategy and splitting up __wasm_apply_data_relocs into smaller functions that do not exceed the body size limit threshold.

As mentioned previously by @sbc100 @k1nder10


Full diff: https://github.com/llvm/llvm-project/pull/129007.diff

4 Files Affected:

  • (modified) lld/wasm/InputChunks.cpp (+9-4)
  • (modified) lld/wasm/InputChunks.h (+1-1)
  • (modified) lld/wasm/SyntheticSections.cpp (+2)
  • (modified) lld/wasm/Writer.cpp (+59-15)
diff --git a/lld/wasm/InputChunks.cpp b/lld/wasm/InputChunks.cpp
index ccdc92f5c8d71..40e1644de56e0 100644
--- a/lld/wasm/InputChunks.cpp
+++ b/lld/wasm/InputChunks.cpp
@@ -361,12 +361,11 @@ uint64_t InputChunk::getVA(uint64_t offset) const {
 // Generate code to apply relocations to the data section at runtime.
 // This is only called when generating shared libraries (PIC) where address are
 // not known at static link time.
-bool InputChunk::generateRelocationCode(raw_ostream &os) const {
+void InputChunk::generateRelocationCode(std::vector<std::string> &funcs) const {
   LLVM_DEBUG(dbgs() << "generating runtime relocations: " << name
                     << " count=" << relocations.size() << "\n");
 
   bool is64 = ctx.arg.is64.value_or(false);
-  bool generated = false;
   unsigned opcode_ptr_const = is64 ? WASM_OPCODE_I64_CONST
                                    : WASM_OPCODE_I32_CONST;
   unsigned opcode_ptr_add = is64 ? WASM_OPCODE_I64_ADD
@@ -385,6 +384,14 @@ bool InputChunk::generateRelocationCode(raw_ostream &os) const {
     if (!requiresRuntimeReloc)
       continue;
 
+    if (funcs.empty() || funcs.back().size() >= 7654300) {
+      funcs.emplace_back(std::string());
+      raw_string_ostream os(funcs.back());
+      writeUleb128(os, 0, "num locals");
+    }
+
+    raw_string_ostream os(funcs.back());
+
     LLVM_DEBUG(dbgs() << "gen reloc: type=" << relocTypeToString(rel.Type)
                       << " addend=" << rel.Addend << " index=" << rel.Index
                       << " output offset=" << offset << "\n");
@@ -439,9 +446,7 @@ bool InputChunk::generateRelocationCode(raw_ostream &os) const {
     writeU8(os, opcode_reloc_store, "I32_STORE");
     writeUleb128(os, 2, "align");
     writeUleb128(os, 0, "offset");
-    generated = true;
   }
-  return generated;
 }
 
 // Split WASM_SEG_FLAG_STRINGS section. Such a section is a sequence of
diff --git a/lld/wasm/InputChunks.h b/lld/wasm/InputChunks.h
index f545449e1246f..d231382a5f5ef 100644
--- a/lld/wasm/InputChunks.h
+++ b/lld/wasm/InputChunks.h
@@ -78,7 +78,7 @@ class InputChunk {
 
   size_t getNumRelocations() const { return relocations.size(); }
   void writeRelocations(llvm::raw_ostream &os) const;
-  bool generateRelocationCode(raw_ostream &os) const;
+  void generateRelocationCode(std::vector<std::string> &funcs) const;
 
   bool isTLS() const { return flags & llvm::wasm::WASM_SEG_FLAG_TLS; }
   bool isRetained() const { return flags & llvm::wasm::WASM_SEG_FLAG_RETAIN; }
diff --git a/lld/wasm/SyntheticSections.cpp b/lld/wasm/SyntheticSections.cpp
index 7fb44b9f0c009..3cab58ed16f93 100644
--- a/lld/wasm/SyntheticSections.cpp
+++ b/lld/wasm/SyntheticSections.cpp
@@ -299,6 +299,8 @@ void FunctionSection::writeBody() {
 void FunctionSection::addFunction(InputFunction *func) {
   if (!func->live)
     return;
+  if (func->hasFunctionIndex())
+    return;
   uint32_t functionIndex =
       out.importSec->getNumImportedFunctions() + inputFunctions.size();
   inputFunctions.emplace_back(func);
diff --git a/lld/wasm/Writer.cpp b/lld/wasm/Writer.cpp
index 7770bdcfc1f16..dd3e94b4fabee 100644
--- a/lld/wasm/Writer.cpp
+++ b/lld/wasm/Writer.cpp
@@ -1459,20 +1459,21 @@ void Writer::createStartFunction() {
 void Writer::createApplyDataRelocationsFunction() {
   LLVM_DEBUG(dbgs() << "createApplyDataRelocationsFunction\n");
   // First write the body's contents to a string.
-  std::string bodyContent;
+  std::vector<std::string> funcs;
   {
-    raw_string_ostream os(bodyContent);
-    writeUleb128(os, 0, "num locals");
-    bool generated = false;
     for (const OutputSegment *seg : segments)
       if (!ctx.arg.sharedMemory || !seg->isTLS())
         for (const InputChunk *inSeg : seg->inputSegments)
-          generated |= inSeg->generateRelocationCode(os);
+          inSeg->generateRelocationCode(funcs);
+  }
 
-    if (!generated) {
-      LLVM_DEBUG(dbgs() << "skipping empty __wasm_apply_data_relocs\n");
-      return;
-    }
+  if (funcs.empty()) {
+    LLVM_DEBUG(dbgs() << "skipping empty __wasm_apply_data_relocs\n");
+    return;
+  }
+
+  for (auto &func : funcs) {
+    raw_string_ostream os(func);
     writeU8(os, WASM_OPCODE_END, "END");
   }
 
@@ -1485,24 +1486,67 @@ void Writer::createApplyDataRelocationsFunction() {
       make<SyntheticFunction>(nullSignature, "__wasm_apply_data_relocs"));
   def->markLive();
 
-  createFunction(def, bodyContent);
+  if (funcs.size() == 1) {
+    createFunction(def, funcs.back());
+    return;
+  }
+
+  std::string body;
+  {
+    raw_string_ostream os(body);
+    writeUleb128(os, 0, "num locals");
+
+    for (std::size_t i = 0; i < funcs.size(); ++i) {
+      auto &name =
+          *make<std::string>("__wasm_apply_data_relocs_" + std::to_string(i));
+      auto *func = make<SyntheticFunction>(nullSignature, name);
+      auto *def = symtab->addSyntheticFunction(
+          name, WASM_SYMBOL_VISIBILITY_HIDDEN, func);
+      def->markLive();
+      // Normally this shouldn't be called manually for a synthetic
+      // function, since the function indices in
+      // ctx.syntheticFunctions will be calculated later (check
+      // functionSec->addFunction call hierarchy for details).
+      // However, at this point we already need the correct index. The
+      // solution is to place the new synthetic function eagerly, and
+      // also making addFunction idempotent by skipping when there's
+      // already a function index.
+      out.functionSec->addFunction(func);
+      createFunction(def, funcs[i]);
+
+      writeU8(os, WASM_OPCODE_CALL, "CALL");
+      writeUleb128(os, def->getFunctionIndex(), "function index");
+    }
+
+    writeU8(os, WASM_OPCODE_END, "END");
+  }
+  createFunction(def, body);
 }
 
 void Writer::createApplyTLSRelocationsFunction() {
   LLVM_DEBUG(dbgs() << "createApplyTLSRelocationsFunction\n");
-  std::string bodyContent;
+  std::vector<std::string> funcs;
   {
-    raw_string_ostream os(bodyContent);
-    writeUleb128(os, 0, "num locals");
     for (const OutputSegment *seg : segments)
       if (seg->isTLS())
         for (const InputChunk *inSeg : seg->inputSegments)
-          inSeg->generateRelocationCode(os);
+          inSeg->generateRelocationCode(funcs);
+  }
 
+  if (funcs.empty()) {
+    funcs.emplace_back(std::string());
+    raw_string_ostream os(funcs.back());
+    writeUleb128(os, 0, "num locals");
+  }
+
+  for (auto &func : funcs) {
+    raw_string_ostream os(func);
     writeU8(os, WASM_OPCODE_END, "END");
   }
 
-  createFunction(WasmSym::applyTLSRelocs, bodyContent);
+  assert(funcs.size() == 1);
+
+  createFunction(WasmSym::applyTLSRelocs, funcs.back());
 }
 
 // Similar to createApplyDataRelocationsFunction but generates relocation code

Copy link
Collaborator

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a reasonable approach. Thanks for working on this.

Any ideas who to write a test for this?

@@ -299,6 +299,8 @@ void FunctionSection::writeBody() {
void FunctionSection::addFunction(InputFunction *func) {
if (!func->live)
return;
if (func->hasFunctionIndex())
return;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious why this was needed?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe assert(!hasFunctionIndex()); was being triggered in setFunctionIndex, and if (func->hasFunctionIndex()) return; was introduced to keep program termination from occurring.

void InputFunction::setFunctionIndex(uint32_t index) {                                                                                                 
  LLVM_DEBUG(dbgs() << "InputFunction::setFunctionIndex: " << name << " -> "                                                                           
                    << index << "\n");                                                                                                                 
  assert(!hasFunctionIndex());                                                                                                                         
  functionIndex = index;                                                                                                                               
}  

Copy link
Author

@dmjio dmjio Feb 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, it's described below

https://github.com/llvm/llvm-project/pull/129007/files#diff-e826be2acc8b58c5d040525dc8a509e90810d3edcd93190d4810e476919ef9aaR1509-R1513

addFunction or markLive is setting the index already it seems. So this explains the if (func->hasFunctionIndex()) return; (since the assert invariant no longer holds)

// However, at this point we already need the correct index. The
// solution is to place the new synthetic function eagerly, and
// also making addFunction idempotent by skipping when there's
// already a function index.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that alternative to this would be to somehow generate relocations here.

I wonder how this works for other synthetic functions that can call each such as __wasm_init_memory?

@dmjio
Copy link
Author

dmjio commented Feb 27, 2025

Looks like a reasonable approach. Thanks for working on this.

Any ideas who to write a test for this?

No problem,

I imagine we'll have to create source that generates wasm_apply_data_relocs_0 when compiled to shared object code. In GHC it's triggered with the "hello world" example (using this patch)

Test bed:

module Main where

main :: IO ()
main = putStrLn "hello world"

Make shared object code

$ wasm32-wasi-ghc -dynamic -shared Main.hs -o Main.wasm  

Export to C (wasm2c), grep for wasm_apply_data_relocs_0

$ wasm2c Main.wasm > test.c && cat test.c | grep --color wasm_apply_data_relocs_0

static void w2c__0x5F_wasm_apply_data_relocs_0(w2c_*);
  w2c__0x5F_wasm_apply_data_relocs_0(instance);
void w2c__0x5F_wasm_apply_data_relocs_0(w2c_* instance) {

Could use llvm-objdump and test for symbols that way.

@sbc100 We'll probably need to create a large C project (maybe generate it dynamically to repro this condition) and repeat the above.

@dmjio dmjio changed the title [GHC] wasm-ld: split up __wasm_apply_data_relocs [WASM] wasm-ld: split up __wasm_apply_data_relocs Feb 28, 2025
@dmjio dmjio force-pushed the wasm-data-relocs branch 2 times, most recently from a4272a1 to b109200 Compare February 28, 2025 07:32
@dmjio
Copy link
Author

dmjio commented Mar 3, 2025

Looks like a reasonable approach. Thanks for working on this.

Any ideas who to write a test for this?

Thought of an idea on how to test this. What if we made the max body size parameter configurable (right now its max 7654321, and can remain the default). Then, in the test suite we configure it to be much smaller, so our test input could be of reasonable size. This should demonstrate the chunking functionality. @sbc100 does this sound good to you?

@dmjio dmjio marked this pull request as draft March 3, 2025 22:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants