Skip to content

debug_names incorrect parent due to collision between CU and TU #93886

@dwblaikie

Description

@dwblaikie

from #91808 (comment)

namespace A {
namespace B {
struct C { };
struct D { };
}  // namespace B
}  // namespace A
void f1(A::B::C, A::B::D) { }
$ clang++-tot names.cpp -g2 -O0 -gpubnames -c -fdebug-types-section && llvm-dwarfdump-tot -debug-names names.o | grep "Name \|Entry \|Tag: \|DW_IDX\|String: "
      String: 0x000000a0 "_Z2f1N1A1B1CENS0_1DE"
      Entry @ 0xe5 {
        Tag: DW_TAG_subprogram
        DW_IDX_die_offset: 0x00000023
        DW_IDX_parent: <parent not indexed>
      String: 0x000000b5 "A"
      Entry @ 0xeb {
        Tag: DW_TAG_namespace
        DW_IDX_type_unit: 0x00
        DW_IDX_die_offset: 0x00000023
        DW_IDX_parent: <parent not indexed>
      Entry @ 0xf1 {
        Tag: DW_TAG_namespace
        DW_IDX_type_unit: 0x01
        DW_IDX_die_offset: 0x00000023
        DW_IDX_parent: <parent not indexed>
      Entry @ 0xf7 {
        Tag: DW_TAG_namespace
        DW_IDX_die_offset: 0x00000044
        DW_IDX_parent: <parent not indexed>
      String: 0x000000b7 "B"
      Entry @ 0x103 {
        Tag: DW_TAG_namespace
        DW_IDX_type_unit: 0x00
        DW_IDX_die_offset: 0x00000025
        DW_IDX_parent: Entry @ 0xe5
      Entry @ 0x10d {
        Tag: DW_TAG_namespace
        DW_IDX_type_unit: 0x01
        DW_IDX_die_offset: 0x00000025
        DW_IDX_parent: Entry @ 0xf1
      Entry @ 0x117 {
        Tag: DW_TAG_namespace
        DW_IDX_die_offset: 0x00000046
        DW_IDX_parent: Entry @ 0xf7

So the second and third "B" entries match up with the appropriate "A" entries, but the first jumps over to refer to a different DIE in the CU that has the same offset.

This is because the UniqueID on a unit is only unique within that type of unit (it's unique across CUs and, separately, unique across TUs) - so the "DieOffsetAndUnitID" is not a globally unique identifier for an entry - it'd need the type of unit in there too to completely uniquify it.

Adding this patch is enough to expose the issue more directly:

diff --git a/llvm/lib/CodeGen/AsmPrinter/AccelTable.cpp b/llvm/lib/CodeGen/AsmPrinter/AccelTable.cpp
index 5b679fd3b9f9..cc8b8d1881ed 100644
--- a/llvm/lib/CodeGen/AsmPrinter/AccelTable.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/AccelTable.cpp
@@ -615,8 +615,10 @@ Dwarf5AccelTableWriter::Dwarf5AccelTableWriter(
 
   for (auto &Bucket : Contents.getBuckets())
     for (auto *Hash : Bucket)
-      for (auto *Value : Hash->getValues<DWARF5AccelTableData *>())
-        IndexedOffsets.insert(Value->getDieOffsetAndUnitID());
+      for (auto *Value : Hash->getValues<DWARF5AccelTableData *>()) {
+        auto Inserted = IndexedOffsets.insert(Value->getDieOffsetAndUnitID()).second;
+        assert(Inserted);
+      }
 
   populateAbbrevsMap();
 }

The two different units (one CU, one TU) with the same unit ID end up colliding in the IndexedOffsets set - and only one ends up in there, and so then the loop here:

for (OffsetAndUnitID Offset : IndexedOffsets)
  DIEOffsetToAccelEntryLabel.insert({Offset, Asm->createTempSymbol("")});

Only inserts the copy once, oh... and /this/ code:

if (EmittedAccelEntrySymbols.insert(EntrySymbol).second)
  Asm->OutStreamer->emitLabel(EntrySymbol);

Silently skips emitting the label even though it matches more than one entry unintentionally, because it can match more than one entry /intentionally/ (if there's multiple entries for exactly the same entity, but known by different names (like the mangled name and unmangled name)).

So, yeah.

Hmm, I guess at least the type unit number probably isn't gapless - there are cases where we create a type unit and then potentially throw it away (see the TypeUnitsUnderConstruction stuff) - but we don't reuse the type unit number.

So, maybe it's possible to use a single numbering for CUs and TUs, gaps and all?

The only thing I was worried about was that some code might be using the numbering to index into an array of units at some point, but if that's not the case - great! Probably more intuitive that they be totally unique numbers.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions