Description
from #91808 (comment)
namespace A {
namespace B {
struct C { };
struct D { };
} // namespace B
} // namespace A
void f1(A::B::C, A::B::D) { }
$ clang++-tot names.cpp -g2 -O0 -gpubnames -c -fdebug-types-section && llvm-dwarfdump-tot -debug-names names.o | grep "Name \|Entry \|Tag: \|DW_IDX\|String: "
String: 0x000000a0 "_Z2f1N1A1B1CENS0_1DE"
Entry @ 0xe5 {
Tag: DW_TAG_subprogram
DW_IDX_die_offset: 0x00000023
DW_IDX_parent: <parent not indexed>
String: 0x000000b5 "A"
Entry @ 0xeb {
Tag: DW_TAG_namespace
DW_IDX_type_unit: 0x00
DW_IDX_die_offset: 0x00000023
DW_IDX_parent: <parent not indexed>
Entry @ 0xf1 {
Tag: DW_TAG_namespace
DW_IDX_type_unit: 0x01
DW_IDX_die_offset: 0x00000023
DW_IDX_parent: <parent not indexed>
Entry @ 0xf7 {
Tag: DW_TAG_namespace
DW_IDX_die_offset: 0x00000044
DW_IDX_parent: <parent not indexed>
String: 0x000000b7 "B"
Entry @ 0x103 {
Tag: DW_TAG_namespace
DW_IDX_type_unit: 0x00
DW_IDX_die_offset: 0x00000025
DW_IDX_parent: Entry @ 0xe5
Entry @ 0x10d {
Tag: DW_TAG_namespace
DW_IDX_type_unit: 0x01
DW_IDX_die_offset: 0x00000025
DW_IDX_parent: Entry @ 0xf1
Entry @ 0x117 {
Tag: DW_TAG_namespace
DW_IDX_die_offset: 0x00000046
DW_IDX_parent: Entry @ 0xf7
So the second and third "B" entries match up with the appropriate "A" entries, but the first jumps over to refer to a different DIE in the CU that has the same offset.
This is because the UniqueID on a unit is only unique within that type of unit (it's unique across CUs and, separately, unique across TUs) - so the "DieOffsetAndUnitID" is not a globally unique identifier for an entry - it'd need the type of unit in there too to completely uniquify it.
Adding this patch is enough to expose the issue more directly:
diff --git a/llvm/lib/CodeGen/AsmPrinter/AccelTable.cpp b/llvm/lib/CodeGen/AsmPrinter/AccelTable.cpp
index 5b679fd3b9f9..cc8b8d1881ed 100644
--- a/llvm/lib/CodeGen/AsmPrinter/AccelTable.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/AccelTable.cpp
@@ -615,8 +615,10 @@ Dwarf5AccelTableWriter::Dwarf5AccelTableWriter(
for (auto &Bucket : Contents.getBuckets())
for (auto *Hash : Bucket)
- for (auto *Value : Hash->getValues<DWARF5AccelTableData *>())
- IndexedOffsets.insert(Value->getDieOffsetAndUnitID());
+ for (auto *Value : Hash->getValues<DWARF5AccelTableData *>()) {
+ auto Inserted = IndexedOffsets.insert(Value->getDieOffsetAndUnitID()).second;
+ assert(Inserted);
+ }
populateAbbrevsMap();
}
The two different units (one CU, one TU) with the same unit ID end up colliding in the IndexedOffsets
set - and only one ends up in there, and so then the loop here:
for (OffsetAndUnitID Offset : IndexedOffsets)
DIEOffsetToAccelEntryLabel.insert({Offset, Asm->createTempSymbol("")});
Only inserts the copy once, oh... and /this/ code:
if (EmittedAccelEntrySymbols.insert(EntrySymbol).second)
Asm->OutStreamer->emitLabel(EntrySymbol);
Silently skips emitting the label even though it matches more than one entry unintentionally, because it can match more than one entry /intentionally/ (if there's multiple entries for exactly the same entity, but known by different names (like the mangled name and unmangled name)).
So, yeah.
Hmm, I guess at least the type unit number probably isn't gapless - there are cases where we create a type unit and then potentially throw it away (see the TypeUnitsUnderConstruction stuff) - but we don't reuse the type unit number.
So, maybe it's possible to use a single numbering for CUs and TUs, gaps and all?
The only thing I was worried about was that some code might be using the numbering to index into an array of units at some point, but if that's not the case - great! Probably more intuitive that they be totally unique numbers.