RFC: Optional Language Provider Factory for Non-Sleigh Processor Languages #8781
Replies: 2 comments 4 replies
-
|
Would it be possible to just add a new variant of |
Beta Was this translation helpful? Give feedback.
-
|
Ghidra's C++ decompiler code might be a more natural place to handle processor-specific instruction decoding and semantics. It's a little harder to extend than the Java code you reference, but the greater access to inferred block structure and dependency/heritage linkages might make up for that. Your path forward might be:
The RISC-V vector instructions handle some of the same inference-engine tasks that the Hexagon seems to be built for. The SLEIGH file does the simple/local decoding, generating about 1300 user pcode ops with very simple-minded semantics definition. Vector instruction semantics can be heavily dependent on context, especially 'upstream' vector configuration instructions. That means generating correct semantic PCode can't be done until after a lot of control flow and dependency/heritage/Phi Node/Multiequal analysis is complete. The decompiler's iterative design works pretty well at handling the circular dependencies here. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
RFC: Optional Language Provider Factory for Non-Sleigh Processor Languages
Type: Design Proposal / RFC
Scope: Documentation only – no implementation included
Motivation / Problem Statement
Ghidra processor support is currently centered around SLEIGH (.slaspec) definitions.
While SLEIGH is powerful, it can be cumbersome or impractical for certain non-standard or highly irregular architectures, such as Qualcomm Hexagon, where instruction decoding and semantics are more naturally expressed in custom code.
At present, developers who wish to integrate such architectures are effectively constrained to:
This RFC proposes a minimal and backward-compatible extension point that allows processor developers to supply their own
Languageimplementation—while remaining fully integrated with Ghidra’s language loading, analysis, and decompiler architecture.Goals
Languageimplementation without forking or replacing core Ghidra logicLanguage,LanguageProvider,.ldefs)Non-Goals
.slaspecbehaviorHigh-Level Design
Key Idea
Allow a processor developer to supply their own
Languageimplementation by:SleighLanguageLanguageProvider.ldefsfileThis enables
SleighLanguage.parse()to return a developer-definedInstructionPrototype, fully integrated into the existing Ghidra pipeline.Proposed
.ldefsExtensionIntroduce an optional language provider factory declaration in the
.ldefsfile.Example (conceptual)
Semantics
Language Provider Factory Contract
The factory class must:
Responsibilities
LanguageProviderLanguageLanguagemust extendSleighLanguageCustom Language Behavior
The custom
Languageimplementation:SleighLanguageSleighLanguage.parse()returns a customInstructionPrototypeThis allows instruction decoding and semantic modeling to be implemented directly in Java where SLEIGH is impractical.
Integration Point in Ghidra
The proposed integration point is intentionally small and localized.
Affected Method
Proposed Change
language_provider_factoryattributegetProvider()LanguageProviderBackward Compatibility
.ldefsfiles are unaffectedSecurity and Stability Considerations
Use Case: Qualcomm Hexagon
Qualcomm Hexagon is an example of an architecture where:
This proposal allows such architectures to be supported without weakening or bypassing Ghidra’s existing abstractions.
Proposed Development Plan
Open Questions for Maintainers
.ldefsacceptable?SleighLanguageProvider.read(...)the preferred integration point?Summary
This RFC proposes a minimal, optional, and backward-compatible mechanism for supporting non-standard processor languages by allowing developers to supply their own
Languageimplementations, while remaining fully integrated with Ghidra’s architecture.Feedback from maintainers is requested before submitting an implementation.
Beta Was this translation helpful? Give feedback.
All reactions