Skip to content

Commit b72b3cd

Browse files
authored
Small fixes & preparation for publishing Python bindings onto PyPi (#8)
1 parent 6125d4b commit b72b3cd

File tree

6 files changed

+228
-18
lines changed

6 files changed

+228
-18
lines changed

.github/workflows/udmp-parser.yml

Lines changed: 22 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -13,9 +13,9 @@ jobs:
1313
- {os: windows-latest, generator: ninja, arch: x64, config: RelWithDebInfo}
1414
- {os: windows-latest, generator: msvc, arch: win32, config: RelWithDebInfo}
1515
- {os: windows-latest, generator: msvc, arch: arm64, config: RelWithDebInfo}
16-
- {os: ubuntu-latest, generator: gcc, arch: , config: RelWithDebInfo}
17-
- {os: ubuntu-latest, generator: clang, arch: , config: RelWithDebInfo}
18-
- {os: macos-latest, generator: clang, arch: , config: Release}
16+
- {os: ubuntu-latest, generator: gcc, arch: x64, config: RelWithDebInfo}
17+
- {os: ubuntu-latest, generator: clang, arch: x64, config: RelWithDebInfo}
18+
- {os: macos-latest, generator: clang, arch: x64, config: Release}
1919
runs-on: ${{ matrix.variant.os }}
2020
name: parser / ${{ matrix.variant.os }} / ${{ matrix.variant.generator }} / ${{ matrix.variant.arch }}
2121
env:
@@ -81,9 +81,9 @@ jobs:
8181
- {os: windows-latest, generator: msvc, arch: x64, config: RelWithDebInfo, py-arch: x64}
8282
- {os: windows-latest, generator: msvc, arch: win32, config: RelWithDebInfo, py-arch: x86}
8383
# - {os: windows-latest, generator: msvc, arch: arm64, config: RelWithDebInfo, py-arch: x64} # Unsupported (see https://raw.githubusercontent.com/actions/python-versions/main/versions-manifest.json)
84-
- {os: ubuntu-latest, generator: gcc, arch: , config: RelWithDebInfo, py-arch: x64}
85-
- {os: ubuntu-latest, generator: clang, arch: , config: RelWithDebInfo, py-arch: x64}
86-
- {os: macos-latest, generator: clang, arch: , config: Release, py-arch: x64}
84+
- {os: ubuntu-latest, generator: gcc, arch: x64, config: RelWithDebInfo, py-arch: x64}
85+
- {os: ubuntu-latest, generator: clang, arch: x64, config: RelWithDebInfo, py-arch: x64}
86+
- {os: macos-latest, generator: clang, arch: x64, config: Release, py-arch: x64}
8787
runs-on: ${{ matrix.variant.os }}
8888
name: bindings / ${{ matrix.variant.os }} / ${{ matrix.variant.generator }} / ${{ matrix.variant.arch }}
8989
env:
@@ -153,8 +153,23 @@ jobs:
153153
pytest -vvv ./tests
154154
cd ../..
155155
156+
- name: Build wheel
157+
run: |
158+
cd src/python
159+
python -m pip wheel -w ../../artifact .
160+
cd ../..
161+
156162
- name: Upload artifacts
157163
uses: actions/upload-artifact@v3
158164
with:
159-
name: python-${{ matrix.variant.os }}.${{ matrix.variant.generator }}-bin-${{ matrix.variant.arch }}.${{ matrix.variant.config }}
165+
name: python-${{ matrix.variant.os }}.${{ matrix.variant.generator }}-${{ matrix.variant.arch }}.${{ matrix.variant.config }}
160166
path: artifact/
167+
168+
- name: Upload to PyPi
169+
if: github.event_name == 'push' && startsWith(github.ref, 'refs/tags')
170+
uses: pypa/gh-action-pypi-publish@release/v1
171+
with:
172+
password: ${{ secrets.PYPI_API_TOKEN }}
173+
print-hash: true
174+
packages-dir: artifact/
175+
verbose: true

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,6 @@ Examples:
3939
parser.exe -t main user.dmp
4040
Show a memory page at a specific address:
4141
parser.exe -dump 0x7ff00 user.dmp
42-
4342
```
4443

4544
## Building
@@ -157,6 +156,7 @@ Feature-wise, here are some examples of usage:
157156
Get a hashmap of threads (as `{TID: ThreadObject}`), access their information:
158157

159158
```python
159+
>>> threads = dmp.Threads()
160160
>>> len(threads)
161161
14
162162
>>> threads

src/python/CMakeLists.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ else()
4242
if(MSVC)
4343
install(FILES $<TARGET_PDB_FILE:udmp_parser> DESTINATION bindings/python OPTIONAL)
4444
endif(MSVC)
45-
endif(BUILD_PYTHON_PACKAGE)
45+
endif()
4646

4747
if(WIN32)
4848
target_compile_definitions(udmp_parser PRIVATE NOMINMAX)

src/python/README.md

Lines changed: 201 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,207 @@
22

33
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black) [![Licence MIT](https://img.shields.io/packagist/l/doctrine/orm.svg?maxAge=2592000?style=plastic)](https://github.com/0vercl0k/udmp-parser/blob/master/LICENSE)
44

5+
`udmp-parser` is a cross-platform C++ parser library for Windows [user minidumps](https://docs.microsoft.com/en-us/windows/win32/debug/minidump-files) written by [0vercl0k](https://github.com/0vercl0k). The Python bindings were added by [hugsy](https://github.com/hugsy). Refer to the [project page on Github](https://github.com/0vercl0k/udmp-parser) for documentation, issues and pull requests.
56

6-
This package holds the Python bindings for the [`udmp-parser`](https://github.com/0vercl0k/udmp-parser) project.
7+
![parser](https://github.com/0vercl0k/udmp-parser/raw/main/pics/parser.gif)
78

8-
`udmp-parser` is a cross-platform C++ parser library for Windows [user minidumps](https://docs.microsoft.com/en-us/windows/win32/debug/minidump-files) written by [0vercl0k](https://github.com/0vercl0k). The Python bindings were added by [hugsy](https://github.com/hugsy).
9+
The library supports Intel 32-bit / 64-bit dumps and provides read access to things like:
910

10-
Refer to the [project page on Github](https://github.com/0vercl0k/udmp-parser) for documentation, issues and pull requests.
11+
- The thread list and their context records,
12+
- The virtual memory,
13+
- The loaded modules.
14+
15+
## Installing from PyPI
16+
17+
The easiest way is simply to:
18+
19+
```
20+
pip install udmp_parser
21+
```
22+
23+
## Usage
24+
25+
The Python API was built around the C++ code so the names were preserved. Everything lives within the module `udmp_parser`.
26+
Note: For convenience, a simple [pure Python script](src/python/tests/utils.py) was added to generate minidumps ready to use:
27+
28+
```python
29+
$ python -i src/python/tests/utils.py
30+
>>> pid, dmppath = generate_minidump_from_process_name("winver.exe")
31+
Minidump generated successfully: PID=3232 -> minidump-winver.exe-1687024880.dmp
32+
>>> pid
33+
3232
34+
>>> dmppath
35+
WindowsPath('minidump-winver.exe-1687024880.dmp'))
36+
```
37+
38+
Parsing a minidump object is as simple as:
39+
40+
```python
41+
>>> import udmp_parser
42+
>>> udmp_parser.version.major, udmp_parser.version.minor, udmp_parser.version.release
43+
(0, 4, '')
44+
>>> dmp = udmp_parser.UserDumpParser()
45+
>>> dmp.Parse(pathlib.Path("C:/temp/rundll32.dmp"))
46+
True
47+
```
48+
49+
Feature-wise, here are some examples of usage:
50+
51+
### Threads
52+
53+
Get a hashmap of threads (as `{TID: ThreadObject}`), access their information:
54+
55+
```python
56+
>>> threads = dmp.Threads()
57+
>>> len(threads)
58+
14
59+
>>> threads
60+
{5292: Thread(Id=0x14ac, SuspendCount=0x1, Teb=0x2e8000),
61+
5300: Thread(Id=0x14b4, SuspendCount=0x1, Teb=0x2e5000),
62+
5316: Thread(Id=0x14c4, SuspendCount=0x1, Teb=0x2df000),
63+
3136: Thread(Id=0xc40, SuspendCount=0x1, Teb=0x2ee000),
64+
4204: Thread(Id=0x106c, SuspendCount=0x1, Teb=0x309000),
65+
5328: Thread(Id=0x14d0, SuspendCount=0x1, Teb=0x2e2000),
66+
1952: Thread(Id=0x7a0, SuspendCount=0x1, Teb=0x2f7000),
67+
3888: Thread(Id=0xf30, SuspendCount=0x1, Teb=0x2eb000),
68+
1760: Thread(Id=0x6e0, SuspendCount=0x1, Teb=0x2f4000),
69+
792: Thread(Id=0x318, SuspendCount=0x1, Teb=0x300000),
70+
1972: Thread(Id=0x7b4, SuspendCount=0x1, Teb=0x2fa000),
71+
1228: Thread(Id=0x4cc, SuspendCount=0x1, Teb=0x2fd000),
72+
516: Thread(Id=0x204, SuspendCount=0x1, Teb=0x303000),
73+
2416: Thread(Id=0x970, SuspendCount=0x1, Teb=0x306000)}
74+
```
75+
76+
And access invidual thread, including their register context:
77+
78+
```python
79+
>>> thread = threads[5292]
80+
>>> print(f"RIP={thread.Context.Rip:#x} RBP={thread.Context.Rbp:#x} RSP={thread.Context.Rsp:#x}")
81+
RIP=0x7ffc264b0ad4 RBP=0x404fecc RSP=0x7de628
82+
```
83+
84+
85+
### Modules
86+
87+
Get a hashmap of modules (as `{address: ModuleObject}`), access their information:
88+
89+
```python
90+
>>> modules = dmp.Modules()
91+
>>> modules
92+
{1572864: Module_t(BaseOfImage=0x180000, SizeOfImage=0x3000, ModuleName=C:\Windows\SysWOW64\sfc.dll),
93+
10813440: Module_t(BaseOfImage=0xa50000, SizeOfImage=0x14000, ModuleName=C:\Windows\SysWOW64\rundll32.exe),
94+
1929052160: Module_t(BaseOfImage=0x72fb0000, SizeOfImage=0x11000, ModuleName=C:\Windows\SysWOW64\wkscli.dll),
95+
1929183232: Module_t(BaseOfImage=0x72fd0000, SizeOfImage=0x52000, ModuleName=C:\Windows\SysWOW64\mswsock.dll),
96+
1929576448: Module_t(BaseOfImage=0x73030000, SizeOfImage=0xf000, ModuleName=C:\Windows\SysWOW64\browcli.dll),
97+
1929641984: Module_t(BaseOfImage=0x73040000, SizeOfImage=0xa000, ModuleName=C:\Windows\SysWOW64\davhlpr.dll),
98+
1929707520: Module_t(BaseOfImage=0x73050000, SizeOfImage=0x19000, ModuleName=C:\Windows\SysWOW64\davclnt.dll),
99+
1929838592: Module_t(BaseOfImage=0x73070000, SizeOfImage=0x18000, ModuleName=C:\Windows\SysWOW64\ntlanman.dll),
100+
[...]
101+
140720922427392: Module_t(BaseOfImage=0x7ffc24980000, SizeOfImage=0x83000, ModuleName=C:\Windows\System32\wow64win.dll),
102+
140720923017216: Module_t(BaseOfImage=0x7ffc24a10000, SizeOfImage=0x59000, ModuleName=C:\Windows\System32\wow64.dll),
103+
140720950280192: Module_t(BaseOfImage=0x7ffc26410000, SizeOfImage=0x1f8000, ModuleName=C:\Windows\System32\ntdll.dll)}
104+
```
105+
106+
Access directly module info:
107+
108+
```python
109+
>>> ntdll_modules = [mod for addr, mod in dmp.Modules().items() if mod.ModuleName.lower().endswith("ntdll.dll")]
110+
>>> len(ntdll_modules)
111+
2
112+
>>> for ntdll in ntdll_modules:
113+
print(f"{ntdll.ModuleName=} {ntdll.BaseOfImage=:#x} {ntdll.SizeOfImage=:#x}")
114+
115+
ntdll.ModuleName='C:\\Windows\\SysWOW64\\ntdll.dll' ntdll.BaseOfImage=0x77430000 ntdll.SizeOfImage=0x1a4000
116+
ntdll.ModuleName='C:\\Windows\\System32\\ntdll.dll' ntdll.BaseOfImage=0x7ffc26410000 ntdll.SizeOfImage=0x1f8000
117+
```
118+
119+
A convenience function under `udmp_parser.UserDumpParser.ReadMemory()` can be used to directly read memory from the dump. The signature of the function is as follow: `def ReadMemory(Address: int, Size: int) -> list[int]`. So to dump for instance the `wow64` module, it would go as follow:
120+
121+
```python
122+
>>> wow64 = [mod for addr, mod in dmp.Modules().items() if mod.ModuleName.lower() == r"c:\windows\system32\wow64.dll"][0]
123+
>>> print(str(wow64))
124+
Module_t(BaseOfImage=0x7ffc24a10000, SizeOfImage=0x59000, ModuleName=C:\Windows\System32\wow64.dll)
125+
>>> wow64_module = bytearray(dmp.ReadMemory(wow64.BaseOfImage, wow64.SizeOfImage))
126+
>>> assert wow64_module[:2] == b'MZ'
127+
>>> import hexdump
128+
>>> hexdump.hexdump(wow64_module[:128])
129+
00000000: 4D 5A 90 00 03 00 00 00 04 00 00 00 FF FF 00 00 MZ..............
130+
00000010: B8 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00 ........@.......
131+
00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
132+
00000030: 00 00 00 00 00 00 00 00 00 00 00 00 E8 00 00 00 ................
133+
00000040: 0E 1F BA 0E 00 B4 09 CD 21 B8 01 4C CD 21 54 68 ........!..L.!Th
134+
00000050: 69 73 20 70 72 6F 67 72 61 6D 20 63 61 6E 6E 6F is program canno
135+
00000060: 74 20 62 65 20 72 75 6E 20 69 6E 20 44 4F 53 20 t be run in DOS
136+
00000070: 6D 6F 64 65 2E 0D 0D 0A 24 00 00 00 00 00 00 00 mode....$.......
137+
```
138+
139+
140+
### Memory
141+
142+
The memory blocks can also be enumerated in a hashmap `{address: MemoryBlock}`.
143+
144+
```python
145+
>>> memory = dmp.Memory()
146+
>>> len(memory)
147+
0x260
148+
>>> memory
149+
[...]
150+
0x7ffc26410000: [MemBlock_t(BaseAddress=0x7ffc26410000, AllocationBase=0x7ffc26410000, AllocationProtect=0x80, RegionSize=0x1000)],
151+
0x7ffc26411000: [MemBlock_t(BaseAddress=0x7ffc26411000, AllocationBase=0x7ffc26410000, AllocationProtect=0x80, RegionSize=0x11c000)],
152+
0x7ffc2652d000: [MemBlock_t(BaseAddress=0x7ffc2652d000, AllocationBase=0x7ffc26410000, AllocationProtect=0x80, RegionSize=0x49000)],
153+
0x7ffc26576000: [MemBlock_t(BaseAddress=0x7ffc26576000, AllocationBase=0x7ffc26410000, AllocationProtect=0x80, RegionSize=0x1000)],
154+
0x7ffc26577000: [MemBlock_t(BaseAddress=0x7ffc26577000, AllocationBase=0x7ffc26410000, AllocationProtect=0x80, RegionSize=0x2000)],
155+
0x7ffc26579000: [MemBlock_t(BaseAddress=0x7ffc26579000, AllocationBase=0x7ffc26410000, AllocationProtect=0x80, RegionSize=0x9000)],
156+
0x7ffc26582000: [MemBlock_t(BaseAddress=0x7ffc26582000, AllocationBase=0x7ffc26410000, AllocationProtect=0x80, RegionSize=0x86000)],
157+
0x7ffc26608000: [MemBlock_t(BaseAddress=0x7ffc26608000, AllocationBase=0x0, AllocationProtect=0x0, RegionSize=0x3d99e8000)]}
158+
```
159+
160+
To facilitate the parsing in a human-friendly manner, some helper functions are provided:
161+
* `udmp_parser.utils.TypeToString`: convert the region type to its meaning (from MSDN)
162+
* `udmp_parser.utils.StateToString`: convert the region state to its meaning (from MSDN)
163+
* `udmp_parser.utils.ProtectionToString`: convert the region protection to its meaning (from MSDN)
164+
165+
This allows to search and filter in a more comprehensible way:
166+
167+
168+
```python
169+
# Collect only executable memory regions
170+
>>> exec_regions = [region for _, region in dmp.Memory().items() if "PAGE_EXECUTE_READ" in udmp_parser.utils.ProtectionToString(region.Protect)]
171+
172+
# Pick any, disassemble code using capstone
173+
>>> exec_region = exec_regions[-1]
174+
>>> mem = dmp.ReadMemory(exec_region.BaseAddress, 0x100)
175+
>>> for insn in cs.disasm(bytearray(mem), exec_region.BaseAddress):
176+
print(f"{insn=}")
177+
178+
insn=<CsInsn 0x7ffc26582000 [cc]: int3 >
179+
insn=<CsInsn 0x7ffc26582001 [cc]: int3 >
180+
insn=<CsInsn 0x7ffc26582002 [cc]: int3 >
181+
insn=<CsInsn 0x7ffc26582003 [cc]: int3 >
182+
insn=<CsInsn 0x7ffc26582004 [cc]: int3 >
183+
insn=<CsInsn 0x7ffc26582005 [cc]: int3 >
184+
insn=<CsInsn 0x7ffc26582006 [cc]: int3 >
185+
insn=<CsInsn 0x7ffc26582007 [cc]: int3 >
186+
insn=<CsInsn 0x7ffc26582008 [cc]: int3 >
187+
insn=<CsInsn 0x7ffc26582009 [cc]: int3 >
188+
insn=<CsInsn 0x7ffc2658200a [cc]: int3 >
189+
insn=<CsInsn 0x7ffc2658200b [cc]: int3 >
190+
insn=<CsInsn 0x7ffc2658200c [cc]: int3 >
191+
insn=<CsInsn 0x7ffc2658200d [cc]: int3 >
192+
insn=<CsInsn 0x7ffc2658200e [cc]: int3 >
193+
insn=<CsInsn 0x7ffc2658200f [cc]: int3 >
194+
insn=<CsInsn 0x7ffc26582010 [48895c2410]: mov qword ptr [rsp + 0x10], rbx>
195+
insn=<CsInsn 0x7ffc26582015 [4889742418]: mov qword ptr [rsp + 0x18], rsi>
196+
insn=<CsInsn 0x7ffc2658201a [57]: push rdi>
197+
insn=<CsInsn 0x7ffc2658201b [4156]: push r14>
198+
insn=<CsInsn 0x7ffc2658201d [4157]: push r15>
199+
[...]
200+
```
201+
202+
# Authors
203+
204+
* Axel '[@0vercl0k](https://twitter.com/0vercl0k)' Souchet
205+
206+
# Contributors
207+
208+
[ ![contributors-img](https://contrib.rocks/image?repo=0vercl0k/udmp-parser) ](https://github.com/0vercl0k/udmp-parser/graphs/contributors)

src/python/pyproject.toml

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -7,13 +7,12 @@ name = "udmp-parser"
77
version = "0.4"
88
description = "A Cross-Platform C++ parser library for Windows user minidumps."
99
readme = "README.md"
10-
requires-python = ">=3.9"
10+
requires-python = ">=3.2"
1111
authors = [{ name = "0vercl0k", email = "[email protected]" }]
1212
classifiers = [
1313
"Development Status :: 4 - Beta",
1414
"License :: OSI Approved :: MIT License",
1515
"Programming Language :: Python :: 3",
16-
"Programming Language :: Python :: 3.9",
1716
"Topic :: Software Development :: Assemblers",
1817
"Natural Language :: English",
1918
]
@@ -26,13 +25,11 @@ Homepage = "https://github.com/0vercl0k/udmp-parser"
2625
profile = "black"
2726

2827
[tool.scikit-build]
28+
wheel.py-api = "cp32"
2929
minimum-version = "0.4"
3030
build-dir = "build/{wheel_tag}"
31-
wheel.py-api = "cp39"
3231
cmake.minimum-version = "3.20"
3332
cmake.args = [
34-
"-DBUILD_PARSER:BOOL=OFF",
35-
"-DBUILD_PYTHON_BINDING:BOOL=OFF",
3633
"-DBUILD_PYTHON_PACKAGE:BOOL=ON",
3734
]
3835

src/python/src/udmp_parser.cc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
// Released under MIT License, by 0vercl0k - 2023
55
//
66
// With contribution from:
7-
// * hugsy -(github.com / hugsy)
7+
// * hugsy - (github.com / hugsy)
88
//
99

1010
#include "udmp-parser.h"

0 commit comments

Comments
 (0)