Skip to content

Commit 9986c2e

Browse files
committed
keep formatting in interpreter_definiton
1 parent d31258c commit 9986c2e

File tree

1 file changed

+19
-37
lines changed

1 file changed

+19
-37
lines changed

Tools/cases_generator/interpreter_definition.md

Lines changed: 19 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -6,16 +6,15 @@ The CPython interpreter is defined in C, meaning that the semantics of the
66
bytecode instructions, the dispatching mechanism, error handling, and
77
tracing and instrumentation are all intermixed.
88

9-
This document proposes defining a custom C-like DSL for defining the
9+
This document proposes defining a custom C-like DSL for defining the
1010
instruction semantics and tools for generating the code deriving from
1111
the instruction definitions.
1212

1313
These tools would be used to:
14-
15-
- Generate the main interpreter (done)
16-
- Generate the tier 2 interpreter
17-
- Generate documentation for instructions
18-
- Generate metadata about instructions, such as stack use (done).
14+
* Generate the main interpreter (done)
15+
* Generate the tier 2 interpreter
16+
* Generate documentation for instructions
17+
* Generate metadata about instructions, such as stack use (done).
1918

2019
Having a single definition file ensures that there is a single source
2120
of truth for bytecode semantics.
@@ -46,7 +45,7 @@ passes from the semantic definition, reducing errors.
4645

4746
As we improve the performance of CPython, we need to optimize larger regions
4847
of code, use more complex optimizations and, ultimately, translate to machine
49-
code.
48+
code.
5049

5150
All of these steps introduce the possibility of more bugs, and require more code
5251
to be written. One way to mitigate this is through the use of code generators.
@@ -62,9 +61,10 @@ blocks as the instructions for the tier 1 (PEP 659) interpreter.
6261
Rewriting all the instructions is tedious and error-prone, and changing the
6362
instructions is a maintenance headache as both versions need to be kept in sync.
6463

65-
By using a code generator and using a common source for the instructions, or
64+
By using a code generator and using a common source for the instructions, or
6665
parts of instructions, we can reduce the potential for errors considerably.
6766

67+
6868
## Specification
6969

7070
This specification is a work in progress.
@@ -74,7 +74,7 @@ We update it as the need arises.
7474

7575
Each op definition has a kind, a name, a stack and instruction stream effect,
7676
and a piece of C code describing its semantics::
77-
77+
7878
```
7979
file:
8080
(definition | family | pseudo)+
@@ -85,7 +85,7 @@ and a piece of C code describing its semantics::
8585
"op" "(" NAME "," stack_effect ")" "{" C-code "}"
8686
|
8787
"macro" "(" NAME ")" "=" uop ("+" uop)* ";"
88-
88+
8989
stack_effect:
9090
"(" [inputs] "--" [outputs] ")"
9191
@@ -128,9 +128,9 @@ and a piece of C code describing its semantics::
128128

129129
The following definitions may occur:
130130

131-
- `inst`: A normal instruction, as previously defined by `TARGET(NAME)` in `ceval.c`.
132-
- `op`: A part instruction from which macros can be constructed.
133-
- `macro`: A bytecode instruction constructed from ops and cache effects.
131+
* `inst`: A normal instruction, as previously defined by `TARGET(NAME)` in `ceval.c`.
132+
* `op`: A part instruction from which macros can be constructed.
133+
* `macro`: A bytecode instruction constructed from ops and cache effects.
134134

135135
`NAME` can be any ASCII identifier that is a C identifier and not a C or Python keyword.
136136
`foo_1` is legal. `$` is not legal, nor is `struct` or `class`.
@@ -165,9 +165,9 @@ part of the DSL.
165165

166166
Those functions include:
167167

168-
- `DEOPT_IF(cond, instruction)`. Deoptimize if `cond` is met.
169-
- `ERROR_IF(cond, label)`. Jump to error handler at `label` if `cond` is true.
170-
- `DECREF_INPUTS()`. Generate `Py_DECREF()` calls for the input stack effects.
168+
* `DEOPT_IF(cond, instruction)`. Deoptimize if `cond` is met.
169+
* `ERROR_IF(cond, label)`. Jump to error handler at `label` if `cond` is true.
170+
* `DECREF_INPUTS()`. Generate `Py_DECREF()` calls for the input stack effects.
171171

172172
Note that the use of `DECREF_INPUTS()` is optional -- manual calls
173173
to `Py_DECREF()` or other approaches are also acceptable
@@ -203,7 +203,6 @@ two idioms are valid:
203203
`ERROR_IF(true, error)`.
204204

205205
An example of the latter would be:
206-
207206
```cc
208207
res = PyObject_Add(left, right);
209208
if (res == NULL) {
@@ -232,16 +231,13 @@ The same is true for all members of a pseudo instruction
232231
Some examples:
233232
234233
### Output stack effect
235-
236234
```C
237235
inst ( LOAD_FAST, (-- value) ) {
238236
value = frame->f_localsplus[oparg];
239237
Py_INCREF(value);
240238
}
241239
```
242-
243240
This would generate:
244-
245241
```C
246242
TARGET(LOAD_FAST) {
247243
PyObject *value;
@@ -253,15 +249,12 @@ This would generate:
253249
```
254250
255251
### Input stack effect
256-
257252
```C
258253
inst ( STORE_FAST, (value --) ) {
259254
SETLOCAL(oparg, value);
260255
}
261256
```
262-
263257
This would generate:
264-
265258
```C
266259
TARGET(STORE_FAST) {
267260
PyObject *value = PEEK(1);
@@ -272,17 +265,14 @@ This would generate:
272265
```
273266
274267
### Input stack effect and cache effect
275-
276268
```C
277269
op ( CHECK_OBJECT_TYPE, (owner, type_version/2 -- owner) ) {
278270
PyTypeObject *tp = Py_TYPE(owner);
279271
assert(type_version != 0);
280272
DEOPT_IF(tp->tp_version_tag != type_version);
281273
}
282274
```
283-
284275
This might become (if it was an instruction):
285-
286276
```C
287277
TARGET(CHECK_OBJECT_TYPE) {
288278
PyObject *owner = PEEK(1);
@@ -298,14 +288,12 @@ This might become (if it was an instruction):
298288
### More examples
299289
300290
For explanations see "Generating the interpreter" below.)
301-
302291
```C
303292
op ( CHECK_HAS_INSTANCE_VALUES, (owner -- owner) ) {
304293
PyDictOrValues dorv = *_PyObject_DictOrValuesPointer(owner);
305294
DEOPT_IF(!_PyDictOrValues_IsValues(dorv));
306295
}
307296
```
308-
309297
```C
310298
op ( LOAD_INSTANCE_VALUE, (owner, index/1 -- null if (oparg & 1), res) ) {
311299
res = _PyDictOrValues_GetValues(dorv)->values[index];
@@ -315,13 +303,11 @@ For explanations see "Generating the interpreter" below.)
315303
Py_DECREF(owner);
316304
}
317305
```
318-
319306
```C
320307
macro ( LOAD_ATTR_INSTANCE_VALUE ) =
321308
counter/1 + CHECK_OBJECT_TYPE + CHECK_HAS_INSTANCE_VALUES +
322309
LOAD_INSTANCE_VALUE + unused/4 ;
323310
```
324-
325311
```C
326312
op ( LOAD_SLOT, (owner, index/1 -- null if (oparg & 1), res) ) {
327313
char *addr = (char *)owner + index;
@@ -332,18 +318,15 @@ For explanations see "Generating the interpreter" below.)
332318
Py_DECREF(owner);
333319
}
334320
```
335-
336321
```C
337322
macro ( LOAD_ATTR_SLOT ) = counter/1 + CHECK_OBJECT_TYPE + LOAD_SLOT + unused/4;
338323
```
339-
340324
```C
341325
inst ( BUILD_TUPLE, (items[oparg] -- tuple) ) {
342326
tuple = _PyTuple_FromArraySteal(items, oparg);
343327
ERROR_IF(tuple == NULL, error);
344328
}
345329
```
346-
347330
```C
348331
inst ( PRINT_EXPR ) {
349332
PyObject *value = POP();
@@ -367,21 +350,20 @@ For explanations see "Generating the interpreter" below.)
367350
A _family_ maps a specializable instruction to its specializations.
368351

369352
Example: These opcodes all share the same instruction format):
370-
371353
```C
372-
family(LOAD_ATTR) = { LOAD_ATTR_INSTANCE_VALUE, LOAD_SLOT };
354+
family(load_attr) = { LOAD_ATTR, LOAD_ATTR_INSTANCE_VALUE, LOAD_SLOT };
373355
```
374356
375357
### Defining a pseudo instruction
376358
377359
A _pseudo instruction_ is used by the bytecode compiler to represent a set of possible concrete instructions.
378360
379361
Example: `JUMP` may expand to `JUMP_FORWARD` or `JUMP_BACKWARD`:
380-
381362
```C
382363
pseudo(JUMP) = { JUMP_FORWARD, JUMP_BACKWARD };
383364
```
384365

366+
385367
## Generating the interpreter
386368

387369
The generated C code for a single instruction includes a preamble and dispatch at the end
@@ -430,7 +412,7 @@ rather than popping and pushing, such that `LOAD_ATTR_SLOT` would look something
430412
stack_pointer += 1;
431413
}
432414
s1 = res;
433-
}
415+
}
434416
next_instr += (1 + 1 + 2 + 1 + 4);
435417
stack_pointer[-1] = s1;
436418
DISPATCH();

0 commit comments

Comments
 (0)