Skip to content

Commit 21ca371

Browse files
committed
doc/ref/spec.md: alternative definitions for closedness and friends
This CL changes the meaning of `[x]: y` and `...` fields and closedness with the following objectives: - Incorporate experience of usability of defintions and closedness - Simplify the rules of closedness to simplifiy both its usages and implementation - Make it (far) easier to express JSON Schema in CUE. - Allow closedness to be expressed in terms of CUE itself The expressability of JSON Schema in CUE seems important as this seems to becoming one of the major application domains of CUE. Aside from that, it is a general win for the language as it results in more expressability without loss of generality: the old CUE semantics can easily be simulated by enclosing bulk optional fields in curly braces (the current `cue fmt` and `cue fix` currently do this rewrite). The changes to bulk optional fields (now pattern constraints) bring them in line with the definition of "patternProperties" of JSON Schema. The same holds for the `...` notation, which now maps to additional properties. To allow for a complete one-to-one mapping, the spec now also allows for `...T`. This brings it in line with the semantics for lists. Coincidentally, the new JSON Schema semantics allows closed structs to be defined in terms of CUE itself (using `..._|_`). In addition, the new simplified definition aims to simplify implementation and incorporate user feedback to make definitions and closed structs more intuitive. As a consequence of the simplified defintions, let expressions can no longer be used to circumvent closedness. The relaxed rules around closedness and the earlier reintroduction of hidden fields make this an acceptable compromise. Transition plan: The new definition changes the meaning of existing syntax. The plan is to first rewrite the old semantics into equivalent CUE that will give the same results with the new interpretation. This is already implemented with cue fix and cue fmt. The next step is to prohibit any usage of bulk optional fields that will mean something different with the new syntax (CL pending). Finally, the new behavior is implemented by the new evaluator. Change-Id: I2617597ea4a973d8987239ead4e62d51002c622b Reviewed-on: https://cue-review.googlesource.com/c/cue/+/6260 Reviewed-by: Marcel van Lohuizen <[email protected]>
1 parent a20e1d4 commit 21ca371

File tree

1 file changed

+75
-71
lines changed

1 file changed

+75
-71
lines changed

doc/ref/spec.md

Lines changed: 75 additions & 71 deletions
Original file line numberDiff line numberDiff line change
@@ -1018,7 +1018,9 @@ It could be a role of vet checkers to identify such cases (and suggest users
10181018
to explicitly use `_|_` to discard a field, for instance).
10191019
-->
10201020

1021-
Syntactically, a struct literal may contain multiple fields with
1021+
Syntactically, a field is marked as optional by following its label with a `?`.
1022+
The question mark is not part of the field name.
1023+
A struct literal may contain multiple fields with
10221024
the same label, the result of which is a single field with the same properties
10231025
as defined as the unification of two fields resulting from unifying two structs.
10241026

@@ -1038,29 +1040,59 @@ Expression Result (without optional fields)
10381040
{a: 1} & {a: 2} _|_
10391041
```
10401042

1041-
Optional labels are defined in sets with an expression to select all
1042-
labels to which to apply a given constraint.
1043-
Syntactically, the label of an optional field set is an expression in square
1044-
brackets indicating the matching labels.
1045-
The value `string` matches all fields, while a concrete string matches a
1046-
single field.
1047-
As the latter case is common, a concrete label followed by
1048-
a question mark `?` may be used as a shorthand.
1049-
So
1050-
```
1051-
foo?: bar
1052-
```
1053-
is a shorthand for
1054-
```
1055-
["foo"]: bar
1056-
```
1057-
The question mark is not part of the field name.
1058-
The token `...` may be used as the last declaration in a struct
1059-
and is a shorthand for
1043+
A struct may define constraints that apply to fields that are added when unified
1044+
with another struct using pattern or default constraints.
1045+
1046+
A _pattern constraint_, denoted `[pattern]: value`, defines a pattern, which
1047+
is a value of type string, and a value to unify with fields whose label
1048+
match that pattern.
1049+
When unifying structs `a` and `b`,
1050+
a pattern constraint `[p]: v` declared in `a`
1051+
defines that the value `v` should unify with any field in the resulting struct `c`
1052+
whose label unifies with pattern `p` and for which there exists no
1053+
field in `a` with the same label.
1054+
1055+
Additionally, a _default constraint_, denoted `...value`, defines a value
1056+
to unify with any field for which there is no other declaration in a struct.
1057+
When unifying structs `a` and `b`,
1058+
a default constraint `...v` declared in `a`
1059+
defines that the value `v` should unify with any field in the resulting struct `c`
1060+
whose label does not unify with any of the patterns of the pattern
1061+
constraints defined for `a` _and_ for which there exists no field in `a`
1062+
with that label.
1063+
The token `...` is a shorthand for `..._`.
1064+
1065+
10601066
```
1061-
[_]: _
1067+
a: {
1068+
foo: string // foo is a string
1069+
["^i"]: int // all other fields starting with i are integers
1070+
["^b"]: bool // all other fields starting with b are booleans
1071+
...string // all other fields must be a string
1072+
}
1073+
1074+
b: a & {
1075+
i3: 3
1076+
bar: true
1077+
other: "a string"
1078+
}
10621079
```
10631080

1081+
<!-- NOTE: pattern and default constraints can be made to apply to all
1082+
fields by embedding them as a struct:
1083+
x: {
1084+
a: 2
1085+
b: 3
1086+
{[string]: int}
1087+
}
1088+
or by writing
1089+
x: [string]: int
1090+
x: {
1091+
a: 2
1092+
b: 3
1093+
}
1094+
-->
1095+
10641096
Concrete field labels may be an identifier or string, the latter of which may be
10651097
interpolated.
10661098
Fields with identifier labels can be referred to within the scope they are
@@ -1117,8 +1149,9 @@ future extensions and relaxations:
11171149
-->
11181150

11191151
```
1120-
StructLit = "{" { Declaration "," } [ "..." ] "}" .
1121-
Declaration = Field | Embedding | LetClause | attribute .
1152+
StructLit = "{" { Declaration "," } "}" .
1153+
Declaration = Field | Ellipsis | Embedding | LetClause | attribute .
1154+
Ellipsis = "..." [ Expression ] .
11221155
Embedding = Comprehension | AliasExpr .
11231156
Field = Label ":" { Label ":" } Expression { attribute } .
11241157
Label = [ identifier "=" ] LabelExpr .
@@ -1202,12 +1235,15 @@ A1: A & {
12021235
```
12031236

12041237
A _closed struct_ `c` is a struct whose instances may not have regular fields
1205-
not defined in `c`.
1206-
Closing a struct is equivalent to adding an optional field with value `_|_`
1207-
for all undefined fields.
1238+
with a name that does not match the name of a regular or optional field
1239+
or the pattern of a pattern constraint defined in `c`.
1240+
A struct that is the result of unifying any struct with a [`...`](#Structs)
1241+
declaration is defined for all fields.
1242+
Recursively closing a struct is equivalent to adding `..._|_` to its its root
1243+
and any of its substructures that are not defined for all fields.
12081244

1209-
Syntactically, closed structs can be explicitly created with the `close` builtin
1210-
or implicitly by [definitions](#Definitions).
1245+
Syntactically, structs are recursively closed explicitly with
1246+
the `close` builtin or implicitly by [definitions](#Definitions).
12111247

12121248

12131249
```
@@ -1247,6 +1283,8 @@ D: close({
12471283
<!-- (jba) Somewhere it should be said that optional fields are only
12481284
interesting inside closed structs. -->
12491285

1286+
<!-- TODO: move embedding section to above the previous one -->
1287+
12501288
#### Embedding
12511289

12521290
A struct may contain an _embedded value_, an operand used
@@ -1261,8 +1299,7 @@ In this case, a CUE program will evaluate to the embedded value
12611299
and the CUE program may not have top-level regular or optional
12621300
fields (definitions and aliases are allowed).
12631301

1264-
Syntactically, embeddings may be any expression, except that `<`
1265-
is eagerly interpreted as a bind label.
1302+
Syntactically, embeddings may be any expression.
12661303

12671304
```
12681305
S1: {
@@ -1300,46 +1337,14 @@ A field is a _definition_ if its identifier starts with `#` or `_#`.
13001337
A field is _hidden_ if its starts with a `_`.
13011338
Definitions and hidden fields are not emitted when converting a CUE program
13021339
to data and are never required to be concrete.
1303-
For definitions
1304-
literal structs that are part of a definition's value are implicitly closed,
1305-
but may unify unrestricted with other structs within the field's declaration.
1306-
This excludes literals structs in embeddings and aliases.
1307-
1308-
<!--
1309-
This may be a more intuitive definition:
1310-
Literal structs that are part of a definition's value are implicitly closed.
1311-
Implicitly closed literal structs that are unified within
1312-
a single field declaration are considered to be a single literal struct.
1313-
However, this would make unification non-commutative, unless one imposes an
1314-
ordering where literal structs are unified before unifying them with others.
1315-
Imposing such an ordering is complex and error prone.
1316-
-->
1317-
An ellipsis `...` in such literal structs keeps them open,
1318-
as it defines `_` for all labels.
13191340

1320-
<!--
1321-
Excluding embeddings from recursive closing allows comprehensions to be
1322-
interpreted as embeddings without some exception. For instance,
1323-
if x > 2 {
1324-
foo: string
1325-
}
1326-
should not cause any failure. It is also consistent with embeddings being
1327-
opened when included in a closed struct.
1341+
Referencing a definition will implicitely [close](#ClosedStructs) it.
1342+
A struct that embeds a referenced definition will itself be closed
1343+
after first allowing any other fields or embedded structs to unify.
1344+
The result of `{ #A }` is `#A` for any `#A`.
13281345

1329-
Finally, excluding embeddings from recursive closing allows for
1330-
a mechanism to not recursively close, without needing an additional language
1331-
construct, such as a triple colon or something else:
1332-
#foo: {
1333-
{
1334-
// not recursively closed
1335-
}
1336-
... // include this to not close outer struct
1337-
}
1338-
1339-
Including aliases from this exclusion, which are more a separate definition
1340-
than embedding seems sensible, and allows for an easy mechanism to avoid
1341-
closing, aside from embedding.
1342-
-->
1346+
If referencing a definition would always result in an error, implementations
1347+
may report this inconsistency at the point of its declaration.
13431348

13441349
```
13451350
#MyStruct: {
@@ -1570,7 +1575,7 @@ The length of an open list is the its number of elements as a lower bound
15701575
and an unlimited number of elements as its upper bound.
15711576

15721577
```
1573-
ListLit = "[" [ ElementList [ "," [ "..." [ Expression ] ] ] [ "," ] "]" .
1578+
ListLit = "[" [ ElementList [ "," [ Ellipsis ] ] [ "," ] "]" .
15741579
ElementList = Embedding { "," Embedding } .
15751580
```
15761581

@@ -2861,9 +2866,8 @@ If the result of the unification of all embedded values is not a struct,
28612866
it will be output instead of its enclosing file when exporting CUE
28622867
to a data format
28632868

2864-
<!-- TODO: allow ... anywhere in SourceFile and struct. -->
28652869
```
2866-
SourceFile = [ PackageClause "," ] { ImportDecl "," } { Declaration "," } [ "..." ] .
2870+
SourceFile = [ PackageClause "," ] { ImportDecl "," } { Declaration "," } .
28672871
```
28682872

28692873
```

0 commit comments

Comments
 (0)