-
Notifications
You must be signed in to change notification settings - Fork 22
Description
Proposal to add the concept of 'isolated' data to Nim
This is the evolution of owned ref. A new name, Isolated was chosen
in order to avoid confusions and also because it can be done almost entirely
as a library without bloating Nim's core type system further.
Motivation
We want to be able to pass subgraphs to threads, safely and easily. All
data races should be prevented at compile-time. We are not willing to pay
the price of atomic reference counting in order to do so. We also seek to
avoid Rust's and C++'s many different pointer types. Nim uses ref everywhere
and ref should remain the default pointer type to use. It is safe, efficient
and Nim's optimizers understand it well.
Yet, ref has no concept of unique ownership which is required for effective
message passing without copies. Hence we wrap it inside an Isolated[T]:
type
Isolated*[T] = object ## Isolated data can only be moved, not copied.
value: T
proc `=`*[T](dest: var Isolated[T]; src: Isolated[T]) {.error.}
proc `=sink`*[T](dest: var Isolated[T]; src: Isolated[T]) {.inline.} =
# delegate to value's sink operation
`=sink`(dest.value, src.value)
proc `=destroy`*[T](dest: var Isolated[T]) {.inline.} =
# delegate to value's destroy operation
`=destroy`(dest.value)
Isolated[T] is what a channel should use, comparable to
Rust's "sendable" trait:
proc send*[T](c: var Channel[T]; msg: sink Isolated[T])
proc recv*[T](c: var Channel[T]): T
## Note: Returns T, not Isolated[T] for convenience.
proc recvIso*[T](c: var Channel[T]): Isolated[T]
## remembers the data is Isolated[T].
How to construct isolated subgraphs
Construction must ensure that the invariant holds, namely that the wrapped T
is free of external aliases into it. To ensure it, we propose a Pony-inspired
recover construct, but named isolate for clarity:
func isolate(x: sink T): Isolated[T] {.magic: "Isolate".}
As you can see, this is a new builtin because the check it performs on x is non-trivial:
If T does not contain a ref or closure type, it is isolated. Else the syntactic structure of x is analyzed:
- Atoms like
nil,4,"abc"are isolated. - An array constructor
[x...]is isolated if every elementxis isolated. - An object constructor
Obj(x...)is isolated if every elementxis isolated. - An
iforcaseexpression is isolated if all possible values the expression
may return are isolated. - A type conversion
C(x)is isolated ifxis isolated. Analogous forcast
expressions. - A field access
x.fieldis isolated ifxis isolated. - An array/seq access
x[i]is not isolated ifxis isolated as otherwise
code likea.send x[i]; b.send x[i](send to two different channels) might
compile. - A function call
f(x...)is isolated iffis.noSideEffectand for every argumentx:xis isolated orf's return type cannot aliasx's type. This is checked via a form of alias analysis as explained in the next paragraph.
Note: Previously the spec said that f must be .gcsafe, this is not sufficient, we cannot guarantee isolation for a .threadvar location.
Alias analysis
We start with an important, simple case that must be valid: Sending the result
of parseJson to a channel. Since the signature
is func parseJson(input: string): JsonNode it is easy to see that JsonNode
can never simply be a view into input which is a string.
A different case is the identity function id, send id(myJsonGraph) must be
invalid because we do not know how many aliases into myJsonGraph exist
elsewhere.
In general type A can alias type T if:
AandTare the same types.Ais a distinct type derived fromT.Ais a field insideTifTis a final object type.Tis an inheritable object type. (An inherited type could always contain
afield: A).Tis a closure type. Reason:T's environment can contain a field of
typeA.Ais the element type ofTifTis an array, sequence or pointer type.
These rules ensure the freedom of potential data races but they can be quite
limiting. It remains to be seen if they suffice in practice. Pony's recover
is actually not a function, but a block of code. It is unclear at this point
if that really adds expressivity or not over this proposed builtin function.
Sugar
We expect the pattern send(isolate(f())) to be very common so we add a
template overload to the channel:
template send*[T](c: var Channel[T]; msg: sink T) =
c.send(isolate(msg))
Example: send JsonNodes to a worker thread
var ch: Channel[JsonNode]
open(ch)
proc worker {.thread.} =
while true:
let t = ch.recv()
if t == nil: break
download(t["url"])
var thr: Thread[void]
createThread(thr, worker)
proc prepareTasks(fileWithUrls: string): seq[Isolated[JsonNode]] =
result = @[]
for line in lines(fileWithUrls):
result.add isolate(parseJson(line))
proc spawnCrawlers =
var tasks = prepareTasks("todo_urls.txt")
for t in mitems tasks:
ch.send move tNext steps
There is now an implementation available in std/isolation. The channel type should use Isolated[T] for ARC/ORC and we need to see how it works.
Future directions
Isolated graphs can also be checked for isolation at runtime via something like func assertIsolated[T](graph: sink T): Isolated[T]. This would use ORC's mechanism for graph traversal and the involved reference counts to determine that no external pointers into graph exist. Since this is a runtime check and not a compile-time guarantee, we will first do without such a mechanism.