Closed
Description
Looking at an open_zarr
computation from @rabernat I'm coming across intermediate values like the following:
>>> Future('zarr-adt-0f90b3f56f247f966e5ef01277f31374').result()
ImplicitToExplicitIndexingAdapter(array=LazilyIndexedArray(array=<xarray.backends.zarr.ZarrArrayWrapper object at 0x7fa921fec278>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))
This object has many dependents, and so will presumably have to float around the network to all of the workers
>>> len(dep.dependents)
1781
In principle this is fine, especially if this object is cheap to serialize, move, and deserialize. It does introduce a bit of friction though. I'm curious how hard it would be to build task graphs that generated these objects on the fly, or else removed them altogether. It is slightly more convenient from a task scheduling perspective for data access tasks to not have any dependencies.