This epic covers developing arrow schema that is optimal for representing multi-dimensional rasters as tensors.
Design goals
- Records in this schema can accessed through zero-copy
ndarray
- This may require numpy extension
- Schema supports efficient dimension slicing by virtue of being paged and columnar
- Arrow pages can map unto bands or other "dimension-slices"
- It should be cheap to slice/select in at least one dimension of the array
- Records in this schema map well on common image encoding formats
- pixel-interleave, band-interleave
- Some subset of GeoTrellis
Tile interface or ND4J interface can be backed by these records
- we're willing to give up efficient random access
- we kind of get this one for "free", its not a big restriction
Its not clear that we can achieve all this so this is a wish-list that can be negotiated down
This epic covers developing arrow schema that is optimal for representing multi-dimensional rasters as tensors.
Design goals
ndarrayTileinterface orND4Jinterface can be backed by these recordsIts not clear that we can achieve all this so this is a wish-list that can be negotiated down