Replies: 5 comments 1 reply
-
|
See Marc's slides presented on April 23. |
Beta Was this translation helpful? Give feedback.
-
|
See Philippe's response (presented on May 14) to Marc's slides. |
Beta Was this translation helpful? Give feedback.
-
|
Related issues: |
Beta Was this translation helpful? Give feedback.
-
|
I read the linked slides. I want to come at this at a different angle. First, Phlex lets users enjoy a low bar to making new data product types. Once DUNE people start to write Phlex nodes, we can expect a proliferation of data types if we do not get out in front of that movement. To get ahead of this, I want to put out this idea for a generic data model and implementation that may not even need to depend on phlex. It is based on patterns WCT uses in its "tensor data model" which in turn is based on HDF5's model. The basic idea is to separate out type and "format" from schema. The model would describe the transient representation for:
These would be represented by concrete types. General purpose file I/O code can be written based on these types. If we are smart, we can implement these types in ways that we get a lot of support code for free. Eg, actually use JSON, actually use multiarray. What they do not cover is the schema used to interpret instances of these types. To join the ideas in this thread so far, we may think about these basic types as the substrate and then express "conceptual data types" and IDL layers which overlay structure on these basic types. That is, I don't attempt to dodge that bigger problem that this thread opens with but rather to make a decision that helps narrow the scope for the problem and give us a more concrete way to reason about the problem. |
Beta Was this translation helpful? Give feedback.
-
|
Has Apache Arrow been considered as a way to flesh out the C++/Python boundary? The work Phlex does to find |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Among the developers, there are various ideas for how data products should be supported. These ideas are motivated by multiple and, perhaps at times, competing desires: ease-of-use, retaining framework-independence of algorithms (and possibly their data products), and exploiting maximum efficiency of the computing hardware. We will need to decide on a way to balance each of these desires...and others not yet enumerated. This discussion is to catalog our thoughts on data-product support.
Beta Was this translation helpful? Give feedback.
All reactions