Data trees

This is a spec for bringing clarity to the plans for the next evolution of our visual programming language in Graphite as we move towards better support for pure functional programming paradigms without the limitations imposed by today's incomplete architecture.

This spec is also related to:
- #2988
- #1832
- #1173

This is described from a user-facing perspective, from which the technical requirements arise in order to produce the necessary solution to the problem space we uniquely face. A technical implementation plan will still need to designed meeting this spec which will supersede #2522, but take inspiration from it for the dynamic handling of attributes.

## Key ideas

In the past, Graphite passed only singleton data through the data flow, although the old `GraphicGroup` type allowed nesting. In #1834, we transitioned to the current system where graphical types use `Table<T>` in order to make everything work with lists of data, for all graphical types `T` such as `Vector`, `Raster`, `Color`, and `Graphic`, but all other types like `String`, `f64`, `DVec2`, and `bool` all still get passed around as singleton types. And really, `Table<T>` is itself just a singleton that internally stores multiple child elements. This spec is the answer to how to progress further towards a world where everything (all types, even `f64`, etc.) can be treated as a singleton but become part of a list of multiple elements, and have attributes even as a singleton.

Instead of currently passing `f64` from a node, we pass an object/record like `{ element: f64 }` where `element` is its primary key and thus determines its type in our type system, which would be `f64` in this case. This is not a list, just a singleton object. However, if this is passed through a node which adds an attribute, this could become `{ element: f64, enabled: bool }` by adding some string-indexed value to the object. The Data panel visualizes this as a table with the headers "element" and "enabled" and a single row with the values, but no column for the row index since this is not part of a table. If this singleton object eventually reaches a layer that combines it with other `f64`-type objects such as a different `{ element: f64, name: String }`, they would be combined into a list (visualized in the Data panel as a table with columns for the row index, `element`, `enabled`, and `name`) that looks like:

```
[
	0:{ element: f64, enabled: bool, name: String }
	1:{ element: f64, enabled: bool, name: String }
]
```

Between the two merged `f64` objects, as long as their primary key (`element`) is of the same type, their other attributes do not have to match since they can be merged while filling in the missing attributes with each type's default value.

Just like how currently, `u64`, `DAffine2`, and `Table<Raster>` are types that can be passed through the graph directly as a singleton or as cells in a `Table<T>`, we can also think of the table shown above as a type, `f64[]`, that may flow through the graph as the cell of a table. Or in other words, an element of an object's primary key or an attribute. This is how, in the Data panel, we already represent child tables as a button to drill down into the breadcrumb trail to see that child table data. So in essence, a list is just viewed by clicking the button to enter it, while other singleton types are previewed directly. An `f64` is shown as a label displaying the numerical value and an `f64[]` is shown as a button to enter its table to view all the rows of that list (table) of `f64` primary key elements, plus any other columns for attributes.

A key idea here is how we are always working with these object types, and lists (tables) are the data types held within, either as part of the primary element or additional attributes. Multiple lists (of different lengths) can occupy the different cells of an object. And fundamentally, we can think of these object types as a view into what could later on become a row in some future parent's table. A rectangle might be built up as an object, but later join other siblings by being promoted to a table row by a node that places it into a list. That list, then, would be contained within a cell of its own parent object. Importantly, it got wrapped within nested data; it became an element of a data tree.

A node such as Repeat will produce multiple copies of some data, perhaps 3 copies. Given `{ element: Color, transform: DAffine2 }`, the Repeat node would produce `{ element: Color[] }` and that list of colors (with type `Color[]`) would look like:
```
[
	0:{ element: Color, transform: DAffine2 }
	1:{ element: Color, transform: DAffine2 }
	2:{ element: Color, transform: DAffine2 }
]
```

Often, users will be working with single objects like vector shapes and images that are not specifically repeated. Each of these leaf layers will usually be of type `Vector` (`{ element: Vector, ... }`) or `Raster` (`{ element: Raster, ... }`) since they are not nested. These would be drawn with solid wires in the graph, with the usual data type colors (blue for `Vector`, orange for `Raster`, etc.).

When a single object becomes part of a list such as in `{ element: Vector[], ... }` which has the type `Vector[]`, its data flow would be drawn with blue dashed wires while keeping the same data type color. Users can easily see if data flow is a singular or plural type based on whether it's dashed. Likewise, running the `Vector[]` through another Repeat node would yield type `Vector[][]` which would still be drawn with the blue dashed wire. As such, we successfully separate the concepts of nestedness from data type, which was not possible with previous designs. With this, we can represent highly nested data while preserving its purity, avoiding the dreaded entropy of becoming the heterogenous `Graphic` type unless mixing with other types is intended. This also fully generalizes to both graphical and non-graphical types without issues and allows arbitrary data to be represented, such as a simple list of strings, PCM audio samples, JSON, or a full imported CSV file or Excel spreadsheet.

The Merge node is currently only compatible with `Table<T>` graphical data types and it introduces a mandatory conversion to `Table<Graphic>` which introduces both nesting and a loss of data type purity. In this new architecture, the Merge node will essentially become an equivalent to `Vec::extend()` or `Iter::chain()`, making it compatible with all types. We need to decide between two possible ways it may handle nesting:
1. It might always introduce a level of nesting, turning `Vector[]` and `Vector[]` inputs into `Vector[][]`.
2. Or it might join them flatly by just concatenating the two input lists in the primary key (`elements` column), keeping the type `Vector[]`, and discarding the other attributes belonging to one of the two inputs. But if non-lists are given, the two singleton objects would have to be put into a list together, acting as a "base case" where `Vector` and `Vector` inputs become `Vector[]`. This would occur by the automatic type coercion system. A drawback is how the bottom Merge layer in a stack would receive the left input of `Vector` (solid wire) and output `Vector` (solid wire), but upon entering the bottom of the next layer up, it would then be coerced into `Vector[]` and use a dashed wire going the rest of the way up the stack. But this approach is overall a generally better default because it helps avoid potentially annoying nesting occurring when it's usually not wanted.

This architecture solves a critical limitation of the current implementation, which is that nodes like Blending (which set the opacity, etc.) are forced to map over every element in the `Table<T>`, giving each an opacity, even though this is incorrect. The correct result would be affecting the opacity on the parent, which this allows by setting it once on the object (that would become a parent layer in the future insertion as a row into a parent table) instead of repeatedly on each child. If done before the Repeat node, it affects each `Vector` child within the list `element`'s list. If done after the Repeat node, it affects just the one `Vector[]` object outside the `element`'s list. So just based on the order of the nodes, whether part of the solid wire or dashed wire portion, it becomes possible to apply node data modification operations at different levels of nesting in a relatively intuitive, sequential order of operations. To demonstrate this, here is an example data flow:

<img width="1259" height="158" alt="Image" src="https://github.com/user-attachments/assets/c7a13e5e-6281-436b-a6f5-28683d5b8a5e" />

<details>
<summary><h3>Node data flow descriptions: click to expand</h3></summary>

```
Rectangle:
{ element: Vector } is type Vector which has a solid blue wire

Transform:
{ element: Vector, transform: DAffine2 } is type Vector which has a solid blue wire

Fill:
{ element: Vector, transform: DAffine2, fill: Fill } is type Vector which has a solid blue wire

Repeat:
{ element: Vector[] } is type Vector[] which has a dashed blue wire
			↓
			[
				0:{ element: Vector, transform: DAffine2, fill: Fill }
				1:{ element: Vector, transform: DAffine2, fill: Fill }
				2:{ element: Vector, transform: DAffine2, fill: Fill }
				3:{ element: Vector, transform: DAffine2, fill: Fill }
				4:{ element: Vector, transform: DAffine2, fill: Fill }
			]

Stroke:
{ element: Vector[], stroke: Stroke } is type Vector[] which has a dashed blue wire
			↓
			[
				0:{ element: Vector, transform: DAffine2, fill: Fill }
				1:{ element: Vector, transform: DAffine2, fill: Fill }
				2:{ element: Vector, transform: DAffine2, fill: Fill }
				3:{ element: Vector, transform: DAffine2, fill: Fill }
				4:{ element: Vector, transform: DAffine2, fill: Fill }
			]

Blending:
{ element: Vector[], stroke: Stroke, alpha_blending: Blending } is type Vector[] which has a dashed blue wire
			↓
			[
				0:{ element: Vector, transform: DAffine2, fill: Fill }
				1:{ element: Vector, transform: DAffine2, fill: Fill }
				2:{ element: Vector, transform: DAffine2, fill: Fill }
				3:{ element: Vector, transform: DAffine2, fill: Fill }
				4:{ element: Vector, transform: DAffine2, fill: Fill }
			]

ALTERNATIVE 1:
Merge (with left and bottom inputs given the same value from the Blending node):
{ element: Vector[], stroke: Stroke, alpha_blending: Blending } is type Vector[] which has a dashed blue wire
			↓
			[
				0:{ element: Vector, transform: DAffine2, fill: Fill }
				1:{ element: Vector, transform: DAffine2, fill: Fill }
				2:{ element: Vector, transform: DAffine2, fill: Fill }
				3:{ element: Vector, transform: DAffine2, fill: Fill }
				4:{ element: Vector, transform: DAffine2, fill: Fill }
				5:{ element: Vector, transform: DAffine2, fill: Fill }
				6:{ element: Vector, transform: DAffine2, fill: Fill }
				7:{ element: Vector, transform: DAffine2, fill: Fill }
				8:{ element: Vector, transform: DAffine2, fill: Fill }
				9:{ element: Vector, transform: DAffine2, fill: Fill }
			]

ALTERNATIVE 2:
Merge (with left and bottom inputs given the same value from the Blending node):
{ element: Vector[][] } is type Vector[][] which has a dashed blue wire
			↓
			[
				0:{ element: Vector[], stroke: Stroke, alpha_blending: Blending }
							↓
							[
								0:{ element: Vector, transform: DAffine2, fill: Fill }
								1:{ element: Vector, transform: DAffine2, fill: Fill }
								2:{ element: Vector, transform: DAffine2, fill: Fill }
								3:{ element: Vector, transform: DAffine2, fill: Fill }
								4:{ element: Vector, transform: DAffine2, fill: Fill }
							]
				1:{ element: Vector[], stroke: Stroke, alpha_blending: Blending }
							↓
							[
								0:{ element: Vector, transform: DAffine2, fill: Fill }
								1:{ element: Vector, transform: DAffine2, fill: Fill }
								2:{ element: Vector, transform: DAffine2, fill: Fill }
								3:{ element: Vector, transform: DAffine2, fill: Fill }
								4:{ element: Vector, transform: DAffine2, fill: Fill }
							]
			]
```
</details>

Shown in the collapsed content directly above, "ALTERNATIVE 2" preserves the `stroke` and `alpha_blending` attributes of both merged parents, while "ALTERNATIVE 1" must choose only one to keep. However, it is still likely preferable to use the auto-flattening variant because this avoids an explosion of nesting. Users can always add a "Wrap" node to counteract this. We can also employ some UI tricks to visualize which variant's behavior is chosen and let the user explicitly wrap or flatten any node input as visual syntax sugar for convenience, perhaps at every connector.

Here is a more complete example of some data flow involving the Merge node and multiple data types:

<img width="1410" height="797" alt="Image" src="https://github.com/user-attachments/assets/ff5167fc-1981-42fd-ba93-2f3e742aad34" />

(I am not certain about the part where we mix the types and when precisely the colors get converted between raster, vector, and graphic.)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Data trees #3779

Key ideas

Node data flow descriptions: click to expand

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Data trees #3779

Description

Key ideas

Node data flow descriptions: click to expand

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions