Tensor

Tensors are the primary data structure in HEAL. They represent multi-dimensional arrays with fixed shape and data type. Every function in the HEAL runtime operates over tensors, which are passed as pointers into device memory.

We adopt a widely accepted approach to tensor structure and semantics, consistent with NumPy and other modern ML frameworks like PyTorch.

This page introduces how tensors are structured in HEAL: their data types, shapes, strides, and broadcasting behavior.

Data Types

HEAL tensors support integer types (int32_t ;int64_t)

Every tensor must have a clearly defined scalar type. The type is determined at tensor creation and cannot be changed.

Example:

std::shared_ptr<DeviceTensor<int32_t>> my_tensor;

This declares a tensor of 32-bit integers, managed on the device.

Shape

A tensor’s shape defines its dimensionality - the size of the tensor along each axis.

Example:

A tensor with shape [2, 3] contains 2 rows and 3 columns:

[
 [1, 2, 3],
 [4, 5, 6]
]

Shape is immutable unless explicitly changed (e.g. via reshape).
Shape compatibility is enforced in all HEAL operations.

Strides

Strides describe how tensor elements are laid out in memory. Each stride tells you how many memory steps to take to move between elements along a particular axis.

Strides are especially important when tensors are non-contiguous, such as after slicing, reshaping, or transposing.

Example:

A tensor with shape [2, 3] laid out in row-major order will have strides [3, 1]:

Move 3 units in memory to go from row 0 to row 1.
Move 1 unit to move across columns.

Strides allow for memory-efficient views and slicing without copying data.

Broadcasting Semantics

Broadcasting allows element-wise operations between tensors of different shapes, as long as the shapes are broadcastable. This enables operations without explicitly reshaping tensors.

Broadcasting Rules

Two tensors are broadcastable if:

Both have at least one dimension.
Starting from the last (trailing) dimension and comparing backward:
- Dimensions are equal
or
- One of the dimensions is 1
or
- One of the tensors has fewer dimensions, in which case it's left-padded with 1s to match the other tensor's rank.

Example

A.shape =       [  3, 1, 5]
B.shape = [2, 1, 3, 1, 1]

Becomes (after left-padding A with two 1s):

A.shape = [1, 1, 3, 1, 5]
B.shape = [2, 1, 3, 1, 1]

Resulting broadcasted shape:

[2, 1, 3, 1, 5]

If these rules hold for every dimension, the shapes are broadcastable. The resulting shape is the maximum size in each dimension.

✓ Valid Broadcasting Example

Shape A: [5, 1, 4, 1]
Shape B:     [3, 1, 1]

Resulting shape: [5, 3, 4, 1]

Why is this a valid example?

To align shapes, we left-pad B’s shape:

Shape A: [5, 1, 4, 1]
Shape B: [1, 3, 1, 1]

Comparison (from last to first):

Dimension

Valid?

4th

✓ equal

3rd

✓ one is 1

2nd

✓ one is 1

1st

✓ one is 1

✓ All dimensions are compatible → broadcasting is allowed

✘ Invalid Broadcasting Example

Shape A: [5, 2, 4, 1]
Shape B:     [3, 1, 1]

// Error: dim mismatch (2 ≠ 3) in position 2

Why is this an invalid example?

Left-padding B’s shape:

Shape A: [5, 2, 4, 1]
Shape B: [1, 3, 1, 1]

Comparison:

Dimension

Valid?

4th

✓ equal

3rd

✓ one is 1

2nd

✘ mismatch

1st

✓ one is 1

✘ Dimension 2: 2 ≠ 3 and neither is 1 → broadcasting not allowed

Additional Note

In-place operations (where input and output refer to the same tensor) are only allowed if broadcasting does not change the input's shape.

If a tensor needs to be broadcast to a larger shape, it cannot safely be used as the output - doing so would write outside its allocated memory.

Example:

A.shape = [4, 1]
B.shape = [4, 3]
C.shape = [4, 3]

modsum_ttc(A, B, 6, A);  // ❌ Error: A would be broadcast to [4, 3]
modsum_ttc(A, B, 6, B);  // ✔ Safe: A and B shape remains unchanged
modsum_ttc(A, B, 6, C);  // ✔ Safe: A, B and C shape remains unchanged

Want to understand how tensors are transferred between host and accelerator?

See HEAL Memory and Execution Model for the full memory model and lifecycle.

PreviousMemory and Execution Model NextTranscript

Last updated 1 month ago