Memory and Execution Model
Last updated
Last updated
HEAL operates in a heterogeneous execution environment, where computations are distributed between two types of processors:
Host (CPU)
Acts as the control unit and manages data flow.
Device (Accelerator / FHE Hardware)
Executes computationally expensive homomorphic operations on tensors. It processes data according to instructions received from the host.
Below we explain the three-phase execution model used to manage data across host and device memory.
The host prepares input data in tensor form and transfers it to the device memory using:
host_tensor
is a tensor in CPU memory.
The function returns a DeviceTensor<T>
, a pointer-based wrapper referencing allocated memory on the device.
No computation is done yet; the tensor is simply staged for processing.
The host invokes computational functions on the device, which operate entirely within device memory.
Example functions include:
Each function receives pointers to input and output tensor.
The hardware implementation processes these inputs and writes results directly to the output tensor memory location provided by HEAL.
✓ Intermediate results remain on the device. ✘ They are not automatically visible to the host after each function.