Lattica HEAL Documentation
  • Getting Started with HEAL
  • Core Concepts Explained
    • Memory and Execution Model
    • Tensor
    • Transcript
    • Runtime
  • Interface Specifications
Powered by GitBook
On this page
  • Overview
  • Execution Flow
  • Upload – Transfer input to device
  • Work – Run computations on the device
  • Download – Transfer Results to Host
  1. Core Concepts Explained

Memory and Execution Model

PreviousCore Concepts ExplainedNextTensor

Last updated 9 days ago

Overview

HEAL operates in a heterogeneous execution environment, where computations are distributed between two types of processors:

Component
Role

Host (CPU)

Acts as the control unit and manages data flow.

Device (Accelerator / FHE Hardware)

Executes computationally expensive homomorphic operations on tensors. It processes data according to instructions received from the host.


Execution Flow

Below we explain the three-phase execution model used to manage data across host and device memory.

1

Upload – Transfer input to device

The host prepares input data in tensor form and transfers it to the device memory using:

host_to_device<T>(host_tensor);
  • host_tensor is a tensor in CPU memory.

  • The function returns a DeviceTensor<T>, a pointer-based wrapper referencing allocated memory on the device.

No computation is done yet; the tensor is simply staged for processing.

2

Work – Run computations on the device

The host invokes computational functions on the device, which operate entirely within device memory.

Example functions include:

modmul_ttc<T>(a, b, p_scalar, result);
ntt<T>(input, p, twiddles, result);
expand<T>(a, axis, repeats, result);
  • Each function receives pointers to input and output tensor.

  • The hardware implementation processes these inputs and writes results directly to the output tensor memory location provided by HEAL.

If additional memory is required to store the results, the host is responsible for allocating this memory and providing the correct pointer before calling device functions.

✓ Intermediate results remain on the device. ✘ They are not automatically visible to the host after each function.

3

Download – Transfer Results to Host

Once computations are complete, the host explicitly transfers results back using:

device_to_host<T>(device_tensor);

This step copies the computed tensor back into host memory for further use, such as decryption, visualization, or exporting.


HEAL memory management diagram
Page cover image