# Memory and Execution Model

## Overview

HEAL operates in a **heterogeneous execution environment**, where computations are distributed between two types of processors:

| Component                               | Role                                                                                                                                      |
| --------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------- |
| **Host** (CPU)                          | Acts as the control unit and manages data flow.                                                                                           |
| **Device** (Accelerator / FHE Hardware) | Executes computationally expensive homomorphic operations on tensors. It processes data according to instructions received from the host. |

***

## Execution Flow&#x20;

Below we explain the three-phase execution model used to manage data across host and device memory.

{% stepper %}
{% step %}

### **Upload** – *Transfer input to device*

The host prepares input data in tensor form and transfers it to the device memory using:

```cpp
host_to_device<T>(host_tensor);
```

* `host_tensor` is a tensor in CPU memory.
* The function returns a `DeviceTensor<T>`, a pointer-based wrapper referencing allocated memory on the device.

No computation is done yet; the tensor is simply staged for processing.
{% endstep %}

{% step %}

### &#x20;**Work** – *Run computations on the device*

The host invokes computational functions on the device, which operate entirely within device memory.&#x20;

Example functions include:

```cpp
modmul_ttc<T>(a, b, p_scalar, result);
ntt<T>(input, p, twiddles, result);
expand<T>(a, axis, repeats, result);
```

* Each function receives pointers to input and output tensor.&#x20;
* The hardware implementation processes these inputs and writes results directly to the output tensor memory location provided by HEAL.&#x20;

{% hint style="info" %}
If additional memory is required to store the results, the host is responsible for allocating this memory and providing the correct pointer before calling device functions.
{% endhint %}

✓  Intermediate results remain on the device.\
✘  They are **not automatically visible** to the host after each function.
{% endstep %}

{% step %}

### Download – *Transfer Results to Host*

Once computations are complete, the host explicitly transfers results back using:

```cpp
device_to_host<T>(device_tensor);
```

This step copies the computed tensor back into host memory for further use, such as decryption, visualization, or exporting.
{% endstep %}
{% endstepper %}

<figure><img src="/files/d8aLPBa3ovZ04hXd0bQH" alt=""><figcaption><p>HEAL memory management diagram </p></figcaption></figure>

***


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://healdocs.lattica.ai/core-concepts-explained/memory-and-execution-model.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
