The research team led by Prof. Joo-Young Kim at KAIST ITC has developed Strix and Morphling, accelerator designs for Torus Fully Homomorphic Encryption Schemes, addressing the computational bottleneck in bootstrapping using various techniques such as ciphertext batching and transform-domain reuse.

Fully Homomorphic Encryption (FHE) is a technology that enables computations to be carried out on data within an encrypted domain. This is crucial for achieving end-to-end privacy, meaning user data remains encrypted at all times, except when in the user’s possession. Consequently, not even the cloud service provider can access the user’s data, yet it can still process this data and generate an encrypted result that only the user can decipher. Should the data be intercepted during processing, the user’s privacy is still safeguarded due to the encryption.

Despite its significant privacy benefits, Fully Homomorphic Encryption (FHE) nevertheless incurs high computational costs. For instance, while a baseline inference task in an unencrypted domain might take only a hundred milliseconds on a CPU, the same task under FHE can extend to as much as three hours, resulting in a slowdown of over one hundred thousand times. There are numerous FHE schemes available for selection. In their research, the authors concentrated on accelerating the TFHE scheme, chosen for its flexibility in enabling encrypted computations and its relatively lower computational expense compared to other schemes.

Strix represents the team’s inaugural development of a TFHE accelerator. It pinpointed the critical challenge of accelerating TFHE on GPUs, addressing the memory bottleneck through intelligent computation scheduling. Additionally, Strix achieved reduced latency by processing data in a fully-pipelined manner. The accelerator also offers a study on parallelism analysis and employs batching techniques to enhance hardware utilization efficiently.
Furthermore, the team developed Morphling, accelerating computation in THFE by introducing transform-domain reuse to minimize the domain-transform operations required in TFHE bootstrapping. By strategically incorporating transform-domain data reuse into a 2D systolic VPE array architecture, this research effectively reduces domain-transform operations, enabling higher computational core allocation and enhancing system throughput.
Strix was presented at the 56th IEEE/ACM International Symposium on Microarchitecture (MICRO) held in Toronto from October 28 to November 1, 2023. Meanwhile, Morphling was presented at the 30th IEEE International Symposium on High-Performance Computer Architecture (HPCA) held in Edinburgh on March 2 to 6, 2024.

Fig 1. Computation Breakdown in TFHE
Fig 1. Computation Breakdown in TFHE
Fig 2. Overall Architecture Design
Fig 2. Overall Architecture Design
Contact Information:
Joo-Young Kim School of Electrical Engineering, KAIST
E-mail: jooyoung1203@kaist.ac.kr
Homepage: https://castlab.kaist.ac.kr