Researchers analyze current findings on the security and privacy techniques of confidential computer-aided machine learning as well as the limitations of existing Trusted Execution Environment (TEE) systems

Source: https://arxiv.org/pdf/2208.10134.pdf

The evolution of machine learning (ML) offers wider possibilities of use. However, extended applications also increase the risks of a large attack surface on ML security and privacy. . ML models probably use private and sometimes sensitive data, for example specific information about people (names, photos, addresses, preferences, etc.). Additionally, the network architecture can be stolen. In response to these risks, several methods of anonymizing data and securing the various stages of the machine learning process have been and are still being developed. However, these solutions are only rarely applied.

In a professional context, the different steps (training/inference) and the data necessary for the functioning of the model can be held by different stakeholders, such as customers and companies. Moreover, they can arise or be stored in different places (model provider’s server, data owner, cloud, etc.). The risk of attack may be present in each of these entities. A promising method to achieve reliable ML to ensure privacy is confidential computing. Considering the importance and challenges related to the security and privacy of machine learning models, an English research team has proposed a paper on systematization of knowledge (SoK). In this article, the authors presented the problem and proposed future solutions to realize ML with Confidential Computing for hardware, system and framework.

The authors claim that confidential computing technology provides a level of assurance of confidentiality and integrity when using Trusted Execution Environments (TEEs) to execute code on data. TEE is one of the newer methods for isolating and verifying the execution of code inside protected memory, also known as enclaves or secure world, and away from the host’s privileged system stacks like the system. operation or the hypervisor. It is based on the hard keys: the root of measuring trust, establishing and attesting trust remotely, and executing and compartmentalizing trustworthy code. Data/model owners must secretly provide their data/models to the untrusted host’s TEE in Confidential Computing-assisted ML. To be more specific, owners prepare the model and/or data, perform remote attestation to ensure the integrity of the remote TEE, and then create secure communication channels with the TEE. The main feature offered by Confidential Computing is the separation of enclaves/TEEs from the untrusted environment with hardware support.

In this SoK article, several recommendations were presented. The authors believe that the concept of confidentiality is still unclear in relation to security or integrity. To have a well-founded privacy guarantee, the theoretically-founded protection objective must be established, for example, with differential privacy information. They insist that the upstream part of the ML pipeline, such as data preparation, must be protected at all costs because its absence has unavoidable adverse effects. By incorporating TEE-based verification into the data signature, this can be accomplished. The entire ML pipeline protection can also benefit from multiple TEEs/Conclaves. It is necessary to carefully research the privacy and integrity weaknesses of various ML components (layers, feature maps, numerical computations) before designing the ML framework to be compatible with TEEs and partitionable for heterogeneous TEEs. In addition, management of the TEE system to effectively protect the most sensitive ML components with high priority is required.

In this article, we have seen an exciting and challenging new era related to protecting ML from privacy leaks and integrity breaches using confidential computing techniques. Although the execution of the learning and inference processes has been the subject of many studies. They continue to struggle with the lack of trusted resources within TEEs. The existing safeguards only ensure the confidentiality and integrity of the training/inference step in the full ML pipeline, because ML requires much more reliable resources. Confidential computing establishes a more reliable runtime environment for ML operations by achieving a hardware-based root of trust. The idea that hiding the training/inference process inside such enclaves is the best course of action needs to be reconsidered. Future researchers and developers need to better understand the privacy challenges underlying the ML pipeline so that future security measures can focus on the essential components.

This Article is written as a research summary article by Marktechpost Staff based on the research paper 'SoK: Machine Learning with Confidential Computing'. All Credit For This Research Goes To Researchers on This Project. Check out the paper.

Please Don't Forget To Join Our ML Subreddit


Mahmoud is a PhD researcher in machine learning. He also holds a
bachelor’s degree in physical sciences and a master’s degree in
telecommunications systems and networks. His current areas of
research focuses on computer vision, stock market prediction and
learning. He has produced several scientific articles on the re-
identification and study of the robustness and stability of
networks.


Sherry J. Basler