TL;DR Quantum machine learning algorithms are vulnerable to adversarial attacks and optimally certifying the robustness of such systems enables us to reliably assess the risk involved when deploying such models.

To learn more, check: [Maurice Weber, Nana Liu, Bo Li, Ce Zhang, Zhikuan Zhao. Optimal Provable Robustness of Quantum Classification via Quantum Hypothesis Testing. npj Quantum Information, 2021.]
Some of our other related work in the classical setting: [Robustness against backdoor attacks][End-to-end Robustness]
This work is conducted in collaboration with Prof. Nana Liu from Shanghai Jiao Tong University and Prof. Bo Li from University of Illinois at Urbana-Champaign.

The interdisciplinary research between quantum computation and machine learning has been proliferating in the past years. Among the topics with most exciting practical prospects, quantum-assisted supervised learning and classification algorithms aspire to leverage the exponentially large Hilbert space uniquely accessible to quantum algorithms to either drastically speed up computational bottlenecks in classical protocols, or to construct quantum-enhanced feature transformations which are practically prohibitive to compute classically. While the promising potential of quantum enhancement in terms of model expressiveness continues to fill the community with anticipation for imminent real-world impact, the security aspect of these quantum models ensues to be an inevitable inquiry. In this article, we attend to a specific aspect of model security in quantum machine learning, namely its robustness against adversarial attacks.

In contemporary machine learning, models' robustness against adversarial attacks is an ever more vital topic. Here we are specifically concerned with the scenario where the input for a classification algorithm is exposed to potential adversarial perturbations within a maximum radius away from the benign input by some distance measure. The aim of a robustness certification protocol is to derive a provable guarantee that the predictive outcome of the model is unaffected by the input perturbation. In the classical setting, the importance of robustness  is evident for applications relevant to public safety. To take an example in autonomous driving, an adversarial attack in the form of a simple sticker or graffiti on road signs, making little difference to human perception, can trick the computer vision system directing the vehicle into failure in recognising a stop sign, resulting in potentially severe consequences. Such real world attacks were studied extensively by Eykholt et al. in [1]. As was shown by Liu & Wittek [2], this issue of vulnerability against input perturbation unfortunately persists in the quantum setting, threatening to undermine the practical utility of quantum enhancement of machine learning. Moreover, even before the attacks of malicious parties should become an immediate concern, quantum classification algorithms, being still embryonic in its applicational development, is naturally destined to contend with a noisy input situation due to implementational imperfection, prevalent in the noisy intermediate-scale quantum (NISQ) era. To this end, an adversarial attack simply represents the worst-case noise on the input quantum state. In order to simultaneously address the aforementioned issues of 1) security against malicious attack and 2) tolerance against worst-case input noise for quantum classifiers, we attempt to answer the question:

To what extend can we certify the robustness of quantum classifiers when the input is vulnerable to adversarial perturbation?

To make the setting more explicit, consider the "quantum computer vision" example shown in Figure 1, where images of mushrooms are encoded as quantum state vectors for a classification circuit composed of unitary gates followed by a projective measurement.

Our plan to proceed is as follows:

• First, we describe a fundemental connection between robustness of quantum classifiers and Quantum hypothesis testing (QHT), leverage it to establish a (provably) tight condition for adversarial robustness.
• Based on the robustness condition arised from considering QHT, we lay out an optimal protocol for robustness cerification.
• We demonstrate the application of the certification protocol and the robustness bound with a single-qubit pure state example.
• We will also discuss the technique of active input randomization which we show is capable of boosting the robustness of quantum classifiers.

### Robustness Bound Inspired by Quantum Hypothesis Testing

Quantum hypothesis testing (QHT) is among the most fundamental problems in quantum information science. In a nutshell, suppose you are given a quantum state for which it is known that it is described by one of two states, $$\sigma$$ or $$\rho$$ but it is a priori unknown by which of the two. The goal is then to perform a measurement on the given quantum state that can discriminate between the two possibilities with the lowest possible probability of making an error. Being such a longstanding and widely studied problem, QHT accounts for a rich theoretical toolkit for assessing the reliability of  discriminating between quantum states. In the following, we aim to make use of this in order to derive optimal robustness conditions.

When trying to obtain robustness guarantees for a given quantum classifier, we are interested in finding regions around an input $$\sigma$$ for which it is guaranteed that all states within that region are classified the same as $$\sigma$$.  Thus, when we are interested in characterising robust regions for a given classifier, then the question that we are really asking is, what is the region around a given input where the classifier at hand can not discriminate two quantum states by means of assigning them to different classes. Here, an intuitive connection with quantum hypothesis testing emerges: while classification robustness is concerned with the question of finding conditions for which a classifier can not discriminate quantum states, quantum hypothesis testing is concerned with finding means to optimally discriminate between quantum states. Viewing a classifier as a hypothesis test, we can then say that, if the best possible hypothesis test can not discriminate $$\rho$$ from $$\sigma$$, then the (worse) discriminator, our classifier, can also not discriminate between those states and will not assign them to distinct classes. This can be expressed more rigorously as

Robust regions consist of all states $$\rho$$, for which the optimal hypothesis test admits an error probability of at least 1/2 for discriminating $$\rho$$ from $$\sigma$$.

In addition, stemming from the optimality of QHT, this condition is tight in the sense that whenever a state $$\rho$$ violates the robustness condition, then there is a classifier for which $$\rho$$ and $$\sigma$$ are assigned different classes. In other words, tightness means that this robustness condition defines the largest possible robust region and any state which happens to be outside, is not guaranteed to be robust.

Establishing this connection provides a first step towards assessing the resilience of quantum programs against worst-case types of noise and leads to an implicit robustness condition in terms of a semidefinite optimization problem. However, we can obtain an operationally meaningful robustness condition in terms of distance measures by connecting optimal QHT error probabilities to Uhlmann's fidelity between pure quantum states, leading to the robustness condition $F(\rho,\,\sigma) > \frac{1}{2} + \sqrt{p_A(1-p_A)}$ where $$p_A$$ is the probability of measuring the most likely class and $$F$$ denotes the fidelity.

The connection between hypothesis testing and adversarial robustness has also been investigated in classical machine learning previously with the seminal work by Cohen, Rosenfeld & Kolter [3] and our follow-up work on provable robustness against backdoor attacks co-led by Xiaojun Xu [4]. This technique on which these classical approaches are based, is referred to as randomised smoothing and is led by the intuition that randomisation favours robustness and privacy. These papers then also sparked the initial inspiration to pursue a similar endeavour in the quantum realm and led to the interesting insight that, in contrast to classical classifiers, quantum classifiers are inherently probabilistic and hence no additional randomisation is required to obtain this robustness guarantee.

### Certification Protocol

The robustness bound derived from quantum hypothesis testing can be practically used to assess the resilience of a given quantum classifier against arbitrary input perturbations and in particular, the resilience against adversarial threats. In summary, the protocol to certify the robustness around a given pure state $$\sigma$$ consists of the following three steps:

1. Run the quantum circuit $$N$$ times and collect the measurement outcomes.
2. Estimate a lower bound $$p_A$$ on the probability of the most likely class from the empirical distribution.
3. If $$p_A > 1/2$$, then compute the robustness bound $$R = \frac{1}{2} + \sqrt{p_A(1-p_A)}$$ otherwise abstain from certification.

In order to assess the robustness, one would typically perform these steps in order to calculate a robustness radius for each point in a given test set, which the user deems to be secure. The larger these radii are, the more robust a classifier is against adversaries. This equips the user with a first idea as to how strong an adversary needs to be in order to attack the classifier once it is deployed in the field. Given this first assessment, the classifier can then be adjusted by increasing or decreasing the model robustness, and the robustness evaluation procedure can then be run again in the next cycle.

### Example: Robustness of a Single-qubit Quantum Classifier

We now turn our attention to a simple toy example: the classification of single-qubit quantum states living on the Bloch sphere as either $$+$$ (northern hemisphere) or $$-$$ (southern hemisphere). A classifier which induces a linear decision boundary can be visualised as a disk crossing the center of the Bloch sphere and classifying everything above as $$+$$ and everything below as $$-$$. The classifier corresponding to the disc in the figure below will predict the $$| 0 \rangle$$-state to be of class $$+$$ with prediction confidence $$p_A \approx 0.9$$ which leads to a robust region described by the grey spherical cap:

These robust region contains all "adversarial" states $$\rho$$ which result in an optimal type-II error probability of at least 1/2, or, put differently, this robust region contains all states which can be discriminated from the benign state $$\sigma$$ with a type-II error probability of at least 1/2. This illustrates the intriguing connection between quantum hypothesis testing and classification robustness: as soon as it becomes sufficiently "easy" to discriminate the adversarial state $$\rho$$ from $$\sigma$$ by means of an optimal measurement, the quantum state is too far away from the benign input state, and the classifier is no longer guaranteed to be robust.

### Input Randomisation

Both in classical and quantum machine learning, numerous security related protocols have been developed based on the observation that noise can be harnessed to increase privacy and robustness. In the quantum realm, Du et al. [5] have shown that naturally occurring noise sources, particularly prevalent in the NISQ era, can be exploited to enhance robustness guarantees by leveraging tools from quantum differential privacy in the presence of depolarisation noise. In line with these findings, incorporating depolarisation noise into our QHT formalism also leads to increased robustness against adversarial perturbations. In the following plot, our robust radius $$R_p^{\mathrm{QHT}}$$ is compared with Du et al.'s [5] bound derived from differential privacy $$R_p^{\mathrm{DP}}$$ and a bound derived from Hölder duality $$R_p^{\text{Hölder}}$$ for different levels of the depolarisation noise parameter $$p$$.

In summary, we have demonstrated the use of quantum hypothesis testing as a powerful proxy to obtain a bound for adversarial robustness for quantum classification algorithms. This bound directly gives rise to an optimal robustness certification protocol which can be useful for both validating a quantum classifier's tolerance against worst-case input imperfection and protecting quantum models' integrity against malicious attacks. Although the robustness bound from QHT is provably tight, an active input randomisation on a given classifier can boost the robustness radii. Contrasting the QHT approach with other means of obtaining a robustness bound for quantum classifiers numerically validates its tightness. For further details, please check out the technical manuscript of our work on arXiv.