### Notation and terminology

We use capitals to represent random variables, and lowercase for their realizations. Sets are denoted by calligraphic font. The set \(\mathcal {S}\) is defined as \(\mathcal {S}=\{0,\ldots,N-1\}\). The mutual information (see e.g. [24]) between *X* and *Y* is *I*(*X*;*Y*). The probability density function (pdf) of the random variable \(X\in \mathbb {R}\) in written as *f*(*x*) and its cumulative distribution function (cdf) as *F*(*x*). We denote the number of minutiae found in a fingerprint by *Z*. The coordinates of the *j*’th minutia are *x*_{j}=(*x*_{j},*y*_{j}) and its orientation is *θ*_{j}. We write \(\boldsymbol {x}=(\boldsymbol {x}_{j})_{j=1}^{Z}\) and \(\boldsymbol {\theta }=(\boldsymbol {\theta }_{j})_{j=1}^{Z}\) We will use the abbreviations FRR = False Reject Rate, FAR = False Accept Rate, EER = Equal Error Rate, ROC = Receiver Operating Characteristic. Bitwise xor of binary strings is denoted as ⊕.

### Helper Data Systems

A HDS is a cryptographic primitive that allows one to reproducibly extract a secret from a noisy measurement. A HDS consist of two algorithms: Gen (generation) and Rep (reproduction/reconstruction), see Fig. 1. The Gen algorithm takes a measurement *X* as input and generates the secret *S* and a helper data *W*. The Rep algorithm has as input *W* and a noisy measurement *Y*; it outputs an estimator \(\hat {S}\). If *Y* is sufficiently close to *X* then \(\hat {S}=S\). The helper data should not reveal much about *S*. Ideally it holds that *I*(*W*;*S*)=0. This is known as *Zero Leakage* helper data.

### Two-stage hDS template protection scheme

Figure 2 shows the two-stage HDS architecture as described e.g. in [4]. The enrollment measurement *x* is transformed to the spectral representation \((x_{i})_{i=1}^{M}\) on *M* grid points. The first-stage enrollment procedure Gen1 is applied to each *x*_{i} individually, yielding short (mostly one-bit) secrets *s*_{i} and zero-leakage helper data *w*_{i}. The *s*_{1}…*s*_{M} are concatentated into a string *k*. Residual noise in *k* is dealt with by the second-stage HDS (Code Offset Method), whose Gen2 produces a secret *c* and helper data *r*. A hash *h*(*c*||*z*) is computed, where *z* is salt. The hash and the salt are stored. In the verification phase, the noisy *y* is processed as shown in the bottom half of Fig. 2. The reconstructed secret \(\hat {c}\) is hashed with the salt *z*; the resulting hash is compared to the stored hash.

### Minutia-pair spectral representation

Minutiae are features in a fingerprint, e.g. ridge endings and bifurcations. We briefly describe the minutia-pair spectral representation introduced in [20]. For minutia indices *a*,*b*∈{1,…,*Z*} the distance and angle are given by *R*_{ab}=|**x**_{a}−**x**_{b}| and \(\tan \phi _{ab}= \frac {y_{a}-y_{b}}{x_{a}-x_{b}}\). The spectral function \(\mathcal {M}_{\boldsymbol {x\theta }}\) is defined as

$$ \mathcal{M}_{\boldsymbol{x\theta}}(q,R) = \sum_{{a,b\in\{1,\ldots,Z\}}\atop{a< b}} e^{iq\phi_{ab}} e^{-\frac{\left(R-R_{ab}\right)^{2}}{2\sigma^{2}}} e^{i (\theta_{b} - \theta_{a})}, $$

(1)

where *σ* is a width parameter. The spectral function is evaluated on a discrete (*q*,*R*) grid. A pair (*q*,*R*) is referred to as a grid point. The variable *q* is integer and can be interpreted as the Fourier conjugate of an angular variable, i.e. a harmonic. The function \(\mathcal {M}_{\boldsymbol {x\theta }}\) is invariant under translations of *x*. When a rotation of the whole fingerprint image is applied over an angle *δ*, the spectral function transforms in a simple way,

$$ \mathcal{M}_{\boldsymbol{x\theta}}(q,R) \to e^{iq\delta} \mathcal{M}_{\boldsymbol{x\theta}}(q,R). $$

(2)

### Zero Leakage Helper Data Systems

We briefly review the ZLHDS developed in [4, 5] for quantisation of an enrollment measurement \(X\in \mathbb {R}\). The density function of *X* is *f*, and the cumulative distribution function is *F*. The verification measurement is *Y*. The *X* and *Y* are considered to be noisy versions of an underlying ‘true’ value. They have zero mean and variance \(\sigma _{X}^{2}\), \(\sigma _{Y}^{2}\), respectively. The correlation between *X* and *Y* can be characterised by writing *Y*=*λ**X*+*V*, where *λ*∈[0,1] is the attenuation parameter and *V* is zero-mean noise independent of *X*, with variance \(\sigma _{V}^{2}\). It holds that \(\sigma ^{2}_{Y} =\lambda ^{2} \sigma ^{2}_{X}+\sigma ^{2}_{V}\). We consider the *identical conditions* case: the amount of noise is the same during enrollment and reconstruction. In this situation we have \(\sigma ^{2}_{X} = \sigma ^{2}_{Y}\) and \(\lambda ^{2} = 1-\frac {\sigma ^{2}_{V}}{\sigma ^{2}_{X}}\).

The real axis \(\mathbb {R}\) is divided into *N* intervals \({\mathcal {A}}_{\alpha }=(\Omega _{\alpha },\Omega _{\alpha +1})\), with \(\alpha \in \mathcal {S}\), \({\mathcal {S}}=\{0,\ldots,N-1\}\). Let \(p_{\alpha }=\Pr [X\in {\mathcal {A}}_{\alpha }]\). The quantisation boundaries are given by \(\Omega _{\alpha }=F^{\text {inv}}\left (\sum _{j=0}^{\alpha -1}p_{j}\right)\). The Gen algorithm produces the secret *s* as \(s=\text {max} \{ \alpha \in \mathcal {S}: x\geq \Omega _{\alpha } \}\) and the helper data *w*∈[0,1) as \(w=\left [F(x)-\sum _{j=0}^{s-1}p_{j}\right ]/p_{s}\). The inverse relation, for computing *x* as a function of *s* and *w*, is given by \(\xi _{s,w}=F^{\text {inv}}\left (\sum _{j=0}^{s-1}p_{j}+wp_{s}\right)\).

The Rec algorithm computes the estimator \(\hat {S}\) as the value in \(\mathcal {S}\) for which it holds that \(y\in (\tau _{\hat {s},w}, \tau _{\hat {s}+1,w})\), where the parameters *τ* are decision boundaries. In the case of Gaussian noise these boundaries are given by

$$ \tau_{\alpha,w}=\lambda\frac{\xi_{\alpha-1,w}+\xi_{\alpha,w}}2 +\frac{\sigma_{V}^{2} \ln\frac{p_{\alpha-1}}{p_{\alpha}}} {\lambda(\xi_{\alpha,w}-\xi_{\alpha-1,w})}. $$

(3)

Here it is understood that *ξ*_{−1,w}=−*∞* and *ξ*_{N,w}=*∞*, resulting in *τ*_{0,w}=−*∞*, *τ*_{N,w}=*∞*.

The above scheme ensures that *I*(*W*;*S*)=0 and that the reconstruction errors are minimised.

### The Code Offset Method (COM)

We briefly describe how the COM is used as a Secure Sketch. Let *C* be a linear binary error correcting code with message space {0,1}^{m} and codewords in {0,1}^{n}. It has an encoding Enc: {0,1}^{m}→{0,1}^{n}, a syndrome function *S**y**n*: {0,1}^{n}→{0,1}^{n−m} and a syndrome decoder *S**y**n**D**e**c*: {0,1}^{n−m}→{0,1}^{n}. In Fig. 2 the Gen2 computes the helper data *r* as *r*=Syn *k*. The *c* in Fig. 2 is equal to *k*. The Rep2 computes the reconstruction \(\hat {c}=\hat k\oplus \texttt {SynDec}(r\oplus \texttt {Syn}\,\hat k)\).

### Polar codes

Polar codes, proposed by Arıkan [25], are a class of linear block codes that get close to the Shannon limit even at small code length. They are based on the repeated application of the *polarisation* operation \(\left (\begin {array}{cc} 1 & 0\\ 1 & 1 \end {array}\right)\)on two bits of channel input. Applying this operation creates two virtual channels, one of which is better than the original channel and one worse. For *n* channel inputs, repeating this procedure in the end yields *m* near-perfect virtual channels, with *m*/*n* close to capacity, and *n*−*m* near-useless channels. The *m*-bit message is sent over the good channels, while the bad ones are ‘frozen’, i.e used to send a fixed string known a priori by the recipient.

Polar codes have a number of advantages, such as flexible code rate and excellently performing soft-decision decoders. The most popular decoder is the Successive Cancellation Decoder (SCD), which sequentially estimates message bits \((c_{i})_{i=1}^{m}\) according to the frozen bits and the previously estimated bits \(\hat {c}_{i-1}\). Polar codes have been recently adopted for the 5G wireless standard, especially for control channels, which have short block length (≤1024). Because of these advantages we have chosen Polar codes for implementing the error correction step in our HDS scheme (see Section 6).