Notation and terminology
We use capitals to represent random variables, and lowercase for their realizations. Sets are denoted by calligraphic font. The set \(\mathcal {S}\) is defined as \(\mathcal {S}=\{0,\ldots,N-1\}\). The mutual information (see e.g. [24]) between X and Y is I(X;Y). The probability density function (pdf) of the random variable \(X\in \mathbb {R}\) in written as f(x) and its cumulative distribution function (cdf) as F(x). We denote the number of minutiae found in a fingerprint by Z. The coordinates of the j’th minutia are xj=(xj,yj) and its orientation is θj. We write \(\boldsymbol {x}=(\boldsymbol {x}_{j})_{j=1}^{Z}\) and \(\boldsymbol {\theta }=(\boldsymbol {\theta }_{j})_{j=1}^{Z}\) We will use the abbreviations FRR = False Reject Rate, FAR = False Accept Rate, EER = Equal Error Rate, ROC = Receiver Operating Characteristic. Bitwise xor of binary strings is denoted as ⊕.
Helper Data Systems
A HDS is a cryptographic primitive that allows one to reproducibly extract a secret from a noisy measurement. A HDS consist of two algorithms: Gen (generation) and Rep (reproduction/reconstruction), see Fig. 1. The Gen algorithm takes a measurement X as input and generates the secret S and a helper data W. The Rep algorithm has as input W and a noisy measurement Y; it outputs an estimator \(\hat {S}\). If Y is sufficiently close to X then \(\hat {S}=S\). The helper data should not reveal much about S. Ideally it holds that I(W;S)=0. This is known as Zero Leakage helper data.
Two-stage hDS template protection scheme
Figure 2 shows the two-stage HDS architecture as described e.g. in [4]. The enrollment measurement x is transformed to the spectral representation \((x_{i})_{i=1}^{M}\) on M grid points. The first-stage enrollment procedure Gen1 is applied to each xi individually, yielding short (mostly one-bit) secrets si and zero-leakage helper data wi. The s1…sM are concatentated into a string k. Residual noise in k is dealt with by the second-stage HDS (Code Offset Method), whose Gen2 produces a secret c and helper data r. A hash h(c||z) is computed, where z is salt. The hash and the salt are stored. In the verification phase, the noisy y is processed as shown in the bottom half of Fig. 2. The reconstructed secret \(\hat {c}\) is hashed with the salt z; the resulting hash is compared to the stored hash.
Minutia-pair spectral representation
Minutiae are features in a fingerprint, e.g. ridge endings and bifurcations. We briefly describe the minutia-pair spectral representation introduced in [20]. For minutia indices a,b∈{1,…,Z} the distance and angle are given by Rab=|xa−xb| and \(\tan \phi _{ab}= \frac {y_{a}-y_{b}}{x_{a}-x_{b}}\). The spectral function \(\mathcal {M}_{\boldsymbol {x\theta }}\) is defined as
$$ \mathcal{M}_{\boldsymbol{x\theta}}(q,R) = \sum_{{a,b\in\{1,\ldots,Z\}}\atop{a< b}} e^{iq\phi_{ab}} e^{-\frac{\left(R-R_{ab}\right)^{2}}{2\sigma^{2}}} e^{i (\theta_{b} - \theta_{a})}, $$
(1)
where σ is a width parameter. The spectral function is evaluated on a discrete (q,R) grid. A pair (q,R) is referred to as a grid point. The variable q is integer and can be interpreted as the Fourier conjugate of an angular variable, i.e. a harmonic. The function \(\mathcal {M}_{\boldsymbol {x\theta }}\) is invariant under translations of x. When a rotation of the whole fingerprint image is applied over an angle δ, the spectral function transforms in a simple way,
$$ \mathcal{M}_{\boldsymbol{x\theta}}(q,R) \to e^{iq\delta} \mathcal{M}_{\boldsymbol{x\theta}}(q,R). $$
(2)
Zero Leakage Helper Data Systems
We briefly review the ZLHDS developed in [4, 5] for quantisation of an enrollment measurement \(X\in \mathbb {R}\). The density function of X is f, and the cumulative distribution function is F. The verification measurement is Y. The X and Y are considered to be noisy versions of an underlying ‘true’ value. They have zero mean and variance \(\sigma _{X}^{2}\), \(\sigma _{Y}^{2}\), respectively. The correlation between X and Y can be characterised by writing Y=λX+V, where λ∈[0,1] is the attenuation parameter and V is zero-mean noise independent of X, with variance \(\sigma _{V}^{2}\). It holds that \(\sigma ^{2}_{Y} =\lambda ^{2} \sigma ^{2}_{X}+\sigma ^{2}_{V}\). We consider the identical conditions case: the amount of noise is the same during enrollment and reconstruction. In this situation we have \(\sigma ^{2}_{X} = \sigma ^{2}_{Y}\) and \(\lambda ^{2} = 1-\frac {\sigma ^{2}_{V}}{\sigma ^{2}_{X}}\).
The real axis \(\mathbb {R}\) is divided into N intervals \({\mathcal {A}}_{\alpha }=(\Omega _{\alpha },\Omega _{\alpha +1})\), with \(\alpha \in \mathcal {S}\), \({\mathcal {S}}=\{0,\ldots,N-1\}\). Let \(p_{\alpha }=\Pr [X\in {\mathcal {A}}_{\alpha }]\). The quantisation boundaries are given by \(\Omega _{\alpha }=F^{\text {inv}}\left (\sum _{j=0}^{\alpha -1}p_{j}\right)\). The Gen algorithm produces the secret s as \(s=\text {max} \{ \alpha \in \mathcal {S}: x\geq \Omega _{\alpha } \}\) and the helper data w∈[0,1) as \(w=\left [F(x)-\sum _{j=0}^{s-1}p_{j}\right ]/p_{s}\). The inverse relation, for computing x as a function of s and w, is given by \(\xi _{s,w}=F^{\text {inv}}\left (\sum _{j=0}^{s-1}p_{j}+wp_{s}\right)\).
The Rec algorithm computes the estimator \(\hat {S}\) as the value in \(\mathcal {S}\) for which it holds that \(y\in (\tau _{\hat {s},w}, \tau _{\hat {s}+1,w})\), where the parameters τ are decision boundaries. In the case of Gaussian noise these boundaries are given by
$$ \tau_{\alpha,w}=\lambda\frac{\xi_{\alpha-1,w}+\xi_{\alpha,w}}2 +\frac{\sigma_{V}^{2} \ln\frac{p_{\alpha-1}}{p_{\alpha}}} {\lambda(\xi_{\alpha,w}-\xi_{\alpha-1,w})}. $$
(3)
Here it is understood that ξ−1,w=−∞ and ξN,w=∞, resulting in τ0,w=−∞, τN,w=∞.
The above scheme ensures that I(W;S)=0 and that the reconstruction errors are minimised.
The Code Offset Method (COM)
We briefly describe how the COM is used as a Secure Sketch. Let C be a linear binary error correcting code with message space {0,1}m and codewords in {0,1}n. It has an encoding Enc: {0,1}m→{0,1}n, a syndrome function Syn: {0,1}n→{0,1}n−m and a syndrome decoder SynDec: {0,1}n−m→{0,1}n. In Fig. 2 the Gen2 computes the helper data r as r=Syn k. The c in Fig. 2 is equal to k. The Rep2 computes the reconstruction \(\hat {c}=\hat k\oplus \texttt {SynDec}(r\oplus \texttt {Syn}\,\hat k)\).
Polar codes
Polar codes, proposed by Arıkan [25], are a class of linear block codes that get close to the Shannon limit even at small code length. They are based on the repeated application of the polarisation operation \(\left (\begin {array}{cc} 1 & 0\\ 1 & 1 \end {array}\right)\)on two bits of channel input. Applying this operation creates two virtual channels, one of which is better than the original channel and one worse. For n channel inputs, repeating this procedure in the end yields m near-perfect virtual channels, with m/n close to capacity, and n−m near-useless channels. The m-bit message is sent over the good channels, while the bad ones are ‘frozen’, i.e used to send a fixed string known a priori by the recipient.
Polar codes have a number of advantages, such as flexible code rate and excellently performing soft-decision decoders. The most popular decoder is the Successive Cancellation Decoder (SCD), which sequentially estimates message bits \((c_{i})_{i=1}^{m}\) according to the frozen bits and the previously estimated bits \(\hat {c}_{i-1}\). Polar codes have been recently adopted for the 5G wireless standard, especially for control channels, which have short block length (≤1024). Because of these advantages we have chosen Polar codes for implementing the error correction step in our HDS scheme (see Section 6).