# Anticollusion solutions for asymmetric fingerprinting protocols based on client side embedding

- Tiziano Bianchi
^{1}Email author, - Alessandro Piva
^{2}and - Dasara Shullani
^{2}

**2015**:6

https://doi.org/10.1186/s13635-015-0023-y

© Bianchi et al. 2015

**Received: **15 January 2015

**Accepted: **7 August 2015

**Published: **28 August 2015

## Abstract

In this paper, we propose two different solutions for making a recently proposed asymmetric fingerprinting protocol based on client-side embedding robust to collusion attacks. The first solution is based on projecting a client-owned random fingerprint, securely obtained through existing cryptographic protocols, using for each client a different random matrix generated by the server. The second solution consists in assigning to each client a Tardos code, which can be done using existing asymmetric protocols, and modulating such codes using a specially designed random matrix. Suitable accusation strategies are proposed for both solutions, and their performance under the averaging attack followed by the addition of Gaussian noise is analytically derived. Experimental results show that the analytical model accurately predicts the performance of a realistic system. Moreover, the results also show that the solution based on independent random projections outperforms the solution based on Tardos codes, for different choices of parameters and under different attack models.

### Keywords

Fingerprinting Anticollusion codes Buyer-seller watermarking protocol Client-side embedding Secure watermark embedding## 1 Introduction

The wide availability of platforms for the distribution of multimedia contents poses several problems regarding copyright protection. A possible solution to prevent copyright violations is to embed a unique code, or fingerprint, in the distributed copies, so that illegally redistributed copies could be traced back to the entity responsible for the violation. In the literature, several watermarking techniques have been proposed with the aim of embedding a unique fingerprint in a multimedia content [1, 2]. However, for the practical deployment of such systems, a number of problems have to be solved.

First, the actual fingerprint can not be embedded by the distributor alone, since a guilty customer could claim to have been framed by a malicious distributor, undermining the credibility of the whole system. Hence, fingerprinting must be either handled by a trusted third party, or performed interactively via secure cryptographic protocols [3–6]. Second, if individual fingerprinted copies for each customer are generated at the server side, the amount of computational and bandwidth resources needed by the server may soon become prohibitively high in large-scale systems. A solution to this issue is offered by client-side embedding methods, in which the server distributes the same encrypted copy of the content to all the clients, along with different client-specific decryption keys allowing each user to decrypt a slightly different version of the content, bearing a different watermark [7–10]. An alternative solution consists in creating a few streams, each with different but constant watermarks, and force the client to switch between the streams [11].

Besides these first two issues, a practical system has also to cope with malicious customers trying to attack the system. In this sense, a very effective attack is the so called *collusion* attack. In this attack, a coalition of customers combine their differently watermarked copies in order to obtain a new copy in which the watermark is much harder to be detected. As to multimedia fingerprinting, several studies have analyzed the robustness of watermarking techniques to collusion attacks [12, 13], the effectiveness of different detectors [14], and the robustness to different collusion strategies [15]. An alternative solution is to construct specific anticollusion codes that can be embedded in a content [16]. Asymptotically optimal probabilistic anticollusion codes were first proposed by Tardos in [17], and later optimized under several aspects [18–31].

The current literature on fingerprinting usually addresses only a single issue at a time. There are few solutions that try to simultaneously solve the above issues. In [32], we proposed an asymmetric fingerprint protocol based on a client-side distribution paradigm. The above solution effectively solves both customer’s rights and scalability problems: nevertheless, collusion resistance was not addressed in this proposal. Recently, a few works have investigated the use of anticollusion codes in asymmetric fingerprinting protocol. In [33], the authors propose an asymmetric Tardos code construction based on oblivious transfer. In [34], the author proposes an asymmetric binary fingerprinting code based on Boneh-Shaw codes. However, both solutions are based on a server-side distribution framework. Some authors also investigated the use of Tardos codes in a client-side distribution framework [35], but without addressing asymmetric protocols.

In this paper, we propose two strategies for providing collusion resistance in the protocol of [32]. The first strategy is based on generating independent coding matrices for the fingerprint of different users, and is conceptually similar to using near orthogonal independent Gaussian fingerprints. The second strategy consists in generating the fingerprint of each user according to a Tardos code, exploiting the fact that such codes can be securely distributed using the protocol proposed in [33]. Since Tardos codes are much longer than the random fingerprints used in [32], an efficient encoding obtained through a random partially circulant matrix is proposed, which is conceptually similar to dimensionality reduction techniques based on the Johnson-Lindenstrauss lemma [36] and applied in compressed sensing [37]. For both solutions, we develop the corresponding accusation strategies and we analytically derive their anticollusion performance under the averaging attack. Experimental results are finally presented to support the proposed analysis.

## 2 Background

In this section, we briefly review the client-side embedding technique proposed in [38] and the asymmetric fingerprinting protocol proposed in [32], which builds on the above technique.

### 2.1 LUT-based client-side embedding

**E**of size

*T*and a set of watermarking LUTs

**W**

_{ k }of the same size,

*k*=0,…,

*N*

_{ U }−1, each associated to one of the

*N*

_{ U }clients. The entries of

**E**are usually i.i.d. random variables following a Gaussian distribution with variance \({\sigma _{E}^{2}}\), while the entries of each

**W**

_{ k }are i.i.d. random variables following a Gaussian distribution with variance \({\sigma _{W}^{2}}\). Different LUTs are assumed to be independent. For the

*k*th client, the distribution server generates a personalized decryption LUT

**D**

_{ k }by combining componentwise the master encryption LUT

**E**and a watermark LUT

**W**

_{ k }as

for *t*=0,1,…,*T*−1. The personalized decryption LUTs are then transmitted once to each client over a secure channel. Note that the generation of the LUTs is carried out just once at the setup phase.

*M*×

*R*values

*t*

_{ ih }in the range [0,

*T*−1], where 0≤

*i*≤

*M*−1, 0≤

*h*≤

*R*−1, is pseudo-randomly generated according to a content dependent key

*sek*. Each of the

*M*content features

*x*

_{ i }is encrypted by adding

*R*entries of the encryption LUT identified by the indexes (

*t*

_{ i0},…,

*t*

_{ i(R−1)}), obtaining the encrypted feature

*c*

_{ i }as follows

*t*

_{ ih }according to the content dependent key

*sek*and by adding

*R*entries of the decryption LUT

**D**

_{ k }to each encrypted feature

*c*

_{ i }as

where the *i*th watermark component is given as the sum of *R* entries of the LUT **W**
_{
k
}. The result of this operation is the watermarked content **y**
_{
k
}=**x**+**w**
_{
k
} identifying the *k*th user.

### 2.2 Asymmetric fingerprinting

A typical asymmetric fingerprinting protocol [3] is composed of a registration phase, in which the client proves his/her identity and commits to a secret that only he/she knows, and a watermarking protocol, jointly performed by the client and the distribution server, after which only the client receives a copy of the watermarked content containing his/her secret. If the copy is illegally distributed, the server can identify the guilty client by extracting his/her secret from the watermark and prove to a Judge that it is indeed the client’s secret by using a proper dispute resolution protocol based on the client’s commitment.

Let us assume that each client has a public/private key pair (*p*
*u*
*k*,*p*
*r*
*k*). The schemes proposed in [5, 6] assume that the *k*th client produces as input to the watermarking protocol a random string of *L* bits denoted as **b**
_{
k
}, representing his/her secret. This *L*-bit fingerprint is encrypted with the client’s public key using an additively homomorphic cryptosystem and sent to the server, together with a proper commitment linking the identity of the client to the encrypted fingerprint. Thanks to the homomorphic properties of the encryption, the server is able to compute a watermarked copy of the content containing the client’s fingerprint directly in the encrypted domain, so that the client’s secret is never disclosed. However, if after an illegal redistribution a watermarked copy is found in the clear, the client’s secret can be obtained through the watermark decoder and used in an accusation protocol.

*k*th client is first randomized by the server using a secret

*L*-bit sequence

**r**

_{ k }. Then, the resulting string is encoded using a binary antipodal modulation, yielding the to be transmitted message

**m**

_{ k }. Formally, each symbol of the message is computed as

*m*

_{ k,l }=

*σ*

_{ W }(2(

*b*

_{ k,l }⊕

*r*

_{ k,l })−1), 0≤

*l*≤

*L*−1. Then, the message

**m**

_{ k }is projected according to a

*T*×

*L*random matrix \(\mathbb {G}\), yielding the following watermarking LUT

**D**

_{ k }is finally obtained by combining the encryption LUT and the above watermarking LUT as

**D**

_{ k }can be directly computed in the encrypted domain using the encryption of

**b**

_{ k }, obtained through an additively homomorphic cryptosystem, and properly rescaled and quantized versions of

**E**and \(\mathbb {G}\) [32]. A high-level block diagram of the above protocol is shown in Fig. 1.

**D**

_{ k }can be expressed by adding to the encrypted signal the product of the decryption LUT

**D**and a proper binary matrix \(\mathbb {T}\) defined according to the sequence of indexes

*t*

_{ ih }, i.e.,

*M*×

*T*binary matrix defined as

that is, the actual watermark is equal to the message **m**
_{
k
} projected by the *M*×*L* random matrix \(\tilde {\mathbb {G}} = \mathbb {T}\mathbb {G}\).

Since the scheme is asymmetric, the decoder does not know the messages **m**
_{
k
} and a correlation detector, as that proposed in [8], is not applicable here. Hence, the authors in [32] propose to obtain an estimated fingerprint \(\hat {\mathbf {b}}_{k}\) and to verify whether it matches with a recorded client, using the accusation protocols provided by the underlying asymmetric fingerprinting protocol.

**x**is available at the decoder, which is a common hypothesis in fingerprint applications, a simple decoder can be obtained using the

*Matched Filter (MF)*principle as

where sgn{} denotes the sign function.

## 3 Anticollusion solutions

As pointed out in [32], the basic protocol described in the previous section is potentially vulnerable to collusion attacks, where several clients combine their received watermarked copies in order to obtain a pirated copy in which their fingerprints are much harder to be detected. In the following, we introduce two possible solutions to cope with this problem. The first solution is based on the use of random fingerprints modulated by a different projection matrix for each client. The second solution consists in generating each client’s fingerprint according to a Tardos anticollusion code.

### 3.1 Collusion model

*N*

_{ U }differently watermarked copies of the same content

**x**to

*N*

_{ U }clients. Among those clients, a coalition \(\mathcal {C}\) of \(C = |\mathcal {C}|\) clients cooperates in order to obtain a pirated copy

**y**

_{ C }according to some, possibly randomized, collusion strategy. In the following, we assume that the

*C*colluders adopt an average collusion strategy, i.e., the pirated copy is obtained as

where **y**
_{
k
} denotes the watermarked copy of the *k*th client and **n** is additive Gaussian noise with zero mean, independent of the watermarked copies. In multimedia fingerprinting, for a given bound on the distortion of **y**
_{
C
} with respect to the watermarked copies, such a strategy is the most efficient from the attacker’s point of view [14]. In the client-side setting, a coalition of attackers has the option of directly combining the respective LUTs, instead of the content. As to the averaging attack, in the absence of noise, the two strategies are equivalent. In the presence of noise, attacking the LUT may achieve a lower distortion of the content for the same variance of the additive noise.

**y**

_{ C }, we assume that a detector produces an accusation set \(\mathcal {A}\) and accuses the

*k*th client if \(k \in \mathcal {A}\). The performance of the detector is measured by the probability of accusing at least one colluder, referred to as

*probability of detection*, and by the probability of accusing an innocent user, referred to as

*probability of false alarm*. Formally, the probability of detection is given by

If we assume that the probability of accusing a specific innocent client is *ε*, from the union bound we have that *P*
_{
fa
}≤(*N*
_{
U
}−*C*)*ε*<*N*
_{
U
}
*ε*. A fingerprinting system is said to be secure against a coalition of *C* colluders for a probability of false alarm *η* if the corresponding probabilities of detection and false alarm satisfy *P*
_{
d
}≈1 and *P*
_{
fa
}≤*η*. In practice, *P*
_{
d
}≥0.9 is usually sufficient to deter collusion.

### 3.2 Random LUTs

Since the early days of watermarking, it was argued that additive Gaussian distributed watermarks may provide a certain level of resistance to colluders [12]. In [13], it was shown that a system with *N*
_{
U
} users employing Gaussian watermarks of length *M*, i.e., where the fingerprints are represented by vectors of *M* i.i.d. zero-mean Gaussian variables, is secure against a coalition of *C* colluders, where \(C = O\left (\sqrt {M/\log N_{U}}\right)\), which is similar to the asymptotic behavior of Tardos codes [17]. If we assume that the random projection matrix \(\mathbb {G}\) is composed of i.i.d. Gaussian variables, then the watermarking technique described by Eqs. (5) and (6) is equivalent to adding to the content *L* almost independent Gaussian watermarks of length *M*, each modulated by a fingerprint bit. Hence, by assigning to different clients different and independent projection matrices, we expect that the system will benefit from the inherent anticollusion properties of Gaussian watermarks.

*L*-bit fingerprint, where each bit is independently drawn with probability 0.5. The fingerprint of the

*k*th client is then randomized using a server generated bit sequence

**r**

_{ k }and encoded using binary antipodal modulation in a

*L*-symbol vector

**m**

_{ k }and the server distributes to each client a personalized decryption LUT obtained as

*k*th client. When the server receives a suspect copy

**y**

_{ C }, an estimate of the fingerprint is computed for each client according to

*k*th client. Namely, for each client the detector computes the Hamming distance between the estimated and the recorded fingerprint, producing the accusation score

*θ*

_{ R }must be set in order to satisfy a suitable bound on the probability of accusing innocent clients. In order to guarantee

*P*

_{ fa }≤

*η*, a conservative choice is to impose

*P*

*r*{

*Z*

_{ k }<

*θ*

_{ R }}≤

*η*/

*N*

_{ U }for each \(k \notin \mathcal {C}\). When \(k \notin \mathcal {C}\), we have that \(\tilde {\mathbb {G}}_{k}\) and

**y**

_{ C }are independent, and the bits \(\hat {b}_{k,l}\) can be modeled as Bernoulli variables with probability 0.5 independent of the fingerprint bits

*b*

_{ k,l }. As a consequence, the accusation score of an innocent user is distributed according to a binomial distribution

*B*(

*L*,0.5) and the probability \(Pr\{Z_{k} < \theta _{R} | k \notin \mathcal {C}\}\) can be upper bounded using Hoeffding’s inequality as

*L*, a less conservative threshold can be obtained by approximating

*Z*

_{ k }as a Gaussian distributed variable with mean

*L*/2 and variance

*L*/4, which results in the threshold

where *Φ*
^{−1} denotes the inverse cumulative distribution function of a standard normal variable.

### 3.3 Tardos codes

An alternative solution for providing resistance to collusion in the protocol described in the previous section is to construct the clients’ fingerprints **b**
_{
k
} according to some anticollusion code. Tardos in [17] proposed a probabilistic code construction which is asymptotically optimal under the restricted digit model, i.e., under the assumption that the attackers can only use one of the symbols that they have received at each position to produce the colluded copy. The construction of the code is as follows. Given a fingerprint length *L* and a maximum coalition size *C*, the server generates the random numbers *p*
_{
l
}∈[*δ*,1−*δ*], *l*=0,…,*L*−1 according to the probability density function \(f(p) = \left (\left (\pi - 4\arcsin \sqrt {\delta }\right)\sqrt {p(1-p)}\right)^{-1}\), where the parameter *δ* satisfies *δ*
*C*≪1 (Tardos suggests *δ*=1/(300*C*)). Then, for each client, the fingerprint bits *b*
_{
k,l
}, *k*=0,…,*N*
_{
U
}−1, are randomly generated as Bernoulli random variables with *P*
*r*{*b*
_{
k,l
}=1}=*p*
_{
l
}.

In the proposed scheme, we assume that Tardos codes can be securely distributed according to an asymmetric protocol and embedded at the Server’s side using a homomorphic cryptosystem, for example using the solutions proposed in [33]. Following the approach in [32], we assume that a similar technique can be used for securely embedding a Tardos code in a LUT. However, a strategy allowing to embed a sufficiently long Tardos code in a watermarking LUT, and next to detect a possibly pirated codeword usable as input to the accusation protocol, has to be designed.

In [19, 30], the authors show that the optimal length of a Tardos code that is secure against a coalition of *C* colluders, when using the symmetric accusation score, is \(L \approx \frac {1}{2}\pi ^{2} C^{2}\log (N_{U}/\eta)\), which means that practical code lengths are of the order of *L*≈10^{6}. As a consequence, the computation of the watermarking LUT in (5) may require a very large projection matrix \(\mathbb {G}\). Moreover, in some cases, we may have *L*>*M*, which implies that a zero forcing detection approach as that proposed in [32] is not possible, since \(\tilde {\mathbb {G}}^{T}\tilde {\mathbb {G}}\) becomes a singular matrix.

*L*, we propose thus to generate \(\mathbb {G}\) according to a partial random circulant matrix. A random circulant matrix is based on a random vector that is circularly shifted to generate every row. Moreover, circulant matrices can be diagonalized using a discrete Fourier transform (DFT) as

*L*>

*M*, the watermarking LUT can be computed as

*M*×

*L*matrix \(\mathbb {P}\) selects the first

*M*entries of a vector of length

*L*, whereas for

*L*≤

*M*the watermarking LUT can be computed as

where the *M*×*L* matrix \(\mathbb {P}^{T}\) pads a *L*-length vector with *M*−*L* zeros. We note that the above expressions can be also implemented with an encrypted version of **m**
_{
k
} by using an encrypted domain FFT algorithm like the one proposed in [39].

**y**

_{ C }as

*L*>

*M*, or as

for *L*<*M*. In the above expression, we note that computing the matrix product \(\mathbb {T}^{T}(\mathbf {y}_{C} - \mathbf {x})\) is actually equivalent to constructing a suspect LUT from the sequence of watermark estimates as explained in [8, Sec. IV.B] and does not require an actual matrix multiplication. Once we have an estimate of the suspect fingerprint, different strategies exist in order to detect possible colluders.

*g*

_{0}(

*p*

_{ l })=−1/

*g*

_{1}(

*p*

_{ l }), and declares as guilty the clients belonging to the following set

*P*

_{ fa }≤

*η*the threshold must be chosen as

*L*[19], which results in the threshold

*p*

_{ e }. Using the results in [29] for the combined digit model, the optimal score in the case of the interleaving attack can be expressed as

*h*is defined as

*γ*

_{ e }=

*p*

_{ e }/(1−2

*p*

_{ e }). The detector declares as guilty the clients belonging to the set

*θ*

_{ T2}should be computed taking into account the distribution of the accusation sum for innocent clients. It is easy to verify \(E[Z_{k}|k\notin \mathcal {C}] = 0\), whereas

Differently from the symmetric accusation score in (25), the threshold for the optimized accusation score depends on the number of colluders, their attack strategy, and the noise level at the receiver. Since these quantities are usually unknown to the detector, a practical detector should estimate \(\sigma ^{2}_{T2,I}\) from the received fingerprint in order to compute the right threshold.

### 3.4 Practical issues

In the secure implementation of the asymmetric fingerprinting protocol of [32], the server does not have the plaintext values of the fingerprints **b**
_{
k
} provided by each *k*th client. Hence, after decoding either the suspect fingerprint of each *k*th client with (14) or decoding the global suspect fingerprint with (23) or (24), a secure protocol must be invoked to compare such fingerprint estimates with the actual fingerprints of each user.

As to the random LUT solution, the accusation set defined in (16) can be securely computed by using secure Hamming distance protocols, like those described in [41]. As to the Tardos solution, after the fingerprint bits are demodulated using (9), the scores in (25) can be computed by using the accusation procedure described in [33], which requires a trusted Judge to be revealed both the random bias values *p*
_{
l
}, which are a secret input of the server, and the fingerprints bits *b*
_{
k,l
}, which are a secret input of the *k*th client. In the first case, the computation of the accusation sets requires the execution of onerous cryptographic protocols, while in the second case, a trusted third party is required.

However, when the number of active clients is very large, e.g., millions of users, performing the secure computation of the accusation sets may become prohibitively expensive. A possible solution to this issue is to embed a server specific watermark along with the client secret watermark and using this first watermark to compute a set of suspect clients, as suggested in [33]. Since this watermark is known to the server, it can not be used in a dispute resolution protocol as a proof that a given client is guilty. However, once the server has a list of suspect clients, the secure accusation protocols can be run only on this reduced subset. Typically, the list of suspect clients will only contain few tens of entries, so that the previously described secure protocols become feasible.

**D**

_{ k }a server specific watermarking LUT

**S**

**W**

_{ k }, i.e., the actual decryption LUT can be redefined as

By using the above decryption LUT, each client will embed along with his/her secret fingerprint a server-specific spread spectrum watermark [8], which is known to have good anticollusion properties. Moreover, the above operation can be performed in the encrypted domain by relying on a homomorphic cryptosystem, hence the secure distribution of decryption LUTs is not violated. A possible drawback is that the two watermarks will interfere each other. However, this will likely affect only the performance of the server specific watermarks, since the watermarks of the suspect clients can be subtracted from the content under investigation before computing the final accusation sets.

## 4 Performance analysis

In this section, we will provide analytical expressions for the probability of detection of the proposed anticollusion schemes as a function of the number of colluders and the probability of false alarm. The analysis is based on the simplifying assumption that the final projection matrix \(\tilde {\mathbb {G}}\) is made of i.i.d. Gaussian entries and relies heavily on the central limit theorem, so it is valid only for large *M* and *L*.

### 4.1 Random LUTs

*C*colluders as

*l*th column of \(\tilde {\mathbb {G}}_{c}\) as ϕ

_{ c,l }. We have

*l*th detected symbol for the

*k*th client is given by \(\hat {m}_{k,l} = \boldsymbol {\phi }_{k,l}^{T}\mathbf {w}\). Now, let us assume that the vectors ϕ

_{ c,l }are mutually independent and Gaussian distributed, and that they are normalized so that the variance of each component is equal to 1/(

*M*

*L*). By invoking the central limit theorem, we can model \(\hat {m}_{k,l}\), both when \(k \in \mathcal {C}\) and \(k \notin \mathcal {C}\), as a Gaussian distributed variable. Moreover, it is easy to derive the following statistics

**n**. Hence, we can express the probability of reading the wrong bit for a colluder as

*B*(

*L*,

*p*

_{ e }) distribution. Using the Gaussian approximation of the binomial distribution, the probability of detecting a specific colluder for a threshold

*θ*

_{ R }can be expressed as

### 4.2 Tardos codes

*θ*

_{ T }can be expressed as

whereas, assuming independence for different colluders, the probability of detecting at least a colluder can be estimated as in (44).

*Z*

_{ k }, computed over

*p*, \(\hat {\mathbf {b}}\),

**b**

_{ k }, can be expressed as

*p*

_{ l }are i.i.d., this can be simplified as \(E[\!Z_{k}] = L\cdot E_{p}\left [E_{\hat {b},b_{k}}\left [(2\hat {b} - 1)g_{b_{k}}(p)\right ]\right ]\), where

*Z*

_{ k }can be similarly expressed as

since we have \(E\left [g_{b_{k,l}}^{2}(p_{l})\right ] = p_{l} {g_{1}^{2}}(p_{l}) + (1-p_{l}){g_{0}^{2}}(p_{l}) = 1\).

*Z*

_{ k }can be computed as \(\mu _{T} = L\cdot E_{p}\left [E_{\hat {b},b_{k}}\left [h_{b_{k},\hat {b}}(p)\right ]\right ]\), where

Similarly, the variance can be computed as \(\sigma _{T} = L\cdot E_{p}\left [E_{\hat {b},b_{k}}\left [h^{2}_{b_{k},\hat {b}}(p)\right ]\right ] - {\mu _{T}^{2}}\).

*μ*

_{ T }and

*σ*

_{ T }for the different accusation scores, the collusion channel \(P_{\hat {b}|b_{k}}(\cdot |\cdot)\) needs to be characterized. In the proposed scheme, Tardos codes are used over a noisy channel, and the random modulation of the code by \(\tilde {\mathbb {G}}\) is not perfectly orthogonal, resulting in an additional interference over each code bit. The above model can be analyzed considering a collusion attack in the combined digit model [25]. According to the average collusion model in (10), for each fingerprint position the colluders output a symbol proportional to the number of zeros and ones that they see in their copies. If we express the

*l*th column of \(\tilde {\mathbb {G}}\) as ϕ

_{ l }, then we have

*l*th symbol computed by the MF detector is given by \(\hat {m}_{l} = {\boldsymbol {\phi }_{l}^{T}}\mathbf {w}\). If we assume that in the

*l*th position the colluders see

*t*

_{ l }ones, then for each

*l*we have

*t*

_{ l }ones can be expressed as

*b*

_{ k }can be expressed as

since the probability of \(\hat {b}_{l} = 0\) conditional to observing *t*
_{
l
} zeros is equal to the probability of \(\hat {b}_{l} = 1\) conditional to observing *t*
_{
l
} ones, due to the symmetry of the channel.

*δ*=0 in the definition of the pdf of

*p*. In the RDM, we have

*p*

_{ e }=0. For the variance, we obtain

In the case of the accusation score in (29), *μ*
_{
T
} and \({\sigma _{T}^{2}}\) cannot be expressed in closed form and the expectation over *p* has to be numerically evaluated.

## 5 Experimental results

For the experimental validation of the proposed solutions, we have simulated a system performing client-side embedding on digital images. A dataset of 20 grayscale uncompressed 8 bit images, each having resolution 1024×1024 pixels and representing different subjects, has been used. For each image, a vector **x** of 2^{16} components has been obtained by applying a 8×8 discrete cosine transform (DCT) to the image and taking 4 DCT coefficients for each 8×8 block, corresponding to the coefficients between the 7th and 10th positions according to the zig-zag ordering used by JPEG standard.

Encryption has been obtained by adding the elements of the encryption LUT **E** to the selected DCT coefficients and reconstructing the images through an inverse block DCT. As in [32], pixel values have been mapped to 9 bits using rounding and modulo 512 operations, so as to avoid wrap around errors after watermarking. In all experiments, the encryption LUT power has been set to \({\sigma _{E}^{2}} = 10^{6}\), the LUT size has been set to *T*=2^{16}, and *R*=4 LUT entries are added together to encrypt each element. The effect of a fixed point representation of both **E** and **D** has been assumed negligible.

For each image, 100 independent tests were performed by randomly generating different encryption LUTs, different fingerprints, and different projection matrices \(\mathbb {G}\). In each test, we simulated joint decryption and watermarking performed by *C* different clients, where *C* ranges from 1 to 100, followed by the averaging attack described in (10). In the attack, the variance of the noise has been set so as to satisfy a prescribed watermark to noise ratio (WNR), defined as \(\text {WNR} = 10\log _{10}\frac {R{\sigma _{W}^{2}}}{{\sigma _{N}^{2}}}\). The performance has been evaluated by measuring the detection rate (DR), defined as the number of tests in which we accuse at least a guilty client over the total number of tests, for each different value of *C*. Ideally, DR should tend to the theoretical probability of detection of the system.

### 5.1 Random LUTs

For the random LUT solution, we set the fingerprint length to *L*=128 and we generated the entries of the projection matrices as i.i.d. zero mean variables, according to either a Gaussian or a Bernoulli distribution with values in 1,−1. Projection matrices have been rescaled so that the power of the watermarking LUT satisfies \({\sigma _{W}^{2}} = 1\). The threshold used by the detector has been computed according to the Gaussian approximation in (19), assuming a maximum number of users *N*
_{
U
}=10^{6} and different target values for the probability of false alarm.

*P*

_{ fa }=10

^{−6}and

*P*

_{ fa }=10

^{−9}, assuming that the attack satisfies WNR=0 dB. The theoretical probability of detection for each collusion set size

*C*has been computed according to (44) and (43) and compared to the measured detection rate obtained using either Gaussian or Bernoulli projection matrices. As can be seen, Gaussian and Bernoulli matrices permit to obtain very similar detection performances. Moreover, the anticollusion performance of the system can be accurately predicted by using the provided theoretical expressions. For the chosen parameters, i.e.,

*M*=2

^{16}and

*L*=128, and WNR=0 dB, the proposed system is able to withstand a coalition of about 32 colluders at

*P*

_{ fa }=10

^{−6}and 25 colluders at

*P*

_{ fa }=10

^{−9}. These results are similar to the performance of random orthogonal fingerprinting for similar parameters, as reported in [14].

### 5.2 Tardos codes

In the case of Tardos solutions, we similarly generated the entries of the first column of the projection matrices as i.i.d. zero mean variables, according to either a Gaussian or a Bernoulli distribution with values in 1,−1. Projection matrices have been rescaled so that the power of the watermarking LUT satisfies \({\sigma _{W}^{2}} = 1\).

For the Tardos solution using the accusation score in (25), simply referred to as Tardos, we set the fingerprint length to *L*=2^{20} and the threshold used by the detector has been computed according to the Gaussian approximation in (28)

For the Tardos solution using the accusation score in (29), referred to as Tardos with interleaving defence (ILD), we set the fingerprint length to *L*=2^{15} and the threshold has been computed according to the Gaussian approximation in (36). The variance *σ*
_{
T2,I
} has been numerically estimated by evaluating (35) through Monte Carlo integration, using \(P_{\hat {b}_{l}}(1) = \sum _{t=0}^{C}\binom {C}{t}p^{t}(1-p)^{C-t}P_{\hat {b}|t}(1|t)\) and \(P_{\hat {b}_{l}}(0) = 1 - P_{\hat {b}_{l}}(1)\).

In both cases, we assumed a maximum number of users *N*
_{
U
}=10^{6} and different target values for the probability of false alarm.

*P*

_{ fa }=10

^{−6}and

*P*

_{ fa }=10

^{−9}, assuming that the attack satisfies WNR=0 dB. The theoretical probability of detection for each collusion set size

*C*has been computed according to (44) and (45) and compared to the measured detection rate obtained using either Gaussian or Bernoulli projection matrices. Also in this case, Gaussian and Bernoulli matrices permit to obtain very similar detection performances and the anticollusion performance of the system can be accurately predicted by using the provided theoretical expressions. At WNR=0 dB, the Tardos solution with

*M*=2

^{16}and

*L*=2

^{20}is able to withstand a coalition of about 15 0 at

*P*

_{ fa }=10

^{−6}and 13 colluders at

*P*

_{ fa }=10

^{−9}, whereas the Tardos ILD solution with

*M*=2

^{16}and

*L*=2

^{15}is able to withstand a coalition of about 16 colluders at

*P*

_{ fa }=10

^{−6}and 14 colluders at

*P*

_{ fa }=10

^{−9}. It is worth noting that the results of the first solution are sensibly worse than the expected performance of Tardos codes on a noiseless channel, which for

*L*=2

^{20}should be about 87 colluders at

*P*

_{ fa }=10

^{−6}and

*N*

_{ U }=10

^{6}[19]. In this case, the performance of the code is severely limited by the fact that only

*M*=2

^{16}positions are available to embed the code, which introduces a lot of interference among code bits. The results of ILD solution are in line with the expected performance of Tardos codes for

*L*=2

^{15}, which is about 15 colluders at

*P*

_{ fa }=10

^{−6}and

*N*

_{ U }=10

^{6}.

### 5.3 Comparisons

The results of the previous sections show that, for the proposed choice of parameters, the random LUT solution offers better collusion resistance with respect to the embedding of Tardos codes. In this section, we compare the performance of the different solutions for different choices of parameters and under different WNRs of the attack. The comparison is made using the theoretical probability of detection obtained in the previous performance analysis. In the case of the Tardos ILD solution, the quantities *μ*
_{
T
} and \({\sigma _{T}^{2}}\) have been evaluated through Monte Carlo integration over 10^{5} independent realizations of *p*. For each choice of parameters, we use as performance metric the largest size of the coalition for which the detector can guarantee *P*
_{
d
}>0.9.

*M*. The other parameters are set as in the previous sections, namely WNR=0 dB,

*L*=128 for random LUTs,

*L*=2

^{20}for Tardos codes. For Tardos codes ILD, we tested both

*L*=2

^{15}and

*L*=2

^{20}. As to random LUTs, the size of the coalition against which a

*M*-length watermark is secure grows approximately as \(C = O(\sqrt {M})\), which is consistent with previous results on random Gaussian fingerprints [13]. For Tardos codes, the behavior depends on the relationship between

*M*and

*L*. For

*M*<

*L*, we have the same behavior as random LUTs, even though the size of the coalition against which the system is secure is approximately halved. For

*M*>

*L*, the performance depends on the length of the inner Tardos code and is independent of

*M*when

*M*≫

*L*. It is worth noting that the ILD solution performs sensibly better than the baseline Tardos solution for the same length of the inner code.

*L*, setting the other parameters as in the previous section, namely

*M*=2

^{16}and WNR=0 dB. For random LUT and Tardos solutions, choosing a larger

*L*usually increases the anticollusion performance, even though there is a performance threshold for large

*L*. For the Tardos ILD solution, there is an optimal

*L*value, indicating that this solution is much more affected by possible interfence among code bits. For random LUTs, setting

*L*≈10

^{3}is sufficient to guarantee the best performance, whereas for Tardos codes, in order to achieve the best attainable performance,

*L*≫

*M*should be used. For the ILD, the best solution seems to set

*L*≈

*M*.

*M*=2

^{16},

*L*=128 for random LUTs,

*L*=2

^{20}and

*L*=2

^{15}for Tardos codes,

*L*=2

^{15}for Tardos codes ILD. At low WNRs, both systems show a similar behavior: the maximum coalition size grows approximately as

*C*=

*O*(

*σ*

_{ W }/

*σ*

_{ N }) and we observe a fixed ratio (approximately equal to two) between the maximum coalition size of random LUTs and Tardos codes. For higher WNRs, the maximum coalition size depends mainly on the other parameters, resulting in a maximum coalition size limit which is independent of the WNR. Interestingly, at high WNRs, random LUTs can withstand a much larger coalition than Tardos codes.

### 5.4 Robustness to JPEG compression

*L*=128 for random LUTs,

*L*=2

^{20}for Tardos codes,

*L*=2

^{15}for Tardos codes ILD. In Fig. 8, we show the detection rate of random LUT and Tardos solutions, respectively, for

*P*

_{ fa }=10

^{−6}and a JPEG compression with quality 80. Both techniques are affected by JPEG compression and achieve worse results than in the presence of additive Gaussian noise. It can be observed that JPEG compression at a quality 80 is roughly equivalent to additive noise with WNR=−5 dB. This suggests that an embedding space larger than

*M*=2

^{16}should be used to increase the robustness in a realistic scenario including compression after the collusion attack.

## 6 Conclusions

Two different anticollusion solutions, specifically tailored for a recently proposed asymmetric fingerprinting protocol based on client-side embedding, have been proposed in this paper. In the first solution, the server randomly encodes a client-owned fingerprint using a different projection matrix for each client. In the second solution, a Tardos code is securely assigned to each client, for example using the protocol in [33], and modulated using a specially designed projection matrix. The performance of both solutions has been analytically derived assuming a correlation-based accusation strategy and an averaging attack by the colluders. For the Tardos solution, two different accusation strategies have been considered, namely, the symmetric accusation score proposed by Škorić et al. and an accusation score optimized for the interleaving attack under the combined digit model. The analytical probability of detecting at least a colluder has been compared to the experimental detection rates, under different number of colluders and different false alarm probability, confirming that the analytical model is indeed very accurate. According to our results, the solution based on independent random projections outperforms the solutions based on binary Tardos codes, under different choices of parameters. Namely, the first solution has a constant performance gain for low watermark-to-noise ratios and appears much more convenient for high watermark-to-noise ratios. The decoder optimized for the interleaving attack is less robust to noisy channels, which is consistent with the fact that its optimality has been demonstrated only for the restricted digit model. Moreover, the solution based on independent random projections requires to manage a much shorter client fingerprint with respect to binary Tardos codes. The same trend has been confirmed by experimental results including a JPEG compression after the averaging attack.

It is worth noting that the proposed schemes have been designed for a scenario that introduces some constraints on the available solutions. Namely, we assume that each client is identified by a binary code, since we can rely on efficient protocols for securely distributing, in an asymmetric way, an encrypted version of such binary codes [5, 33]. By relaxing these constraints, it would be possible to exploit more powerful non-binary anticollusion codes in the Tardos solutions [19, 21, 26, 27]. Extending the current protocols in order to work with non-binary fingerprints is expected to improve the results and is left for future work.

## Declarations

**Open Access** This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

## Authors’ Affiliations

## References

- M Barni, F Bartolini, Watermarking Systems Engineering: Enabling Digital Assets Security and Other Applications (CRC Press, Boca Raton, FL, USA, 2004).Google Scholar
- T Bianchi, A Piva, Secure watermarking for multimedia content protection: A review of its benefits and open issues. Signal Process. Mag. IEEE. 30(2), 87–96 (2013). doi:10.1109/MSP.2012.2228342.View ArticleGoogle Scholar
- B Pfitzmann, M Schunter, in
*Adv. in Cryptology - EUROCRYPT’96. LNCS 1070*. Asymmetric fingerprinting (Springer BerlinHeidelberg, 1996), pp. 84–95.Google Scholar - N Memon, P Wong, A buyer-seller watermarking protocol. IEEE Trans. Image Process. 10(4), 643–649 (2001).View ArticleMATHGoogle Scholar
- M Deng, T Bianchi, A Piva, B Preneel, in
*Proceedings of the 11th ACM Workshop on Multimedia and Security*. An efficient buyer-seller watermarking protocol based on composite signal representation (ACM, New York, NY, USAPrinceton, New Jersey, USA, 2009), pp. 9–18.View ArticleGoogle Scholar - A Rial, M Deng, T Bianchi, A Piva, B Preneel, A provably secure anonymous buyer-seller watermarking protocol. IEEE Trans. Inform. Forensics Secur. 5(4), 920–931 (2010).View ArticleGoogle Scholar
- RJ Anderson, C Manifavas, in
*Proceedings of the 4th International Workshop on Fast Software Encryption — FSE’97*. Chameleon—a new kind of stream cipher (Springer,London, UK, 1997), pp. 107–113.View ArticleGoogle Scholar - M Celik, A Lemma, S Katzenbeisser, M van der Veen, Look-up table based secure client-side embedding for spread-spectrum watermarks. IEEE Trans. Inform. Forensics Secur. 3(3), 475–487 (2008).View ArticleGoogle Scholar
- A Piva, T Bianchi, A De Rosa, Secure client-side ST-DM watermark embedding. IEEE Trans. Inform. Forensics Secur. 5(1), 13–26 (2010).View ArticleGoogle Scholar
- C-Y Lin, P Prangjarote, L-W Kang, W-L Huang, T-H Chen, Joint fingerprinting and decryption with noise-resistant for vector quantization images. Signal Process. 92(9), 2159–2171 (2012).View ArticleGoogle Scholar
- D Jarnikov, JM Doumen, in
*Lecture Notes in Computer Science*, 6933, ed. by W Jonker, M Petkovic. Secure Data Management (SpringerBerlin, Heidelberg, 2011), pp. 101–113.Google Scholar - IJ Cox, J Kilian, FT Leighton, T Shamoon, Secure spread spectrum watermarking for multimedia. IEEE Trans. Image Process. 6(12), 1673–1687 (1997). doi:10.1109/83.650120.View ArticleGoogle Scholar
- J Kilian, FT Leighton, LR Matheson, TG Shamoon, RE Tarjan, F Zane, in
*Proc. of 1998 IEEE International Symposium on Information Theory*. Resistance of digital watermarks to collusive attacks, (1998), p. 271. doi:10.1109/ISIT.1998.708876. - ZJ Wang, M Wu, HV Zhao, W Trappe, KJR Liu, Anti-collusion forensics of multimedia fingerprinting using orthogonal modulation. IEEE Trans. Signal Process. 14(6), 804–821 (2005).Google Scholar
- P Moulin, N Kiyavash, in
*2007 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)*, 2. Performance of random fingerprinting codes under arbitrary nonlinear attacks, (2007), pp. 157–160. doi:10.1109/ICASSP.2007.366196. - D Boneh, J Shaw, Collusion-secure fingerprinting for digital data. IEEE Trans. Inf. Theory. 44(5), 1897–1905 (1998).MathSciNetView ArticleMATHGoogle Scholar
- G Tardos, in
*Proceedings of the Thirty-fifth Annual ACM Symposium on Theory of Computing. STOC ’03*. Optimal probabilistic fingerprint codes (ACM,New York, NY, USA, 2003), pp. 116–125.View ArticleGoogle Scholar - B Škorić, TU Vladimirova, M Celik, JC Talstra, Tardos fingerprinting is better than we thought. IEEE Trans. Inf. Theory. 54(8), 3663–3676 (2008). doi:10.1109/TIT.2008.926307.View ArticleMathSciNetMATHGoogle Scholar
- B Škorić, S Katzenbeisser, MU Celik, Symmetric Tardos fingerprinting codes for arbitrary alphabet sizes. Des. Codes Crypt. 46(2), 137–166 (2008).View ArticleMathSciNetMATHGoogle Scholar
- T Furon, A Guyader, F Cérou, in
*Information Hiding. Lecture Notes in Computer Science*, 5284, ed. by K Solanki, K Sullivan, and U Madhow. On the design and optimization of Tardos probabilistic fingerprinting codes (Springer,Berlin Heidelberg, 2008), pp. 341–356.Google Scholar - F Xie, T Furon, C Fontaine, in
*Proceedings of the 10th ACM Workshop on Multimedia and Security. MM&Sec ’08*. On-off keying modulation and Tardos fingerprinting (ACM,New York, NY, USA, 2008), pp. 101–106. doi:10.1145/1411328.1411347.View ArticleGoogle Scholar - K Nuida, S Fujitsu, M Hagiwara, T Kitagawa, H Watanabe, K Ogawa, H Imai, An improvement of discrete Tardos fingerprinting codes. Des. Codes Crypt. 52(3), 339–362 (2009). doi:10.1007/s10623-009-9285-z.MathSciNetView ArticleMATHGoogle Scholar
- E Amiri, G Tardos, in
*Proceedings of the Twentieth Annual ACM-SIAM Symposium on Discrete Algorithms. SODA ’09*. High rate fingerprinting codes and the fingerprinting capacity (Society for Industrial and Applied Mathematics,Philadelphia, PA, USA, 2009), pp. 336–345.View ArticleGoogle Scholar - A Charpentier, F Xie, C Fontaine, T Furon, 7254. Expectation maximization decoding of Tardos probabilistic fingerprinting code, (2009), pp. 72540–7254015. doi:10.1117/12.806034.
- B Škorić, S Katzenbeisser, HG Schaathun, MU Celik, Tardos fingerprinting codes in the combined digit model. IEEE Trans. Inf. Forensics Secur. 6(3), 906–919 (2011). doi:10.1109/TIFS.2011.2116783.View ArticleGoogle Scholar
- D Boesten, B S̆koric, in
*Information Hiding. Lecture Notes in Computer Science*, 6958, ed. by T Filler, T Pevný, S Craver, and A Ker. Asymptotic fingerprinting capacity for non-binary alphabets (Springer,Berlin, Heidelberg, 2011), pp. 1–13.Google Scholar - D Boesten, B S̆koric, in
*Information Hiding. Lecture Notes in Computer Science*, 7692, ed. by M Kirchner, D Ghosal. Asymptotic fingerprinting capacity in the combined digit model (Springer,Berlin, Heidelberg, 2013), pp. 255–268.Google Scholar - P Meerwald, T Furon, Toward practical joint decoding of binary Tardos fingerprinting codes. Inf. Forensics Secur. IEEE Trans. 7(4), 1168–1180 (2012). doi:10.1109/TIFS.2012.2195655.View ArticleGoogle Scholar
- J-J Oosterwijk, B Škoric, J Doumen, in
*Proceedings of the First ACM Workshop on Information Hiding and Multimedia Security. IH&MMSec ’13*. Optimal suspicion functions for Tardos traitor tracing schemes (ACM,New York, NY, USA, 2013), pp. 19–28. doi:10.1145/2482513.2482527.View ArticleGoogle Scholar - T Laarhoven, B de Weger, Optimal symmetric Tardos traitor tracing schemes. Des. Codes Crypt. 71(1), 83–103 (2014). doi:10.1007/s10623-012-9718-y.MathSciNetView ArticleMATHGoogle Scholar
- T Furon, M Desoubeaux, in
*2014 IEEE International Workshop on Information Forensics and Security (WIFS)*. Tardos codes for real, (2014), pp. 24–29. doi:10.1109/WIFS.2014.7084298. - T Bianchi, A Piva, TTP-free asymmetric fingerprinting based on client side embedding. IEEE Trans. Inf. Forensics Secur. 9(10), 1557–1568 (2014).View ArticleGoogle Scholar
- A Charpentier, C Fontaine, T Furon, I Cox, in
*Proceedings of the 13th International Conference on Information Hiding. IH’11*. An asymmetric fingerprinting scheme based on Tardos codes (Springer,Berlin, Heidelberg, 2011), pp. 43–58.Google Scholar - S Pehlivanoglu, in
*Proceedings of the First ACM Workshop on Information Hiding and Multimedia Security. IH& MMSec ’13*. An asymmetric fingerprinting code for collusion-resistant buyer-seller watermarking (ACM,New York, NY, USA, 2013), pp. 35–44.View ArticleGoogle Scholar - S Katzenbeisser, B Škorić, M Celik, A-R Sadeghi, in
*Information Hiding. Lecture Notes in Computer Science*, 4567, ed. by T Furon, F Cayre, G Doërr, and P Bas. Combining Tardos fingerprinting codes and fingercasting (Springer,Berlin Heidelberg, 2007), pp. 294–310.Google Scholar - WB Johnson, J Lindenstrauss, Extensions of Lipschitz mappings into a Hilbert space. Contemp. Math. 26, 189–206 (1984).MathSciNetView ArticleMATHGoogle Scholar
- DL Donoho, Compressed sensing. Inf Theory IEEE Trans. 52(4), 1289–1306 (2006).MathSciNetView ArticleMATHGoogle Scholar
- MU Celik, AN Lemma, S Katzenbeisser, M van der Veen, in
*IEEE International Conference on Acoustics, Speech and Signal Processing, 2007. ICASSP 2007.*, 2. Secure embedding of spread spectrum watermarks using look-up-tables, (2007), pp. 153–156. doi:10.1109/ICASSP.2007.366195. - T Bianchi, A Piva, M Barni, On the implementation of the discrete Fourier transform in the encrypted domain. IEEE Trans. Inf. Forensics Secur. 4(1), 86–97 (2009).View ArticleGoogle Scholar
- Y-W Huang, P Moulin, On the saddle-point solution and the large-coalition asymptotics of fingerprinting games. IEEE Trans. Inf. Forensics Secur. 7(1), 160–175 (2012). doi:10.1109/TIFS.2011.2168212.View ArticleGoogle Scholar
- A Jarrous, B Pinkas, in
*Applied Cryptography and Network Security. Lecture Notes in Computer Science*, 5536, ed. by M Abdalla, D Pointcheval, P-A Fouque, and D Vergnaud. Secure hamming distance based computation and its applications (Springer,Berlin Heidelberg, 2009), pp. 107–124.View ArticleGoogle Scholar