Lots of research has been carried out to improve the robustness of a watermarked image where the use of error correcting codes to protect the signature is the most highlighted [2–5]. The watermarking problem is synonymous to the transmission of a signal over a noisy channel, where the image is considered to be the channel, the attacks are considered to be noise signals and the signature is considered to be the signal to be transmitted in the form of the watermark.
This section deals with the investigation of the performance capabilities of different error correcting schemes employed for a digital color image watermarking application based on the discrete wavelet transform. We worked on improving the robustness of the signature with the help of four families of error correcting codes. These four families of error correcting codes give different response when tested against different attacks on watermarked images. This is so because each of the attacks modifies the watermarked image in a diverse way and the properties exhibited by the error correcting codes are different against different error types (burst errors or random errors). To counter this problem we have employed repetition codes (presented at the first section), Hamming codes [47], Bose Chaudhuri Hocquenghem (BCH) codes [48] and Reed-Solomon codes [49].
In the literature different types of error correcting schemes for the watermarking problem are proposed. For example since 1998 Wang et al. use Hamming codes [50], Perreira et al. study the BCH codes for watermarking applications [51], some were hybrids between for example BCH and repetition codes [4]. Finally, some articles suggest using convolutional codes for watermarking [52, 53]. Some compared different types of coding schemes, e.g., Reed Solomon, BCH and repetition codes [3].
What makes our study original is that we describe and compare the effect of different classes of codes against different type of real image attacks, we include different codes and the list decoding scheme in a color watermarking complete process. With this study, we propose to describe the errors introduced by different attacks and thus to illustrate the connection of a particular attack with a particular error correcting scheme in the context of our color wavelet algorithm. Since, there is a relationship between the contents of an image and the error nature attack, the result section analyzes the different results with empirical observation and provides intuitive explanations.
We adopted a rigorous testing process where we tested the robustness of different watermarked images with multiple signatures. We employed some standard attacks which include color attacks, filtering attacks, noise attacks and image compression attacks. The scheme had already been tested against some of these attacks with the use of repetition codes [5]. It proved to be robust against these attacks to a certain extent. We wished to explore the effectiveness of other error correcting codes against these attacks.
The different error correcting codes are tested using the wavelet based color image watermarking scheme presented in Section 2. Then using some possible attacks the robustness obtained using the different families and modes of error correcting codes is shown and the results are presented in Section 4.
Characteristics of the watermarking channel
Due to the requirement of watermark invisibility, the watermarks are weakly inserted in to the image. This makes the watermark signal prone to errors or attacks. The watermark channel is very noisy due to the different types of intentional or unintentional attacks. We consider the problem of watermark robustness against different errors or attacks analogous to the transmission of a signal over a noisy channel. To correctly transmit a signal over a noisy channel error correcting codes are used to protect the signal from the effects of the channel. The characteristics of the watermarking channel depend upon the type of attacks experienced by the watermarked image. Like in the transmission of a signal over a noisy channel, error correcting codes are used to protect the signature in the form of a watermark so that the effects of the channel are reduced or minimized. The underlying characteristics of an image, e.g., the texture and color information also determine the effect an attack has on the watermarked image. The watermarking algorithm and the type and mode of error correcting codes also play an important role in defining the combined performance of robustness and invisibility.
The characteristics of the watermarking channel are primarily determined by the different attacks. We consider JPEG compression, additive white Gaussian noise, low pass filter, hue, saturation, and brightness as the underlying characteristics of the watermarking channel. Each of the different error correcting codes presented in the following Section exhibit different properties against these attacks.
The watermarking channel is characterized by very high error rates. To correct these errors we use different error correction schemes. Four families of error correcting schemes are used in our study to enhance the robustness of the watermark—repetition codes, Hamming codes, BCH codes and Reed-Solomon codes. We explore the use and effectiveness of the concatenation of these different families of error correcting codes to enhance the robustness of the watermarking scheme.
We employ a concatenation model where two of these error correcting codes are concatenated so that the two error correcting codes can facilitate one another. The outer error correcting codes are a second version of the repetition codes: the watermark is built up from some repetitions of the signature. The outer error help in reducing the error rates so that the inner error correcting codes (repetition, Hamming, or BCH) could then further reduce the errors so that the decision that the received watermark is valid or not could be taken.
Error correcting codes are expressed in the following article in the form of (n,k,d), where n is the length of the code, k is the dimension and d is the minimum Hamming distance between any pair of different codewords. The Hamming distance, H
d
is based on the Hamming weight of a codeword c given by H
w
(c), the number of non zero elements in a vector. The Hamming distance H
d
between two codewords is the number of elements in which they differ. The minimum Hamming distance d, between any two different codewords defines the error correcting capability of the particular error correcting code. An (n, k, d) error correcting code is capable of correcting t errors where .
Concatenated error correcting codes
As we have said, the robustness of the watermarking scheme could be improved by concatenating these codes using signature repetition codes as outer codes and bit repetition, Hamming or BCH codes as inner codes. The outer coding is adaptive and is in accordance to the size of the image and user parameters, it is always repetition coding as shown in Figure 5.
At the receiver side an exact opposite procedure is applied to decode the signature from the watermark, i.e., we decode the watermark using repetition decoding first and then we decode the resulting information using repetition, Hamming or BCH decoding and we have the signature.
Such a concatenation mode has been selected because error correcting codes cannot display their potential unless the error rate induced by the channel is reduced below a critical value which brings about the possibility of first improving the channel error rate via repetition coding to an acceptable level, before any further decoding. The watermark channel may have to operate at very high bit error rates and codes such as BCH stop bringing in any advantage while the repetition codes continue with their modest protection. However concatenation of repetition and BCH codes is a way to improve the decoding performance when the error rates are high [3]. The BCH codes can correct up to t = ⌊(d − 1) / 2⌋ errors, all errors exceeding t may cause the decoder to decode erroneously. The repetition codes display better characteristics than BCH under high error rates. This could be seen in Section 4.2 (noise attack) where the repetition codes perform much better than the BCH (63,16,23) codes when the SNR<2.
As mentioned in the introduction Reed-Solomon codes are used in a standalone mode to correct burst errors. The decoding of Reed-Solomon codes is carried out using list decoding algorithms [1, 6]. The list decoding algorithms offer enhanced performance over bounded distance algorithms when the code rates are low.
Repetition codes
Repetition codes are used to construct the watermark from the signature and they are expressed in the form of (n, k, d). They are always used as (n, 1, n) where each codeword is repeated n number of times. The repetition codes are used as inner codes in the construction of the watermark. They are also used as outer codes in all cases. The decoding of the repetition codes is always done using a mean operation on the received codeword to distinguish between a 0 or a 1.
Hamming codes
Hamming codes are linear block codes. For an integer m > 1, we have the following representation for the binary Hamming codes in the form (n, k, d) = (2m − 1, 2m − 1 − m, m).
For m = 3, we have (7,4,3) Hamming error correcting codes. These Hamming codes encode 4 bits of data into 7 bit blocks (a Hamming code word). The extra 3 bits are parity bits. Each of the 3 parity bits is parity for 3 of the 4 data bits, and no 2 parity bits are for the same 3 data bits. All of the parity bits are even parity. The (7,4,3) Hamming error correcting code can correct 1 error in each of the Hamming codeword.
When we multiply the received codeword with the parity check matrix we get the corresponding parity ranging from 000 to 111. These three bits give us the error location. 000 indicating that there were no errors in transmission and the rest from 001 to 111 indicate the error location in our seven bit received codeword. Here we can correct one error according to t = ⌊(d − 1) / 2⌋ as the minimum Hamming distance between our code words is 7 − 4 = 3, we have 1 as the number of correctable errors. Now we have the error location, we could simply flip the bit corresponding to the error location and the error will be corrected. Then we discard the parity bits from position one, two and four we have our received data words. Hamming Codes are perfect 1 error correcting codes. That is, any received word with at most one error will be decoded correctly and the code has the smallest possible size of any code that does this. The Hamming codes that we used could correct 1 error in each codeword. There was a need to test other types of codes which can correct more errors. We selected the BCH codes which are explained in the following section.
Bose Chaudhuri Hocquenghem (BCH) codes
BCH codes are cyclic block codes such that for any positive integers m ≥ 3 and t with t ≤ 2m − 1 − 1, there is a BCH codes of length n = 2m − 1 which is capable of correcting t error and has dimension k = n − m ∗ t.
Let C be a linear block code over a finite field F of block length n. C is called a cyclic code, if for every codeword c = (c1,…,c
n
) from C, the word (c
n
, c1, …, cn − 1) in Fn obtained by a cyclic right shift of components is also a codeword from C.
We have selected the BCH (15,7,5) and the BCH (63,16,23) error correcting codes for the purpose of our experimentation. The BCH (63,16,23) is in line with our algorithm testing parameters since the size of our initial matrix is 8×8 bits.
Reed-Solomon codes
Reed-Solomon codes [49, 54] are q-ary [n, k, d] error correcting codes of length n, dimension k and Hamming minimum distance d equal to n − k + 1. These codes can decode in a unique way up to errors, and there exists the possibility to decode them beyond the classical bounded radius . Usually these codes are considered over the Galois field G F(pm) (for p a prime) and have parameters [pm − 1, pm − 1 − 2t, 2t + 1]. In particular the case p = 2 is often considered for applications since in that case any symbol of the code can be described with m bits. It is also possible either by considering less coordinates in their definition, either by shortening them, to construct Reed-Solomon codes over G F(pm) with parameters [pm − 1 − s, pm − 1 − 2t − s, 2t + 1], which can be decoded in the same way that non shortened Reed-Solomon codes.
Reed-Solomon codes are particularly useful against burst noise. This is illustrated in the following example.
Consider an (n, k, d) = (40, 11, 30) Reed-Solomon code over G F(26), where each symbol is made up of m = 6 bits as shown in Figure 6. As d = 30 indicates that this code can correct any t = 14 symbol error in a block of 40. Consider the presence of a burst of noise lasting 60 bits which disturbs 10 symbols as highlighted in Figure 6. The Reed-Solomon (40,11,30) error correcting codes can correct any 14 symbol errors using the bounded distance decoding algorithm without regard to the type of error induced by the attack. The code corrects by blocks of 6 bits and replaces the whole symbol by the correct one without regard to the number of bits corrupted in the symbol, i.e., it treats an error of 1 bit in the symbol in the same way as it treats an error of 6 bits of the symbol—replacing them with the correct 6 bit symbol. This gives the Reed-Solomon codes a tremendous burst noise advantage over binary codes. In this example, if the 60 bits noise disturbance can occur in a random fashion rather than as a contiguous burst, that could effect many more than 14 symbols which is beyond the capability of the code.
In the watermarking channel, the errors, characterized by the different attacks, occur in random or burst manner. Depending on the placement of the watermark in an image and the use of error correcting codes, the robustness of the signature can be increased against the attacks.
For Reed-Solomon codes the conventionally used, bounded distance decoding algorithms correct up to t = ⌊(n − k) / 2⌋ symbol errors as shown in the above example. Using list decoding, Sudan [1] and later Guruswami-Sudan [6] showed that the error correcting capability of Reed Solomon could be improved to and respectively.
List decoding of Reed-Solomon codes
It is well known that for a linear code [n,k,d]
q
over the field G F(q), of length n, dimension k and distance d, it is possible to decode the code in a unique way up to a number of errors: t = [(d − 1) / 2]. Now what happens if the number of errors is greater than t? Clearly there will always be cases where a unique decoding will not occur. For instance if d is odd and a codeword c has weight d, any element x of weight (d + 1) / 2 (which support the set of non zero coordinates) is included in the support of c, will be at distance (d + 1) / 2 of two codewords: x and (0, 0, …, 0), which gives two possibilities for decoding. Meanwhile if one considers a random element of weight (d + 1) / 2 the probability that such a situation occurs is very unlikely. A closer look at probabilities leads to the fact that in fact even for larger t (but with t bounded by a certain bound, called the Johnson bound) the probability of a random element to be incorrectly decoded is in fact very small.
The idea of list decoding is that for t > (d − 1) / 2 a list decoding algorithm will output a list of codewords rather than a unique codeword. List decoding was introduced by Elias [55], but the first usable algorithm for a family of codes, the Reed Solomon codes, was proposed by Sudan in [1], later the method was improved by Guruswami and Sudan [6].
The list decoding method is a very powerful method but it is slower than classical algorithms which decode less errors. For usual context in coding theory the decoding speed is a very important factor since one wants to optimize communications speed, but there exist contexts in which the use of such a decoding is not as important since the use of the algorithm is only causal in the overall process. This is for instance the case in cryptography and in traitor tracing schemes [56] where list decoding algorithms are used when one wants to search a corrupted mark (which does not occur all the time).
The principle of the algorithm is a generalization of the classical Welch-Berlekamp algorithm, the algorithm works in two steps: first construct a particular bivariate polynomial Q(x, y) over G F(q) and then factorize it for finding special factors. These factors lead to a list of decoded codewords.
The first algorithm by Sudan permits (for k / n < 1 / 3) to decode up to errors rather than n / 2 for classical algorithms. This method is based on Lagrange interpolation.
The list decoding algorithm of Sudan [1, 57] is detailed in the following steps
For a received codeword r = (r1, r2, …, r
n
) and a natural number
(5)
and for
-
1.
Solve the following system of linear equations
(7)
where l
j
= n − 1 − t
G
− j(k − 1).
-
2.
Put
and
-
3.
Find all factors of Q(x, y) of the form (y − f(x)) with degree (f(x)) < k.
-
4.
A list of factors f(x) that satisfy the following is obtained
-
5.
Calculate f(x) over the encoding elements to obtain the corrected codeword (c 1, c 2, c 3, …, c
n
).
The second method of Guruswami and Sudan [6, 57] permits to decode up to errors, but is trickier to use since it is based on Hermite bivariate interpolation and on the notion of Hasse derivative.
For a bivariate polynomial:
the Hasse derivative for the point (a′, b′) is defined as:
In practice the hard step of decoding is finding the polynomial Q(x, y). It can be done in cubic complexity in an elementary (but slow) way by the inversion of a matrix, or also in quadratic complexity but with a more hard to implement method [58].
The Guruswami-Sudan list decoding algorithm detailed in [1, 57] could be summarized in the following three steps
-
1.
For a received word (r 1, r 2, r 3, …, r
n
) and encoding elements (x 1, x 2, x 3, …, x
n
) belonging to a Galois Field, solve for Q a, b the system of homogeneous linear equations
(8)
where h + u < s, i = 1, 2, …, n, and s is a natural number.
Qa, b = 0 if l > a or b > l
a
where l
a
= s(n − tGS) − 1 − a(k − 1) and l and s are the list size and multiplicity factor [6, 57] for the Reed-Solomon code. Where
(9)
and
-
2.
Put and consequently .
-
3.
Find all factors of Q(x, y) of the form (y − f(x)) with degree(f(x)) < k,and then calculate f(x) over the encoding elements to obtain the corrected codeword (c 1, c 2, c 3, …, c
n
).
The performance of Guruswami-Sudan algorithm is better than the algorithm proposed by Sudan when the code rate R = k / n is high. When the code rate is very low they have similar performance. The performance of both the list decoding algorithms shows clear improvement over the bounded distance (BD) algorithms when the code rate is low. We exploit this property of the list decoding algorithms to encode the signature in to the watermark. The improvement of performance is shown in Figure 7.
We select the Sudan’s algorithm for the purpose of decoding as the code rate R < 1 / 3 for the watermarking scheme presented in Section 2 [5] for a signature size of 64 bits and the actual performance gain by the Guruswami-Sudan over the Sudan algorithm is not significant. The parameters used to demonstrate the performance of the Reed-Solomon codes are RS (40,11,30), RS (127,9,119), and RS (448,8,441) and the code rates for these three cases are 0.275, 0.071, and 0.018 respectively. Therefore it is useless to use the high complexity Guruswami-Sudan algorithm as for the code rates are very low for the given cases and Sudan algorithm has similar performance, specially for RS (127,9,119) and RS (448,8,441), as seen in Figure 7.
According to the properties of the different codes, we are now going to study the integration of these tools in our color watermarking process.