- Research Article
- Open Access
- Published:
Markov Modelling of Fingerprinting Systems for Collision Analysis
EURASIP Journal on Information Security volume 2008, Article number: 195238 (2007)
Abstract
Multimedia fingerprinting, also known as robust or perceptual hashing, aims at representing multimedia signals through compact and perceptually significant descriptors (hash values). In this paper, we examine the probability of collision of a certain general class of robust hashing systems that, in its binary alphabet version, encompasses a number of existing robust audio hashing algorithms. Our analysis relies on modelling the fingerprint (hash) symbols by means of Markov chains, which is generally realistic due to the hash synchronization properties usually required in multimedia identification. We provide theoretical expressions of performance, and show that the use of -ary alphabets is advantageous with respect to binary alphabets. We show how these general expressions explain the performance of Philips fingerprinting, whose probability of collision had only been previously estimated through heuristics.
1. Introduction
Multimedia fingerprinting, also known as robust or perceptual hashing, aims at representing multimedia signals through compact and perceptually significant descriptors (hash values). Such descriptors are obtained through a hashing function that maps signals surjectively onto a sufficiently lower-dimensional space. This function is akin to a cryptographic hashing function in the sense that, in order to perform nearly unique identification from the hash values, perceptually different signals—according to some relevant distance—must lead with high probability to clearly different descriptors. Equivalently, the probability of collision () between the descriptors corresponding to perceptually different signals must be kept low. Differently than in cryptographic hashing, signals that are perceptually close must lead to similar robust hashes. Despite this difference with respect to cryptographic hashing, the probability of collision remains the parameter that determines the "resolution" of a method for identification purposes.
A large number of robust hashing algorithms have been proposed recently. This flurry of activity calls for a more systematic examination of robust hashing strategies and their performance properties. In this paper, we take a step in that direction by examining the probability of collision of a certain general class of robust hashing systems, rather than analyzing a particular method. In its binary alphabet version, the class considered broadly encompasses several existing algorithms, in particular, a number of robust audio hashing algorithms [1–4]. We will show that the -ary alphabet version of the class provides an advantage over the binary version for fixed storage size. In order to keep our exposition simple, other issues such as robustness to distortions or to desynchronization are not considered in this analysis. The study of the tradeoffs brought about by the simultaneous consideration of these issues is left as further work. We must also note that we will be dealing with unintentional collisions due to the inherent properties of the signals to be hashed. A related problem not tackled in this paper is the analysis of intentional forgeries of signals—perhaps under distortion constraints—in order to maximize the probability of collision.
The class of fingerprinting systems that we will study in this paper can be considered as consisting of two independent blocks. Denoting the multimedia signal to be hashed by a continuous-valued -dimensional vector
, in the first feature extraction block, a function,
, is applied to extract a set of
feature vectors, which we assume to be real-valued with dimension
. The feature extraction function is

so that with
for
.
The second block can be termed as the hashing block, in which the continuous feature vector values are mapped to a finite alphabet of hash symbols, that is, quantized. In many methods, this hashing block is implemented through the application of a scalar hashing function to each scalar feature vector value, which we denote as

where is the alphabet of hash symbols whose size is given by
.
In any hashing system, a distance measure must be established in order to determine the closeness between hash values. The commonly used distance for comparing sequences formed by discrete-alphabet symbols is the Hamming distance. This distance is defined as the number of times that symbols with the same index differ in the two sequences. Therefore, when comparing any two -ary symbols their Hamming distance can only take the values
or
.
As already stated, our aim is to investigate the probability of collision—also termed in some works false positive probability—of the general type of system described above, under certain assumptions that we will give next. Given a distance measurement, the probability of collision is simply the probability that the fingerprints (hashes) of two independent signals are closer than some preestablished threshold according to the distance measurement established. Our analysis will rely on the fact that the feature vector values are generally highly correlated, due to the synchronization requirements of a fingerprinting system. This high degree of correlation frees the observer of a segment of (or a distorted version of it) from the need to know its exact alignment with the complete original signal used to store the fingerprint during the acquisition process (in which the reference hash is obtained for subsequent comparisons). For example, in the Philips method [5] the features are extracted by processing
frame-by-frame on a set of heavily overlapped frames, which creates the conditions for our analysis. In the following, we will consider the case in which dependencies within a feature vector can be modelled as a continous-valued, discrete-time Markov chain. In particular, we assume that

for all . Furthermore, we assume that the process is stationary, that is, with statistics independent of
. We will also focus without loss of generality on one particular element
of the feature vector. Hence, we will write the relevant random variables of the feature vector as
and
to represent the distributions of the feature value at
and
, respectively, for any
, dropping the implicit index
.
We characterize next the Markov chain of the hash symbols. Define to be the discrete hash symbol generated by application of the hashing function to a particular element of the feature vector. We will assume that the sequence
forms a discrete-valued, discrete-time Markov chain, with transition probabilities defined by

for all the pairs
.
Finally note that, although methods which deal with real-valued fingerprints could be deemed in principle to belong to this class (using very large values of ), they rely on the use of mean square error distances instead of the Hamming distance. Thus, their study is not covered by the class of methods studied here.
Notation
Lowercase boldface letters such as represent column vectors, while matrices are represented by upper case Roman letters such as
.
is a matrix with the elements of
in the diagonal and zero elsewhere. The symbols
and
denote the identity and the all-zero matrices, respectively, whereas
denotes an all-ones vector, all of suitable size depending on the context.
denotes the trace of
. The
operator stacks sequentially the columns of an
matrix into an
column vector. The symbol
denotes the Kronecker (or direct) product of two matrices, and
denotes their Hadamard (component-wise) product. Finally,
denotes the Kronecker delta function.
2. Probability of Collision
We firstly define as the amount of bits required to store a single
-ary hash symbol, that is,

To fix a point of operation, we consider hash sequences of symbols (assumed integer) which have fixed bit size
(storage size). We investigate the probability of collision between two such independent sequences of symbols generated from the Markov chain with
transition matrix
, whose elements are defined in (4). Note that
is a column-stochastic matrix, so that
.
The probability of collision is simply the probability that two such hash sequences are closer than a given threshold under the distance measure established. Write to represent the Hamming distance between the sequences. Let
be the Hamming distance below which we consider two sequences of storage size
bits to be identical, with
and assuming
integer for simplicity. Using this threshold, the probability of collision between two sequences of storage size
is

In order to approximate this probability, observe that for any two -length sequences of symbols their overall Hamming distance is

with the Hamming distance between the
elements of the two sequences. If the random variables
were independent, we could apply the central limit theorem (CLT) to
for large
, in order to compute the probability (6). Although there are short-term dependencies created by the Markov chain, these vanish in the long term. Then we may invoke a broader version of the CLT for locally correlated signals [6]. In summary, the result in [6] states that, provided the second and third moments of
are bounded, then
tends to the normal distribution. Finally, notice that
is discrete, and then applying the CLT entails approximating a distribution with support in the positive integers using a distribution with support in the whole real line.
Assuming that the distribution of may be approximated by a Gaussian for large
, we only need its mean
and variance
to characterize it. The probability of collision can then be approximated as

with . We tackle the computation of the statistics required for this approximation in Section 3, and particular cases in Section 5.
Alternatively, the exact computation of (6) involves enumerating all cases generating a Hamming distance lower than or equal to , that is,

We investigate this direct approach in Section 4. Finally, in Section 6 we propose a Chernoff bound to , which is useful when the CLT assumption is not accurate or when the exact computation presents computational difficulties.
3. Mean and Variance of Hamming Distance
In this section, we derive the mean and variance of the Hamming distance using the Markov chain of symbol transitions , defined by (4). To proceed, we assume that
represents an irreducible, aperiodic Markov chain.
We denote as the pair of simultaneous values of two independent hash sequences at time
. The Hamming distance between the elements of
is denoted by
such that
. Also, for convenience we denote the nonnegative integer associated with the concatenation of the bit representation of the two components of
by
. For instance, with
, a possible value of
is
; in this particular case,
and
, as the bit representation of the components is
and
, respectively. We define next the
vector
with components
, for all possible
values of
sorted in natural order, that is, according to
. The pairs thus defined constitute a new Markov chain with column-stochastic transition matrix
, with
the Kronecker product. Therefore,

for all indices . Denote the equilibrium distribution of this Markov chain as
; then

If is symmetric, then the symbols are equally likely in equilibrium and
.
Some more definitions will be required in order to formalize the derivation of the probabilities associated with a given Hamming distance sequence. Firstly, we define two indicator vectors and
, both of size
. The elements of the vector
are defined to be all zeros except for those elements at positions in
such that
corresponds to a pair with Hamming distance
, which are set to
. It is easy to see that
and
. Now, defining
, we can write the distribution of elemental Hamming distances at the index
as

Observe next that the element at the position of the matrix
, with
, gives the joint probability
with
the unique inverse of
. Using this matrix, we can write the joint probability of a pair of elemental distances as

with .
Using the probabilities (12) and (13), we can derive the mean and variance of the Hamming distance between two independent hash sequences of symbols, assuming that the process starts in the equilibrium distribution (11). This is tantamount to assuming
, in which case
and
, that is, we can drop the index
and write
. When the initial symbol is chosen with uniform probability from
this condition holds if the transition matrix is symmetric. Even if all values for the initial symbol are not equiprobable in reality, the assumption is not too demanding whenever convergence to equilibrium is fast. We investigate a more general case for binary hashes in Section 5.
Noting that (7) is a sum of dependent variables, we have


Notice that, as because the Hamming distance only takes values in
, the first summand in (15) is just (14). We compute next the different summands required to obtain
and
. Denote the equilibrium mean and variance of
as
and
, respectively. The aforementioned mean and second moment are given by

where we have used (12) and the equilibrium assumption. Hence (14) is given by

Next, consider the sum of the elemental distance covariances. If the elemental distances were independent, we would have

Taking into account the dependencies, we have instead,

Using next (12), (13), and the equilibrium assumption we can compute (19) as

In Appendix 1, we develop this expression to show that the variance (10) of the Hamming distance between two -length hash sequences is

with given by (A.9).
4. The Stochastic Process of Elemental Distances
In this section, we will investigate the stochastic process of elemental distances, that is, the process that generates the sequence . Through an analysis of this process, we arrive at a full expression for the probability of collision, which is exact in the case of binary hashing sequences with symmetric transition matrices. This is possible because, as we will show, the elemental distance process is itself a Markov chain when
and the transition matrix is symmetric. Even for the case
, we note that the elemental distance process is well approximated by a Markov chain, and then the expression obtained for the probability of collision can be interpreted as a good approximation to the true collision probability.
To understand the process of elemental distances, , we consider the conditional probability of
given
. Define the matrix
with components
. From (12) and (13) we have that

Define as the matrix such that
. Using
, note that
, where
is the Hadamard product. Now using the identity
for any matrices
and
of appropriate size [7], we have that

Equation (23) represents a weighted sum of the diagonal elements of , with the weights depending on
and summing to
. Similarly, using
and
, we have

Note that (24) is a weighted sum of the off-diagonal elements of with weights depending on
and summing to one. The remaining two components of
are given by
and
.
It follows that, whenever the diagonal elements of are all equal and the off-diagonals are all equal, the dependence of
on
factors from (23) and (24), and
is independent of the time-step
. In this case, the process of elemental distances is itself a stationary Markov chain. Let us assume that
has the structure
with
and
. In this case, as
, we can see that
with
and
. As we have discussed above, this is the structure that allows to cancel the dependence on
in (23) and (24). For
, observe that symmetry implies that
is always of the form above, and then the conditions are always fullfilled in that case.
On the other hand, even when the elemental distances do not follow a Markov chain, since , the equilibrium probability, the elemental distance process is well approximated by the Markov chain with transition matrix
obtained by replacing
in (23) and (24) with
, such that
. From now on, we will refer loosely to the elemental distance Markov chain, meaning, when appropriate, the Markov chain derived from this approximation.
4.1. Probability of Collision
Using (23) and (24), define , the probability of a transition from
, and
, the probability of a transition
, in the elemental distance Markov chain. Let
be the initial distribution of the elemental distance. Consider a sequence,
, such that
. Then there are
positions in
at which
. Presume for the moment that
. Starting with a block of ones,
consists of blocks of ones, interweaved with blocks of zeros. Let
be the number of blocks of zeros and
be the number of blocks of ones. Consider the case
. Then either
, in which case, the sequence ends with a block of zeros, or
in which case the sequence ends with a block of ones. Given that there are in total
ones in the sequence, it is possible to count the number of different types of transitions that occur in the sequence and hence the probability that this sequence can occur. Indeed, if
represents the random variable modelling an
-bit Hamming distance sequence, then

For and
, define
. To evaluate
, we enumerate all the different ways that a sequence
with
and
can occur. This amounts to counting the number of ways that
ones can be subdivided into
blocks and
zeros can be subdivided into
or
blocks. With the blocks constructed, interweaving the blocks creates the sequence
. Indeed, from the total of
possible positions at which the sequence of ones can be split, it is necessary to choose
positions. Hence there are
different ways to select
blocks of ones, and similarly
to select
blocks of zeros, and
to select
blocks of zeros. Thus,

Now,

Assuming ;
, using an analogous argument to derive
and gathering terms, we arrive at the expression

where and
.
Expression (28) gives the exact probability of collision when the sequence of elemental distances is a Markov chain. In other cases, it will lead to an approximation. Consequently, the analysis is exact for and
symmetric, in which case
(
) can be determined easily from
.
5. Binary Hashes with Symmetric Transition Matrix
In this section, we derive expressions for the particular case with
symmetric. In this case, some simplifications on the general expressions derived above are possible. Define firstly the
matrices

Note that the first matrix is idempotent, that is, , and then so is the second,
; a further consequence of the definitions is
. Assuming symmetry, then for some
, we can write the binary transition matrix as

With so defined, it can be checked that as
, (17) and (21) reduce to

While (31) holds under the assumption that the distribution of is the equilibrium distribution, it is also possible to derive the exact mean and variance of
from an arbitrary initial distribution. This case is interesting, since, although the symbol sequences are assumed to be generated from independent sources, at the application level, the first bit of the hash sequence corresponding to the input signal is sometimes aligned with that of the hash sequences in the database. We can handle this scenario by assuming that the distance between the initial pair of bits is zero.
Before proceeding, note that the transition matrix for the elemental distance process is and, from (30), we can write

5.1. Exact Mean and Variance
With , as before, the initial distribution of the elemental distances, it is convenient to define the vectors
and
and write
with

Note that and
. Following the same argument as previously, and defining
, we obtain analogous expressions to (16) and (20) for this case as follows:


The summands in (34) are sums of terms of the form , which are nonzero only when
. Furthermore, since the coefficient of
in
is
, it follows that the coefficient of
in
is
. Hence, summing the geometric series,

where

On the other hand, the summands in (35) are sums of terms of the form , which are nonzero only when
and
, in which case they take the value
. Now, observe that
and
. Hence, (35) reduces to a sum over four terms,
, and
, where

In Appendix 2, we use (38) to show that the variance of a symmetric binary hash is

Noting that as
, this expression coincides with (31) as
when
.
6. Chernoff Bounding
For large and small probabilities the CLT can exhibit large deviations from the true probabilities. This is due to the fact that the CLT gives an approximation based only on the two first moments of the real distribution. Also, the exact computation (28) can run into numerical difficulties due to the combinatorials involved. Then, it is interesting to see what can be obtained by means of Chernoff bounding on (6). Apart from the interest of a strict upper bound, this strategy also provides the error exponent followed by the integral of the tail of the distribution of
.
The Chernoff bound on the probability of collision is given by

The expectation in (40) cannot be expanded as a product of elemental expectations due to the implicit dependencies. However, using the transition matrix of the elemental distance Markov chain and defining
, we can efficiently compute it as

It is not possible to optimize this expression analytically in closed-form. Nonetheless, numerical optimization can be easily undertaken, as (41) is just a weighted sum of powers of .
7. Empirical Results
Matlab source code and data assoicated with the empirical results given below can be downloaded from http://www.ihl.ucd.ie.
7.1. Synthetic Markov Chains
To test the validity of the expressions presented and the accuracy of the CLT approximation, random binary and 4-ary hash sequences were drawn from the Markov chain model. For the binary case, the transition matrix in (30) is used with
. The generator matrix used for the 4-ary hashes used
(note: no relationship with
here). The initial hash symbols were drawn from the equilibrium (uniform) distribution. This corresponds to 4-ary sequences generated by concatenation of binary pairs. The collision probability was measured empirically, using
trials in the binary case and
trials in the
-ary case. In Figure 1, these empirical probabilities are plotted against the CLT approximation, using the mean and variance given by (17) and (21), respectively. Also shown is the theoretical expression, calculated as
using (28) and the elemental distance Markov chain. This demonstrates the accuracy of the elemental distance Markov chain approximation for 4-ary hashes.
The CLT approximation has good agreement in the binary case for , but is significantly less accurate for 4-ary hashes. This is due to the fact that in the second case, the pdf of
is significantly skewed as zero distances are more likely to happen. Due to this, the CLT approximation understimates the tail of the true distribution. The Chernoff bound, also shown in Figure 1, follows the same shape as the exact distribution and is tighter for high values of
than the CLT approximation.
7.2. The Philips Method
We show in this subsection how the Markov modelling that we have described is applicable to the hashing method proposed by Haitsma et al. [1], commonly known as the Philips method. Moreover we show how previous work on modelling this particular method allows to obtain analytically the parameters of the Markov chain.
In previous work [8], we developed a model that allows the analysis of the performance of the Philips method under additive noise and desynchronisation. Using this model, the transition matrix of the Markov chain associated to the bitstream of the Philips hash can be determined analytically as follows. In [8] we analysed the bit error that results from desynchronization, the lack of alignment between the original framing used in the acquisition stage and the framing that takes place in the identification stage.
In particular, we showed that for a given band (i.e., a particular feature value in this paper) the probability of error for a desynchronization of
indices in
is well approximated by

where is the correlation coefficient corresponding to that band and that level of desynchronization. This model was shown therein to give very good agreement with empirical results, even with real audio (and hence nonstationary) input signals.
This same formula can be applied to determine the transition probabilities or
of the hash bits within a given signal. To this end we only need to observe that two overlapped frames which generate consecutive hash bits are in fact desynchronized by the number of indices where there is no overlap. Denoting this value by
and using
in (42), it follows that the binary Markov chain model of Section 5 with
can be used to determine the probability of collision for this method. Figure 2 shows the accuracy of this model against empirical results, for a range of hash sequence lengths from
to
, with the Philips method applied to the hashing of normally distributed i.i.d input signals.
The empirical probability of collision of the Philips method is plotted against storage size and compared with the theoretical expression (28). The theoretical plot uses a binary transition matrix with
calculated using (42) and the correlation coefficient
determined empirically from hash sequence data. Hashes are generated from normally distributed i.i.d input signals. Each frame corresponds to 0.37 seconds of a 44.1 kHz signal.
It is relevant to compare our Markov chain analysis with the collision probability for the Philips method previously examined in [5], in which it is referred to as the "probability of false alarm." Therein, it was assumed that were mutually independent, leading straightforwardly to
and
. With the CLT approximation, from (8), this yields the following expression for the collision probability,

which is independent of the transition probability. To obtain agreement with empirical data, in [5] this expression is modified to account for dependencies using a heuristic correction factor , that is,

Considering our own CLT approximation (8), we observe that, letting in (36) and (39), the correction factor with respect to the independent case actually tends to

In the results presented in Figure 2, and hence the correction factor for this value of
is
. In summary, our analysis is able to tackle dependencies without resorting to any heuristics.
7.2.1. Real Audio Signals
We examine the validity of our analysis for real audio signals, by carrying out a collision analysis on hashes generated using the Philips method on three real audio signals already used in [1, 8]: "O Fortuna" by Carl Orff, "Say what you want" by Texas, and "Whole lotta Rosie" by AC/DC (16 bits, 44.1 kHz). Using the parameters of the original algorithm described in [1], a 32-bit block, corresponding to frequency bands, is extracted from each frame. Each frame corresponds to 0.37 seconds of audio and the degree of overlap between frames is
. Hence, from each audio file, a hash block of
bits is extracted, where the number of frames
is between
and
. Our collision analysis is applied by estimating a single empirical correlation coefficient
from the entire hash block. We then use our model to predict the probability of collision between hash sequences drawn from the first 200 000 elements of the entire sequence of
bits. The results are shown in Figure 3.
Although our model assumes stationarity, which is clearly not the case for real audio signals, good agreement is found between the model predictions and empirical data. The greatest discrepancy appears in the AC/DC audio and may be due to greater dynamics in this song. To improve the results, we could apply the approach used in [8], where real audio signals are approximated by stationary stretches and apply our model separately to each stretch. While this approach can provide the probability of collision within each stationary stretch, combining these into an overall probability of collision could prove problematic.
8. Conclusion
We have examined the probability of collision of a certain general class of robust hashing systems that can be described by means of Markov chains. We have given theoretical expressions for the performance of general chains of -ary hashes, by deriving the mean and variance of the distance between independent hashes and applying a CLT approximation for the probability distribution. We have been able to derive an expression for the distribution, which is exact for binary symmetric hashes and gives a very good approximation otherwise. We have confirmed the accuracy of the Gaussian distribution on binary hashes once the hash sequence is sufficiently large. Moreover, we derived the binary transition matrix for the Philips method and showed that the Markov chain model has very good agreement with empirical results for this method. While we have shown that for
,
-ary chains have an advantage over binary chains from the point of view of collision, higher order alphabets will inevitably lead to a degradation of performance under additive noise and desynchronisation error. The performance tradeoffs that result will be examined in future work.
Appendices
A. Variance of an
-Ary Hash Sequence
In this appendix, we detail the computation of (20) in order to obtain . Firstly, see that the following identity that holds:

Define and
. Then

Since is an eigenvector of
,
is not invertible. Instead, notice that

which implies

with . Similarly,

and therefore,

Using (A.2), (A.4), (A.5), and (A.6), we get

Observe that, since ,

which implies that is a right identity of
. Hence, using the definition

(A.7) can be rewritten as

Note also that

Using (A.10) and (A.11), the sum of the covariances (20) is found to be

As ,

Using (17) and (A.12) in (15) we finally obtain (21).
B. Variance of Binary Symmetric Hash Sequence
In this appendix, we compute the sum of covariances (35), necessary to obtain the variance of a symmetric binary hash using (15). We will use (38) for this computation. We note firstly the following identities:

Using the definition in (37), we can write

Therefore,

Using (37), (B.1), and (37), (B.3) becomes

Inserting (B.2) into the expression above, we get

Finally, inserting (36) and (B.5) into (15), we arrive at (39).
References
Haitsma J, Kalker T, Oostveen J: Robust audio hashing for content identification. Proceedings of the International Workshop on Content-Based Multimedia Indexing (CBMI '01), September 2001, Brescia, Italy 117-125.
Mihçak MK, Venkatesan R: A perceptual audio hashing algorithm: a tool for robust audio identification and information hiding. In Proceedings of the 4th International Workshop on Information Hiding (IHW '01), April 2001, Pittsburgh, Pa, USA, Lecture Notes In Computer Science. Volume 2137. Springer; 51-65.
Baluja S, Covell M: Content fingerprinting using wavelets. Proceedings of the 3rd European Conference on Visual Media Production (CVMP '06), November 2006, London, UK 209-212.
Kim S, Yoo CD: Boosted binary audio fingerprint based on spectral subband moments. Proceedings of the 32nd IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '07), April 2007, Honolulu, Hawaii, USA 1: 241-244.
Haitsma J, Kalker T: A highly robust audio fingerprinting system. Proceedings of the 3rd International Conference on Music Information Retrieval (ISMIR '02), October 2002, Paris, France 107-115.
Blum M: On the central limit theorem for correlated random variables. Proceedings of the IEEE 1964,52(3):308-309.
Magnus JR, Neudecker H: Matrix Differential Calculus with Applications in Statistics and Econometrics. 2nd edition. John Wiley & Sons, New York, NY, USA; 1999.
Balado F, Hurley NJ, McCarthy EP, Silvestre GCM: Performance analysis of robust audio hashing. IEEE Transactions on Information Forensics and Security 2007,2(2):254-266.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Hurley, N.J., Balado, F. & Silvestre, G.C.M. Markov Modelling of Fingerprinting Systems for Collision Analysis. EURASIP J. on Info. Security 2008, 195238 (2007). https://doi.org/10.1155/2008/195238
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1155/2008/195238
Keywords
- Markov Chain
- Feature Vector
- Transition Matrix
- Collision Probability
- Markov Chain Model