Skip to main content

Table 6 Experimental results of different feature sets and clustering algorithms

From: Inside the scam jungle: a closer look at 419 scam email operations

 

Threshold

ARI

Compactness

MDC

Clusters

Emails (% of total)

Features

Choquet

0.35

-

3.08

1,036

19,866

13,779 (39%)

6

Choquet, no phone

0.35

0.53

2.64

912

16,745

17,753 (50%)

5

Choquet, no emails

0.35

0.52

2.2

868

13,458

14,884 (42%)

4

Choquet, no subj.

0.35

0.55

2.63

1,034

17,918

17,287 (49%)

5

WOWA

0.5

-

3.3

1,012

21,775

16,920 (48%)

6

WOWA, no phone

0.6

0.56

2.98

1,393

18,032

11,103 (31%)

5