From: Optimized combined-clustering methods for finding replicated criminal websites
Scam websites | Dynamic cut height | Optimized cut height | ||
---|---|---|---|---|
 | Test | Train | Test | Train |
Fake escrow services | ||||
Sentences | 0.107 | 0.289 | 0.982 | 0.924 |
DOM tags | 0.678 | 0.648 | 0.979 | 0.919 |
File names | 0.094 | 0.235 | 0.972 | 0.869 |
Images | 0.068 | 0.206 | 0.325 | 0.314 |
S and D | 0.942 | 0.584 | 0.982 | 0.925 |
S and F | 0.120 | 0.245 | 0.980 | 0.895 |
S and I | 0.072 | 0.257 | 0.962 | 0.564 |
D and F | 0.558 | 0.561 | 0.979 | 0.892 |
D and I | 0.652 | 0.614 | 0.599 | 0.385 |
F and I | 0.100 | 0.224 | 0.518 | 0.510 |
S and D and F | 0.913 | 0.561 | 0.980 | 0.895 |
S and D and I | 0.883 | 0.536 | 0.971 | 0.673 |
S and F and I | 0.100 | 0.214 | 0.975 | 0.892 |
D and F and I | 0.642 | 0.536 | 0.831 | 0.772 |
S and D and F and I | 0.941 | 0.536 | 0.971 | 0.683 |
High-yield investment programs | ||||
Sentences | 0.713 | 0.650 | 0.738 | 0.867 |
DOM tags | 0.381 | 0.399 | 0.512 | 0.580 |
File names | 0.261 | 0.299 | 0.254 | 0.337 |
Images | 0.289 | 0.354 | 0.434 | 0.471 |
S and D | 0.393 | 0.369 | 0.600 | 0.671 |
S and F | 0.291 | 0.310 | 0.266 | 0.344 |
S and I | 0.290 | 0.362 | 0.437 | 0.471 |
D and F | 0.309 | 0.358 | 0.314 | 0.326 |
D and I | 0.302 | 0.340 | 0.456 | 0.510 |
F and I | 0.296 | 0.289 | 0.397 | 0.336 |
S and D and F | 0.333 | 0.362 | 0.319 | 0.326 |
S and D and I | 0.319 | 0.350 | 0.459 | 0.510 |
S and F and I | 0.303 | 0.289 | 0.398 | 0.336 |
D and F and I | 0.320 | 0.337 | 0.404 | 0.405 |
S and D and F and I | 0.320 | 0.337 | 0.404 | 0.405 |