Skip to main content

Table 1 Summary of the 11 hyperlink selection methods proposed. The table lists the classifiers involved, the local context constraints (where applied), and the network type where each classifier is applied for each method under consideration

From: Hybrid focused crawling on the Surface and the Dark Web

Methods

Classifiers

Local context constraints

Classifiers used per Network Type

L

P

D

SCPL

DCPL

t 1 < scorel < t 2

|sw| ≤ k

Freenet

Tor

I2P

Surf.

1

x

      

L

L

L

L

2

x

 

x

    

D

D

D

L

3

x

 

x

  

x

 

L, D

L, D

L

L

4

x

 

x

  

x

 

L, D

L, D

L, D

L, D

5

x

x

   

x

 

L, P

L, P

L

L

6

x

  

x

 

x

 

L, SCPL

L, SCPL

L

L

7

x

 

x

   

x

L, D

L, D

L, D

L, D

8

x

x

    

x

L, P

L, P

L, P

L, P

9

   

x

   

SCPL

SCPL

SCPL

SCPL

10

    

x

  

DCPL

DCPL

DCPL

DCPL

11

x

   

x

  

DCPL

DCPL

DCPL

L

  1. L stands for the link-based classifier, P for the parent Web page classifier, D for the destination Web page classifier, SC PL for the static linear combination classifier, DC PL for the dynamic linear combination classifier (both for the static and the dynamic combination classifiers the link-based and the parent classifier are combined), Surf for the Surface Web. The comma separated entries for the classifiers used per network type, entail a two-step hyperlink selection strategy, where the second classifier is enabled based on the local context constraints