The purpose of SRPITP is to protect the ownership and privacy of image owners, and it will send alarm to the image owners as long as the illegal use of their images is found. The model framework is shown in Fig. 1. SRPITP mainly includes the following modules, i.e., image fingerprint extraction, detected image crawling, fingerprint storage, and image matching module.
The detailed image protection process is as follows:
-
(1)
If a user thinks it is necessary to protect his/her images, he/she submits the protection application to SRPITP. The images submitted are checked to see whether they have been in the protected image database already. If not, the application is accepted.
-
(2)
The accepted images are fingerprinted and classified by the fingerprint extraction servers.
-
(3)
The fingerprints of these images and relevant information of the owner are inserted into the protected image database.
-
(4)
Images in the websites under monitoring are crawled. SRPITP system administrator has the privilege to determine which websites should be monitored.
-
(5)
Image crawling servers receive the images from the websites.
-
(6)
Images obtained from step (5) are tagged and fingerprinted by the fingerprint extraction servers.
-
(7)
The fingerprints and related information of the images obtained from step (6) are saved in the detected image database. The related information includes the names of the uploaders and time of the upload.
-
(8)
Applying the image matching algorithm, the fingerprint matching servers determine whether there exists unauthorized image usage.
-
(9)
If unauthorized image use is found, the fingerprint matching server sends a message to a management server.
-
(10)
The management server sends an alarm to the image owner immediately, and detailed information of the illegal use (such as when and where this usage is found, the uploader of the image) is also sent to the owner. Then, the image owner may take appropriate measures to protect his/her rights.
With the increased amount of images in websites, using traditional stand-alone computer to accomplish image fingerprint extraction and matching cannot guarantee the real-time performance of our proposed model. Therefore, Spark is applied to enhance the model’s throughput. One of the management servers works as the master which monitors the status of the whole system and also responsible for job scheduling. All the other computers act as worker nodes in our system, and they work in parallel to ensure the real-time property of SRPITP.
The details of the main modules are described in the following sub-sections.
Fingerprint extraction module
Fingerprint extraction algorithm is one of the core algorithms in SRPITP. Compared with other image fingerprint extraction algorithms, the algorithm proposed by Mao et al. [16] has lower calculation complexity and higher accuracy, so it is adopted in our model. For the readers’ convenience, the algorithm is described below. The division of an image is shown in Fig. 2.
To minimize the storage space usage, quaternion quantization method is applied to represent the image fingerprint data. After the quantization, the fingerprint of each image is 180 words and only 1440 bits are required to store each fingerprint. Section 3.3 gives a detailed description of the image database.
Image crawling module
In SRPITP, Scrapy is chosen to finish real-time image acquisition from websites determined by the system administrator. Scrapy is a fast, high-level screen capture and web crawler framework which can crawl websites and extract structured data from pages [28].
Scrapy is deployed in the image crawling servers, and these servers execute the crawling at regular intervals. The value of the interval is determined by the system administrator according to the Internet image increase speed. When acquiring images from websites, the image crawling servers adopts the so called incremental crawling policy, i.e., the servers only crawl websites’ new added images since last crawl. This policy helps to reduce the number of detected images, so as to reduce the number of fingerprint extraction and image matching, and improve the image protection efficiency greatly.
After being downloaded, the fingerprints of these images will be extracted. Later, the fingerprints and other related information of the images will be inserted into the detected image database for matching. The storage details are shown in Section 3.3.
Database establishment and Tag classification
As shown in Fig. 1, there are two types of database in SRPITP, i.e., the protected image database and the detected image database. Protected image database contains information of images submitted by the image owners. In order to increase the efficiency of image matching, this database has several data tables holding different types of images, e.g., human figures photography and scenic or animal photos.
After a user submits an image for protection, firstly, the image is classified and a tag is given according to its type, and then, it is fingerprinted. After that, the fingerprint and other necessary information of the image are stored into the corresponding data table in accordance with its type tag. The main fields of each protected image data table include image id, user id, storage address, fingerprint, protection duration, and so on.
Similarly, images obtained from the websites are analyzed, and their type tags are obtained. Their fingerprints together with other related information are stored in the detected image database. The structure of the detected image database is quite similar to that of the protected image database, except that there is only one data table in this database and it has a field to keep the tag values of images.
During matching, there is no need to traverse the whole protected image database to get the result, and only the data table having the same tag with the detected image’s tag value needed to be searched. It can shorten the image matching time greatly, and the experimental results are shown in Section 4.
Image matching module
Efficient and accurate image matching is the basis of SRPITP. Since SRPITP has to handle massive images in real-time, Tag classification method is chosen to help to accomplish image matching faster.
During the image matching process, image records in the detected database are handled in parallel to improve the system efficiency.