Skip to main content

Table 1 PDF structural path consolidation rules

From: Hidost: a static machine-learning-based detector of malicious files

  Search regular expression Substitute regular expression
1. /Resources/(ExtGState|ColorSpace|Pattern|Shading|XObject|Font|Properties|Para)/[̂/]+ /Resources/ 1/Name
2. ̂Pages/(Kids/|Parent/)*(Kids$|Kids/|Parent/|Parent$) Pages/
3. /(Kids/|Parent/)*(Kids$|Kids/|Parent/|Parent$) /
4. (Prev/|Next/|First/|Last/)+ <empty string>
5. ̂Names/(Dests|AP|JavaScript|Pages|Templates|IDS|URLS|EmbeddedFiles|AlternatePresentations|Renditions)/(Kids/|Parent/)*Names Names/ 1/Names
6. ̂StructTreeRoot/IDTree/(Kids/)*Names StructTreeRoot/IDTree/Names
7. ̂(StructTreeRoot/ParentTree|PageLabels)/(Kids/|Parent/)+(Nums|Limits) 1/ 3
8. ̂StructTreeRoot/ParentTree/Nums/(K/|P/)+ StructTreeRoot/ParentTree/Nums/
9. ̂(StructTreeRoot|Outlines/SE)/(RoleMap|ClassMap)/[̂/]+ 1/ 2/Name
10. ̂(StructTreeRoot|Outlines/SE)/(K/|P/)* 1/
11. ̂(Extensions|Dests)/[̂/]+ 1/Name
12. Font/([̂/]+)/CharProcs/[̂/]+ Font/ 1/CharProcs/Name
13. ̂(AcroForm/(Fields/|C0/)?DR/)(ExtGState|ColorSpace|Pattern|Shading|XObject|Font|Properties)/[̂/]+ 1 3/Name
14. /AP/(D|N)/[̂/]+ /AP/ 1/Name
15. Threads/F/(V/|N/)* Threads/F
16. ̂(StructTreeRoot|Outlines/SE)/Info/[̂/]+ 1/Info/Name
17. ColorSpace/([̂/]+)/Colorants/[̂/]+ ColorSpace/ 1/Colorants/Name
18. ColorSpace/Colorants/[̂/]+ ColorSpace/Colorants/Name
19. Collection/Schema/[̂/]+ Collection/Schema/Name