urldedupe
urldedupe is a tool to quickly pass in a list of URLs, and get back a list of deduplicated (unique) URL and query string combination. This is useful to ensure you don’t have a URL list will hundreds of duplicated parameters with differing qs values.
Details
Category: Utilities
Publisher: trickest
Created Date: 6/23/2021
Container: quay.io/trickest/urldedupe:cc9b25a
Source URL: https://github.com/ameenmaali/urldedupe
Parameters
mode
string
Command:
-m
- The mode/filters to be enabled (can be 1 or more, comma separated). Default is none, available options are the other flags (--mode r,s,qs,ne)urls-file
file
requiredCommand:
-u
- File containing urlsregex-parse
boolean
Command:
-r
- This is significantly slower than normal parsing, but may be more thorough or accurateno-extensions
boolean
Command:
-ne
- Do not include URLs if they have an extension (i.e. .png, .jpg, .woff, .js, .html)query-strings-only
boolean
Command:
-qs
- Only include URLs if they have query stringsremove-similar-urls
boolean
Command:
-s
- Remove similar URLs (based on integers and image/font files) - i.e. /api/user/1 & /api/user/2 deduplicated