Details

Category: Utilities

Publisher: trickest

Created Date: 6/23/2021

Container: quay.io/trickest/cewl:4ad686f

Source URL: https://github.com/digininja/CeWL

Parameters

url
string
required
Command: - The site to spider
meta
boolean
Command: --meta - Include meta data.
count
boolean
Command: --count - Show the count for each word found.
debug
boolean
Command: --debug - Extra debug information.
depth
string
Command: -d - Depth to spider to, default 2.
email
boolean
Command: --email - Include email addresses.
groups
boolean
Command: --groups - Return groups of words as well
header
string
Command: --header - In format name:value - can pass multiple.
allowed
string
Command: --allowed - A regex pattern that path must match to be followed
exclude
file
Command: --exclude - A file containing a list of paths to exclude
offsite
boolean
Command: --offsite - Let the spider visit other sites.
verbose
boolean
Command: --verbose - Verbose.
auth-pass
string
Command: --auth_pass - Authentication password.
auth-type
string
Command: --auth_type - Digest or basic.
auth-user
string
Command: --auth_user - Authentication username.
lowercase
boolean
Command: --lowercase - Lowercase all parsed words
proxy-host
string
Command: --proxy_host - Proxy host.
proxy-port
string
Command: --proxy_port - Proxy port, default 8080.
user-agent
string
Command: --ua - User agent to send.
with-numbers
boolean
Command: --with-numbers - Accept words with numbers in as well as just letters
proxy-password
string
Command: --proxy_password - Password for proxy, if required.
proxy-username
string
Command: --proxy_username - Username for proxy, if required.
convert-umlauts
boolean
Command: --convert-umlauts - Convert common ISO-8859-1 (Latin-1) umlauts (ä-ae, ö-oe, ü-ue, ß-ss)
max-word-length
string
Command: --max_word_length - Maximum word length, default unset.
min-word-length
string
Command: --min_word_length - Minimum word length, default 3.