Secure download
AE Find Weblinks
Changelog
Added
- New option:
-StripRegexBeforeEvaluation - New option:
-LinkEvaluationStripRegex <regex> - Allows removing a regex-matched part of extracted links before:
- search matching
- exclude matching
- blacklist checks
- deduplication
- output writing
Example use case: strip a trailing numeric ID such as /12345 before evaluating or outputting links.
Updated
- Help text now documents the new stripping options.
- Interactive mode now asks whether to enable regex stripping and prompts for the regex.
- Generated command/config output now includes the new options.
- Resume/run signature now includes the new stripping settings.
- Matching, excluding, blacklist checks, deduplication, and output writing now support the stripped/evaluated URL value.
- Crawl boundary logic can also use the evaluated URL value.
- Parallel mode was adjusted so workers still only fetch/extract; stripping is applied centrally when writing results.
- Final summary now prints whether evaluation regex stripping was enabled.
Safety / Validation
- Supplying
-LinkEvaluationStripRegexwithout the switch auto-enables stripping with a warning. - Using
-StripRegexBeforeEvaluationwithout a regex throws an error. - Invalid regex patterns throw a clear error.
- Regex timeout is handled gracefully: it warns once and continues with the original link.
Possible Gotcha
Old resume progress files may not match the new run signature, because the signature now includes the new stripping-related settings.
5
seconds until the download starts
If the download does not start automatically after the countdown, click the button above.