WebScraper 4.7.2 – Scan and output website data as CSV or JSON.

November 7, 2018

WebScraper uses the Integrity v6 engine to quickly scan a website, and can output the data (currently) as CSV or JSON.

Easy to scan a site – just enter the starting URL and press “Go”
Easy to export – choose the columns you want
Plenty of extraction options, including HTML elements with certain classes or IDs, regular expressions, or entire content in a number of formats (html, plain text, markdown)
Configuration of various limits on the crawl and the output file size

What’s New

Version 4.7.2:

Small but important enhancement to whitelisting rules. If a page meets the 'output filter' rules (which means that it's an 'information page' or 'detail page') it'll be included in the crawl regardless of the rules that are set up in the scan blacklist / whitelist rules.
this makes it easier to set up WebScraper where you want to limit the scan to search results or a certain section of the site, but gather information from detail pages which don't meet those scan rules.
Some updates to the context help and other small fixes / enhancements.

Compatibility

OS X 10.8 or later, 64-bit processor

Screenshots

Tags: Internet WebScraper

You may also like...

Leave a Reply Cancel reply

You must Register or Login to post a comment.