Forum Discussion
Pedro_Haoa
Feb 21, 2018Ret. Employee
To expand the list:
- cURL – command line tool and library for transferring (including getting) data with URLs.
- Data Toolbar – web scraping add-on for IExplorer, Mozilla Firefox, and Google Chrome Web browsers.
- Diffbot – uses computer vision and machine learning to automatically extract data from web pages.
- Heritrix – gets pages (lots of them). It is a web crawler designed for web archiving.
- HtmlUnit – headless browser that can be used for retrieving web pages, web scraping, and more.
- iMacros –a browser extension to record, code, share and replay browser automation (javascript).
- Kantu – uses screenshots and OCR for scraping.
- Selenium (software) – a portable software-testing framework for web applications.
- Jaxer
- nokogiri
- OutWit Hub – Web scraping application.
- watir
- Wget – computer program that retrieves content from web servers.
- WSO2 Mashup Server
- Yahoo! Query Language (YQL)
-
Data Scraping Studio – Stand alone windows desktop software.
-
Greasemonkey
- Node.js
- PhantomJS – scripted, headless browser used for automating web page interaction.
-
jQuery
-
Agenty – SaaS solution, paid versions available.
- Apify – Web scraping and automation platform, free and paid versions available.
- dexi.io – SaaS solution, free and paid versions available.
- diggernaut.com – Turn websites into datasets, free and paid subscriptions available.
- fScraper – Facebook friendly scraper, SaaS solution, free and paid versions available.
- Import.io – SaaS solution.
- Listly.io – HTML to Excel in seconds, free SaaS service.
- Mozenda – SaaS solution, is a web-based platform for web data extraction.
- uScraper – SaaS service, free and paid versions available.
- Scrapy
Cheers!