Skip to main content
Version: v2.0

Website scraping

This guide describes how to scrape import data from web pages.

Fetching

Use the faraday gem https://github.com/lostisland/faraday to fetch. It is the basis of many third party API libraries and provides convenient wrappers around many network libraries (net_http by default) and my already be installed.

Useful middleware gems for scraping include:

Parsing

Use the Nokogiri gem https://nokogiri.org/ to parse HTML. You can use xpath and css selectors to extract data.