Data scraping with Google Sheets and Tabula

Data scraping — pulling tables of stats and other information from web pages, .PDFs and other documents — is a valuable skill for journalists. In this training video and hands-on exercises, you’ll learn how to scrape data tables out of web pages. We’ll use a formula and a browser plug-in to scrape web pages, then we’ll cover how to scrape native .PDFs with a fantastic tool called Tabula. Before starting the video, install on your computer, and open the exercises page link. You’ll be a data-scraping ninja in no time!

Link to exercises: Click here

Scraping formula: =IMPORTHTML(“URL”,“table”, 0)

Download free Tabula .PDF scraping software here: Click here
More data scraping resources on Journalist’s Toolbox: Click here

More training on Investigative: