Introduction

Scrapy

Installing Scrapy

conda install scrapy

conda install lxml=3.8.0

Finding XPath routes

Next page

RSS feed

Parsing the feed

Looking at the sitemap.xml

First robots.txt

Inspecting the sitemaps

Capturing the urls

Splitting the urls

Wayback Machine

Installing Wayback Machine

where wayback_machine_downloader

Dowloading the urls

Parsing

Analizying the data