1.9. 爬虫相关¶
Selenium是自动化测试工具。它支持各种浏览器,包括 Chrome,Safari,Firefox 等主流界面式浏览器:
pip install selenium // 安装到/Library/Python/2.7/site-packages/
Selenium 2是 Selenium 和 WebDriver 两个项目的合并
chromedriver下载地址:
https://sites.google.com/a/chromium.org/chromedriver/downloads
1.9.1. Scrapy项目——开源爬虫系统¶
/html/head/title
: selects the <title> element, inside the <head> element of a HTML document/html/head/title/text()
: selects the text inside the aforementioned <title> element.//td
: selects all the <td> elements//div[@class="mine"]
: selects all div elements which contain an attribute class=”mine”
轻量级爬虫: