Differences

This shows you the differences between two versions of the page.

--- start [2017/10/18 13:02]
zoza [Scraping and mining Dezeen articles]
+++ start [2017/10/27 08:25]
zoza [Scraping and mining Dezeen articles]
@@ Line 25: / Line 25: @@
 Detailed instructions here: https://doc.scrapy.org/en/latest/intro/tutorial.html#creating-a-project
+Create a _spider_ in the folder dezeen/dezeen/spiders/ within which you will create a class that will declare its' name. This name will be used to call the spider from the console:
+<code>$ scrapy crawl spider_name</code>
+It is also important to declare fields in pages that will be scraped. This is done in the dezeen/items.py file, using eg (the Class is already declared when you start project).
+<code python>Class DezeenItem(Item):
+title = Field()
+link = Field()
+description = Field()
+</code>
+These fields will be later used as part of the item dictionary (e.g. item['link'])
 ====== DOCTORAL RESEARCH ======

emperor's new architecture research