User Tools

Site Tools


start

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
start [2017/10/18 13:02]
zoza [Scraping and mining Dezeen articles]
start [2018/04/26 13:46]
zoza [Python and SOM]
Line 1: Line 1:
 ====== POSTDOCTORAL RESERACH ====== ====== POSTDOCTORAL RESERACH ======
 +
 +===== Python and SOM =====
 +
 + - python module by Vahid Moosavi of CAAD, **sompy**
 +
 + - another SOM python implementation,​ **somoclu**:​ https://​somoclu.readthedocs.io/​en/​stable/​index.html ​
 +
 + - SOM Java Toolbox created at TU Wien http://​www.ifs.tuwien.ac.at/​dm/​somtoolbox/ ​
 +
 + - Twitter sentiment analysis with Python: https://​towardsdatascience.com/​another-twitter-sentiment-analysis-with-python-part-5-50b4e87d9bdd
  
 ===== Scraping and mining twitter streams ===== ===== Scraping and mining twitter streams =====
Line 25: Line 35:
  
 Detailed instructions here: https://​doc.scrapy.org/​en/​latest/​intro/​tutorial.html#​creating-a-project Detailed instructions here: https://​doc.scrapy.org/​en/​latest/​intro/​tutorial.html#​creating-a-project
 +
 +Create a _spider_ in the folder dezeen/​dezeen/​spiders/​ within which you will create a class that will declare its' name. This name will be used to call the spider from the console: ​
 +
 +<​code>​$ scrapy crawl spider_name</​code>​
 +
 +It is also important to declare fields in pages that will be scraped. This is done in the dezeen/​items.py file, using eg (the Class is already declared when you start project).
 +
 +<code python>​Class DezeenItem(Item):​
 +title = Field()
 +link = Field()
 +description = Field()
 +</​code>​
 +
 +These fields will be later used as part of the item dictionary (e.g. item['​link'​])
  
 ====== DOCTORAL RESEARCH ====== ====== DOCTORAL RESEARCH ======
start.txt ยท Last modified: 2018/04/26 13:46 by zoza