This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
start [2015/04/29 13:50] zoza [Connect or Not, BUhne A, Zurich] |
start [2018/04/26 13:46] (current) zoza [Python and SOM] |
||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ====== POSTDOCTORAL RESERACH ====== | ||
+ | |||
+ | ===== Python and SOM ===== | ||
+ | |||
+ | - python module by Vahid Moosavi of CAAD, **sompy** | ||
+ | |||
+ | - another SOM python implementation, **somoclu**: https://somoclu.readthedocs.io/en/stable/index.html | ||
+ | |||
+ | - SOM Java Toolbox created at TU Wien http://www.ifs.tuwien.ac.at/dm/somtoolbox/ | ||
+ | |||
+ | - Twitter sentiment analysis with Python: https://towardsdatascience.com/another-twitter-sentiment-analysis-with-python-part-5-50b4e87d9bdd | ||
+ | |||
+ | ===== Scraping and mining twitter streams ===== | ||
+ | |||
+ | following these tutorials: [[http://adilmoujahid.com/posts/2014/07/twitter-analytics/|Introduction to Text Mining using Twitter Streaming API and Python]] & [[http://mike.teczno.com/notes/streaming-data-from-twitter.html|a beginners guide to streamed data from Twitter]] | ||
+ | |||
+ | - get Twitter API keys from https://apps.twitter.com | ||
+ | - **scrape tweet stream** using [[twitter-streaming-py|python streaming script]] | ||
+ | - **mine the tweets** using [[mine-tweets-py|python mining script]] | ||
+ | |||
+ | A resourceful guide for Twitter textmining in Python: https://marcobonzanini.com/2015/03/23/mining-twitter-data-with-python-part-4-rugby-and-term-co-occurrences/ | ||
+ | |||
+ | ===== Scraping and mining Dezeen articles ===== | ||
+ | |||
+ | * with **scrapy** | ||
+ | |||
+ | The setup: python3 in the conda environment | ||
+ | <code>$ conda create -n bots python=3.4 # create a virtual environment named "bots" | ||
+ | $ source activate bots # activate the environment; check if active: conda info --envs | ||
+ | $ conda install -n bots -c conda-forge scrapy # install scrapy for the named environment | ||
+ | </code> | ||
+ | |||
+ | Run scrapy directly from the shell: | ||
+ | <code>$ scrapy startproject dezeen # start a project</code> | ||
+ | |||
+ | Detailed instructions here: https://doc.scrapy.org/en/latest/intro/tutorial.html#creating-a-project | ||
+ | |||
+ | Create a _spider_ in the folder dezeen/dezeen/spiders/ within which you will create a class that will declare its' name. This name will be used to call the spider from the console: | ||
+ | |||
+ | <code>$ scrapy crawl spider_name</code> | ||
+ | |||
+ | It is also important to declare fields in pages that will be scraped. This is done in the dezeen/items.py file, using eg (the Class is already declared when you start project). | ||
+ | |||
+ | <code python>Class DezeenItem(Item): | ||
+ | title = Field() | ||
+ | link = Field() | ||
+ | description = Field() | ||
+ | </code> | ||
+ | |||
+ | These fields will be later used as part of the item dictionary (e.g. item['link']) | ||
+ | |||
+ | ====== DOCTORAL RESEARCH ====== | ||
+ | |||
+ | |||
//>>>>>>>>>>>>>>>>>>>!>!>>> // * // * // * // >>>>>>>>> // // ? // ? !! >>>>> | //>>>>>>>>>>>>>>>>>>>!>!>>> // * // * // * // >>>>>>>>> // // ? // ? !! >>>>> | ||
Line 18: | Line 72: | ||
[[connect-or-not-ljubljana|{{http://emperors.kucjica.org/wp-content/uploads/2014/09/ida-web.jpg?300}}]] | [[connect-or-not-ljubljana|{{http://emperors.kucjica.org/wp-content/uploads/2014/09/ida-web.jpg?300}}]] | ||
- | ===== Connect or Not, BUhne A, Zurich ===== | + | ===== Connect or Not, Buhne A, Zurich ===== |
[[connect-or-not-zurich|Details on hardware and software development]] | [[connect-or-not-zurich|Details on hardware and software development]] | ||
Line 63: | Line 117: | ||
d3 js | d3 js | ||
- | use this instead http://bl.ocks.org/mbostock/4060954 | + | [[workflow]] |
+ | [[http://kucjica.kucjica.org/emperors-vizi/|results]] | ||
====== lisboa: connect or not at IST Alameda campus ====== | ====== lisboa: connect or not at IST Alameda campus ====== | ||
Line 309: | Line 364: | ||
====== other ====== | ====== other ====== | ||
+ | |||
+ | [[ways-to-run-python|ways to run Python]] | ||
+ | |||
+ | [[server maintenance]] | ||
[[network configuration]] | [[network configuration]] | ||
Line 331: | Line 390: | ||
[[imagemagick]] | [[imagemagick]] | ||
+ | |||
+ | [[bash script dnld book]] |