Is it Phoenix for the win?

Izzet Guild

TLDR I won the European Modern Series Qualifier at Kaboom in Zurich with this Izzet Phoenix deck

I found this article by Javier Domingues as a good starting point: link to Hareruya and adjusted the deck to the meta accordingly

One interesting conclusion I perceived was how little of a consensus there was between different players that clearly knew their decks very well, which speaks about the depth of the strategy.

Javier Dominguez

Shatterstorm и против hardened scales если купить немного времени болтами и прочими.

Я просто заменил им abrade ибо abrade был хорош flexibility особенно против хуманов, но у нас итак мало хуманов

16 игроков было с top-4.

В первом раунде обыграл Eldrazi Tron и очень затилтил оппонента что в первой игре выиграл через chalice двумя тварями во льду, ибо +1/+1 марки то все равно убираются и протащил один спелл через chalice :))) он так прямо распух и перестал со мной говорить.

Потом играл против Jeskai Saheeli Guardian combo. Проиграл.

Потом против духов, выиграл.

Потом UW control, надо выигрывать и aria of flame тащит, но ничья из-за нехватки времени.

Пятый раунд был такой же лист как и во втором раунде, я заметил что они не уважают blood moon, и в третьей игре оппонент с двумя шоклэндами.. gg

Полуфинал выиграл у джанда, aria реально тащит! И хорошо ибо оппонент постоянно у меня все съедал своим scooze или выставлял Leyline of the Void (но благо тогда его Tarmogoyf был крохотным)

А в финале опять оппонент из первого раунда! Он после поражения мне все тащил и был выше меня в Standings, я еле пролез на четвёртое место благодаря ему получается. Он опять давай свои chalice on 1, chalice on 2, Karn great creator, fetch lattice, no spells for you sir.. ну благо в третьей игре я начал бегать двумя фениксами на втором ходу и на четвёртом ходу ещё болт и затащил.

Illustration copyright: https://www.deviantart.com/abstractendeavours/art/Izzet-Guild-783909311

Фельетон про #соскидокторанаук

YC: И вообще у меня соски чувствительные
YC: А вы тут яхта вилла

VR: соски???

YC: Ага
YC: Трутся о футболку и ужасно болят

VR: лиф тебе надо ;)))
YC: Купил специальный крем. Мажу. Тогда не болят.
YC: #maleproblems
YC:Но в том то и прикол что это профилактический крем
YC: И что мне, теперь всегда мазать?
YC: А может они бы сегодня не болели
YC: А когда кончится снова покупать?
KG: Вот такой маркетинг
YC: А что если соски потом так привыкнут что уже спорт не спорт а соски хотят крем! Не могут без крема
YC: Вот такие мысли
💭
VR: о боги
YC: #соскидокторанаук
YC: Крутой хэштег ящитаю
YC: Надо increase awareness of male problems with nipples
YC: Both men and women have nipples but why do we never hear about male sensitive nipples??? Right! Conspiracy of corporations!
YC: Выйду завтра наодиночный пикет
YC: Покажу всем соски
YC: До и после крема. Пускай сами смотрят КАК это для сосков важно
VR: 😂😂😂😂😂😂
YC: Хватит замалчивать проблемы мущин
KG: 🤣🤣🤣
KG: Да уж.. менисты и феминистки
YC: Ну а что. А раз у меня такие чувствительные соски то может это вообще эрогенная зона? Мне никто соски не ласкал никогда откуда я знаю? А сколько ещё мущин могут не подозревать об этом аспекте своей сексуальности? Что значит волосатые? Ну побрить могу! Крема всякие есть волосы сами отвалятся! О! Бизнес идея крема два в одном: отваливаются волосы и увлажняют соски одновременно!!! Бомба 💣
KG: Ахахахах 😂😂😂
KG: Пошли бизнес идеи
YC: Это может объяснить почему на свете настолько больше мужиков ходят топлесс недели женщин. Вот вчера иду домой и в Шлирене парень шёл топлесс. А теперь я его понимаю! Соски трутся о футболку и жизнь не мила вот и идёт топлесс

Больше щасливых мущин—больше производительность больше ВВП. Влажные соски в каждый дом!

Nine Inch Nails качает

А так ваще Maynard James Keenan был в составе A Perfect Circle с концертом в прошлом году и это было обалденно, в этом году он посетил Европу уже в составе культового Tool и это был полный отпад. И я так много его слушал на ютубе, что постепенно проникся его самым новым проектом Puscifer альт рок, трип хоп экспериментал всего намешано:

Ну и конечно виды Grand Canyon завораживают, и понимаешь, что нифига мы его толком не посетили и ничего не видели. Так покружились на вертолете вокруг, разве это дело.

Но крутейшим открытием для меня стали нынешние Nine Inch Nails. Просто когда я подрабатывал в институте материаловедения помню научный сотрудник Антон слушал всякую экстремальную музыку. Всякие там сборники экстемальной музыки я у него помнится брал, тогда еще на дисках, VladExtreme, например с экспериментальной из Владивостока. Ну и по большей части это было для меня слишком тяжелым всё, лишь позже я начал слушать Rammstein, а потом уже и вообще перешел на метал. Но тут наткнулся и подумал, что я с тех пор уже куда экстремальней музыку слушаю. Каково же было мое удивление, когда я посмотрел этот их концерт:

Есть и ударные композиции с галопирующей басс-гитарой (шестая минута), и индастриал рок (двадцать первая минута) и какой-то фьюжн танцевально-джазовый чтоли (пятнадцатая минута) и по сути индастриал метал вроде раммштайна (сороковая, пятьдесят вторая). Я уж не говорю про самое крутое освещение что я видел когда-либо.

Качает

На тему качает, между делом, Pelican have released a record after six years of silence. I don’t know if any mescaline was used to compose it, but it’s less patient as usual, filled with guitar solo duels, deepest riffs, intense rhythm. 

Ref mescaline et cetera, there is a whole exhibition in Bern I’ve visited about ecstatic experiences: https://www.zpk.org/en/exhibitions/current/ecstasy-1794.html

Also, I haven’t been following Lacuna Coil for about a decade now, and this is my earworm, or Ohrwurm as Germans would put it, of the week:

I only compete vs myself of yesterday

Running faster than 5 min per km.. I haven’t done this since 2014 at Zurich City Run.. but recently on a running track of a nearby school, after a warmup round, I’ve reached 4’12” and 4’16”: my new highs since ACL surgery. Stocked!

Managed to get atop the Uetliberg right before the storm yesterday in under 52 minutes. Previous times were 1h03 and 1h06 🙈 a whopping 18% improvement? Tbh I thought my heart would explode 

475 meters of elevation from Urdorf

EDIT: today I did it in under 49 minutes..

Affordable & Safe Housing in Seattle, WA

Photo by Meriç Dağlı on Unsplash
Photo by Meriç Dağlı on Unsplash

Affordable & Safe Housing in Seattle

Introduction

This report is for the final course of the IBM Data Science Specialization hosted on Coursera platform. The project allows learners to be as creative as they want and come up with an idea to leverage the location data available via FourSquare API to compare neighborhoods of a city of choice, come up with a problem which can be solved using that data.

In our problem statement, we have a group of athletes who are planning to live in Seattle for several weeks. They would need to find several flats, so it’s desirable that they are located nearby to make the collective work-outs easier. Additional preferences include presence of a park nearby and low criminality in that district because they are planning to be outside very often (jogging in the evenings, etc). Also, the apartments should be affordable, but the factor of low criminality is valued higher by our clients.

The target audience for this report are:

  • potential buyers, who can roughly estimate which neighborhoods are more desired (and the models used for analysis should be easily adjustable),
  • real estate builders and planners who can decide what kind of neighborhoods are more attractive on the market to maximize selling price of newly built flats,
  • and of course, to this course’s instructors and learners who will grade my project,
  • anyone who is curious how Python can be applied to easily crawl web pages; parse CSV or JSON files; create powerful visualizations of data as scatter plots, heat maps, density plots using matplotlib, seaborn and map visualizations using Folium; process data using lists, dictionaries, pandas DataFrames.

All the code with data analysis is available on my GitHub page.

Data Description

Seattle city neighborhoods were chosen as the observation target due to following reasons:

  • there is a lot of statistical data freely available for USA,
  • diversity of neighborhoods: Seattle is a rather large city with very different districts,
  • availability of geolocation data to allow for visualizations on a map.

For the data acquisition part, we use this Wikipedia article to find out Seattle’s district names and coordinates. For most of them, we couldn’t find any additional information like population size. We get the population information from the portal ‚Find My Seattle‚. Crime data is available from official sources of City of Seattle. For prices of flats we use data set provided by Airbnb on Kaggle portal. To locate parks nearby flats, we access FourSquare API.

Data collection

The process of collecting and cleaning data:

  • we use Python libraries `requests` and `lxml` to scrap web pages of Wikipedia and locate the attributes and tags of interest using XPath, follow the URLs of all neighborhoods and retrieve the geographical locations (see Figures 1 and 2 below),
  • population sizes of Seattle’s districts we enter by hand into a CSV file,
  • we use Python library `json` to process crime data, which we then analyze on the monthly rolling basis and normalize by districts‘ population sizes;
  • to measure proximity to parks, we utilize FourSquare API, namely the `Search for Venues` request with a corresponding categoryId of `4bf58dd8d48988d163941735` (see Figures 3 and 4),
  • Airbnb listings are available in CSV format.
Figure 1. Scraping Wikipedia for districts' names

Figure 1: Scraping Wikipedia for districts‘ names

No alt text provided for this image

Figure 2. Retrieving districts‘ coordinates

No alt text provided for this image

Figure 3. Obtaining coordinates of parks using FourSquare

No alt text provided for this image

Figure 4. Example for Ballard district: FourSquare locates most parks

Data preparation

Districts are named differently across the data set. Therefore, we map some districts in crime data onto bigger districts from population sizes data and vice versa: population of some districts must be summed up to obtain a bigger district so that there is a one-to-one correspondence between districts‘ names.

We do similar normalization for population data from FindMySeattle.

From the crime data, we filter out crime categories which aren’t of interest for our clients and focus our efforts on violent crimes which have happened in the past decade only (we also truncate data for an incomplete month of January 2019).

No alt text provided for this image

Figure 5. Normalizing district names across datasets

We then form monthly breakdowns by district of crimes in that district and normalize it by dividing them by the population of said district (see Figure 6, size of the circle is proportional to number of crimes per capita). It looks like the most dangerous districts are Georgetown, Pioneer Square and Chinatown, but there are quite some fluctuations in monthly figures.

No alt text provided for this image

Figure 6. Monthly crime rate by district (number of crimes per capita)

Due to this high volatility, we can’t simply utilize the last data point for our analysis: there might be some seasonal patterns, etc. Therefore, for our needs, we consider crime rate on a rolling basis with window of two years (see Figure 7 as an example for Chinatown – International District). This way the crime rate is smoothed out and allows us to have a single figure per district. Breakdown of rolling crime rate is presented in Figure 8.

No alt text provided for this image

Figure 7. Rolling crime rate for Chinatown — International District.

No alt text provided for this image

Figure 8. Rolling crime rate by district

To obtain coordinates of all parks we query FourSquare, however, it has a limitation of returning no more than 100 parks at once. To overcome this, we use the districts coordinates we obtained from Wikipedia, make a request for each district’s coordinate to FourSquare and then combine the results (see Figures 3 and 9).

No alt text provided for this image

Figure 9. All parks of Seattle, WA

Then for each listing of a flat in Airbnb, we measure its geographic distance to every park of Seattle using `distance` Python package and persist the minimum proximity (see Figure 10).

No alt text provided for this image

Figure 10. Measuring proximity of flats to nearest parks

Methodology

Our client wants to find out districts which contain many flats meeting their criteria:

  • affordable in price;
  • low criminality in that region;
  • proximity to parks.

First, let’s get some insights into our data using visualizations.

Data visualization

Figure 11 presents a histogram of distribution of rolling crime rate (where each tick represents an observation, and bars indicate how many fell into the same bin) and a distribution of rental prices for flats by district. There doesn’t seem to be clear dependency to rental price (Figure 12), also crime rate follows Poisson distribution whilst rental prices are distributed normally.

Pearson correlation between these metrics is quite low at 0.178 and p-value of 0.38 suggests this slight positive correlation is insignificant.

An interesting insight: comparing the most expensive district, South Lake Union, with the cheapest one, Rainier Beach, our clients would end up paying 2.2 times higher rent on average for having 5.2 times higher chance of being involved in an aggravated assault, rape, residential robbery or other serious crimes. Sometimes, paying more doesn’t imply living in safety, and it’s exactly the balance of the two factor that our client is seeking.

No alt text provided for this image
No alt text provided for this image

Figure 11. Distributions of rolling crime rate and rental price

No alt text provided for this image

Figure 12. Rolling crime rate vs mean rent by district

No alt text provided for this image

Figure 13. Mean rent across districts of Seattle, WA

Choosing a method for data analysis

Context-based recommender

First method which was suggested to our client was context-based recommender, where we would be able to find districts similar to those preferred by the client. However, our clients have never been to Seattle and were unable to specify districts they’ve liked.

K-Means clustering

Unsupervised learning techniques like k-means clustering were ruled out because they are too sensitive to the scaling of the datasets (remember that our data follows different distributions), difficulty to predict the number of clusters, order of the data having impact on the final result.

Scoring engine

We have suggested to our client that for each of our data sources, we would create attribute groups. For crime rate: safe, normal, dangerous; for rental price: low, affordable, expensive; for proximity to parks: close, further, far.

Thus, for each apartment listed on Airbnb we compute each of these scores, then sum them up (potentially with some weights) to obtain an overall score, and then filter out those not meeting a desired minimum cut-off score.

This approach is easy for clients to understand, it is extendable to include other metrics, and the weights can be adjusted to prioritize different attributes.

For this case study, it has been agreed to assign scores between 0 and 10 according to the quantile of the attribute’s distribution. It has been agreed that criminality score is valued 1.5 times higher than score for proximity to parks and price score has weight of 1.25. And an apartment needs to acquire a total score of 24 to be of interest, and additionally, it’s price and parks score must each be at least 5, and criminality score at least 6 (our clients had concerns that we might end up with choosing apartments which are cheap and close to a park full of drug dealers).

Then, to compare districts to each other, we order them by ratio of desired apartment to the total number of listings.

Results

Scores for apartments have been computed and an interactive map with markers been prepared (see Figures 14 and 15).

Unfortunately, quite many districts have little to no desired flats fulfilling our criteria. All in all, out of 3’349 listings only 344 remained, thus our scoring engine filtered about 90% of supply on the market. Interestingly, desired regions are grouped together, so our clients won’t need to hop between different ends of the city in search of a cheap yet safe housing.

No alt text provided for this image

Figure 14. Desired apartments are marked in green, blue dots mark the parks

No alt text provided for this image

Figure 15. Each green dot has a name of the listing, and its corresponding scores

When looking at the ratio of desired flats to total number of listings in a district, the most desired neighborhoods to begin search are Capitol Hill, Madison Park and Green Lake (see Table below). For instance, in Capitol Hill almost every third apartment is desired and in Madison Park every fourth!

No alt text provided for this image
No alt text provided for this image
No alt text provided for this image
No alt text provided for this image

Discussion

As it can be seen from scatter plots comparing our scores, there is no clear dependency between scores. However, there appear to be some regions where values’ dependencies are more expressed. This suggests that it might be beneficial to approach the search for desired apartments slightly differently: instead of taking the granularity of districts as is, one could form regions based on the density of desired apartments in them.

Here is how it would look like in our case if we were to plot a hex grid with geographic coordinates, where intensity of a color in each cell corresponds to the number of desired flats which happened to be in it:

No alt text provided for this image

Figure 16. Geographic hex grid of desired apartments

As a side note, after analyzing the breakdown of crimes in Seattle by hour and day of week, we’d advise our clients to keep alarmed Fridays after working hours: there is a spike in number of committed crimes.

Spikes at 12 am and 12 pm suggest the data isn’t clean enough and many crimes are being registered as happening “before noon” or “in the afternoon”, see Figure 17.

No alt text provided for this image

Conclusion

It is hard to find balance between different attributes of good housing. We have provided our clients an interactive tool to meet their desired criteria and make it easy to understand the trade-off for each particular offer. The tool is extensible and flexible to include other attributes or adjust the priorities of attributes.

(C) Dr Yury Chebiryak, January 2019