I recently came across the AirBnb dataset (link). I found this dataset very interesting and wanted to look at the growth of Airbnb and study its trends. I limited the data analysis to 14 cities in the US.
Question 1: Are AirBnB locations clustered around tourist spots or points of interest?
This dashboard shows a snapshot of the listings in two cities – I’ve considered New York and San Francisco as examples. (Any two cities can be looked at, at a time, using the filter)
If it were true that airbnb listings were clustered around tourist spots, then the map show reflect that. However, in this case, the map reflects that airBnB listings are not just clustered around the tourist spots, but are spread out across the city.
This raises questions about the type of travelers who use airBnB. It would be interesting to study this with respect to the type of traveler or their purpose.
Question 2: Does staying at an AirBnB listing mean sharing a house with the host or a room with the host or someone else?
The above dashboard shows that a higher number of entire homes / apartments are listed on the website as compared to shared room or even private rooms. This strongly supports the fact that it is a misconception that airBnB involved sharing a room with the host or someone else. A majority of the listing are for independent properties. Also adding support is the value for these listings While a private room costs about $73.30, for $168.40, an entire apartment is available, making it a very good deal for those traveling in groups or families needing atleast 2 rooms. The dashboard also shows steep rate of growth of independent listings and reviews. In showcasing that entire apartments are available all across the city, the map displays yet another reason to choose an entire apartment / house – it’s ubiquitous presence.
Question 3: Does the size of a city and its cost of living impacts the number of listings and the price at which a listing is offered?
The top part of the above dashboard shows a comparison across different cities on the basis of households and listings. This leads to the density of listings. For places with more households, is there a higher density of households being listed? The data does not show a clear correlation. For instance, Santa Cruz has low number of households but has a high density of households listed; whereas Chicago has a lower density. This leads to the question of cost of living. The graph on the bottom left seems to answer this. Chicago has a low cost of living and a low average room rate; while Santa Cruz has a high cost of living and a high average room rate. However, this graph also sports anomalies such as Austin – which has a high average room rate but a low cost of living.
Further data on occupancy rates, the type of property (beyond room type), the neighborhood, etc. would be helpful in analyzing this further; and studying other trends with respect to population or cost of living differences across listings.
Question 4: Are there regional differences across AirBnB listings?
There seem to be some differences in the parameters across different regions. The Northeast not only have higher listings, but also sport higher prices. This is true across all types of properties as soon in the box plot on top. However, when it comes to the number of reviews, South’s average is higher than the others; with west matching (and beating it) for private rooms and entire apartments. It’s also interesting to note that shared rooms receive the lowest number of reviews while private rooms receive the most, with entire homes and apartments inbetween the two – across all regions. An analysis on the sentiment of the reviews could lead to differences across the regions and whether culturally one region is more attuned to giving reviews.
Another trend that is somewhat uniform across the regions is that of the minimum nights stay required. Of interest here is the minimum number of nights required to stay in a shared room in the West. This number seems high, and needs further investigation in terms of a difference across cities. In looking at a deep dive within the West region, we see that this is because of the requirement in SF. This requirement needs to be looked into – is it a policy, or is it that some properties are being operated as hostels, etc.
Question 5: Are some listings in New York illegal?
News reports claimed that AirBnB had purged their New York listings data around 20th of Nov 2015, since these listings were illegal. A deepdive into the data showcases this as seen in the top graph. The difference is about 1500 listings. However, between then and January 2016, the listings have been revived.
In New York, it is illegal to let out entire properties for less than 30 days. As can be seen from the bar graphs, about 54% of all listings are for entire homes or apartments and these are available to be booked for about 220 days a year. The scatterplot on the right shows the listings of entire homes and the duration for which they are available. The ones coded in orange are available for less than 30 days; and hence it can be inferred that these are illegal listings.