Sources & Methodology

Data Sources

Airbnb listing data was compiled from two different sources: Inside Airbnb and Tom Slee. Each of these websites scrapes listing data from Airbnb's website and aggregates it into a .csv file. Inside Airbnb had data from Jan 1, 2015 until Oct 2, 2016. Tom Slee's website had data from May 10, 2014 until Jan 20, 2016. Both websites scraped Airbnb data roughly monthly.

New York City neighborhood rent information is from Zillow. Specifically, we used monthly median ZRI for multi- and singlefamily residences and condos. To learn how Zillow calculates the ZRI, please see their ZRI methodology page. Because this data spanned several years, it was adjusted for inflation using monthly CPI for the NYC metro area, which is freely available from the Bureau of Labor Statistics here.

Information on housing units per neighborhood was calculated from datasets provided by the NYC Department of City Planning.

Tax Calculations

Hotels in New York City (and therefore Airbnbs as well) are supposed to pay six different kinds of taxes. Three of these are paid to the city and three are paid to the state. (source)

Taxes Paid to New York City Taxes Paid to New York State
NYC Sales Tax4.5% NYS Sales Tax4%
NYC Hotel Occupancy Tax Rate5.875% MCTD State Sales & Use Tax0.375%
NYC Hotel Room Occupancy Tax$2 per room per night NYS Javits Expansion Fund$1.50 per room per night

Hotels are exempted from paying both NYC Hotel Occupancy Taxes and the Javits Expansion charge if the room is rented for fewer than 14 nights. (source)

We estimated taxes separately for each listing, using a methodology based on that of Inside Airbnb's occupancy rate methodology.

  1. We used the number of reviews per month and an estimated review rate of 50% to estimate the number of bookings for each listings in each month.
  2. According to Airbnb themselves (source), the average length of stay in New York City is 6.4 nights. For most listings, we used 6.4 nights per booking to convert the number of bookings to number of occupied nights per month. For listings whose minimum stay length was greater than 6.4 days, we used that number instead. We also capped the number of booked nights at 70% of the total month to account for scheduling changes and shorter-than- average bookings.
  3. We then used the nightly price for each listing and the calculated number of booked nights to calculate monthly revenue.
  4. Using the maximum stay length for each listing, we determined if they were exempt from several types of tax (i.e. the taxes with the 14 night rule).
  5. We then calculated the number of rooms in each Airbnb. We assume that an "entire home" listing includes some type of living room, as well as the stated number of bedrooms.
  6. Finally, we applied the above stated percentages and rate to get the total tax amounts.

Listing Legality

According to New York City hotel codes, it is illegal to temporarily rent out your home if: it is a multiple residential dwelling unit (i.e. apartment), it is occupied by paying guests for fewer than 30 days, and the owner is not on the premises (source).

The data available from Inside Airbnb indicated whether each listing was a house, apartment, or condo. We also had the minimum nights each listing could be booked for. It was more difficult to determine whether the owner was still present. We had three methods of estimating this.

  1. If the host is allowing guests to rent the entire home.
  2. If the host has multiple listings in New York City. (He/she cannot be living in all of them.)
  3. If the host's location is not in New York City. Because our data wasn't standardized, we said that listings were legal if the host was listed as being in "New York", "NY", or simply "US". This kept our estimates relatively conservative.

Combining these three tests told us which listings violated NYC housing codes.

Posting Density

In order to calculate the total number of residential units in each neighborhood and borough, we downloaded a dataset containing extensive land use and geographic data provided by the NYC Department of City Planning. We then filtered the dataset to only contain buildings with residential units. Each building contains location coordinates in the New York-Long Island State Plane coordinate system. We used a website called Earth Point to convert these coordinates to normal Latitude-Longitude coordinates. Based on shapefiles provided by Zillow, we then mapped each building to a neighborhood and borough based on its Lat-Long position. Finally, for each neighborhood and borough, we summed over the total residential units of all buildings assigned to that region.

In order to calculate the density of Airbnb listings in each borough and neighborhood, we first mapped each listing to a neighborhood and borough (via its geographical coordinates) and then calculated the total number of listings assigned to each each region. The density of Airbnb posts for a given region is a simple ratio of the number of listings to the number of residential units in the region. For the purposes of our visualization, we multiplied this ratio by 10,000 to provide the user with a more intuitive value of "number of listings per 10,000 residential units" in a given region.