Data Cleaning and Revenue Analysis on Airbnb Sales Data using SQL and Power BI
View the Project on GitHub josephGZC/airbnb_revenue_analysis
link to the dashboard → Power BI Report
link to the code → SQL Data Cleaning and Exploratory Analysis
1. Project Background
2. Executive Summary
3. Dataset Overview
4. Data Cleaning and Preprocessing
5. Insights Deep-Dive
5.1. Revenue and Operations Overview
5.2. Quarterly and Yearly Performance
5.3. Geographic Revenue Distribution
5.4. Host Segmentation and Pricing Dynamics
6. Recommendations
Airbnb, the world’s leading online marketplace for short-term rentals and unique travel experiences, has transformed how people travel and engage with local communities. This project aims to explore key operational trends—such as revenue patterns, transaction volume, and geographic performance—over a four-year period (2019–2022) to identify opportunities for growth and optimization. By examining quarterly fluctuations, city and street-level distributions, and host type behaviors, this report seeks to uncover actionable insights that can help improve revenue outcomes across the platform.
This analysis examines Airbnb’s operational performance from 2019 to 2022, with the goal of identifying trends and insights that can drive revenue growth. Despite an encouraging rise in both transactions and listings in the most recent year, a concerning decline in total and average revenue was observed. The report investigates this discrepancy by analyzing seasonal patterns, geographic contributions at the city and street levels, and variations across host types. Key findings reveal that nightly rates have dropped significantly in Q2–Q4 of 2022, especially among professional hosts, likely contributing to the revenue decline. Additionally, while Big Bear Lake City generates the highest total revenue, average revenue is more evenly distributed across cities, with Cholla Avenue in Yuca Valley emerging as a top performer. Based on these insights, two strategic recommendations are proposed: implement targeted seasonal rate increases to recover lost revenue and replicate successful practices from high-performing locations to improve revenue across underperforming areas.
SQL was utilized for cleaning and processing, ensuring the data was complete, consistent, and analysis-ready.
HAVING COUNT(*) > 1
to detect repeated combinations—none were found in this case.COUNT(*) - COUNT(column_name)
. Additionally, empty string values were detected using LTRIM(RTRIM(column_name)) = ''
to flag fields that appeared filled but contained no meaningful data. No essential columns had missing values that required further cleaning or corrective action.unified_id
fieldmonth
column, originally in ‘YYYY-MM’ string format, was converted into a proper SQL DATE by appending ‘-01’INT
or FLOAT
using CAST(column_name AS INT)
or CAST(column_name AS FLOAT)
to ensure accurate computation and analysis.'1,000.00'
), the commas were removed using REPLACE(column_name, ',', '.')
before casting the values to FLOAT
for proper decimal interpretation.Over the past four years, the total revenue has amounted to $783.7M, with the current year generating $245M—3.3% lower than the previous year. In terms of average revenue, the past four-year average stands at $5.35K, while the current year's average has dropped to $5K, representing a 20.6% decline compared to the previous year.
Over the past four years, there have been a total of 146.5K bookings, with the current year accounting for 47K—a 22% increase from the previous year. Similarly, total listings over this period reached 7.4K, with the current year contributing 5K listings, marking an 18.2% increase from the previous year.
Despite increases in the number of bookings and listings, both total and average revenue are declining. This warrants further investigation.
For Revenue and Average Revenue, a noticeable dip occurs in the 2nd quarter, followed by a gradual increase from the 3rd quarter until the 1st quarter of the subsequent year.
Regarding Bookings and Listings, an upward curve is evident over the quarters, although a significant dip is observed around Q2 2020, which is likely attributable to the drastic reduction in visitor numbers caused by COVID-19.
|
|
|
|
Big Bear Lake City is the dominant revenue driver, accounting for 58.5% of total revenue, and the top 5 revenue streets are all located within this city.
In terms of average revenue, the distribution among cities is more balanced, ranging between 19% and 28%, with Cholla Avenue in Yuca Valley emerging as the leader in average revenue.
While Big Bear Lake City drives overall revenue, the analysis of average revenue reveals performance differences across streets and cities, suggesting that specific high-performing streets, such as those in Yuca Valley, may be key targets for further investigation.
Professional hosts lead with 62% of listings, followed by single owners who contribute 21.2%, while hosts with 2-5 units account for 16%.
From 2019 to 2021, there was an overall increase in quarterly nightly rates across all host types. However, a decrease in nightly rates is observed in 2022 across all quarters.
|
|
|
|
The percent decrease in nightly rates from 2021 to 2022 was calculated by quarter and host type. Notably, significant drops exceeding 10% were observed in Q2 through Q4—approximately 13% in Q2, 16–17% in Q3, and 12–14% in Q4.
Quarter | Professionals | 2–5 Units | Single Owners |
---|---|---|---|
Q1 | 2.5% | - | - |
Q2 | 13.5% | 13.6% | 13.6% |
Q3 | 17.0% | 16.6% | 18.0% |
Q4 | 14.5% | 14.2% | 12.1% |
Optimize Nightly Rates Based on Seasonal Trends. Address the revenue decline by implementing targeted rate increases of approximately 14–17% in Q2–Q4, especially for professional hosts. This pricing adjustment aligns with historical drops and can help recover lost revenue without discouraging bookings, given the concurrent rise in transactions and listings.
Leverage High-Performing Locations to Guide Growth. Use insights from Big Bear Lake City and Cholla Avenue in Yuca Valley—the top contributors to total and average revenue—as benchmarks. Identify common features and strategies in these areas and replicate them in underperforming cities and streets to boost overall platform revenue.