Skip to main content
QUICK REVIEW

[论文解读] Distributional Fitting and Tail Analysis of Lead-Time Compositions: Nights vs. Revenue on Airbnb

Harrison Katz, Jess Needleman|arXiv (Cornell University)|Jan 17, 2026
Sharing Economy and Platforms被引用 0
一句话总结

简要直接回答摘要:论文将 Airbnb 上的每日提前期组成作为成分向量进行分析,发现中等范围的提前期主导 GBV,Gamma/Weibull 拟合效果良好,尾部推断对截断很敏感。

ABSTRACT

We analyze daily lead-time distributions for two Airbnb demand metrics, Nights Booked (volume) and Gross Booking Value (revenue), treating each day's allocation across 0-365 days as a compositional vector. The data span 2,557 days from January 2019 through December 2025 in a large North American region. Three findings emerge. First, GBV concentrates more heavily in mid-range horizons: beyond 90 days, GBV tail mass typically exceeds Nights by 20-50%, with ratios reaching 75% at the 180-day threshold during peak seasons. Second, Gamma and Weibull distributions fit comparably well under interval-censored cross-entropy. Gamma wins on 61% of days for Nights and 52% for GBV, with Weibull close behind at 38% and 45%. Lognormal rarely wins (<3%). Nonparametric GAMs achieve 18-80x lower CRPS but sacrifice interpretability. Third, generalized Pareto fits suggest bounded tails for both metrics at thresholds below 150 days, though this may partly reflect right-truncation at 365 days; above 150 days, estimates destabilize. Bai-Perron tests with HAC standard errors identify five structural breaks in the Wasserstein distance series, with early breaks coinciding with COVID-19 disruptions. The results show that volume and revenue lead-time shapes diverge systematically, that simple two-parameter distributions capture daily pmfs adequately, and that tail inference requires care near truncation boundaries.

研究动机与目标

  • 研究在 Airbnb 上,入住数量(Nights)与收入(GBV) 的提前期分布是否不同。
  • 在区间密度截断下,识别最适合日常提前期 pmf 的参数族。
  • 评估提前期分布的尾部行为及跨时间的潜在结构变化。
  • 评估拟合提前期分布的非参数和参数方法,并比较对预测相关的评分。
  • 检验在截断约束下对收入预测和尾部推断的影响。

提出的方法

  • 将每一天的提前期分配视为在 365-simplex 上的组成向量。
  • 使用 Wasserstein-1 距离比较日常 Nights vs. GBV pmf。
  • 通过交叉熵在区间密度截断的 pmf 上拟合 Gamma、Weibull 和对数正态分布。
  • 应用广义帕累托尾部分析并进行阈值稳定性诊断(POT 方法)。
  • 对 Wasserstein 距离序列应用 HAC-鲁棒的 Bai–Perron 结构断裂检验。
  • 使用 CRPS 和 KLD 作为评分规则比较非参数 GAM 拟合。
Figure 1: Aggregated lead-time distributions for Nights (blue) and GBV (red), 2019–2025. Distributions are day-weighted averages of daily pmfs: $\bar{p}(\ell)=D^{-1}\sum_{d}x_{d,\ell}$ . Both peak near $\ell=0$ and decline rapidly. The curves cross around $\ell=30$ days: below, Nights slightly excee
Figure 1: Aggregated lead-time distributions for Nights (blue) and GBV (red), 2019–2025. Distributions are day-weighted averages of daily pmfs: $\bar{p}(\ell)=D^{-1}\sum_{d}x_{d,\ell}$ . Both peak near $\ell=0$ and decline rapidly. The curves cross around $\ell=30$ days: below, Nights slightly excee

实验结果

研究问题

  • RQ1不同日期的基于体积(Nights)与基于收入(GBV)的提前期分布是否系统性不同?
  • RQ2在区间截断下,哪一个参数族(Gamma、Weibull、对数正态)最能描述日常提前期 pmf?
  • RQ3Nights 与 GBV 的尾部行为有何差异,在给定截断的情况下阈值是否对尾部稳定?
  • RQ4样本中 Nights 与 GBV 提前期形状的差异是否出现结构断裂?
  • RQ5参数拟合(描述性)与非参数拟合(GAM)在描述性与预测导向之间的相对表现如何?

主要发现

  • 相较于 Nights,在超过 90 天的时间范围内,GBV 的中等区间分布更集中,峰值季节时在 180 天处比例高达 75%。
  • Gamma 与 Weibull 提供可比拟合;在 Nights 约 55–60% 的日子和 GBV 约 52% 的日子通过交叉熵判定 Gamma 获胜。
  • 对数正态很少获胜;GAMs 虽可实现更低的 CRPS,但可解释性存在权衡。
  • GPD 尾部估计在约 150 天左右因截断而稳定,超出后推断不稳定。
  • Wasserstein 距离存在五个结构断裂,提示与 COVID 相关的转折及随后的分歧进入新体制。
  • 描述性提前期 pmf 仅需两参数 Gamma/Weibull 即可;GAM 在样本内 CRPS 更好,但不如简约模型 Parsimonious。
Figure 2: Daily Wasserstein-1 distance between Nights and GBV, with structural breakpoints (dashed vertical lines) identified via Bai–Perron with HAC standard errors. The series averages 8.67 (95% CI: 8.42–8.92). Seasonality peaks in summer. Early breaks (2020–2021) align with COVID disruptions; lat
Figure 2: Daily Wasserstein-1 distance between Nights and GBV, with structural breakpoints (dashed vertical lines) identified via Bai–Perron with HAC standard errors. The series averages 8.67 (95% CI: 8.42–8.92). Seasonality peaks in summer. Early breaks (2020–2021) align with COVID disruptions; lat

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。