QUICK REVIEW

[Paper Review] Latent Bayesian melding for integrating individual and population models

Mingjun Zhong, Nigel Goddard|arXiv (Cornell University)|Oct 30, 2015

Blind Source Separation Techniques23 references24 citations

TL;DR

This paper proposes latent Bayesian melding (LBM), a method for integrating individual-level and population-level models with latent variables by combining their posterior distributions via logarithmic opinion pooling. In electricity disaggregation, LBM significantly outperforms posterior regularization, reducing duration and cycle error by up to 70% and 68% respectively, demonstrating improved accuracy in blind source separation with constrained modeling.

ABSTRACT

In many statistical problems, a more coarse-grained model may be suitable for population-level behaviour, whereas a more detailed model is appropriate for accurate modelling of individual behaviour. This raises the question of how to integrate both types of models. Methods such as posterior regularization follow the idea of generalized moment matching, in that they allow matching expectations between two models, but sometimes both models are most conveniently expressed as latent variable models. We propose latent Bayesian melding, which is motivated by averaging the distributions over populations statistics of both the individual-level and the population-level models under a logarithmic opinion pool framework. In a case study on electricity disaggregation, which is a type of single-channel blind source separation problem, we show that latent Bayesian melding leads to significantly more accurate predictions than an approach based solely on generalized moment matching.

Motivation & Objective

To address the challenge of integrating individual-level models with latent variables and population-level statistical constraints in single-channel blind source separation problems.
To overcome limitations of moment-matching approaches like posterior regularization when both models involve latent variables.
To develop a principled method for merging prior information from individual and population models using a unified probabilistic framework.
To evaluate the method on real-world electricity disaggregation data, where identifiability issues hinder accurate appliance-level energy estimation.
To demonstrate that latent Bayesian melding improves prediction accuracy, especially for aggregate statistics like duration and cycle counts.

Proposed method

Proposes latent Bayesian melding (LBM) as an extension of Bayesian melding to handle models with latent variables, using logarithmic opinion pooling to merge priors from individual and population models.
Uses a melded prior distribution for the individual-level model parameters, derived from both the induced prior via the simulation function and the external population prior.
Applies the change-of-variable technique or a heuristic formula (eq. 2) to derive the melded prior when the simulation function is not invertible.
Employs a logarithmic opinion pool: epτ(τ) ∝ p∗τ(τ)^α pτ(τ)^(1−α), where α controls the weight of each prior, with α fixed in this study.
Integrates the melded prior into a full Bayesian model, updating the posterior over latent states and parameters using standard inference.
Applies the method to energy disaggregation using an adaptive factorial HMM (AFHMM) with constraints on summary statistics (e.g., total energy, duration, cycle count).

Experimental results

Research questions

RQ1Can latent Bayesian melding effectively integrate individual-level models with latent variables and population-level constraints in a statistically principled way?
RQ2How does LBM compare to posterior regularization in terms of predictive accuracy for electricity disaggregation?
RQ3Does incorporating population-level summary statistics improve the identifiability of individual appliance signals in single-channel blind source separation?
RQ4To what extent does LBM reduce errors in aggregate statistics such as duration and cycle counts compared to moment-matching baselines?
RQ5Can LBM generalize across datasets, as shown in cross-dataset evaluation on HES and UK-DALE data?

Key findings

On synthetic data, AFHMM+LBM reduced duration aggregate error (DAE) by 8% and cycle aggregate error (CAE) by 50% compared to AFHMM+PR.
On real mains data from 6 houses, AFHMM+LBM reduced NDE by 15%, DAE by 10%, and CAE by 40% compared to AFHMM+PR.
On the UK-DALE dataset, AFHMM+LBM reduced DAE by 70% and CAE by 68% compared to AFHMM+PR, while maintaining similar performance on NDE and SAE.
LBM consistently outperformed posterior regularization in predicting aggregate statistics, particularly duration and cycle counts, which are critical for energy monitoring applications.
The method demonstrated strong generalization, achieving consistent improvements across multiple real-world datasets (HES and UK-DALE) with different sampling rates and data characteristics.
The results confirm that incorporating population-level constraints via latent Bayesian melding effectively mitigates identifiability issues in single-channel blind source separation for energy disaggregation.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.