[論文レビュー] A Machine Learning Framework for Constructing Heterogeneous Contact Networks: Implications for Epidemic Modelling
The paper presents a machine learning framework to construct population-scale contact networks that preserve age-structured mixing and contact heterogeneity, enabling more realistic epidemic simulations.
Capturing the structured mixing within a population is key to the reliable projection of infectious disease dynamics and hence informed control. Both heterogeneity in the number of contacts and age-structured mixing have been repeatedly demonstrated as fundamental, yet are rarely combined. Networks provide a powerful and intuitive method to realise population structure, and simulate infection dynamics. However the explicit measurement of contact networks is not scalable to larger populations. Here, using data from social contact surveys, we develop a generalisable and robust algorithm utilizing machine learning to generate a surrogate population-scale network that preserves both age-structured mixing and heterogeneity of contacts. We simulate the spread of infection across different populations, considering how the epidemic size varies over basic reproduction number ($R_0$) scenarios - mirroring the process of determining public health impact from early epidemic growth. Our approach shows that both age structure and degree heterogeneity substantially reduce the epidemic size. We also demonstrate that these simulations more accurately capture the heterogeneity in secondary cases observed for COVID-19 when transmission is scaled by contact duration, dampening the effect of highly connected ``super-spreaders". By using survey data collected during 2020-2022, these network models also inform about the impacts of control and targeting of public health interventions: quantifying the non-linear reduction in transmission opportunities that occurred during lockdowns, and the ages and contact types most responsible for onward transmission. Our robust methodology therefore allows for the inclusion of the full wealth of data commonly collected by surveys but frequently overlooked to be incorporated into more realistic transmission models of infectious diseases.
研究の動機と目的
- Capture structured mixing and degree heterogeneity in population contact patterns.
- Develop a general framework to generate surrogate networks from ego-centric survey data.
- Demonstrate that age structure and contact heterogeneity significantly affect epidemic size and transmission dynamics.
- Compare network-based epidemic outcomes to SBM and homogeneous models across multiple data sets.
- Assess implications for public health interventions and survey design.
提案手法
- Extract egocentric contact data with age and duration from surveys.
- Fit finite Gaussian Mixture Models to jointly model contact age groups and durations for each respondent age group.
- Generate a population of N nodes with age distribution matching census data and sample degree distributions from the GMM.
- Construct the network by a stratified configuration approach to connect stubs with compatible age and duration.
- Run SEIR-type epidemic simulations on the networks with a Gillespie algorithm and duration-weighted transmission.
- Evaluate network realism using Earth Mover’s Distance between ego-networks in data and in the model.
実験結果
リサーチクエスチョン
- RQ1Can a Gaussian Mixture Model-based reconstruction preserve both age-structured mixing and contact heterogeneity in synthetic networks?
- RQ2How do age structure and degree heterogeneity influence final outbreak size for given R0 and transmission settings?
- RQ3What is the impact of including contact duration on epidemic outcomes and dispersion (k) of secondary cases?
- RQ4How do network-based simulations compare to SBM and homogeneous models across different data sets and lockdown scenarios?
- RQ5What are the implications for targeting controls based on age groups and contact durations?
主な発見
- Heterogeneity in contact degree and age-structured mixing substantially reduce epidemic size for a given R0 compared to simpler models.
- Including contact duration rather than assuming equal transmission across contacts yields epidemic sizes that better match observed heterogeneity in secondary cases.
- SBM captures age structure but overestimates homogeneity, whereas GMM with duration scaling yields dispersion parameters (k) within ranges observed for COVID-19 in several data sets.
- Lockdown networks show lower R0 for the same transmission rate, due to reduced high-degree contacts, with effects varying between 2020 and 2021.
- For duration-based GMM networks, longer contacts dominate transmission to raise R0, but at higher R0 shorter contacts become more important for ongoing transmission.
- Age groups and contact durations contribute differently to early growth, with school-age children and ages 30-49 often contributing most to early transmission; effects shift with reopening and policy.
より良い研究を、今すぐ始めましょう
論文設計から論文執筆まで、研究時間を劇的に削減しましょう。
クレジットカード登録不要
このレビューはAIが作成し、人間の編集者が確認しました。