QUICK REVIEW

[论文解读] Knowledge transfer of Deep Learning for galaxy morphology from one survey to another

H. Domínguez Sánchez, J. Zuntz|arXiv (Cornell University)|Jul 2, 2018

Remote Sensing in Agriculture被引用 1

一句话总结

该论文表明，预先在斯隆数字巡天（SDSS）数据上训练的深度学习模型，可通过极少的额外标注数据快速适应暗能量巡天（DES）中的星系分类任务。仅使用300至500张DES星系进行微调，即可将准确率提升至95%以上，并显著提高完整性和纯度，从而实现在仪器特性不同的巡天之间高效迁移形态学知识。

ABSTRACT

Deep Learning (DL) algorithms for morphological classification of galaxies have proven very successful, mimicking (or even improving) visual classifications. However, these algorithms rely on large training samples of labeled galaxies (typically thousands of them). A key question for using DL classifications in future Big Data surveys is how much of the knowledge acquired from an existing survey can be exported to a new dataset, i.e. if the features learned by the machines are meaningful for different data. We test the performance of DL models, trained with Sloan Digital Sky Survey (SDSS) data, on Dark Energy survey (DES) using images for a sample of 5000 galaxies with a similar redshift distribution to SDSS. Applying the models directly to DES data provides a reasonable global accuracy ($\sim$ 90%), but small completeness and purity values. A fast domain adaptation step, consisting in a further training with a small DES sample of galaxies ($\sim$ 500-300), is enough for obtaining an accuracy > 95% and a significant improvement in the completeness and purity values. This demonstrates that, once trained with a particular dataset, machines can quickly adapt to new instrument characteristics (e.g., PSF, seeing, depth), reducing by almost one order of magnitude the necessary training sample for morphological classification. Redshift evolution effects or significant depth differences are not taken into account in this study.

研究动机与目标

研究在某一星系巡天上训练的深度学习模型是否能有效迁移到另一具有不同成像特征的巡天中。
评估在无需重新训练的情况下，仅使用目标巡天中少量标注子集，将预训练模型应用于新数据集（DES）的性能。
评估通过微调实现的领域自适应对形态分类中准确率、完整性和纯度的影响。
确定在迁移后实现高性能所需的最小目标巡天标注数据量。

提出的方法

在斯隆数字巡天（SDSS）的大规模标注星系样本上预训练深度学习模型。
直接将预训练模型应用于暗能量巡天（DES）的5,000张星系样本，其红移分布与SDSS相匹配。
通过在300至500张DES星系的少量标注子集上进行微调，快速执行领域自适应步骤。
使用标准指标（全局准确率、完整性和纯度）在DES测试集上评估模型性能。
在两个巡天中使用相同的网络架构和超参数，以隔离领域偏移与自适应的影响。

实验结果

研究问题

RQ1在SDSS数据上训练的深度学习模型在未重新训练的情况下直接应用于DES数据时，能否实现高准确率？
RQ2当将预训练模型迁移到具有不同仪器特性的新巡天时，其性能会如何退化？
RQ3在目标巡天（DES）中实现高分类准确率所需的最小标注样本量是多少？
RQ4微调在多大程度上提升了跨巡天形态分类中的完整性和纯度？

主要发现

将SDSS训练的模型直接应用于DES数据，可实现约90%的全局准确率。
相同模型在完整性和纯度方面表现欠佳，表明由于领域偏移导致性能下降。
仅使用300至500张DES星系进行微调，即可将准确率提升至95%以上。
完整性和纯度指标在微调后显著改善，表明模型在目标领域具有更好的泛化能力。
结果表明，从一个巡天向另一个巡天迁移知识可将所需标注训练数据量减少近一个数量级。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。