QUICK REVIEW

[论文解读] The VIMOS Public Extragalactic Redshift Survey (VIPERS). A Support Vector Machine classification of galaxies, stars and AGNs

K. Małek, A. Solarz|arXiv (Cornell University)|Mar 11, 2013

Spectroscopy and Chemometric Analyses参考文献 48被引用 34

一句话总结

本文提出一种基于支持向量机（SVM）的分类器，利用多波段测光（u*, g', r', i', z', Ks）对维拉姆公共河外红移巡天（VIPERS）中的星系、恒星和活动星系核（AGNs）进行区分。通过引入近红外（NIR）数据，分类器在自检中对星系的准确率达到97%，对恒星为97%，对AGNs为95%，显著提升了样本纯度，并实现了对低质量光谱源的测光分类。

ABSTRACT

The aim of this work is to develop a comprehensive method for classifying sources in large sky surveys and we apply the techniques to the VIMOS Public Extragalactic Redshift Survey (VIPERS). Using the optical (u*, g', r', i') and NIR data (z', Ks), we develop a classifier, based on broad-band photometry, for identifying stars, AGNs and galaxies improving the purity of the VIPERS sample. Support Vector Machine (SVM) supervised learning algorithms allow the automatic classification of objects into two or more classes based on a multidimensional parameter space. In this work, we tailored the SVM for classifying stars, AGNs and galaxies, and applied this classification to the VIPERS data. We train the SVM using spectroscopically confirmed sources from the VIPERS and VVDS surveys. We tested two SVM classifiers and concluded that including NIR data can significantly improve the efficiency of the classifier. The self-check of the best optical + NIR classifier has shown a 97% accuracy in the classification of galaxies, 97 for stars, and 95 for AGNs in the 5-dimensional colour space. In the test on VIPERS sources with 99% redshift confidence, the classifier gives an accuracy equal to 94% for galaxies, 93% for stars, and 82% for AGNs. The method was applied to sources with low quality spectra to verify their classification, and thus increasing the security of measurements for almost 4 900 objects. We conclude that the SVM algorithm trained on a carefully selected sample of galaxies, AGNs, and stars outperforms simple colour-colour selection methods, and can be regarded as a very efficient classification method particularly suitable for modern large surveys.

研究动机与目标

开发一种稳健、自动化的大型测光巡天天文源分类方法。
通过准确识别并剔除恒星和AGNs等污染源，提高VIPERS星系样本的纯度。
评估近红外（NIR）测光对分类准确率的影响，与仅使用光学数据进行比较。
实现对光谱红移质量差或模糊的源进行测光分类。
提供一种可扩展的、监督式机器学习框架，适用于未来河外巡天。

提出的方法

使用VIPERS和VVDS中光谱确认的源训练支持向量机（SVM）分类器，将天体分类为星系、恒星和AGNs。
分类器使用由u*, g', r', i', z'星等定义的五维颜色空间，扩展至包含Ks波段近红外测光的六维空间。
训练数据来自光谱确认的源，确保类别分配的高可靠性。
通过在训练样本上的自检验证模型，并在高红移置信度的VIPERS源上进行测试。
在仅使用光学和光学+近红外配置下，使用准确率指标评估性能。
方法中引入星等分箱，以提升在不同流量水平下的分类稳定性。

实验结果

研究问题

RQ1支持向量机能否有效利用VIPERS巡天中的波段测光对星系、恒星和AGNs进行分类？
RQ2与仅使用光学测光相比，包含近红外（NIR）测光（z', Ks）如何影响分类准确率？
RQ3该分类器在多大程度上能通过识别并剔除污染源来提升VIPERS星系样本的纯度？
RQ4在不同巡天特性下，该SVM分类器的性能与现有方法（如Pan-STARRS1分类器）相比如何？
RQ5该分类器能否可靠地对光谱红移质量差的源进行分类，从而扩展可用样本规模？

主要发现

在使用光学与近红外测光（u*, g', r', i', z', Ks）对训练样本进行测试时，SVM分类器对星系的分类准确率达97%，对恒星为97%，对AGNs为95%。
引入近红外数据显著提升了分类效率，在自检和测试场景中均优于仅使用光学数据的分类器。
在对VIPERS源进行测试（红移置信度99%）时，分类器对星系的准确率为94%，对恒星为93%，对AGNs为92%。
仅使用光学的分类器（u*, g', r', i'）对星系的准确率为94%，对AGNs为82%，对恒星为93%，尽管参数空间维度较低，其性能与Pan-STARRS1分类器相当。
性能提升归因于使用u*波段替代zP1/yP1波段，以及应用星等分箱，从而增强了颜色空间中的稳定性与分离度。
分类器成功重新分类了4,900个光谱质量差的源，证明其在扩展宇宙学分析可用数据方面的实用性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。