[论文解读] Trusted Artificial Intelligence: Towards Certification of Machine Learning Applications
本文提出了一套机器学习应用的认证框架与审计目录,聚焦于低风险的监督学习系统。该框架引入了四个关键性等级以评估决策影响,并将安全性、数据质量、伦理与合规性整合到一个整体评估过程中,旨在通过正式的认证标准实现可信人工智能。
Artificial Intelligence is one of the fastest growing technologies of the 21st century and accompanies us in our daily lives when interacting with technical applications. However, reliance on such technical systems is crucial for their widespread applicability and acceptance. The societal tools to express reliance are usually formalized by lawful regulations, i.e., standards, norms, accreditations, and certificates. Therefore, the T\\"UV AUSTRIA Group in cooperation with the Institute for Machine Learning at the Johannes Kepler University Linz, proposes a certification process and an audit catalog for Machine Learning applications. We are convinced that our approach can serve as the foundation for the certification of applications that use Machine Learning and Deep Learning, the techniques that drive the current revolution in Artificial Intelligence. While certain high-risk areas, such as fully autonomous robots in workspaces shared with humans, are still some time away from certification, we aim to cover low-risk applications with our certification procedure. Our holistic approach attempts to analyze Machine Learning applications from multiple perspectives to evaluate and verify the aspects of secure software development, functional requirements, data quality, data protection, and ethics. Inspired by existing work, we introduce four criticality levels to map the criticality of a Machine Learning application regarding the impact of its decisions on people, environment, and organizations. Currently, the audit catalog can be applied to low-risk applications within the scope of supervised learning as commonly encountered in industry. Guided by field experience, scientific developments, and market demands, the audit catalog will be extended and modified accordingly.
研究动机与目标
- 开发机器学习应用的正式认证流程,以增强可信度与社会接受度。
- 解决现实工业应用中机器学习系统缺乏标准化评估标准的问题。
- 建立一个涵盖软件安全、数据质量、隐私与伦理考量的多维审计目录。
- 基于机器学习决策对人员、环境与组织的潜在影响,定义关键性等级。
- 通过基于现场经验与市场需求的迭代优化,为未来高风险人工智能系统的认证奠定基础。
提出的方法
- 提出四级关键性等级体系,根据决策影响严重程度对机器学习应用进行分类。
- 开发全面的审计目录,涵盖安全软件开发、功能正确性、数据质量、数据保护与伦理合规性。
- 将该框架应用于工业中常见的低风险监督学习应用。
- 利用现场经验、科学洞察与市场需求,指导审计目录的迭代演进。
- 整合现有标准与法规的原则,使其与法律与社会信任机制保持一致。
- 将认证流程设计为对机器学习系统属性的综合、多重视角评估。
实验结果
研究问题
- RQ1如何在多个维度上系统性地评估机器学习应用的可信度?
- RQ2基于其对社会与运营影响,应采用哪些标准来定义并分类机器学习决策的关键性?
- RQ3如何为低风险工业应用中的机器学习系统建立正式的认证流程?
- RQ4审计目录中必须包含哪些要素,以确保符合安全、数据质量、隐私与伦理标准?
- RQ5随着机器学习应用的发展并进入更高风险领域,该认证框架如何实现扩展与适应?
主要发现
- 所提出的审计目录适用于工业场景中的低风险监督学习应用。
- 四个关键性等级为评估机器学习决策对人员、环境与组织的潜在影响提供了结构化方法。
- 该框架能够从安全性、数据质量、隐私与伦理维度对机器学习系统进行整体评估。
- 认证流程设计为通过现场经验、科学进展与市场反馈持续演进。
- 该方法为未来高风险人工智能系统(如共享工作空间中的自主机器人)的正式认证奠定了基础。
- 审计目录目前以软性评审格式适用,旨在实现迭代改进。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。