QUICK REVIEW

[논문 리뷰] Position: Towards Bidirectional Human-AI Alignment

Hua Shen, Tiffany Knearem|arXiv (Cornell University)|2024. 06. 13.

Digital Transformation in Industry인용 수 13

한 줄 요약

이 논문은 400편 이상의 연구에 대한 체계적 고찰을 통해 양방향 인간-AI 정렬(Bidirectional Human-AI Alignment)을 정의하고 프레임을 제시하며, 인간에 대한 AI 정렬과 AI에 대한 인간 정렬을 모두 포함하는 상호적 장기 프레임워크와 향후 방향을 제안한다.

ABSTRACT

Recent advances in general-purpose AI underscore the urgent need to align AI systems with human goals and values. Yet, the lack of a clear, shared understanding of what constitutes "alignment" limits meaningful progress and cross-disciplinary collaboration. In this position paper, we argue that the research community should explicitly define and critically reflect on "alignment" to account for the bidirectional and dynamic relationship between humans and AI. Through a systematic review of over 400 papers spanning HCI, NLP, ML, and more, we examine how alignment is currently defined and operationalized. Building on this analysis, we introduce the Bidirectional Human-AI Alignment framework, which not only incorporates traditional efforts to align AI with human values but also introduces the critical, underexplored dimension of aligning humans with AI -- supporting cognitive, behavioral, and societal adaptation to rapidly advancing AI technologies. Our findings reveal significant gaps in current literature, especially in long-term interaction design, human value modeling, and mutual understanding. We conclude with three central challenges and actionable recommendations to guide future research toward more nuanced, reciprocal, and human-AI alignment approaches.

연구 동기 및 목표

학문 간 인간-AI 정렬의 정의와 범위를 명확히 한다.
AI를 인간에 맞추는 것과 인간을 AI에 맞추는 것을 포함하는 양방향 인간-AI 정렬 프레임워크를 제안한다.
정렬을 위한 인간의 가치, 상호작용 기법, 평가에 대한 연구 결과를 종합한다.
가까운 시점에서 장기까지의 세 가지 도전과 잠재적 미래 해결책을 제시한다.

제안 방법

PRISMA 지침에 따라 2019년~2024년 1월까지의 400편이 넘는 논문에 대한 체계적 문헌 고찰.
양방향 프레임워크와 분류법을 도출하기 위한 반복적 코딩.
가치, 상호작용, 평가에 대한 통찰을 얻기 위한 논문들의 질적 및 양적 분석.
학제 간 연구의 조화를 위한 공유 어휘와 토폴로지의 개발.
교차 도메인 문헌과 윤리 학술대회(FAccT, AIES)를 통한 검증.

실험 결과

연구 질문

RQ1RQ1. AI 정렬을 위해 연구되는 관련 인간 가치들은 무엇이며, 인간은 이 가치들을 어떻게 명시하는가?
RQ2RQ2. 인간의 명세가 AI 개발에 어떻게 통합되는가?
RQ3RQ3. 기존 연구들은 AI 정렬에 대한 인간의 이해와 평가를 어떻게 향상시키는가?
RQ4RQ4. 인간-AI 협업을 촉진하는 인터페이스와 상호작용을 설계하는 관행은 무엇인가?
RQ5RQ5. 다양한 인간 가치 집단의 요구를 충족하기 위해 AI 시스템이 어떻게 적응되었는가?

주요 결과

인간-AI 정렬의 정의와 범위를 분명히 하였고, 누구와 정렬할지, 정렬 목표, 그리고 정렬할 가치들까지 포함한다.
정밀한 분류체계를 가진 양방향 인간-AI 정렬 프레임워크로, AI를 인간에 맞추기와 인간을 AI에 맞추기를 모두 다룬다.
인간 가치, 상호작용 기법, AI 평가와 인간 평가 간의 차이에 대한 통찰.
단기에서 장기로 이어지는 향후 연구를 위한 세 가지 도전과 제안된 해결 방향.
정렬 연구자 간의 교차학문적 의사소통을 촉진하는 체계화된 어휘.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.