[논문 리뷰] The Surveillance AI Pipeline
สาม
A rapidly growing number of voices argue that AI research, and computer vision in particular, is powering mass surveillance. Yet the direct path from computer vision research to surveillance has remained obscured and difficult to assess. Here, we reveal the Surveillance AI pipeline by analyzing three decades of computer vision research papers and downstream patents, more than 40,000 documents. We find the large majority of annotated computer vision papers and patents self-report their technology enables extracting data about humans. Moreover, the majority of these technologies specifically enable extracting data about human bodies and body parts. We present both quantitative and rich qualitative analysis illuminating these practices of human data extraction. Studying the roots of this pipeline, we find that institutions that prolifically produce computer vision research, namely elite universities and "big tech" corporations, are subsequently cited in thousands of surveillance patents. Further, we find consistent evidence against the narrative that only these few rogue entities are contributing to surveillance. Rather, we expose the fieldwide norm that when an institution, nation, or subfield authors computer vision papers with downstream patents, the majority of these papers are used in surveillance patents. In total, we find the number of papers with downstream surveillance patents increased more than five-fold between the 1990s and the 2010s, with computer vision research now having been used in more than 11,000 surveillance patents. Finally, in addition to the high levels of surveillance we find documented in computer vision papers and patents, we unearth pervasive patterns of documents using language that obfuscates the extent of surveillance. Our analysis reveals the pipeline by which computer vision research has powered the ongoing expansion of surveillance.
연구 동기 및 목표
- Assess how computer vision research self-reports enables extracting data about humans.
- Quantify the prevalence and types of human data extraction in papers and downstream patents.
- Map the roots of Surveillance AI across institutions, nations, subfields, and years to reveal fieldwide norms.
- Identify linguistic practices that obscure the surveillance implications within CV research and patents.
제안 방법
- Collect and analyze over 40,000 computer vision papers and downstream patents linking 19,000+ CV papers to 23,000+ patents.
- Perform qualitative content analysis on a subset (100 papers and 100 patents) to categorize human-data extraction targets.
- Compute quantitative statistics on the prevalence of human data extraction across the full corpus (papers vs. patents).
- Analyze trends over decades in the share of CV papers whose downstream patents are used in surveillance patents.
- Examine linguistic patterns and obfuscating language that conceal surveillance implications in texts.]
- research_questions:[
실험 결과
연구 질문
- RQ1주석이 달린 컴퓨터 비전 논문과 다운스트림 특허 중 인간에 대한 데이터를 추출하는 비율은 얼마인가?
- RQ2CV 논문과 특허에서 식별된 인간 데이터 추출의 네 가지 대상은 무엇이며 그 보편성은 어느 정도인가?
- RQ3다운스트림 특허가 포함된 CV 논문이 수십년, 기관, 국가, 하위 분야에 걸쳐 감시 특허에 얼마나 기여하는가?
- RQ4CV 연구와 특허의 감시 가능성을 은폐하는 모호한 언어의 증거가 있는가?
- RQ51990년대에서 2010년대로의 CV 연구와 감시 특허 간 관계가 어떻게 진화했는가?
주요 결과
- 주석이 달린 CV 논문과 특허의 90%가 인간에 대한 데이터를 추출할 수 있게 한다.
- 논문과 특허의 68%가 인체 및 신체 부위에 대한 데이터를 명시적으로 추출한다.
- 감시 특허에 사용된 다운스트림 특허를 가진 CV 논문의 비중은 1990년대의 50%에서 2010년대의 79%로 증가했다.
- 1990년대에서 2010년대까지 다운스트림 특허를 가진 CV 논문의 수가 다섯 배 이상 증가했다.
- 도표와 데이터 세트에서 인간을 객체로 취급하거나 인간 데이터 추출을 은폐하는 모호한 언어가 만연하다.
더 나은 연구,지금 바로 시작하세요
연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.
카드 등록 없음 · 무료 플랜 제공
이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.