QUICK REVIEW

[논문 리뷰] Detection of Unauthorized IoT Devices Using Machine Learning Techniques

Yair Meidan, Michael Bohadana|arXiv (Cornell University)|2017. 09. 14.

Network Security and Intrusion Detection참고 문헌 6인용 수 198

한 줄 요약

논문은 TCP/IP 트래픽 특성에 랜덤 포레스트를 사용해 무허가 IoT 기기를 자동으로 탐지하고 화이트리스트 타입을 올바르게 분류하며, 20-세션 이동 창에서 약 96%의 미확인 탐지와 99%의 올바른 화이트리스트 분류를 달성한다.

ABSTRACT

Security experts have demonstrated numerous risks imposed by Internet of Things (IoT) devices on organizations. Due to the widespread adoption of such devices, their diversity, standardization obstacles, and inherent mobility, organizations require an intelligent mechanism capable of automatically detecting suspicious IoT devices connected to their networks. In particular, devices not included in a white list of trustworthy IoT device types (allowed to be used within the organizational premises) should be detected. In this research, Random Forest, a supervised machine learning algorithm, was applied to features extracted from network traffic data with the aim of accurately identifying IoT device types from the white list. To train and evaluate multi-class classifiers, we collected and manually labeled network traffic data from 17 distinct IoT devices, representing nine types of IoT devices. Based on the classification of 20 consecutive sessions and the use of majority rule, IoT device types that are not on the white list were correctly detected as unknown in 96% of test cases (on average), and white listed device types were correctly classified by their actual types in 99% of cases. Some IoT device types were identified quicker than others (e.g., sockets and thermostats were successfully detected within five TCP sessions of connecting to the network). Perfect detection of unauthorized IoT device types was achieved upon analyzing 110 consecutive sessions; perfect classification of white listed types required 346 consecutive sessions, 110 of which resulted in 99.49% accuracy. Further experiments demonstrated the successful applicability of classifiers trained in one location and tested on another. In addition, a discussion is provided regarding the resilience of our machine learning-based IoT white listing method to adversarial attacks.

연구 동기 및 목표

기업 IoT 보안을 네트워크에서 무허가 기기의 자동 탐색으로 촉진한다.
전문화된 하드웨어가 필요 없는 확장 가능한 트래픽 기반 화이트리스트 시스템을 개발한다.
TCP/IP 트래픽 데이터에서 기기 유형 분류를 머신러닝으로 시연한다.
랩 간 학습 모델의 이전 가능성과 적대적 시도에 대한 탄력성을 논의한다.

제안 방법

장치 유형 식별을 레이블이 지정된 TCP/IP 세션 특징을 사용하는 다중 클래스 분류 문제로 다룬다.
화이트리스트 기기 유형의 세션에서 Random Forest 분류기를 학습해 각 유형에 대한 후처확 probabilities를 산출한다.
분류 임계값 tr를 사용해 후확률을 바탕으로 세션을 기기 유형으로 라벨링하거나 미확인으로 라벨링한다.
20개의 연속 세션으로 구성된 이동 창에서 다수결 투표를 적용해 스트림의 기기 유형을 결정하고 강건성을 높인다.
실제 배치를 시뮬레이션하기 위해 데이터를 시간적으로 학습, 검증, 테스트 세트로 분할하고 과적합을 완화한다.
클래스 불균형에 대응하고 검증 세트에서 F1으로 임계값을 최적화한다.

실험 결과

연구 질문

RQ1TCP/IP 트래픽 특징이 실제 기업 환경에서 허가된 IoT 기기 유형 간 차이를 구분할 수 있는가?
RQ2단일 세션과 세션 시퀀스에 기초해 미확인(무허가) IoT 기기 유형을 신뢰할 수 있게 '미확인'으로 탐지할 수 있는가?
RQ3다수의 세션에 걸친 다수결 투표가 허가된 기기와 무허가 기기의 탐지 및 분류 정확도를 향상시키는가?
RQ4한 실험실에서 학습된 분류기가 다른 실험실의 데이터로 얼마나 잘 전이되는가(전이성)?

주요 결과

제외된 기기 유형	세션 수	미확인으로 올바르게 감지됨	화이트리스트에서 올바르게 분류된 가중 평균
baby_monitor	1981	1	1
smoke_detector	104	1	1
socket	1962	1	1
TV	1962	0.84	0.98
refrigerator	1981	0.99	1
thermostat	1981	1	1
motion_sensor	1239	1	0.99
security_camera	1375	0.94	0.99
watch	1111	0.84	0.97

이동 창을 110 세션으로 사용한 경우 무허가 IoT 기기 유형이 미확인으로 96% 탐지되었다(초록/섹션에 단일 단계 분석으로 보고됨).
화이트리스트 기기 유형은 테스트 세트에서 20-세션 다수결 투표로 실제 유형으로 99% 올바르게 분류되었다.
아홉 개의 실험에서(매번 하나의 기기 유형을 화이트리스트에서 제외), 미확인 기기 탐지의 평균 정확도는 96%, 화이트리스트 기기 분류의 평균 정확도는 99%였다.
전이성 실험에서 Lab A에서 학습된 분류기가 Lab B에서 무허가 TV에 대해 85%의 평균 정확도를 달성했고, 랩 간 무허가 보안 카메라 탐지에서 완전한 전이성(100%)을 보였다.
다수결 단계가 전반적인 성능을 크게 향상시켰으며 대부분의 기기 유형에서 테스트 세트의 무허가 탐지 96% 및 화이트리스트 분류 99%로 향상되었다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.