Skip to main content
QUICK REVIEW

[论文解读] Machine Learning (ML) library in Linux kernel

Viacheslav Dubeyko|arXiv (Cornell University)|Mar 2, 2026
Advanced Data Storage Technologies被引用 0
一句话总结

本论文提出一种通用的 ML 基础设施,通过将用户空间的 ML 模型与内核空间的 ML 模型代理对接,以 PoC 实现来证明可行性,从而实现内核中的 ML 使用。它讨论了架构、交互模式,以及未来在 ML 驱动的内核优化方面的工作。

ABSTRACT

Linux kernel is a huge code base with enormous number of subsystems and possible configuration options that results in unmanageable complexity of elaborating an efficient configuration. Machine Learning (ML) is approach/area of learning from data, finding patterns, and making predictions without implementing algorithms by developers that can introduce a self-evolving capability in Linux kernel. However, introduction of ML approaches in Linux kernel is not easy way because there is no direct use of floating-point operations (FPU) in kernel space and, potentially, ML models can be a reason of significant performance degradation in Linux kernel. Paper suggests the ML infrastructure architecture in Linux kernel that can solve the declared problem and introduce of employing ML models in kernel space. Suggested approach of kernel ML library has been implemented as Proof Of Concept (PoC) project with the goal to demonstrate feasibility of the suggestion and to design the interface of interaction the kernel-space ML model proxy and the ML model user-space thread.

研究动机与目标

  • Motivate the need for self-evolving Linux kernel configurations and subsystems using ML.
  • Define a generalized ML infrastructure that enables interaction between kernel Space and user-space ML models.
  • Demonstrate feasibility via a Proof Of Concept implementation and outline an interface for ML model proxies in the kernel.

提出的方法

  • Propose a kernel ML library architecture that provides an interface for ML model proxy creation, start/stop, data exchange, and application of recommendations in kernel space.
  • Advocate running ML models in user-space to leverage FPUs and Python, while kernel space hosts a proxy that interacts with the kernel subsystem.
  • Describe data collection and preprocessing flows from kernel space to user-space and back, using sysfs, FUSE, or character devices for data transfer.
  • Introduce interaction modes for the kernel ML proxy: emergency, learning, collaboration, and recommendation, including back-propagation for model correction.

实验结果

研究问题

  • RQ1How can ML models be effectively integrated into the Linux kernel without compromising performance?
  • RQ2What is a feasible architecture for a kernel-space ML model proxy that coordinates with a user-space ML model?
  • RQ3What interaction modes and data pathways enable iterative training and inference in kernel subsystems?
  • RQ4Can a PoC demonstrate the practicality of ML-driven subsystem configuration and synthesized kernel logic?
  • RQ5What subsystems (e.g., GC, DAMON) could benefit from ML-based optimization in future work?

主要发现

  • A generalized kernel ML library architecture can interconnect kernel subsystems with user-space ML models via a proxy in kernel space.
  • Running ML models in user-space with FPUs is favored over in-kernel execution, using interfaces like sysfs, FUSE, or character devices for data exchange.
  • Multiple interaction modes (emergency, learning, collaboration, recommendation) are proposed to manage ML-driven changes and safety in kernel subsystems.
  • A PoC implementation demonstrates feasibility and provides a blueprint for interface design between kernel-space proxies and user-space ML threads.
  • Future work targets ML-based GC for multiple file systems and an ML-based DAMON extension to test real-world efficacy.

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。