Skip to main content
QUICK REVIEW

[论文解读] Overparameterization: A Connection Between Software 1.0 and Software 2.0

Atkinson, Eric, Yang, Cambridge|arXiv (Cornell University)|May 4, 2018
Bayesian Modeling and Causal Inference参考文献 23被引用 6
一句话总结

Shuffle 是一种领域特定编程语言,通过强制执行概率论和统计依赖关系,使开发人员能够手动编写正确且高效的概率推理算法。通过集成类型安全的分布操作、基于证明规则的形式化验证,以及带有性能优化的自动代码生成,Shuffle 生成的推理过程在形式上是正确的,并且比现有系统(如 Venture)快 up to 3.1x。

ABSTRACT

A new ecosystem of machine-learning driven applications, titled Software 2.0, has arisen that integrates neural networks into a variety of computational tasks. Such applications include image recognition, natural language processing, and other traditional machine learning tasks. However, these techniques have also grown to include other structured domains, such as program analysis and program optimization for which novel, domain-specific insights mate with model design. In this paper, we connect the world of Software 2.0 with that of traditional software - Software 1.0 - through overparameterization: a program may provide more computational capacity and precision than is necessary for the task at hand. In Software 2.0, overparamterization - when a machine learning model has more parameters than datapoints in the dataset - arises as a contemporary understanding of the ability for modern, gradient-based learning methods to learn models over complex datasets with high-accuracy. Specifically, the more parameters a model has, the better it learns. In Software 1.0, the results of the approximate computing community show that traditional software is also overparameterized in that software often simply computes results that are more precise than is required by the user. Approximate computing exploits this overparameterization to improve performance by eliminating unnecessary, excess computation. For example, one - of many techniques - is to reduce the precision of arithmetic in the application. In this paper, we argue that the gap between available precision and that that is required for either Software 1.0 or Software 2.0 is a fundamental aspect of software design that illustrates the balance between software designed for general-purposes and domain-adapted solutions. A general-purpose solution is easier to develop and maintain versus a domain-adapted solution. However, that ease comes at the expense of performance. We show that the approximate computing community and the machine learning community have developed overlapping techniques to improve performance by reducing overparameterization. We also show that because of these shared techniques, questions, concerns, and answers on how to construct software can translate from one software variant to the other.

研究动机与目标

  • 解决手动编写的概率推理算法缺乏形式化正确性保证的问题。
  • 提供一种强制执行概率论和统计依赖关系的编程模型。
  • 从高级概率抽象生成优化的、高性能的推理代码。
  • 通过经验证的、可组合的抽象,弥合手动推理实现(软件 1.0)与自动化合成(软件 2.0)之间的差距。

提出的方法

  • Shuffle 引入了一种类型系统,用于追踪条件依赖关系和分布,确保符合概率公理。
  • 通过验证分布操作和条件独立性的证明规则来强制保证正确性。
  • 该语言支持一等公民的分布原语,以及可组合的操作(如边缘化和条件密度计算)。
  • Shuffle 通过静态分析和代码转换生成优化的低级实现。
  • 它支持增量更新和循环优化,以提升性能。
  • 该系统将高级推理过程编译为高效、单线程的 C++ 代码,开销极小。

实验结果

研究问题

  • RQ1编程语言能否通过形式化的类型和证明规则,在手动编写的概率推理算法中强制保证正确性?
  • RQ2如何将高级概率抽象编译为高效、低级的代码,同时不牺牲正确性?
  • RQ3经验证的、可组合的抽象在性能和正确性方面,相较于现有系统能提升多少?
  • RQ4能否形式化建模规范与推理实现之间的分离,以同时支持正确性和优化?

主要发现

  • Shuffle 生成的推理过程比 Venture 的实现快 up to 3.1x,且高级抽象未带来性能退化。
  • Shuffle 中的增量优化使性能相比未优化版本至少提升了 30x,证明了代码生成策略的重要性。
  • 该系统成功实现了并验证了标准模型(包括高斯混合模型(GMM)、潜在狄利克雷分配(LDA)和动态贝叶斯网络(DMM))的推理算法。
  • Shuffle 的类型系统可防止条件独立性和分布组合的错误使用,确保所有生成的代码均符合概率论。
  • 该语言通过经验证的、可组合的原语支持折叠 Gibbs 采样和似然加权,实现高效推理。
  • 性能评估表明,Shuffle 抽象机制无显著性能开销,证实正确性与效率并非相互排斥。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。