QUICK REVIEW

[Paper Review] Vehicle: Bridging the Embedding Gap in the Verification of Neuro-Symbolic Programs

Matthew L. Daggitt, Wen Kokke|arXiv (Cornell University)|Jan 12, 2024

Ferroelectric and Negative Capacitance Devices3 citations

TL;DR

This paper introduces Vehicle, a dependently-typed intermediate language that bridges the 'embedding gap' in neuro-symbolic program verification by enabling unified specification of neural components across machine learning frameworks, automated theorem provers (ATPs), and interactive theorem provers (ITPs). It demonstrates its utility by formally verifying a safety-critical autonomous car controller in a stochastic, partially observable environment using modular proofs across ATPs and ITPs.

ABSTRACT

Neuro-symbolic programs, i.e. programs containing both machine learning components and traditional symbolic code, are becoming increasingly widespread. Finding a general methodology for verifying such programs is challenging due to both the number of different tools involved and the intricate interface between the "neural" and "symbolic" program components. In this paper we present a general decomposition of the neuro-symbolic verification problem into parts, and examine the problem of the embedding gap that occurs when one tries to combine proofs about the neural and symbolic components. To address this problem we then introduce Vehicle - standing as an abbreviation for a "verification condition language" - an intermediate programming language interface between machine learning frameworks, automated theorem provers, and dependently-typed formalisations of neuro-symbolic programs. Vehicle allows users to specify the properties of the neural components of neuro-symbolic programs once, and then safely compile the specification to each interface using a tailored typing and compilation procedure. We give a high-level overview of Vehicle’s overall design, its interfaces and compilation & type-checking procedures, and then demonstrate its utility by formally verifying the safety of a simple autonomous car controlled by a neural network, operating in a stochastic environment with imperfect information.

Motivation & Objective

To identify and formalize the 'embedding gap'—the challenge of integrating proofs from neural and symbolic components in neuro-symbolic systems.
To design a general decomposition of neuro-symbolic verification into distinct, composable stages: specification, training, neural verification, and symbolic integration.
To develop Vehicle as a principled, type-safe intermediate language that enables consistent, reusable specifications across diverse verification backends.
To demonstrate end-to-end verification of a neuro-symbolic system using both ATPs (for neural components) and ITPs (for symbolic components), achieving a modular proof of correctness.

Proposed method

Designing Vehicle as a high-level, dependently-typed language with native support for tensors, neural networks, first-class quantifiers, and higher-order functions.
Implementing a compilation pipeline that translates Vehicle specifications into target backends: machine learning frameworks (e.g. PyTorch), ATPs (e.g. for neural network verification), and ITPs (e.g. for symbolic reasoning).
Using Vehicle’s dependent type system internally to ensure type safety and generate precise error messages during compilation to each backend.
Leveraging the type system to enforce semantic consistency across different verification layers, minimizing the risk of misinterpretation between neural and symbolic components.
Integrating Vehicle with existing verification tools via tailored type-checking and compilation procedures, enabling interoperability without loss of expressivity.
Demonstrating the framework on a formal verification case study: a stochastic autonomous car controller with safety and liveness properties.

Experimental results

Research questions

RQ1How can the verification of neuro-symbolic programs be decomposed into modular, composable stages to isolate the neural and symbolic reasoning components?
RQ2What causes the 'embedding gap' in neuro-symbolic verification, and why do existing tools fail to bridge it?
RQ3Can a single specification language serve as a trusted interface between neural network training, automated verification, and interactive proof systems?
RQ4How can formal proofs about neural components be safely and soundly integrated with proofs about symbolic components in a cyber-physical system?
RQ5Is it feasible to achieve a modular, end-to-end verification of a neuro-symbolic system using both ATPs and ITPs in a single, coherent proof process?

Key findings

Vehicle successfully bridges the embedding gap by enabling a single, consistent specification of neural components to be compiled to multiple backends, including ATPs and ITPs.
The authors present the first known modular verification of a neuro-symbolic program that integrates automated and interactive theorem proving for both neural and symbolic components.
The case study demonstrates formal verification of a safety-critical autonomous car controller in a stochastic, partially observable environment, ensuring road-keeping and collision avoidance.
Vehicle’s type system enables precise diagnostics during compilation, improving usability and reducing errors when translating specifications to different verification targets.
The tool supports extensible integration with diverse ITPs such as Imandra, Rocq, and KeYmaera X, enabling future support for complex CPS verification workloads.
The framework is extensible to future enhancements, including proof certificate generation and numeric quantization support to address real-world deployment semantics.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.