QUICK REVIEW

[논문 리뷰] GenCAD: Image-Conditioned Computer-Aided Design Generation with Transformer-Based Contrastive Representation and Diffusion Priors

Md Ferdous Alam, Faez Ahmed|arXiv (Cornell University)|2024. 09. 08.

Manufacturing Process and Optimization인용 수 5

한 줄 요약

GenCAD는 입력 이미지에 조건부로 편집 가능한 CAD 명령 시퀀스를 생성하도록 자기회귀 트랜스포머, 대조적 다중모달 학습, 그리고 확산 사전을 결합해 학습함으로써 이미지 기반 CAD 생성 및 검색을 가능하게 한다.

ABSTRACT

The creation of manufacturable and editable 3D shapes through Computer-Aided Design (CAD) remains a highly manual and time-consuming task, hampered by the complex topology of boundary representations of 3D solids and unintuitive design tools. While most work in the 3D shape generation literature focuses on representations like meshes, voxels, or point clouds, practical engineering applications demand the modifiability and manufacturability of CAD models and the ability for multi-modal conditional CAD model generation. This paper introduces GenCAD, a generative model that employs autoregressive transformers with a contrastive learning framework and latent diffusion models to transform image inputs into parametric CAD command sequences, resulting in editable 3D shape representations. Extensive evaluations demonstrate that GenCAD significantly outperforms existing state-of-the-art methods in terms of the unconditional and conditional generations of CAD models. Additionally, the contrastive learning framework of GenCAD facilitates the retrieval of CAD models using image queries from large CAD databases, which is a critical challenge within the CAD community. Our results provide a significant step forward in highlighting the potential of generative models to expedite the entire design-to-production pipeline and seamlessly integrate different design modalities.

연구 동기 및 목표

설계-생산 파이프라인의 속도를 높이기 위해 CAD 모델링 자동화를 촉진한다.
최종 B-rep가 아니라 CAD 명령 시퀀스를 출력하는 확장 가능한 이미지-조건 생성 모델을 제안한다.
CAD 프로그램과 이미지를 정렬하고 검색을 가능하게 하기 위해 다중모달 표현 학습을 활용한다.
이전의 무조건적 CAD 생성 방법에 비해 정확성과 수정 가능성을 향상시켰음을 보여준다.

제안 방법

CAD 명령 시퀀스의 잠재 표현을 학습하기 위해 autoregressive 트랜스포머 인코더-디코더(CSR)를 개발한다.
ResNet 기반 이미지 인코더를 사용하여 CAD 명령과 입력 이미지의 공동 잠재 공간을 학습하기 위해 대조적 CAD-이미지 프리 트레이닝(CCIP) 모델을 훈련한다.
이미지 잠재 벡터에 조건화된 CAD 잠재를 생성하는 CAD 확산 사전(CDP)을 도입하며 결정적 사전 옵션을 제공한다.
CDP가 생성한 잠재로부터 CAD 명령 시퀀스를 생성하기 위해 CSR의 사전 학습된 CAD 디코더를 사용한다.
CAD 명령을 (t_i, p_i)로 고정 차원 벡터로 표현하고 매개변수를 8비트 양자화하여 언어형 CAD 프로그램을 형성한다.
확산 사전에서 샘플링한 후 고정된 CSR 디코더를 사용해 CAD 잠재를 CAD 명령 시퀀스로 디코딩한다.

실험 결과

연구 질문

RQ1학습된 잠재 표현으로부터 autoregressive 트랜스포머가 CAD 명령 시퀀스를 효과적으로 재구성할 수 있는가?
RQ2대조적 학습이 CAD 명령 잠재와 입력 CAD 이미지 간의 정렬을 개선하는가?
RQ3이미지 잠재에 조건화된 확산 사전이 유효한 3D 솔리드로 이어지는 고품질 CAD 명령 시퀀스를 생성하는가?
RQ4이미지-조건 GenCAD 프레임워크가 이미지 질의를 사용해 CAD 프로그램의 신뢰할 수 있는 검색을 가능하게 하는가?

주요 결과

GenCAD는 생성된 CAD 프로그램의 정확도와 수정 가능성 측면에서 최신 무조건 CAD 생성 방법을 크게 능가한다.
이 프레임워크는 더 긴 CAD 명령 시퀀스에 대해 더 높은 정확성을 제공하여 복잡한 설계 작업을 지원한다.
CCIP 구성요소는 이미지 기반 CAD 모델 검색을 가능하게 하며 이미지-대-이미지 검색 기준선에 비해 상당한 이득(정확도 15배 이상)을 제공한다.
이 방식은 이미지로부터 CAD 프로그램을 효과적으로 생성하고 표준 기하 커널을 통해 B-rep 또는 다른 표현으로 변환될 수 있음을 보여준다.
고정된 사전 학습 CAD 인코더 및 디코더를 사용하면 대규모 데이터 세트에 대한 학습 확장성과 효율성이 향상된다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.