[Paper Review] OD-GCN object detection by knowledge graph with GCN.
This paper proposes OD-GCN, a knowledge graph-enhanced object detection framework that leverages graph convolutional networks (GCN) to improve detection accuracy by modeling object category relationships. By constructing a knowledge graph of co-occurring objects and applying GCN as a post-processing module on top of pre-trained detectors, OD-GCN boosts mAP by 1–5 percentage points across multiple models on the COCO dataset, with human-validated improvements.
Classical object detection methods only extract the objects' image features via CNN, lack of utilizing the relationship among objects in the same image. In this article, we introduce the graph convolutional networks (GCN) into the object detection field and propose a new framework called OD-GCN (object detection with graph convolutional network). It utilizes the category relationship to improve the detection precision. We set up a knowledge graph to reflect the co-exist relationships among objects. GCN plays the role of post-processing to adjust the output of base object detection models, so it is a flexible framework that any pre-trained object detection models can be used as the base model. In experiments, we try several popular base detection models. OD-GCN always improve mAP by 1-5pp on COCO dataset. In addition, visualized analysis reveals the benchmark improvement is quite reasonable in human's opinion.
Motivation & Objective
- To address the limitation of classical object detectors that ignore inter-object relationships in images.
- To improve detection accuracy by incorporating semantic and co-occurrence relationships among objects.
- To design a flexible framework compatible with any pre-trained object detection model.
- To validate that graph-based reasoning enhances detection performance in a human-interpretable way.
Proposed method
- Construct a knowledge graph encoding co-occurrence and category relationships among object classes using prior knowledge.
- Use graph convolutional networks (GCN) to refine object detection scores by propagating relational information across nodes in the knowledge graph.
- Integrate GCN as a post-processing module applied after the base detector's output, preserving model-agnostic flexibility.
- Train the GCN component end-to-end or in a fine-tuned manner to adjust detection confidence scores based on contextual relationships.
- Utilize object detection features from pre-trained models (e.g., Faster R-CNN, RetinaNet) as input to the GCN-based refinement stage.
- Apply visual attention and feature propagation to enhance context-aware predictions without modifying the backbone network.
Experimental results
Research questions
- RQ1Can modeling inter-object relationships through a knowledge graph improve object detection performance?
- RQ2How effective is GCN-based post-processing in refining detection scores across diverse base detectors?
- RQ3Does the improvement from OD-GCN align with human perception of detection quality?
- RQ4To what extent does the framework generalize across different object detection architectures?
Key findings
- OD-GCN improves mean average precision (mAP) by 1–5 percentage points across multiple base object detection models on the COCO dataset.
- The performance gain is consistent across different backbone networks, demonstrating the framework’s generalization and flexibility.
- Visualized results show that the model corrects detection errors in a way that aligns with human intuition about plausible object co-occurrences.
- The knowledge graph effectively encodes semantic and contextual relationships, enabling GCN to refine predictions using relational context.
- The post-processing nature of GCN allows integration with any pre-trained detector without retraining the entire model.
Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.