Abstract

For detecting deformable linear objects (DLO), such as cables, CNN-based methods are insufficient for industrial use, such as for bin-picking tasks. In this paper, based on testing various baseline models from state of the art (s.o.t.a), the reasons of failures while detecting cables are found and depending on those, we’ve suggested a 2 stage method, which is a combination of vision Transformer and HTC (Fig. 1) as an example, to reach the best performance in case of detecting cable with s.o.t.a.

Without bells and whistles, the proposed method obtains 21.5% AP and 4.7% segm AP gains compared to s.o.t.a methods.

structure
Fig. 1 The Structure of our proposed method

What, How, Results

What

  • Research on the failures while detecting cables with sota methods
  • Based on founded reasons, an example structure is given.
  • The example method for cable detection achieve 79.9% box and 33.4% segmentation AP.
  • Shows a research direction for cable detection

How

  • Based on the classes' scales of the target, local spikes on the feature maps are observed.
  • Based on the classes' quantity, special loss function for imbalance dataset is used for testing the assumption (whether the “imbalance” affect the accuracy)
  • Comparisons between the Swin Transformer and ResNet are used for proving the negative effect of crossing side to side objects.
  • Implemented anchor-based or anchor-free Region Proposal Network (RPN), observed whether the regression distance affects the accuracy.
  • Implemented oriented proposal based (i.a. rotatable box) RPN, observed whether the one-to-one relation between box and segmentation affect the accuracy.

Results

diagram
Fig. 2 Demonstration of the performance gains according to each improvement
Comparisons
Fig. 3 Comparisons between our method and the s.o.t.a

Things I’ve achieved

  1. Developed anchor relevant modules for 4-dimensional RoI Transformer from scratch
  2. Building assumption models according to the failure cases and developed improvements connect to the scenario
  3. Designed and managed dataset; Fast implementation and failure analysis with multiple s.o.t.a as the baseline
  4. Pytorch based Detectron 2 and MMDetection are used