[CV] Parts-based Models & Deformable Part Model (DPM)

2023. 7. 29. 18:08

728x90

Parts-based Models

Parts-based models은 object를 부분(part)의 집합으로 정의하는 object detection model의 일종이다
Parts-based models에서 part는 다음 두 가지 요인을 기반으로 모델링 된다.

1. Appearance : part의 모양, 색상, 질감 등과 같은 시각적 특성을 의미한다. part들의 외관은 다른 part들 및 배경과 구별하는 데 사용된다.
2. Spatial configuration : part들은 공간상에서 관계성을 가진다. part들의 공간 구성은 object의 전체적인 모양과 구조를 결정하는 데 사용된다.

Deformable Part Model (DPM)

참고 : https://cs.brown.edu/people/pfelzens/papers/lsvm-pami.pdf

Discriminatively Trained Deformable Part Models (Release 5) (rossgirshick.info)

Object Detection 중 가장 성공한 model 중 하나이며, PASCAL과 INRIA person datasets에서 좋은 결과를 달성했다.

Object Detection

Object Detection은 기본적으로 다음과 같은 Stage로 진행되며,

DPM의 Specify Object Model Stage에서는 statistical template approach와 part-based model이 혼합된 Hybrid Template & body model을 사용한다.

Specify Object Model Stage & Generate Hypothesis

Model

DPM에서는 root filter (dalal-triggs 필더와 유사)와 object의 part에 대한 filter를 사용한다.

(a) : root filter

(b) : higher resolution part filter

Root Filter는 Object 전체의 대략적인 Appearance를 모델링한다.

Part Filter들은 Object의 더 작은 부분들을 보다 상세하게 모델링한다.

Spatical Model은 object의 각 part의 상대적인 위치를 설명하는 데 사용되며, root를 기준으로 각 part의 중심이 어디에 위치하는지에 대한 cost를 계산한다. (이 방법은 sliding-window detector와 관련이 있다)

Score Hypothesis

Object Score

Object detection을 위해 여러 scale에서 feature pyramid를 사용한다.

이 pyramid는 image의 다양한 해상도를 포함하고 있다.

각 scale에서 root filter와 각 part filters의 response가 계산되게 된다. (이 response는 서로 다른 해상도에서 filter와 얼마나 잘 일치하는지를 나타내는 score로 생각할 수 있다.)

각 Root에 대해 "Appearance score를 최대화하면서, Spatial cost를 최소화"하는 part들의 위치를 찾는다.

(Appearance score는 part가 image에서 얼마나 잘 일치하는지, Spatial Cost는 part의 위치가 expect 위치로부터 얼마나 떨어져 있는지를 나타낸다.)

Total Score는 각 filter의 score와 spatial cost의 합으로 계산되며, 이 점수는 object가 어떤 class에 속하고, 어디에 위치해 있는지 결정하는 데 사용된다.

Resolve Score

모든 detector에 non-max suppression를 적용하며, 여기서 non-max suppression은 여러 개의 detector가 동일한 object를 검출할 때 중복 검출을 제거하는 데 사용되는 기술이다. detection 결과의 정확도를 향상할 수 있다.

* non-max suppression : https://iso34.tistory.com/107 (1.4 Resolve Score)

Training

DPM은 bounding box(Statistical Template in Bounding Box)과 part model(Articulated parts)을 사용하여 object detection 하는 방법을 학습하게 된다.
이때 part는 다양한 scale에서 학습된다.

bounding box와 part model의 파라미터를 학습하기 위해, 어려운 negative example를 data mining 하기 위한 margin-sensitive 방식과 latent SVM을 결합한 discriminative procedure가 사용된다.

latent SVM은 latent 변수(포즈, 방향, 스케일 등) 측면에서의 MI-SVM를 재구성한 것이다. 기본적으로 latent SVM은 semi-convex하지만, positive example에 대한 latent 정보가 고정되게 되면, training problem이 convex되게 된다.
이 과정을 거쳐 positive example에 대한 latent 값을 고정하고, latent SVM의 objective function을 최적화하는 반복적인 taining algorithm이 생성된다.

728x90

'🤖 ai logbook' 카테고리의 다른 글

[RL] DQN(Deep Q-Network) - 작성중 (0)	2023.08.08
[RL] Q 러닝(Q-learning) (0)	2023.08.07
[CV] Single-stage Models (YOLO, YOLOv2/YOLO9000) (0)	2023.07.31
[CV] Two-stage Models (R-CNN, SPPNet, Fast R-CNN, Faster R-CNN) (0)	2023.07.29
[CV] Object Detection & Statistical Template Approach(Dalal-Triggs Pedestrian Detector) (0)	2023.07.19
베이즈 정리(Bayes’ theorem) & 마르코프 모델(Markov Models) (0)	2023.07.14
[NLP/자연어처리/Python] koGPT2 ChatBot 실습 (0)	2023.07.09
[cs231n/Spring 2023] Lecture 5: Image Classification with CNNs (0)	2023.07.09

I study SO

Menu

Category

Tags