Object Detection & Tracking
Datasets
VOC (Visual Ojbect Classes)
VOC2012 : 20 classes, 530張標註圖片,有27,450個ROI 標註物件

COCO Dataset
80 classes, 330K images, 1.5M object instances

ImageNet
1000 object classes, 1,281,167 training images, 50,000 validation images and 100,000 test images

Open Images Dataset
Open Images is a dataset of ~9M images annotated with image-level labels, object bounding boxes, object segmentation masks, visual relationships, and localized narratives
![]()
Roboflow
Label tools
labelme
pip install labelme
labelme pic123.jpg
Labelme2YOLO
pip install labelme2yolo
- Convert JSON files, split training and validation dataset by –val_size
python labelme2yolo.py --json_dir /home/username/labelme_json_dir/ --val_size 0.2
LabelImg
pip install labelImg
labelImg
labelImg [IMAGE_PATH] [PRE-DEFINED CLASS FILE]
VOC .xml convert to YOLO .txt
cd ~/tf/raccoon/annotations
python ~/tf/xml2yolo.py
YOLO Annotation formats (.txt)
class_num x, y, w, h
0 0.5222826086956521 0.5518115942028986 0.025 0.010869565217391304
0 0.5271739130434783 0.5057971014492754 0.013043478260869565 0.004347826086956522
Object Detection
R-CNN, Fast R-CNN, Faster R-CNN
R-CNN (2013)

Fast R-CNN (2015)

Faster R-CNN (2015)
[物件偵測] S3: Faster R-CNN 簡介](https://ivan-eng-murmur.medium.com/object-detection-s3-faster-rcnn-%E7%B0%A1%E4%BB%8B-5f37b13ccdd2)

Mask R-CNN (2017)
Code: https://github.com/matterport/Mask_RCNN/
SSD (2015)

RetinaNet

EfficientDet [(2019)]((https://arxiv.org/abs/1911.09070)

Kaggle: efficientdet-gwd
YOLO- You Only Look Once
Code: https://github.com/pjreddie/darknet

YOLOv1 (2015)
YOLOv2 (2016)
YOLOv3 (2018) : Darknet-53 + FPN

YOLOv4
Code: AlexeyAB/darknet
YOLOv4 = YOLOv3 + CSPDarknet53 + SPP + PAN + BoF + BoS
YOLOv5

Scaled-YOLOv4

YOLOR

YOLOX (2021)

Code: Megvii-BaseDetection/YOLOX

CSL-YOLO

YOLOv6 (2022)

YOLOv7 (2022)

YOLOv8 (2023)

Kaggle: rkuo2000/YOLOv8, rkuo2000/YOLOv8-Pothole-detection
UAV-YOLOv8
[YOLOv8 Aerial Sheep Detection and Counting]((https://github.com/monemati/YOLOv8-Sheep-Detection-Counting)
YOLOv8 Drone Surveillance
YOLOv9 (2024)
YOLOv10 (2024)
YOLOv11 (2024)
YOLOv12 (2025)

Kaggle: yolov12, yolov12-tank, yolov12-face
YOLOv13 (2025)

YOLO26 (2026)

Kaggle: https://www.kaggle.com/code/rkuo2000/yolo26
RF-DETR (2023)
RF-DETR: SOTA Real-Time Detection and Segmentation Model

Blog: How to Deploy RF-DETR to an NVIDIA Jetson
RF-DETRv2 (2024)
Datasets
Deep-sea Debris (2018)

TACO (2020)

UDD (2020)
Concretely, UDD consists of 3 categories (seacucumber, seaurchin, and scallop) with 2,227 images

DUO (2021)

Applications
Satellite Image Deep Learning
Swimming Pool Detection

Identify Military Vehicles in Satellite Imagery
Dataset: Moving and Stationary Target Acquisition and Recognition (MSTAR) Dataset

Code: Target Recognition in Sythentic Aperture Radar Imagery Using Deep Learning script.ipynb
BCCD Dataset
3 classes: RBC (Red Blood Cell), WBC (White Blood Cell), Platelets (血小板)

YOLOv5 FaceMask Detection

YOLOv5 Traffic Analysis

EfficientDet - Global Wheat Detection
Mask-RCNN

Mask RCNN transfer learning

Objectron

OpenCV-Python play GTA5
Steel Defect Detection
Dataset: Severstal: Steel Defect Detection

https://www.kaggle.com/code/jaysmit/u-net (Keras UNet)
Pothole Detection using YOLOv4
Kaggle: YOLOv7 Pothole Detection

YOLOv7 Braking Detection

Steel Defect Detection
Dataset: Severstal: Steel Defect Detection

Steel Defect Detection using UNet
https://www.kaggle.com/code/jaysmit/u-net (Keras UNet)
https://www.kaggle.com/code/myominhtet/steel-defection (pytorch UNet
Steel-Defect Detection Using CNN

MSFT-YOLO (2022)

PCB Datasets
PCB Defect Detection
Paper: PCB Defect Detection Using Denoising Convolutional Autoencoders
PCB Defect Classification
Dataset: HRIPCB dataset (dropbox)
印刷电路板(PCB)瑕疵数据集。它是一个公共合成PCB数据集,包含1386张图像,具有6种缺陷(漏孔、鼠咬、开路、短路、杂散、杂铜),用于图像检测、分类和配准任务。
Paper: End-to-end deep learning framework for printed circuit board manufacturing defect classification

Multi-Object Tracking
Multiple Object Tracking (MOT)
Under-water Ojbect Tracking (UOT)
boxmot

SiamBAN

FairMOT

3D-ZeF

ByteTrack

OC-SORT

Deep OC-SORT

Track Anything

MeMOTR

MOTIP

This site was last updated June 19, 2026.



