Lecture

Pose Estimation

07 Aug 2024 • Richard Kuo

Pose Estimation includes Applications, Human Pose Estimation, Head Pose Estimation & VTuber, Hand Pose Estimation , Object Pose Estimation.

Pose Estimation Applications

運動裁判 (Sport Referee)

產線SOP

距離：人臉辨識技術，可辨識距離大約僅兩公尺以內，而人體骨幹技術不受距離限制，15 公尺外遠距離的人體也能精準偵測，進而分析人體外觀、四肢與臉部的特徵，以達到偵測需求。
角度：在正臉的情況下，臉部辨識的精準度非常高，目前僅有極少數相貌非常相似的同卵雙胞胎能夠騙過臉部辨識技術；但在非正臉的時候，臉部辨識的精準度就會急遽下降。在很多實際使用的場景，並不能要求每個參與者都靠近鏡頭、花幾秒鐘以正臉掃描辨識，在這種情況下，人體骨幹分析就能充分派上用場，它的精準度甚至比「非正臉的人臉辨識」高出 30% 以上。

Human Pose Estimation

Benchmark: https://paperswithcode.com/task/pose-estimation

Model Name	Paper	Code
PCT	https://arxiv.org/abs/2303.11638	https://github.com/gengzigang/pct
ViTPose	https://arxiv.org/abs/2204.12484v3	https://github.com/vitae-transformer/vitpose

support two new datasets: UBody, 300W-LP
support for four new algorithms: MotionBERT, DWPose, EDPose, Uniformer

3D Human Pose Estimation

Benchmark: https://paperswithcode.com/task/3d-human-pose-estimation

MotionBERT

Paper: MotionBERT: A Unified Perspective on Learning Human Motion Representations
Code: https://github.com/Walter0807/MotionBERT

BCP+VHA R152 384x384

Paper: Representation learning of vertex heatmaps for 3D human mesh reconstruction from multi-view images

Face Datasets

LFPW: Labeled Face Parts in the Wild (LFPW) Dataset

AFW: Annotated Faces in the Wild

AFW (Annotated Faces in the Wild) is a face detection dataset that contains 205 images with 468 faces. Each face image is labeled with at most 6 landmarks with visibility labels, as well as a bounding box.

IBUG: https://ibug.doc.ic.ac.uk/resources/facial-point-annotations/

Head Pose Estimation

Code: yinguobing/head-pose-estimation
Kaggle: https://www.kaggle.com/rkuo2000/head-pose-estimation

HopeNet

Paper: Fine-Grained Head Pose Estimation Without Keypoints

Code: Hopenet
Code: https://github.com/natanielruiz/deep-head-pose

Blog: HOPE-Net : A Machine Learning Model for Estimating Face Orientation

VTuber

Vtuber總數突破16000人，增速不緩一年增加3000人依據日本數據調查分析公司 User Local 的報告，在該社最新的 User Local VTuber 排行榜上，有紀錄的 Vtuber 正式突破了 16,000 人。

1位 Gawr Gura(がうるぐらサメちゃん) Gawr Gura Ch. hololive-EN

2位キズナアイ Kizuna AI

VTuber-Unity = Head-Pose-Estimation + Face-Alignment + GazeTracking

VRoid Studio

VTuber_Unity

OpenVtuber

Hand Pose Estimation

Hand3D

Paper: arxiv.org/abs/1705.01389
Code: lmb-freiburg/hand3d

InterHand2.6M

Paper: https://arxiv.org/abs/2008.09309
Code: https://github.com/facebookresearch/InterHand2.6M
Dataset: Re:InterHand Dataset

InterWild

Paper: Bringing Inputs to Shared Domains for 3D Interacting Hands Recovery in the Wild
Code: https://github.com/facebookresearch/InterWild/tree/main#test

RenderIH

Paper: RenderIH: A Large-scale Synthetic Dataset for 3D Interacting Hand Pose Estimation
Code: https://github.com/adwardlee/RenderIH

TransHand - transformer-based pose estimation network

Exercises of Pose Estimation

PoseNet

Kaggle: https://www.kaggle.com/rkuo2000/posenet-pytorch
Kaggle: https://www.kaggle.com/rkuo2000/posenet-human-pose

OpenPose

Kaggle: https://github.com/rkuo2000/openpose-pytorch

MMPose

Kaggle:https://www.kaggle.com/rkuo2000/mmpose

2D Human Pose

2D Human Whole-Body

2D Hand Pose

2D Face Keypoints

3D Human Pose

2D Pose Tracking

2D Animal Pose

3D Hand Pose

WebCam Effect