Agents

Agents

VLM/MLLMs

Vision Language Model / Multimodal Large Language Model

LLM

Introduction to LLM

VLA

Vision-Language-Action model

RL for Robot Dexity

Reinforcement-Learning for Robot Dexity

RL Gym for Robot

Reinforcement-Learning Gym for Robot

RL

Reinforcement Learning

Generative Song

Text-to-Music, Text-to-Song

Generative Video

Image-to-Video, Text-to-Video, Audio-to-Video

Generative Image

Text-to-Image, Text-to-3D, Image-to-3D

Generative Speech

Text-to-Speech, Voice Cloning, Speech Seperation, ASR

VAE/GAN

Style Transfer, Variational AutoEncoder, Generative Adversarial Network (生成對抗網路)

RNN

Recurrent Neural Networks (遞迴神經網路)

Face Recognition

Face Datasets, Face Detection, Face Alignment, Face Landmark, Face Recognition, Face Identificatio

Pose Estimation

Human-Pose, Head-Pose, Hand-Pose, Object-Pose Estimation (姿態估算)

Image Segmentation

Image Matting, Semantics Segmentation, Human Part Segmentation, Instance Segmentation, Video Object Segmentation, Panopitc Segmentation.

Object Detection

Object Detection (物件偵測)

Image Classification

Image Classification (影像分類)

CNN

Convolutional Neural Networks (卷積層神經網路)

OpenCV-Python

Image Processing (影像處理)

PC-Softwares

Colab, Notepad++, Git-for-Windows, Python3-for-Windows, LLM & ComfyUI

AI Hardwares

AI-chips, Edge-AI MCUs

AI Brief

AI簡介:演進, 應用, 新聞, 影響, 未來