Image Classification
Introduction to Image Classification, including datasets, models, applications, and transfer learning
Datasets
PASCAL VOC (Visual Ojbect Classes)
VOC2007 train/val/test 9,963張標註圖片,有24,640個標註物件
VOC2012 train/val/test11,530張標註圖片,有27,450個ROI 標註物件
20 classes:
- Person: person
- Animal: bird, cat, cow, dog, horse, sheep
- Vehicle: aeroplane, bicycle, boat, bus, car, motorbike, train
- Indoor: bottle, chair, dining table, potted plant, sofa, tv/monitor
COCO Dataset
- Object segmentation
- Recognition in context
- Superpixel stuff segmentation
- 330K images (>200K labeled)
- 1.5 million object instances
- 80 object categories
- 91 stuff categories
- 5 captions per image
- 250,000 people with keypoints
ImageNet
This dataset spans 1000 object classes and contains 1,281,167 training images, 50,000 validation images and 100,000 test images. This subset is available on Kaggle.`
Applications
CIFAR-10
Dataset: CIFAR-10
The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.
Kaggle: https://www.kaggle.com/rkuo2000/cifar10-cnn
Traffic Sign Classifier (交通號誌辨識)
Dataset: German Traffic Sign Recognition Benchmark (GTSRB)
34 traffic signs, 39209 training images, 12630 test images
Kaggle: https://www.kaggle.com/rkuo2000/gtsrb-cnn
Emotion Detection (情緒偵測)
Dataset: FER-2013 (Facial Expression Recognition)
7 facial expression, 28709 training images, 7178 test images
labels = [“angry”, “disgusted”, “fearful”, “happy”, “neutral”, “sad”, “surprised”]
Kaggle: https://www.kaggle.com/rkuo2000/fer2013-cnn
Pneumonia Detection (肺炎偵測)
Dataset: https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia
Kaggle: https://www.kaggle.com/rkuo2000/pneumonia-cnn
COVID19 Detection (新冠肺炎偵測)
Dataset: https://www.kaggle.com/bachrr/covid-chest-xray
Kaggle: https://www.kaggle.com/rkuo2000/covid19-vgg16
FaceMask Classification (人臉口罩辨識)
Dataset: Face Mask ~12K Images dataset
Kaggle: https://www.kaggle.com/rkuo2000/facemask-cnn
Garbage Classification (垃圾分類)
Dataset: https://www.kaggle.com/asdasdasasdas/garbage-classification (42MB)
6 categories : cardboard(403), glass(501), metal(410), paper (594), plastic(482), trash(137)
Kaggle: https://www.kaggle.com/rkuo2000/garbage-cnn
Food Classification (食物分類)
Dataset: Food-11
The dataset consists of 16,643 images belonging to 11 major food categories:
- Bread (1724 images)
- Dairy product (721 images)
- Dessert (2,500 images)
- Egg (1,648 images)
- Fried food (1,461images)
- Meat (2,206 images)
- Noodles/pasta (734 images)
- Rice (472 images)
- Seafood (1,505 images)
- Soup (2,500 images)
- Vegetable/fruit (1,172 images)
Kaggle: https://www.kaggle.com/rkuo2000/food11-classification
Mango Classification (芒果分類)
Dataset: 台灣高經濟作物 - 愛文芒果影像辨識正式賽
Kaggle:
- https://www.kaggle.com/rkuo2000/mango-classification
- https://www.kaggle.com/rkuo2000/mango-efficientnet
Transer Learning
Birds Classification (鳥類分類)
Dataset: https://www.kaggle.com/rkuo2000/birds2
用Google搜尋照片, 下載各20/30張照片,放入資料夾birds後,壓縮成birds.zip, 再上傳Kaggle.com/datasets
Kaggle: https://www.kaggle.com/rkuo2000/birds-classification
Animes Classification (卡通人物分類)
Dataset: https://www.kaggle.com/datasets/rkuo2000/animes
用Google搜尋照片, 下載卡通人物各約20/30張照片,放入資料夾animes後,壓縮成animes.zip, 再上傳Kaggle.com/datasets
Kaggle: https://www.kaggle.com/rkuo2000/anime-classification
Worms Classification(害蟲分類)
Dataset: worms4
用Google搜尋照片, 下載各20/30張照片,放入資料夾worms後,壓縮成worms.zip, 再上傳Kaggle.com/datasets
Kaggle: https://www.kaggle.com/rkuo2000/worms-classification
Railway Track Fault Detection (鐵軌故障偵測)
Dataset: Railway Track Fault Detection
Kaggle: https://www.kaggle.com/code/rkuo2000/railtrack-resnet50v2
from tensorflow.keras.applications import ResNet50V2
from tensorflow.keras import models, layers
base_model=ResNet50V2(input_shape=input_shape,weights='imagenet',include_top=False)
base_model.trainable = False # freeze the base model (for transfer learning)
# add Fully-Connected Layers to Model
x=base_model.output
x=layers.GlobalAveragePooling2D()(x)
x=layers.Dense(128,activation='relu')(x) # FC layer
preds=layers.Dense(num_classes,activation='softmax')(x) #final layer with softmax activation
model=models.Model(inputs=base_model.input,outputs=preds)
model.summary()
Kaggle: https://www.kaggle.com/code/rkuo2000/railtrack-efficientnet
import efficientnet.tfkeras as efn
from tensorflow.keras import models, layers, optimizers, regularizers, callbacks
base_model = efn.EfficientNetB7(input_shape=input_shape, weights='imagenet', include_top=False)
base_model.trainable = False # freeze the base model (for transfer learning)
x = base_model.output
x = layers.GlobalAveragePooling2D()(x)
x = layers.Dense(128)(x)
out = layers.Dense(num_classes, activation="softmax")(x)
model = models.Model(inputs=base_model.input, outputs=out)
model.summary()
Skin Lesion Classification (皮膚病變分類)
Dataset: Skin Cancer MNIST: HAM10000
7 types of lesions : (picture = 600x450)
- Actinic Keratoses (光化角化病)
- Basal Cell Carcinoma (基底細胞癌)
- Benign Keratosis (良性角化病)
- Dermatofibroma (皮膚纖維瘤)
- Malignant Melanoma (惡性黑色素瘤)
- Melanocytic Nevi (黑素細胞痣)
-
Vascular Lesions (血管病變)
Kaggle: https://www.kaggle.com/code/rkuo2000/skin-lesion-classification - assign base_model
#base_model=applications.MobileNetV2(input_shape=(224,224,3), weights='imagenet',include_top=False) #base_model=applications.InceptionV3(input_shape=(224,224,3), weights='imagenet',include_top=False) #base_model=applications.ResNet50V2(input_shape=(224,224,3), weights='imagenet',include_top=False) #base_model=applications.ResNet101V2(input_shape=(224,224,3), weights='imagenet',include_top=False) #base_model=applications.ResNet152V2(input_shape=(224,224,3), weights='imagenet',include_top=False) #base_model=applications.DenseNet121(input_shape=(224,224,3), weights='imagenet',include_top=False) #base_model=applications.DenseNet169(input_shape=(224,224,3), weights='imagenet',include_top=False) #base_model=applications.DenseNet201(input_shape=(224,224,3), weights='imagenet',include_top=False) #base_model=applications.NASNetMobile(input_shape=(224,224,3), weights='imagenet',include_top=False) #base_model=applications.NASNetLarge(input_shape=(331,331,3), weights='imagenet',include_top=False)
- import EfficentNet model
import efficientnet.tfkeras as efn base_model = efn.EfficientNetB7(input_shape=(224,224,3), weights='imagenet', include_top=False)
This site was last updated December 15, 2024.