Image Classification

Introduction to Image Classification, including datasets, models, applications, and transfer learning


Datasets

PASCAL VOC (Visual Ojbect Classes)

VOC2007 train/val/test 9,963張標註圖片,有24,640個標註物件
VOC2012 train/val/test11,530張標註圖片,有27,450個ROI 標註物件
20 classes:

  • Person: person
  • Animal: bird, cat, cow, dog, horse, sheep
  • Vehicle: aeroplane, bicycle, boat, bus, car, motorbike, train
  • Indoor: bottle, chair, dining table, potted plant, sofa, tv/monitor

COCO Dataset

  • Object segmentation
  • Recognition in context
  • Superpixel stuff segmentation
  • 330K images (>200K labeled)
  • 1.5 million object instances
  • 80 object categories
  • 91 stuff categories
  • 5 captions per image
  • 250,000 people with keypoints

ImageNet

This dataset spans 1000 object classes and contains 1,281,167 training images, 50,000 validation images and 100,000 test images. This subset is available on Kaggle.`


Applications

CIFAR-10

Dataset: CIFAR-10

The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.
Kaggle: https://www.kaggle.com/rkuo2000/cifar10-cnn


Traffic Sign Classifier (交通號誌辨識)

Dataset: German Traffic Sign Recognition Benchmark (GTSRB)

34 traffic signs, 39209 training images, 12630 test images
Kaggle: https://www.kaggle.com/rkuo2000/gtsrb-cnn


Emotion Detection (情緒偵測)

Dataset: FER-2013 (Facial Expression Recognition)

7 facial expression, 28709 training images, 7178 test images
labels = [“angry”, “disgusted”, “fearful”, “happy”, “neutral”, “sad”, “surprised”]
Kaggle: https://www.kaggle.com/rkuo2000/fer2013-cnn


Pneumonia Detection (肺炎偵測)

Dataset: https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia

Kaggle: https://www.kaggle.com/rkuo2000/pneumonia-cnn


COVID19 Detection (新冠肺炎偵測)

Dataset: https://www.kaggle.com/bachrr/covid-chest-xray

Kaggle: https://www.kaggle.com/rkuo2000/covid19-vgg16


FaceMask Classification (人臉口罩辨識)

Dataset: Face Mask ~12K Images dataset

Kaggle: https://www.kaggle.com/rkuo2000/facemask-cnn


Garbage Classification (垃圾分類)

Dataset: https://www.kaggle.com/asdasdasasdas/garbage-classification (42MB)

6 categories : cardboard(403), glass(501), metal(410), paper (594), plastic(482), trash(137)

Kaggle: https://www.kaggle.com/rkuo2000/garbage-cnn


Food Classification (食物分類)

Dataset: Food-11
The dataset consists of 16,643 images belonging to 11 major food categories:

  • Bread (1724 images)
  • Dairy product (721 images)
  • Dessert (2,500 images)
  • Egg (1,648 images)
  • Fried food (1,461images)
  • Meat (2,206 images)
  • Noodles/pasta (734 images)
  • Rice (472 images)
  • Seafood (1,505 images)
  • Soup (2,500 images)
  • Vegetable/fruit (1,172 images)

Kaggle: https://www.kaggle.com/rkuo2000/food11-classification


Mango Classification (芒果分類)

Dataset: 台灣高經濟作物 - 愛文芒果影像辨識正式賽
Kaggle:


Transer Learning

Birds Classification (鳥類分類)

Dataset: https://www.kaggle.com/rkuo2000/birds2

用Google搜尋照片, 下載各20/30張照片,放入資料夾birds後,壓縮成birds.zip, 再上傳Kaggle.com/datasets

Kaggle: https://www.kaggle.com/rkuo2000/birds-classification


Animes Classification (卡通人物分類)

Dataset: https://www.kaggle.com/datasets/rkuo2000/animes

用Google搜尋照片, 下載卡通人物各約20/30張照片,放入資料夾animes後,壓縮成animes.zip, 再上傳Kaggle.com/datasets

Kaggle: https://www.kaggle.com/rkuo2000/anime-classification


Worms Classification(害蟲分類)

Dataset: worms4

用Google搜尋照片, 下載各20/30張照片,放入資料夾worms後,壓縮成worms.zip, 再上傳Kaggle.com/datasets

Kaggle: https://www.kaggle.com/rkuo2000/worms-classification


Railway Track Fault Detection (鐵軌故障偵測)

Dataset: Railway Track Fault Detection
Kaggle: https://www.kaggle.com/code/rkuo2000/railtrack-resnet50v2

from tensorflow.keras.applications import ResNet50V2
from tensorflow.keras import models, layers

base_model=ResNet50V2(input_shape=input_shape,weights='imagenet',include_top=False) 
base_model.trainable = False # freeze the base model (for transfer learning)

# add Fully-Connected Layers to Model
x=base_model.output
x=layers.GlobalAveragePooling2D()(x)
x=layers.Dense(128,activation='relu')(x)  # FC layer 
preds=layers.Dense(num_classes,activation='softmax')(x) #final layer with softmax activation

model=models.Model(inputs=base_model.input,outputs=preds)
model.summary()

Kaggle: https://www.kaggle.com/code/rkuo2000/railtrack-efficientnet

import efficientnet.tfkeras as efn
from tensorflow.keras import models, layers, optimizers, regularizers, callbacks

base_model = efn.EfficientNetB7(input_shape=input_shape, weights='imagenet', include_top=False)
base_model.trainable = False # freeze the base model (for transfer learning)

x = base_model.output
x = layers.GlobalAveragePooling2D()(x)
x = layers.Dense(128)(x)
out = layers.Dense(num_classes, activation="softmax")(x)

model = models.Model(inputs=base_model.input, outputs=out)

model.summary()

Skin Lesion Classification (皮膚病變分類)

Dataset: Skin Cancer MNIST: HAM10000

7 types of lesions : (picture = 600x450)

  • Actinic Keratoses (光化角化病)
  • Basal Cell Carcinoma (基底細胞癌)
  • Benign Keratosis (良性角化病)
  • Dermatofibroma (皮膚纖維瘤)
  • Malignant Melanoma (惡性黑色素瘤)
  • Melanocytic Nevi (黑素細胞痣)
  • Vascular Lesions (血管病變)
    Kaggle: https://www.kaggle.com/code/rkuo2000/skin-lesion-classification

  • assign base_model
    #base_model=applications.MobileNetV2(input_shape=(224,224,3), weights='imagenet',include_top=False)
    #base_model=applications.InceptionV3(input_shape=(224,224,3), weights='imagenet',include_top=False)
    #base_model=applications.ResNet50V2(input_shape=(224,224,3), weights='imagenet',include_top=False)
    #base_model=applications.ResNet101V2(input_shape=(224,224,3), weights='imagenet',include_top=False)
    #base_model=applications.ResNet152V2(input_shape=(224,224,3), weights='imagenet',include_top=False)
    #base_model=applications.DenseNet121(input_shape=(224,224,3), weights='imagenet',include_top=False)
    #base_model=applications.DenseNet169(input_shape=(224,224,3), weights='imagenet',include_top=False)
    #base_model=applications.DenseNet201(input_shape=(224,224,3), weights='imagenet',include_top=False)
    #base_model=applications.NASNetMobile(input_shape=(224,224,3), weights='imagenet',include_top=False)
    #base_model=applications.NASNetLarge(input_shape=(331,331,3), weights='imagenet',include_top=False)
    
  • import EfficentNet model
    import efficientnet.tfkeras as efn
    base_model = efn.EfficientNetB7(input_shape=(224,224,3), weights='imagenet', include_top=False)
    



This site was last updated November 15, 2024.