1 Overview
In this project, you will design and implement your own deep learning model to perform 10-class image classification on the given dataset. You will be able to access the training data to train and tune your model, and a public testing dataset for the evaluation before your final submission. Your model will be finally evaluated on a private testing dataset. The grading of the project will be based on the novelty of the proposed model and the testing performance on the private dataset.
2 Project Instructions
What You Are Expected to Do You are expected to explore the state-of-the-art of deep learning based image classification, propose and implement your own deep learning models. You can explore everything including
Novel neural network architectures, novel operations, blocks and modules,
Data pre-processing, normalizations and augmentations,
Training strategies, optimizers, parameter initializations, regularizations, etc.
What You Are NOT Expected to Do
In this project, you are provided with the training dataset. However, for the sake of fairness, you are NOT allowed to use extra image data for transfer learning or pre-training.
Do not simply and directly apply the existing commonly used network architectures such as ResNet, GoogLeNet, VGG, etc.
Deep Learning Framework and Libraries
You are free to choice one framework between TensorFlow and PyTorch. Recommended stable versions are
TensorFlow 1.14
TensorFlow 2.2
PyTorch 1.6
Please specify the version you use in your report. If you use any other versions, please make sure your code works on one of the three versions. Besides, you are free to use any other common open source libraries in your project.
Training and Testing Dataset
You are provided with the CIFAR-10 Dataset for this project. The Dataset is available at https://www.cs.toronto.edu/˜kriz/cifar.html. Please carefully read the dataset description and implement your code to load and process the images. Note that you should only use the training dataset containing 50k images for training and validation. The public testing dataset should NOT be used for tuning your model.
The images of the private testing dataset will be released one week prior to the due date. The images are stored in an .npy file in the shape [N, 32, 32, 3], where N is the number of testing images. The labels of the private testing dataset will not be available. You will need to run your model prediction on the private dataset and submit the prediction results. Please make sure you have saved your model variables in files after training for your convenience.
Code Files Structure For better code readability, please follow the starter code and organize your codes into the following files during implementation.
main.py: Includes the code that loads the dataset and performs the training, testing and prediction.
DataLoader.py: Includes the code that defines functions related to data I/O.
ImageUtils.py: Includes the code that defines functions for any (pre-)processing of the images.
Configure.py: Includes dictionaries that set the model configurations, hyper-parameters, training settings, etc. The dictionaries are imported to main.py
Model.py: Includes the code that defines the your model in a class. The class is initialized with the configuration dictionaries and should have at least the methods “train(X, Y, configs, [X_valid, Y_valid,])”, “evaluate(X, Y)”, “predict_prob(X)”. The defined model class is imported to and referenced in main.py.
Network.py: Includes the code that defines the network architecture. The defined network will be imported and referenced in Model.py.
Detailed descriptions are provided in the starter code. You can add additional files that define your specific modules, blocks, utility functions, etc.
Submission Guidance
Your final submission should include the following files.
Prediction on the private test images Please store your prediction results as an array into an .npy file named “predictions.npy”. For each image, store a vector of the probabilities for the 10 classes instead of the predicted class. The shape of the saved array is [N, 10], where N is the number of testing images.
Code Please put all your code files in a “code” folder. Also include a README file that describes how to run your code for training and prediction.
Saved models Please keep your trained model that can reproduce the results on the public testing set and predictions on the private testing set. Put the related model files in a “saved_models” folder. If the file exceeds the uploading size limit, you can put it on Google Drive and include a share link in the “saved_models” folder.
Report Describe your proposed method, implementation details and summarize your results on the public testing dataset in your report. You can also report anything that you find interesting. The report should be in .pdf format and named as “report.pdf ”. Please compress all the files above in a single .zip file with name “Firstname_Lastname.zip”.
Comments