NIC-2015-Pytorch

This project is the Pytorch implementation of Neural Image Captioning 2015 paper by Vinyals et. al.[PDF]. The implementation is inspired from the Udacity Image captioning project [Repo link]

Backend : Pytorch, Pytorch Vision
Dataset : MS COCO 2014 Dataset [Link]

Model Architecture

File Description

data_load.py : Dataloader class and functions for data augmentation.
model.py : Model class consisting of model definitions and functions.
vocabulary.py : Model class consisting of vocublary functions.
training.ipynb : Jupyter notebook with training hyperparameters like learning rate, batch size, embedding size, hidden state size etc.
inference.ipynb : Jupyter notebook to sample the captions generated by the encoder-decoder model.
vocabulary and architecture experiments.ipynb : Jupyter notebook to understand the vocabulary generation process and experiment with the CNN-RNN architecture to check whether the model.py implementation is correct or not.

Dataset setup instructions

Please follow these instructions to setup the MS COCO 2014 dataset for training. Remember, the training dataset is 13GB along with test data(6GB). Before downloading, ensure good bandwidth and enough storage(atleast 20 GB for dataset) on server.

Clone this repo: https://github.com/cocodataset/cocoapi

git clone https://github.com/cocodataset/cocoapi.git

Setup the coco API (also described in the readme here)

cd cocoapi/PythonAPI  
make  
cd ..

Download some specific data from here: http://cocodataset.org/#download (described below)

Under Annotations, download:
- 2014 Train/Val annotations [241MB] (extract captions_train2014.json and captions_val2014.json, and place at locations cocoapi/annotations/captions_train2014.json and cocoapi/annotations/captions_val2014.json, respectively)
- 2014 Testing Image info [1MB] (extract image_info_test2014.json and place at location cocoapi/annotations/image_info_test2014.json)
Under Images, download:
- 2014 Train images [83K/13GB] (extract the train2014 folder and place at location cocoapi/images/train2014/)
- 2014 Val images [41K/6GB] (extract the val2014 folder and place at location cocoapi/images/val2014/)
- 2014 Test images [41K/6GB] (extract the test2014 folder and place at location cocoapi/images/test2014/)

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
images		images
models		models
LICENSE		LICENSE
README.md		README.md
data_loader.py		data_loader.py
inference.ipynb		inference.ipynb
model.py		model.py
training.ipynb		training.ipynb
vocabulary and architecture experiments.ipynb		vocabulary and architecture experiments.ipynb
vocabulary.py		vocabulary.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NIC-2015-Pytorch

Model Architecture

File Description

Dataset setup instructions

About

Releases

Packages

Languages

License

pshwetank/NIC-2015-Pytorch

Folders and files

Latest commit

History

Repository files navigation

NIC-2015-Pytorch

Model Architecture

File Description

Dataset setup instructions

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages