All Articles

Fast AI - Week 2

This week in the fast.ai course we got more into the details of getting data for image classification models, playing around with the different training parameters, and running them on sample data.

This blog post is focused on some of the main code steps in the lesson 2 Jupyter notebook, and how I used it to build my own penguin image classifier!

How to scrape images from Google Images

Coming from a web development background, it’s kind of funny to see this kind of raw javascript in the browser - but it works. It opens up a csv of the urls of all of the selected photos.

urls = Array.from(document.querySelectorAll('.rg_di.rg_meta'))
.map(el=>JSON.parse(el.textContent).ou);
window.open('data:text/csv;charset=utf-8,' +
escape(urls.join('\n')));

View Data

After we’ve gotten all the photos downloaded into the folders (half the battle), it’s important to look at the data and make sure everything’s good. The command used in the notebook is

data = ImageDataBunch.from_folder(path, train=".", valid_pct=0.2,
ds_tfms=get_transforms(), size=224, num_workers=4).normalize(imagenet_stats)

ImageDataBunch is a class in the fast.ai library that has methods to create a folder structure for your photos to be used by the neural net. This is nice because models often expect a certain directory structure that contains a training set and a validation set. Kaggle, for example, provides data in this format for their competitions.

Then we display a sample of the images in the dataset:

data.showBatch()

And we now have some penguins! penguins

Train model

Now that we know we have the data in the right format, we can actually create a model. The fast.ai library does this via the create_cnn method, which takes in a databunch object, a pre-existing cnn, and a metrics parameter.

learn = create_cnn(data, models.resnet34, metrics=error_rate)

Resnet, short for “residual network,” is a type of deep convolutional neural network that comes with the PyTorch library.

Now, let’s see how well our model does out of the box.

learn.fit_one_cycle(4)
epoch train_loss valid_loss error_rate
1 1.211088 0.636326 0.245614
2 0.773919 0.332470 0.140351
3 0.547051 0.278447 0.087719
4 0.434100 0.278498 0.070175

Running four epochs at this learning rate gives us an error rate of 7%, which is pretty decent.

Understanding learning rates

Using the learning rate finder, we’re looking for the steepest downward slope that exists for awhile. In the following plot, that looks like the slope between 1e-05 and 1e-03.

learn.lr_find()
learn.recorder.plot()

learning rate graph

But what happens if your learning rate is too high?

learn.fit_one_cycle(1, max_lr=0.1)
epoch train_loss valid_loss error_rate
1 12.220007 11567788032.000000 0.701754

A high validation loss is a sign that the learning rate is too high.

What about the learning rate being very low?

learn.fit_one_cycle(4, max_lr=1e-5)
epoch train_loss valid_loss error_rate
1 1.650528 1.357501 0.719298
2 1.562972 1.340124 0.754386
3 1.539678 1.332040 0.754386
4 1.536463 1.332700 0.754386

If the training loss becomes larger than the validation loss, and the error rate is very large, it’s a sign that the learning rate or number of epochs is too low.

Low learning rates also run the risk of overfitting. Overfitting is when your model starts learning your specific images rather than generalizing to any input data.

Cleaning up our data

FileDeleter is an app that runs in Jupyter notebooks that gives us the images the neural net is most unsure about - and then gives us the option to delete them if they actually aren’t what we’re trying to classify.

invalid penguins

As you can see, there are some images in the dataset that it doesn’t make sense to use when training our model, like the map.

This combination of using the neural net plus human feedback works really well.

Productionizing

When deploying a model to production, Jeremy Howard recommends deploying with CPU instead of GPU. Why? Because while GPU is faster, it’s usually faster by a magnitude of 10 - 0.01 seconds for the model to run on a GPU vs. 0.1 seconds on a CPU - and while GPU’s are pretty necessary to train your model, with a CPU, it’s much easier to scale the requests.

Let’s test the model!

Instead of training, we’re going to do inference. Inference is when you’re using a trained model to predict things, rather than training the model.

img = open_image(path/'stuffed'/'00000004.jpeg')

This is an image of a very cute stuffed penguin:

stuffed penguin

Now, when we get the prediction:

pred_class,pred_idx,outputs = learn.predict(img)
pred_class

It gives us the output Category stuffed. Which is correct! Pretty awesome.