Imagine a chemist in search of a novel drug that will bind to a specific protein in the human body. The chemist is faced with two problems: 1) The space of possible drugs is enormous and 2) the cost of synthesizing a large variety of candidate drugs and evaluating their efficacy is prohibitively expensive and time-consuming.
Trained neural networks are notoriously opaque. They consist of millions — if not billions — of parameters, and when they fail, explaining why is not always easy. This has prompted researchers to invent novel techniques that reveal to us why neural networks behave the way they do. One such technique is DeepDream, which besides being a useful research tool is also really fun!
Of all the stunning advancements in deep learning made in the last 10 years, the progress in the field of computer vision is perhaps the most striking. At the heart of this progress is a model known as a convolutional neural network – or “CNN” for short – which resembles the structure of the brain’s visual cortex and has become a staple of almost all computer vision systems today.
In this final post of the series, we will see how to apply backpropagation to the neural net we built in the last chapter. This will allow our network to incrementally improve itself as it gets exposed to new data, which will ultimately enable it to classify the MNIST test set with close to 95% accuracy.
By now, we have covered a whole lot of material. We have talked about decision boundaries, loss functions, and gradient descent, and we have built our own linear models using the perceptron and logistic regression algorithms. In this penultimate post of the series, we will analyze the first half of a neural network — the forward pass — and we will use linear algebra to explore a visual interpretation of what happens to our data as it flows through a neural net.
In the previous post, we saw how the perceptron algorithm automatically discovered how to distinguish data points belonging to two different classes. This time, we will explore the ideas behind another algorithm — logistic regression — to see how and why it plays a central role in many neural network architectures. We will also demonstrate how it can be used to model slightly more complex and interesting data than the Iris dataset.
Building a neural network is fairly easy. It doesn’t require you to know any obscure programming language, and you can build a working model with surprisingly few lines of code. To understand how it works however — to really understand it — takes mathematical insight more than anything else. While this may sound lofty, anyone with a firm grasp of high school level math is equipped to take on the task.