computer science, programming and other ideas
In part 1, we have created a fully functional library which is able to create and train neural networks using computational graphs. We used them on very simple examples. Today, we are going to try it on a more serious problem: character recognition.
We are going to use a well-known database in the machine learning and deep learning world named MNIST. The database is available on Yann LeCun’s website. If you have read a bit about neural networks before you should have already seen his name. He is a French scientist who is one of the pioneers of neural networks and inventors of convolutional neural networks and he is now the director of AI at Facebook.
Character recognition is an emblematic problem for two reasons. Firstly, it is one of the first successes and industrial applications of neural networks. It was used since the 90’s to read checks. Secondly, computer vision has always been a leading application domain for neural networks.
In this part, we are going to briefly discover the MNIST database. Then, we are going to train some networks on it and finally, we are going to explore a bit how a neural network works.
Welcome in this big tutorial on neural networks!
Our goal is to write our own deep learning framework like TensorFlow or Torch. We are going to learn in-depth how neural networks work, all the mechanics behind them.
We will get our hands dirty and code everything! In this tutorial, we will use Python3 and scipy but I hope that the code and the ideas are clear enough so that you can adapt the code to your favorite language.
First, I show you the plan. In this part, we are going to quickly introduce neural networks and then, we will introduce computational graphs in order to model them. In the end of this part, we are going to use our implementation to learn some non-linear functions.
In the second part, we will deal with a more serious problem. We are going to build an optical character recognition system upon the MNIST database. It is a classical problem in machine learning, we have to do it.
Then, we will tackle recurrent neural networks and show how to model them with our library. To apply our new knowledge, we will try to learn a formal grammar generated by an automaton.
In part 4, we will go further with recurrent neural networks and introduce the well-known LSTM cell. We will briefly compare it with fully-connected recurrent neural networks.
To approach part 6, some more efficient optimization algorithms are necessary. Consequently, we will discuss them in part 5.
Have you ever read this fabulous article by Andrej Karpathy? Yes? Cool, because, we are going to reproduce his results with our own library in part 6. Amazing, isn’t it?
Finally, parts 7 and 8 are going to be theoretical appendices for the most curious readers.
Is it all? Maybe not! Stay tuned!
If you are ready, let’s go!
Welcome on my new website!
I will try to keep it up-to-date. And to write a post as soon as I have an idea or discovered anything interesting. Stay Tuned!
A short word on the technologies which proudly power this website. As you can see, the website is gracefully hosted by GitHub. Thanks guys! This website is completely static, I use Jekyll to generate the pages. By the way, I found this page very interesting to get started with Jekyll. And the design is made with Bootstrap.
I hope you’ll find anything interesting here!