Using this structure enables convolutional neural networks to gradually increase the number of extracted image features while decreasing the spatial resolution. A neural networks consist of 3 types of layers: Input Layer(in which we feed our inputs), Hidden Layer(where the processing happens) and Output Layer(the results that we obtain).You might wonder why we stack “layers” of neurons to build a neural network and how can we determine the number of layers or nodes in each layer that we need. Each node on the output layer represents one label, and that node turns on or off according to the strength of the signal it receives from the previous layer’s input and parameters. Please refer to the paper of Trenn 10 years ago: S. Trenn, "Multilayer Perceptrons: Approximation Order and Necessary Number of Hidden Units," IEEE Transactions on Neural Networks, vol. 