Skip to content

Neural Network Fundamentals (Part 3): Artificial Intelligence Network Structures

Machine Learning methods Regression and Support Vector Machines (SVM) have been previously covered. While SVM shows resemblance to Regression Analysis, there exists a method stemming directly from Logistic Regression, called the Artificial Neural Network (ANN). This text describes the...

Artificial Neural Networks: Fundamentals of Machine Learning (Part 3)
Artificial Neural Networks: Fundamentals of Machine Learning (Part 3)

Neural Network Fundamentals (Part 3): Artificial Intelligence Network Structures

Artificial Neural Networks (ANNs) are a key component of modern machine learning, inspired by the biological neurons in a human brain. This article aims to provide a clear and straightforward explanation of ANNs, their components, and their differences with logistic regression.

Components of an ANN

An ANN consists of three main layers: the input layer, hidden layers, and the output layer.

Input Layer

The input layer receives raw data, such as pixels in images.

Hidden Layers

Hidden layers are intermediate layers that process inputs by applying weights and activation functions. These layers capture nonlinear relationships and complex features.

Output Layer

The output layer produces the final result, such as classification probabilities or regression outputs.

Neurons in an ANN

Each "neuron" in these layers calculates a weighted sum of its inputs, applies an activation function, and passes the output forward to the next layer.

Learning in an ANN

Learning in an ANN takes place between the layers, with the weights lW at each layer being learned to resemble the true class label when plugged into the ANN. This learning process is facilitated by an optimization algorithm, such as Gradient Descent, which finds the best values for the weights of the ANN.

The Backpropagation Process

The backpropagation process in an ANN is responsible for distributing the error to each of the weights in the network. It begins by computing the error at the last layer of the network and then uses the chain rule of partial derivatives to compute the change in the weights at each layer.

Regularization in ANNs

The objective function in ANNs also includes a Regularization term to ensure that the model avoids overfitting.

Differences and Similarities with Logistic Regression

While both ANNs and logistic regression share fundamental concepts like weighted sums and sigmoid activations for classification outputs, there are some key differences:

  • Model type: ANNs are multi-layered models with nonlinear transformations via activation functions, while logistic regression is a linear model with a sigmoid function for binary classification.
  • Complexity: ANNs can capture highly nonlinear patterns due to multiple layers and neurons, while logistic regression captures only linear relationships between features and log-odds.
  • Architecture: ANNs are composed of input, hidden, and output layers, while logistic regression is a single-layer model without hidden layers.
  • Learning: ANNs learn through forward pass and backpropagation across multiple layers, while logistic regression estimates coefficients via maximum likelihood or gradient descent.
  • Output: ANNs can output complex, nonlinear decision boundaries, while logistic regression outputs probabilities via the logistic (sigmoid) function.

In essence, logistic regression can be considered a simple neural network without hidden layers, applying a linear transformation and sigmoid activation. ANNs generalize this by stacking multiple layers and nonlinear activations, enabling modeling of complex, nonlinear data relationships.

Training and Classification

The ANN model is trained on the training-set by running an optimization algorithm, which finds the best values for the weights of the ANN. The larger network is applied on the training set and optimized weights are obtained. The classification probabilities are then obtained for the respective test-set. The class labels can be obtained by applying a threshold on the probabilities (e.g., >0.7).

Interactive Visualization Tool

For a more hands-on learning experience, readers can make use of an interactive visualization tool that can be run in their Jupiter notebook.

Further Reading

For those interested in diving deeper into ANNs, the code for this project can be found at: https://www.github.com/azad-academy/MLBasics-ANN.

The historical origin of ANNs dates back to the 1940s and 1950s with foundational work by Warren McCulloch and Walter Pitts, who proposed a model mimicking brain neuron activity using simplified logical units. This McCulloch-Pitts model demonstrated that neural activity could be modeled by formal logic, which laid the groundwork for later developments such as the perceptron in the 1950s and the subsequent rise of ANNs in the 1980s and beyond.

Support the Author

If you find this article helpful, you can support the author by becoming a Patreon Supporter at: https://www.patreon.com/azadacademy.

Follow the Author

You can follow the author on Substack at: https://azadwolf.substack.com.

Stay Updated

For updates on the latest developments in AI, you can follow the author on Twitter at: https://www.twitter.com/azaditech.

Artificial Neural Networks (ANNs), equipped with artificial intelligence, can employ learning algorithms like Gradient Descent to optimize their weights (lW) to resemble the true class label when applied to an ANN.

The primary components of an ANN include the input layer, hidden layers, and the output layer. Each neuron in these layers calculates a weighted sum of its inputs, applies an activation function, and passes the output forward to the next layer, capturing nonlinear relationships and complex features in the data.

Read also:

    Latest