What Are Neural Networks and How Do They Work?

If you keep up with tech news, you've probably come across the concept of neural networks (also known as neural nets).

In 2016, for example, Google's AlphaGo neural network beat one of the best professional Go players in the world in a 4--1 series. YouTube also announced that they would be using neural networks to better understand its videos. Dozens of other stories may come to mind.

But what exactly is a neural network? How does it work? And why is it so popular in machine learning?

A Computer Like a Brain

Modern neuroscientists often discuss the brain as a type of computer. Neural networks aim to do the opposite: build a computer that functions like a brain.

Of course, we only have a cursory understanding of the brain's extremely complex functions, but by creating a simplified simulation of how the brain processes data, we can build a type of computer that functions very differently from a standard one.

Computer processors process data serially ("in order"). They perform many operations on a set of data, one at a time. Parallel processing ("processing several streams at once") significantly speeds up the computer by using multiple processors in series.

In the image below, the parallel processing example requires five different processors:

An artificial neural network (so called to distinguish it from the actual neural networks in the brain) has a fundamentally different structure. It's highly interconnected. This allows it to process data very quickly, learn from that data, and update its own internal structure to improve performance.

The high degree of interconnectedness, however, has some astounding effects. For example, neural networks are very good at recognizing obscure patterns in data.

The Ability to Learn

The ability of a neural network to learn is its greatest strength. With standard computing architecture, a programmer has to develop an algorithm that tells the computer what to do with incoming data to make sure that the computer outputs the correct response.

An input-output response could be as simple as "when the A key is pressed, display 'A' on the screen" or as complicated as performing complex statistics. Neural networks, on the other hand, don't need the same kind of algorithms. Through learning mechanisms, they can essentially design their own algorithms that ensure they perform correctly.

It's important to note that because neural networks are software programs written on machines that use standard serial-processing hardware, current technology still imposes limits. Actually building a hardware version of a neural network is another problem entirely.

From Neurons to Nodes

Now that we've laid the groundwork for how neural networks function, we can start to look at some of the specifics. The basic structure of an artificial neural network looks like this:

Each of the circles is called a "node" and it simulates a single neuron. On the left are input nodes, in the middle are hidden nodes, and on the right are output nodes.

In very basic terms, the input nodes accept input values, which could be a binary 1 or 0, part of an RGB color value, the status of a chess piece, or anything else. These nodes represent the information flowing into the network.

Each input node is connected to a number of hidden nodes (sometimes to every hidden node, sometimes to a subset). Input nodes take the information they're given and pass it along to the hidden layer.

For example, an input node might send a signal ("fire," in the parlance of neuroscience) if it receives a 1, and remain dormant if it receives a zero. Each hidden node has a threshold: if all of its summed inputs reach a certain value, it fires.

From Synapses to Connections

Each connection, the equivalent of an anatomical synapse, is also given a specific weight, which allows the network to place a stronger emphasis on the action of a specific node. Here's an example:

As you can see, the weight of connection B is higher than that of connection A and C. Let's say hidden node 4 will only fire if it receives a total input of 2 or greater. That means that if 1 or 3 fire on their own then 4 won't be triggered, but 1 and 3 together would trigger the node. Node 2 could also trigger the node on its own through connection B.

Let's take weather as a practical example. Say you design a simple neural network to determine whether there should be a winter storm warning.

Using the above connections and weights, node 4 might only fire if the temperature is below 0 F and winds are above 30 MPH, or it would fire if there's more than a 70 percent chance of snow. Temperature would be fed into node 1, winds to node 3, and likelihood of snow into node 2. Now node 4 can take all of these into account when determining what signal to send to the output layer.

Better Than Simple Logic

Of course, this function could simply be enacted with simple AND/OR logic gates. But more complex neural networks, like the one below, are capable of significantly more complex operations.

Output layer nodes function in the same way as hidden layer ones: output nodes sum the input from the hidden layer, and if they reach a certain value, the output nodes fire and send specific signals. At the end of the process, the output layer will be sending a set of signals that indicates the result of the input.

While the network shown above is simple, deep neural networks can have many hidden layers and hundreds of nodes.

Error Correction

The process, so far, is relatively simple. But where neural networks really shine is in learning. Most neural nets use a process called backpropagation, which sends signals backwards through the network.

Before programmers deploy a neural network, they run it through a training phase in which it receives a set of inputs with known results. For example, a programmer might teach a neural network to recognize images. The input could be a picture of a car, and the correct output would be the word "car."

The programmer provides the image as input and see what comes out of the output nodes. If the network responds with "airplane," the programmer tells the computer that it's incorrect.

The network then makes adjustments to its own connections, altering the weights of different links between nodes. This action is guided by a specific learning algorithm added to the network. The network continues to adjust connection weights until it provides the correct output.

This is a simplification, but neural networks can learn highly complex operations using similar principles.

Continual Improvement

Even after training, backpropagation continues -- and this is where neural networks get really cool. They continue to learn as they're used, integrating new information and making tweaks to the weights of different connections, becoming more and more effective and efficient at the task they were designed for.

This could be as simple as image recognition or as complex as playing Go.

In this way, neural networks are always changing and improving. And this can have surprising effects, resulting in networks that prioritize things a programmer wouldn't have thought to prioritize.

In addition to the process outlined above, which is called supervised learning, there's also another method: unsupervised learning.

In this situation, neural networks take an input and try to recreate it exactly in their output, using backpropagation to update their connections. This may sound like a fruitless exercise, but in this way, networks learn to extract useful features and generalize those features to improve their models.

Issues of Depth

Backpropagation is a very effective way to teach neural networks... when they're only a few layers deep. As the number of hidden layers increases, the effectiveness of backpropagation decreases. This is a problem for deep networks. Using backpropagation, they're often no more effective than simple networks.

Scientists have come up with a number of solutions to this problem, the specifics of which are quite complicated and beyond the scope of this introductory piece. What many of these solutions attempt to do, in simple terms, is to decrease the complexity of the network by training it to "compress" the data.

To do this, the network learns to extract a smaller number of identifying features of the input, eventually becoming more efficient in its computations. In effect, the network is making generalizations and abstractions, much in the same way that humans learn.

After this learning, the network can prune nodes and connections that it deems less important. This makes the network more efficient and learning becomes easier.

Neural Network Applications

So neural networks simulate how the brain learns by using multiple layers of nodes -- input, hidden, and output -- and they're able to learn both in supervised and unsupervised situations. Complex nets are able to make abstractions and generalize, making them more efficient and better able to learn.

What can we use these fascinating systems for?

In theory, we can use neural networks for almost anything. And you've probably been using them without realizing it. They're very common in speech and visual recognition, for example, because they can learn to pick out specific traits that sounds or images have in common.

So when you ask Siri where the nearest gas station is, your iPhone is putting your speech through a neural network to figure out what you're saying. There may be another neural network that learns to predict the sorts of things you're likely to ask for.

Self-driving cars might use neural networks to process visual data, thereby following road rules and avoiding collisions. Robots of all types can benefit from neural networks that help them learn to efficiently complete tasks. Computers can learn to play games like chess, Go, and Atari classics. If you've ever talked to a chatbot, there's a chance it was using a neural network to offer appropriate responses.

internet search can benefit greatly from neural networks, as the highly-efficient parallel processing model can churn a lot of data quickly. A neural network could also learn your habits to personalize your search results or predict what you're going to search for in the near future. This prediction model would obviously be very valuable to marketers (and anyone else who needs to be predict complex human behavior).

Image recognition, optical character recognition, stock market prediction, route-finding, big data processing, medical cost analysis, sales forecasting, video game AI... the possibilities are almost endless. The ability for neural networks to learn patterns, make generalizations, and successfully predict behavior makes them valuable in countless situations.

The Future of Neural Nets

Neural networks have advanced from very simple models to highly-complex learning simulations. They're in our phones, our tablets, and running many of the web services we use. There are many other machine-learning systems out there.

But neural networks, because of their similarity (in a very simplified way) to the human brain, are some of the most fascinating. As we continue to develop and refine models, there's no telling what they'll be capable of.

Do you know of any interesting uses of neural networks? Do you have experience with them yourself? What do you find most interesting about this technology? Share your thoughts in the comments below!