Graph Theory in Neural Networks

Graph theory is the branch of discrete mathematics focused on the analysis of graphs, their theorems, and properties. These structures are used in practical applications to model connections between objects.

Neural networks, on the other hand, are a tool and area of study within machine learning. Inspired by the structure and functioning of the human brain, neural networks are composed of layers of artificial neurons (nodes), connected to each other through edges of a given weight.

At first glance, both concepts may seem incompatible — perhaps vaguely related. However, advances in machine learning as a discipline have provided, over years of theoretical research, a conceptual and practical framework rigorous enough to develop what is known today as graph neural networks. These are formally defined as deep learning methods that operate over the domain of graphs. Their advantage lies precisely in the effectiveness of these structures for handling complex connections between data in the context of a non-Euclidean space.

The following definitions are considered:

Definition. A graph $G$ is an ordered pair $G = (V, E)$ , formed by a finite, non-empty set of vertices $V$ , and a collection of pairs of vertices $E$ called edges.

Definition. A graph $G = (V, E)$ is directed if the elements of $E$ (edges) are ordered pairs of vertices. That is, $e = (x, y)$ , with $x, y \in V$ .

Definition. A neural model is composed of three elements: (1) A set of connecting links. (2) An additive link, which functions as a linear combiner. (3) An activation function bounded by the amplitude of the output parameters of a neuron.

Definition. A neural network is a directed graph consisting of nodes interconnected by synaptic and activation links, satisfying four properties: (1) Each neuron is represented by a set of linear synaptic links, an externally applied bias, and a possible nonlinear activation link. The bias is represented by a synaptic link connected to an input fixed at $+1$ . (2) The synaptic links of a neuron carry the weight of their respective inputs. (3) The weighted sum of the input signals defines the induced local field of the neuron in question. (4) The activation link squashes the induced local field of the neuron to produce an output.

Non-IID and Non-Euclidean Data

Traditionally, neural networks are limited to handling independent data that operates in Euclidean space (images, time series, for example). Their inputs are vectors representing a specific point in the dataset, independent of one another. As such, the relationship between data points is not relevant to the output of these types of models (Feedforward Neural Networks, Convolutional Neural Networks, etc.).

For this reason, graph neural networks are vastly superior when handling highly dynamic and interconnected information. The GNN model better captures the structure of the graph by extracting relevant information from vertex neighborhoods, leveraging relationships to obtain better representations.