# Deep Learning 20: graph batching in PyTorch Geometric

PyG or PyTorch Geometric is a deep learning framework for GNNs (graph neural networks). When dealing with graphs, to speed up the computation, we need to do the batch computation, even though the graphs are in different “shapes”.

In PyG, it is possible to pack the data in batches. According to the documentation. “Adjacency matrices are stacked in a diagonal fashion (creating a giant graph that holds multiple isolated subgraphs), and node and target features are simply concatenated in the node dimension.” Shown as the following:

### Use our own data

So to run in small batches, we need a list of node features (X) and a list of adjacency matrices (A). In PyG, we first need to pack them as a `torch_geometric.data.Data` object. Then put a list of Data objects into a `torch_geometric.loader.DataLoader`.

```import torch

from torch_geometric.data import Batch
from torch_geometric.data import Data

# make some toy data
x1 = torch.Tensor([, , ])
edge_index1 = torch.tensor([[0, 1, 1, 2], [1, 0, 2, 1]])

x2 = torch.Tensor([, ])
edge_index2 = torch.tensor([[0, 1, 1], [1, 0, 1]])

# make Data object
data1 = Data(x=x1, edge_index=edge_index1)
data2 = Data(x=x2, edge_index=edge_index2)

print (batch)

'''outputs:
DataBatch(x=[3, 1], edge_index=[2, 4], y=, z=, name=, face=[3, 1], batch=, ptr=)
DataBatch(x=[3, 1], edge_index=[2, 4], y=, z=, name=, face=[3, 1], batch=, ptr=)
'''
```

In the previous example, we create two graphs, defined as x1,x2 and edge_index1, edge_index2. Note that, the shape of node feature X is (num_nodes, Dim), and the adjacency matrix A shape is (2, num_relations). This A matrix defines relations from the source to the target node ids.

Then to make the Data object, we need to define node features and an adjacency matrix. In the dataloader, we set the batch size to be 1.

### Pack in a single batch

If you do not want to use the for-loop, and only need a single batch. Then here is the correct way to pack the data:

```# first get a dataloader