You have for exemple :
class My_model(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 4 * 4, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)
    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 4 * 4)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x
# Call construtor of Class 
my_model = My_model()
It's important to differentie the class and objet.
The name's class start with capital letter in Python.
The constructor as you can see, it doesn't take a data/input parameter, alone the function forward have one.
After, for the training, you must to need :
- criterion who calcul error that model with the labels.
- It must have optimizer for back propagation algorythm
Exemple :
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
For end, you must to need, with a loop, this elements :
    # forward + backward + optimize
    outputs = net(inputs)
    loss = criterion(outputs, labels)
    loss.backward()
    optimizer.step()
Here, you have one iteration of back propagation.
Pytorch documentation
If you want thinking the inference in backpropagation, you can read how create a layer with pytorch and how the pytorch use autograph.
The tensor use Autograph for backpropagation. Exemple with Pytorch documentation
import torch
x = torch.ones(5)  # input tensor
y = torch.zeros(3)  # expected output
w = torch.randn(5, 3, requires_grad=True)
b = torch.randn(3, requires_grad=True)
z = torch.matmul(x, w)+b
loss = torch.nn.functional.binary_cross_entropy_with_logits(z, y)
loss.backward()
print(w.grad)
print(b.grad)
The resultat give the backpropagation, where the cross entropy criterion calcul the distance with model and label. The Tensor z is not unique matrice of value but a class with "memory the calcul" with w, b, x, y.
In the layer the gradiant use the forward function for this calcul or a function backward if necesserie.
Best regard