I used an example of using BERT to classify reviews, described at the link. The code is written for using the CPU and it works fine, but slowly. In Colab Google, with a multilingual model, one epoch is considered 4 hours for me. If I replace the CPU with the CUDA everywhere in the code, then the error that YOU met with appears. I followed the guidelines given in the link, but then another error appears:
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-3-0b35a5f74768> in <module>()
    268                   'labels': batch[2],
    269                   }
--> 270         inputs.to(device)
    271         outputs = model(**inputs)
    272 
AttributeError: 'dict' object has no attribute 'to'
 
    