I have here some lines of code from the beginning of my OCR program. I can see with the Time() function that these few lines take 90% of the time of a run. Unfortunately, I have no more idea how to develop these lines more efficiently in terms of time. What would be your approaches to speed up this process?
for page_number,page_data in enumerate(doc):
            txt = pytesseract.image_to_string(page_data,lang='eng').encode('utf-8')
            Counter = 0
            txt = txt.decode('utf-8')
            tokens = txt.split()
 
            for i in tokens:
                ResultpageNumber.append([page_number+1,tokens[Counter],Counter])
                Counter=Counter+1
 
    