I am currently trying to improve an OCR routine. The text I encounter is white with a varying background. So I'm thinking of changing the perfect white text to black, and everything else to white. Everything works fine, till I need to invert the colours.
The invert method from PIL doesn't support this image mode, so I have to convert, but I get bad results from it.
OSError: not supported for this image mode
My test image is this:
Which I can turn into:
But when I try to convert, invert and convert back, it gets the colours/grayscale again?
So, currently, I can't find a way to get the result I want:
If I use the white text to read the image, I only get "Lampent used BS] gL [LL =e". But it reads perfectly fine with Black text.
What is another way I can invert my image? The only other stuff I found, wants to change every pixel at a time, with no good guidance for beginner coder.
def readimg(image, write=False):
    import pytesseract
    from PIL import Image
    # opening an image from the source path
    if isinstance(image, str):
        img = Image.open(image)
        img = img.convert('RGBA')
    else:
        img = image
    img = img.convert('RGB')  # Worse results if not reconverted??
    img.show()
    # path where the tesseract module is installed
    pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files/Tesseract-OCR/tesseract.exe'
    # converts the image to result and saves it into result variable
    result = pytesseract.image_to_string(img)
    # write text in a text file and save it to source path
    if "" in result:  # Catch some Garbage text
        result = result[:-2]
    result = result.strip()  # Clean newlines
    if write:
        with open('output.txt', mode='a') as file:
            file.write(result)
    print(result)
    return result
def improve_img(image):
    from PIL import Image, ImageOps
    if isinstance(image, str):  # if link, open it
        im = Image.open(image)
    else:
        im = image
    im = im.convert('RGBA')
    thresh = 254  # https://stackoverflow.com/questions/9506841/using-python-pil-to-turn-a-rgb-image-into-a-pure-black-and-white-image
    fn = lambda x: 255 if x > thresh else 0
    r = im.convert('L').point(fn, mode='1')
    #r = im.convert('RGB')
    #r = ImageOps.invert(r)
    #r = im.convert('L')
    #r.save("test.png")
    #r.show()
    return r
if __name__ == '__main__':
    test = improve_img('img/testtext1.png')
    readimg(test)




