i tried to run this HTR Model https://github.com/arthurflor23/handwritten-text-recognition but it gives me this error Invalid argument: Not enough time for target transition sequence. The problem, I think in ctc_batch_cost. My images dimensions Are (137,518) and the max_len of text is 137. Any idea about how can i solve this issue?
            Asked
            
        
        
            Active
            
        
            Viewed 1,058 times
        
    2
            
            
        
        Ali Mostafa
        
- 51
 - 8
 
- 
                    1did you check this out? https://stackoverflow.com/a/50719312/2287841 – Octav Feb 10 '21 at 16:53
 - 
                    Yes, I checked this, i understood the problem, but I can't solve it. – Ali Mostafa Feb 10 '21 at 16:59
 - 
                    so you do not have duplicated letters which result in increase of your labels shape? – Octav Feb 10 '21 at 17:02
 - 
                    Do you mean in my text label?, i think I have, if so what can I do to solve it? – Ali Mostafa Feb 10 '21 at 17:13
 
1 Answers
2
            I fixed the issue, it was due to the size of the input.
Layer (type)                 Output Shape              Param #   
=================================================================
input (InputLayer)           [(None, 1024, 128, 1)]    0         
_________________________________________________________________
conv2d (Conv2D)              (None, 1024, 64, 16)      160       
_________________________________________________________________
p_re_lu (PReLU)              (None, 1024, 64, 16)      16        
_________________________________________________________________
batch_normalization (BatchNo (None, 1024, 64, 16)      112       
_________________________________________________________________
full_gated_conv2d (FullGated (None, 1024, 64, 16)      4640      
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 1024, 64, 32)      4640      
_________________________________________________________________
p_re_lu_1 (PReLU)            (None, 1024, 64, 32)      32        
_________________________________________________________________
batch_normalization_1 (Batch (None, 1024, 64, 32)      224       
_________________________________________________________________
full_gated_conv2d_1 (FullGat (None, 1024, 64, 32)      18496     
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 512, 16, 40)       10280     
_________________________________________________________________
p_re_lu_2 (PReLU)            (None, 512, 16, 40)       40        
_________________________________________________________________
batch_normalization_2 (Batch (None, 512, 16, 40)       280       
_________________________________________________________________
full_gated_conv2d_2 (FullGat (None, 512, 16, 40)       28880     
_________________________________________________________________
dropout (Dropout)            (None, 512, 16, 40)       0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 512, 16, 48)       17328     
_________________________________________________________________
p_re_lu_3 (PReLU)            (None, 512, 16, 48)       48        
_________________________________________________________________
batch_normalization_3 (Batch (None, 512, 16, 48)       336       
_________________________________________________________________
full_gated_conv2d_3 (FullGat (None, 512, 16, 48)       41568     
_________________________________________________________________
dropout_1 (Dropout)          (None, 512, 16, 48)       0         
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 256, 4, 56)        21560     
_________________________________________________________________
p_re_lu_4 (PReLU)            (None, 256, 4, 56)        56        
_________________________________________________________________
batch_normalization_4 (Batch (None, 256, 4, 56)        392       
_________________________________________________________________
full_gated_conv2d_4 (FullGat (None, 256, 4, 56)        56560     
_________________________________________________________________
dropout_2 (Dropout)          (None, 256, 4, 56)        0         
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 256, 4, 64)        32320     
_________________________________________________________________
p_re_lu_5 (PReLU)            (None, 256, 4, 64)        64        
_________________________________________________________________
batch_normalization_5 (Batch (None, 256, 4, 64)        448       
_________________________________________________________________
reshape (Reshape)            (None, 256, 256)          0         
_________________________________________________________________
bidirectional (Bidirectional (None, 256, 256)          296448    
_________________________________________________________________
dense (Dense)                (None, 256, 256)          65792     
_________________________________________________________________
bidirectional_1 (Bidirection (None, 256, 256)          296448    
_________________________________________________________________
dense_1 (Dense)              (None, 256, 332)          85324     
=================================================================
look at the final layer ( dense_1 ) the second dimension is 256, so your text label should be <=256, not more. The problem comes from here.
        Ali Mostafa
        
- 51
 - 8