Recently I've working in table extraction, specifically with stream tables. An in this post I saw that tabula achieves very well this kind of extraction.
For example when compares tabula vs camelot in "budget.pdf", in the extraction Tabula combines the last two columns. Using .split(' ', expand = True) can be fixed and then use combine, join or merge make the original pdf table.
I noticed that when the gap between the columns is so close they would be merged in one. In the taks that I'm trying to achieve that is very common. I don't know how well might be my solution because in some examples that I work on in the middle of the dataframe the columns are merged and I have to sort the columns of the whole dataframe.
I would like to know if Tabula has a hyperparameter tuning to deal with that, like PDFMiner in which you can manage distances between values...