I get this text from a pdf invoice:
INVOCE DATE            Nº ITEM          CONTRACT DATA 
10/10/15           EN56000004567WWG      Standard Plan 3
  CONCEPT        AMOUNT       MONTHS   UNITPRIZE     PRIZE
CONCEPT AAA    47,101   MB      1,0    3,394074   159,86   Dollars
CONCEPT BBB    26,122   MB      1,0    3,394074    88,66   Dollars
CONCEPT CCC    37,101   MB      1,0    3,394074   125,92   Dollars
                       TOTAL       374,44 Dollars
This text is actually a table with several lines but only one colunm where data is in fact only separated with a diferent number of whitespaces in almost every line.
What I want is to get the amounts "47,101" , "26,122", "37,101" with a specific regex for each one based on their concept, for example: regex1 gets "47,101" looking for "CONCEPT AAA" and so on.
I have achieved to get "CONCEPT AAA 47,101" using this R line:
regmatches(invoice,regexpr("\\bCONCEPT AAA\\s*([-,0-9]+)", invoice, perl=TRUE))
but I only want the number "47,101".
ADDITIONAL INFO
For read the pdf I use readPDF function from tm package in R which outputs this table which indeed it is a character vector.
Due to there are a lot of invoices with slight differences in disposition I prefer use regex way to get data rather than try a best pdf to table conversion.
BONUS:
Then I will would like to get the prices for each concept "159,86", "88,66", "125,92".
 
    