Hi I have dataframe with 2 columns :
+----------------------------------------+----------+
|                  Text                  | Key_word |
+----------------------------------------+----------+
| First random text tree cheese cat      | tree     |
| Second random text apple pie three     | text     |
| Third random text burger food brain    | brain    |
| Fourth random text nothing thing chips | random   |
+----------------------------------------+----------+
I want to generate a 3rd columns with a word appearing before the key_word from the text.
+----------------------------------------+----------+-------------------+--+
|                  Text                  | Key_word | word_bef_key_word |  |
+----------------------------------------+----------+-------------------+--+
| First random text tree cheese cat      | tree     | text              |  |
| Second random text apple pie three     | text     | random            |  |
| Third random text burger food brain    | brain    | food              |  |
| Fourth random text nothing thing chips | random   | Fourth            |  |
+----------------------------------------+----------+-------------------+--+
I tried this but it's not working
df2=df1.withColumn('word_bef_key_word',regexp_extract(df1.Text,('\\w+)'df1.key_word,1))
Here is the code to create a example of the dataframe
df = sqlCtx.createDataFrame(
    [
        ('First random text tree cheese cat' , 'tree'),
        ('Second random text apple pie three', 'text'),
        ('Third random text burger food brain' , 'brain'),
        ('Fourth random text nothing thing chips', 'random')
    ],
    ('Text', 'Key_word') 
)
 
    