I have a text file as shown below in the sample that I want to convert to a csv file (currently using Pandas).
The challenge is that I don't know in advance what the keys (column headers are) and in what order they are.
The final column order is not important.
Sample file:
name: john| dob: 10-06-1960| address: 4853 Radio Park Drive
name: jane| dob: 07-10-1973| address: 1537 Timbercrest Road| mobile: 706-289-6746
name: liam| dob: 12-08-1986| address: 4853 498 Fairmont Avenue| telephone number: 706-687-5021
name: chris| dob: 09-12-1965| address: 485 Green Avenue| state: California| Telephone Number: 510-855-5213
Desired output:
Name | dob        | address                  | mobile       | telephone number | state      |
-----+------------+--------------------------+--------------+------------------+------------+
john | 10-06-1960 | 4853 Radio Park Drive    |              |                  |            |
jane | 07-10-1973 | 1537 Timbercrest Road    | 706-289-6746 |                  |            |
liam | 12-08-1986 | 4853 498 Fairmont Avenue |              | 706-687-5021     |            |
chris| 09-12-1965 | 485 Green Avenue         |              | 510-855-5213     | California |
My Code:
import pandas as pd
df = pd.DataFrame()
file = open('D:\sample.log', 'r')
        
lines = file.readlines()
for line in lines:
    pairs = line.split('|')
    my_dict = {}
    for pair in pairs:
       key = pair.split(': ')[0].strip()
       value = pair.split(': ')[1].strip()
            
       my_dict[key] = value
       df.append(my_dict, ignore_index=True)
This way of appending is very slow. How can I make this faster.
Or is there a much better solution (for example via a json string)?
 
     
    