In Pandas, I have a dataframe, written from a csv. My end goal is to generate an XML schema from that CSV, because each of the items in the CSV correspond to a schema variable. The only solution (that I could think of) would be to read each item from that dataframe so that it generates a text file, with each value in the dataframe surrounded by a string.
TableName  Variable      Interpretation   Col4   Col5
CRASH      CRASH_ID      integer          1     
CRASH      SER_NO        range            0
CRASH      SER_NO        code             99999
CRASH      CRASH_MO_NO   code             1      January
CRASH      CRASH_MO_NO   code             2      February
Which would generate a text file that results in something along the lines of (using the first row as an example):
<table = "CRASH">
<name = "CRASH_ID">
<type = "integer">
<value = "1">
Where <table = >, <name = >, are all strings. They don't have to be formatted that way specifically (although that would be nice)-- I just need a faster way to generate this schema than typing it all out by hand from a CSV file.
It seems like the best way to do that would be to read through each row and generate a string while writing it to the output file. I've looked at the .iterrows() method, but that doesn't let me concatenate strings and tuples. I've also looked at some posts from other users, but their focus seems to be more on calculating things within dataframes, or changing the data itself, rather than generating a string from each row.
My current code is below. I understand that pandas is based off Numpy arrays, and that running "for i in df" loops is not an efficient method, but I am not really sure where to start.
EDIT: Some of the rows might need to loop through to display a certain way. For instance, the schema has multiple value codes that have strings attached:
<values>
<value code = "01">January</value>
<value code = "02">February</value>
<value code = "03">March</value>
</values>
I am thinking maybe I could group the values by "interpretation"? And then, if they have the "code" interpretation, I could do some kind of iteration through the group so that it displayed all the codes.
Here is my current code, for reference. I have updated it to reflect Randy's excellent suggestion below. I have also edited the above post to reflect some updated concerns.
import pandas as pd
text_file = open(r'oregon_output.txt', 'w')
df = pd.read_csv(r'oregon_2013_var_list.csv')
#selects only CRASH variables
crash = df['Col1'] == 'CRASH'
df_crash = df[crash]
#value which will be populated with code values from codebook
code_fill = " " 
#replaces NaN values in dataframe wih code_fill
df_crash.fillna(code_fill, inplace = True)
for row_id, row in df.iterrows():
    print '<variable>'
    for k, v in row.iterkv():
        if v is not None:
            print '<{} = "{}">'.format(k, v)
    print '</variable>'
    print
 
     
    
 
    