objective
I am trying to automatically generate an EDA report for each column in my dataframe, starting with value_counts().
problem
the problem is that my function doesn't return anything. So while it does print to console, it doesn't print that same output to my text file. I was using this to just generate syntax and then run it line-by-line in my IDE to look at all the variables, but that is not a very programmatic solution.
notes
Once this is working, I am going to add some syntax for graphs and the output of df.describe(), but for now I can't even get the basics of what I want.
Output doesnt have to be .txt, but I thought that would be easiest while getting this to work.
I tried
import pandas as pd
def EDA(df, name):
    df.name = name  # name == string version of df
    print('#', df.name)
    for val in df.columns:
        print('# ', val, '\n', df[val].value_counts(dropna=False), '\n', sep='')
        print(df[val].value_counts(dropna=False))
path = 'Data/nameofmyfile.csv'
# name of df
activeWD = pd.read_csv(path, skiprows=6)
f = open('Output/outtext.txt', 'a+', encoding='utf-8')
f.write(EDA(activeWD, 'activeWD'))
f.close()
also tried
- various version of replacing - printwith- return- def EDA(df, name): - df.name = name # name == string version of df print('#', df.name) for val in df.columns: print('# ', val, '\n', df[val].value_counts(dropna=False), '\n', sep='') return(df[val].value_counts(dropna=False))
- running file from anaconda prompt - Python Syntax\newdataEDA.5.py >> Output.outtext.txt 
which results in the following codec error:
(base) C:\Users\auracoll\Analytic Projects\IDL Attrition>Python Syntax\newdatanewlife11.5.py >> Output.outtext.txt
sys:1: DtypeWarning: Columns (3,16,39,40,41,42,49) have mixed types. Specify dtype option on import or set low_memory=False.
Traceback (most recent call last):
  File "Syntax\newdatanewlife11.5.py", line 46, in <module>
    EDA(activeWD, name='activeWD')
  File "Syntax\newdatanewlife11.5.py", line 38, in EDA
    print(df[col].value_counts(dropna=False))
  File "C:\ProgramData\Anaconda3\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 382-385: character maps to <undefined>
I tried encoding='utf-8' and encoding='ISO-8859-1', neither of which resolve this problem.
- I have tried to save intermediary variables, which return none type. - testvar = for val in df.columns: df[val].value_counts(dropna=False)
when I do this, testvar is NoneType object of builtins module
 
    