I need your help to automate branching a dictionary. I am iterating through lines of a big dataset with over 100 million lines. I split each line and select the parts of interest:
#I quickly wrote this to create a database so that you can test the script
fruits = ['apple','banana','citron']
cars = ['VW', 'Opel', 'Fiat']
countries = ['Bosnia','Egypt','USA','Ireland']
genomic_contexts = ['CDS', 'UTR5', 'UTR3', 'Intron']
database=[[fruits[random.randint(0,2)],cars[random.randint(0,2)],
countries[random.randint(0,3)],genomic_contexts[random.randint(0,3)]] 
for x in range(100)]
A_B_C_D_dict = {}
for line in database:
    line = line.split(',')
    A = line[0]
    B = line[1]
    C = line[2]
    D = line[3]
    #creating dict branch, if not existing yet and counting the combinations
    A_B_C_D_dict[A]=my_dict.get(A,{})
    A_B_C_D_dict[A][B]=my_dict[A].get(B,{})
    A_B_C_D_dict[A][B][C]=my_dict[A][B].get(C,{})
    A_B_C_D_dict[A][B][C][D]=my_dict[A][B][C].get(D,0)
    A_B_C_D_dict[A][B][C][D] += 1
Now I would like to define a function that does this automatically instead of always writing the branch manually (this should work for different branch lengths, not always 4)! My code should look like this:
for line in database:
    line = line.split(',')
    A = line[0]
    B = line[1]
    C = line[2]
    D = line[3]
    add_dict_branch('A_B_C_D_dict',0)
    A_B_C_D_dict[A][B][C][D] += 1
My try for such a function is the following but I might be humiliating myself:
def select_nest(dict_2,keys,last,counter=0):
    if last == 0:
        return dict_2
    if counter == last:
        return dict_2[globals()[keys[counter-1]]]
    else:
        return select_nested(
        dict_2[globals()[keys[counter-1]]],keys,last,counter+1)
def add_dict_branch(dict_1,end_type):
    if type(dict_1) != type(str()):
        raise KeyError(dict_1," should be string!")
    keys = dict_1.split('_')
    keys = keys[:len(keys)-1]
    for x in range(len(keys)):
        key = globals()[keys[x]]
        if x < len(keys)-1:
            select_nest(globals()[dict_1],keys,x)[key] = \
            select_nest(globals()[dict_1],keys,x).get(key,{})
        else:
            select_nest(globals()[dict_1],keys,x)[key] = \
            select_nest(globals()[dict_1],keys,x).get(key,end_type)
Any comments would be great, either pointing out my mistake or suggestions to new approaches. I really need to write my code with respect to performance. If it is slow I can't use it because of the several million iterations.
 
     
    