given a list of purchase events (customer_id,item)
1-hammer
1-screwdriver
1-nails
2-hammer
2-nails
3-screws
3-screwdriver
4-nails
4-screws
i'm trying to build a data structure that tells how many times an item was bought with another item. Not bought at the same time, but bought since I started saving data. the result would look like
{
       hammer : {screwdriver : 1, nails : 2}, 
  screwdriver : {hammer : 1, screws : 1, nails : 1}, 
       screws : {screwdriver : 1, nails : 1}, 
        nails : {hammer : 1, screws : 1, screwdriver : 1}
}
indicating That a hammer was bought with nails twice (persons 1,3) and a screwdriver once (person 1), screws were bought with a screwdriver once (person 3), and so on...
my current approach is
users = dict where userid is the key and a list of items bought is the value
usersForItem = dict where itemid is the key and list of users who bought item is the value
userlist = temporary list of users who have rated the current item
pseudo:
for each event(customer,item)(sorted by item):
  add user to users dict if not exists, and add the items
  add item to items dict if not exists, and add the user
----------
for item,user in rows:
  # add the user to the users dict if they don't already exist.
  users[user]=users.get(user,[])
  # append the current item_id to the list of items rated by the current user
  users[user].append(item)
  if item != last_item:
    # we just started a new item which means we just finished processing an item
    # write the userlist for the last item to the usersForItem dictionary.
    if last_item != None:
      usersForItem[last_item]=userlist
    userlist=[user]
    last_item = item
    items.append(item)
  else:
    userlist.append(user)
usersForItem[last_item]=userlist   
So, at this point, I have 2 dicts - who bought what, and what was bought by whom. Here's where it gets tricky. Now that usersForItem is populated, I loop through it, loop through each user who bought the item, and look at the users' other purchases. I acknowledge that this is not the most pythonic way of doing things - I'm trying to make sure I get the correct result(which I am) before getting fancy with the Python.
relatedItems = {}
for key,listOfUsers in usersForItem.iteritems():
  relatedItems[key]={}
  related=[]
  for ux in listOfReaders:
    for itemRead in users[ux]:
      if itemRead != key:
        if itemRead not in related:
          related.append(itemRead)
        relatedItems[key][itemRead]= relatedItems[key].get(itemRead,0) + 1    
  calc jaccard/tanimoto similarity between relatedItems[key] and its values
Is there a more efficient way that I can be doing this? Additionally, if there is a proper academic name for this type of operation, I'd love to hear it.
edit: clarified to include the fact that I'm not restricting purchases to items bought together at the same time. Items can be bought at any time.
 
     
     
    