I have two data frames of volume-by-date. They're both the same data, but one is filtered. I'd like to plot a trendline of the ratio between filtered data and non-filtered data on any given day—but am having a hugely hard time molding the data frames so that they're comparable. Here's an example:
unFiltered <- data.frame(date = c("01-01-2015", "01-01-2015", "01-02-2015"), item = c("item1", "item2", "item1"), volume = c(100, 100, 50))
filtered <- data.frame(date = c("01-01-2015", "01-03-2015"), item = c("item1", "item1"), volume = c(10, 40))
From these data sets, I'd like to construct a third data set that is "The percentage of unfiltered item-volume that is being filtered". That is, I want a data frame that will look like this:
    date          item    percentage
1 "01-01-2015"    item1   .1
2 "01-01-2015"    item2    0
3 "01-02-2015"    item1    0
4 "01-02-2015"    item2    0
5 "01-03-2015"    item1   .8
6 "01-03-2015"    item2    0
(Note: Neither data frame has 6 entries—but the resulting data frame has unique values of item and unique values of date.)
Anyone have any ideas? I've been stuck on this for ~2 hours, fumbling around with for loops, merging, joins, manually creating data frames, etc. If anyone has a solution, would you mind explaining what's going on in said solution, too? (I still kind of suck at R, and often times I read code that someone writes without having any idea why it actually works).
 
     
    