I have a set of N customers, indexed 0,...,N-1. Periodically, for some subset S of customers, I need to evaluate a function f(S). Computing f(S) is of linear complexity in |S|. The set S of customers is represented as an object of type std::vector<int>. The subsets that come up for evaluation can be of different size each time. [Since the order of customers in S does not matter, the set can as well be represented as an object of type std::set<int> or std::unordered_set<int>.]
In the underlying application, I may have the same subset S of customers come up multiple times for evaluation of f(S). Instead of incurring the needless linear complexity each time, I am looking to see if it would benefit from some sort of less computational burdensome lookup.
I am considering having a map of key-value pairs where the key is directly the vector of customers, std::vector<int> S and the value mapped to this key is f(S). That way, I am hoping that I can first check to see if a key already exists in the map, and if it does, I can look it up without having to compute f(.) again.
Having an std::map with std::vector as keys is well-defined. See, for e.g., here.
CPPReference indicates that map lookup time is logarithmic. But I suppose this is logarithmic in the number of keys where each key if of a constant length -- such as an int or a double, etc. How is the complexity affected where the key itself need not be of constant length and can be of arbitrary length upto size N?
Since the keys can themselves be of different sizes (subset of customers that come up for evaluation could be different each time), does this introduce any additional complexity in computing a hash function or the compare operation for the std::map? Is there any benefit to maintain the key as a binary array a fixed length N? This binary array is such that B_S[i]=1 if the ith customer is in set S, and it is 0 otherwise. Does this make the lookup any easier?
I am aware that ultimately the design choice between reevaluating f(S) each time versus using std::map would have to be done based on actual profiling of my application. However, before implementing both ideas (the std::map route is more difficult to code in my underlying application), I would like to know if there are any known pre-existing best-practices / benchmarks.