I'm embedding Ruby 2.2 into a C++ application and try to optimise a function of the form
VALUE toRubyString( const MyObject &o )
{
     const char *utf8 = get_string_rep_of( o );
     return rb_enc_str_new( utf8, strlen( utf8 ), rb_utf8_encoding() );
}
It turned out that toRubyString dominates the runtime profiling output in certain use cases. In those cases, the function gets called very often but only with a few different MyObject values. Hence, my idea is to cache the VALUE values in a std::map<MyObject, VALUE> or the like such that I can reuse them, along the lines of
std::map<MyObject, VALUE> cache;
VALUE toRubyString( const MyObject &o )
{
     std::map<MyObject, VALUE>::const_iterator it = cache.find( o );
     if ( it != cache.end() ) {
         return it->second;
     }
     const char *utf8 = get_string_rep_of( o );
     VALUE v = rb_enc_str_new( utf8, strlen( utf8 ), rb_utf8_encoding() );
     cache[o] = v;
     return v;
}
Alas, I noticed that with this modification, the Ruby interpreter eventually crashes, and the crashes disappear if I omit the return it->second; line (i.e. when the code refrains from reusing cached entries).
I suspect this is related to the garbage collector since it only happens after a couple thousand calls to the function, but even a
rb_gc_mark(v);
call (before adding the VALUE to the cache) didn't help. Does anybody have some idea as to what I might be missing here?
 
    