It depends on what's exactly is in the Fence function. In particular, it depends what's between the fence and the rdtsc. It also depends on what's after rdtsc.
Consider the lfence case and where rdtsc is at the top of the timed region. Since Fence is being called using the call instruction, there is probably a ret at the end of that function to go back to the following rdtsc. This means that there is at least a ret between the lfence and the rdtsc. Most probably the ret here is of the form C3, which gets decoded and allocated into the reservation station as two uops on modern Intel and AMD processors. These uops are used to load the return address from the stack and verify the prediction, so there is a true data dependency between them and current processors don't use value prediction.
If the load hits in the L1D and the DTLB or STLB, or if the value is forwarded from the store buffer (this is possible because lfence doesn't wait for the store buffer to drain), it's unlikely that there will be a difference between having lfence placed immediately before rdtsc and having a ret between the two instructions. But if the load takes a long time, rdtsc may have already been executed and later instructions would be in-flight as well in the backend. After the load completes, there is still another uop from ret to be executed waiting in the RS. This uop consumes certain resources and could interfere with all the other uops that are in the timed region and may affect the measured time. Note that even with your simple Fence function, a hardware interrupt could occur just before RET, making store forwarding impossible and may end up evicting the return address from the L1D. Anyway, unless you hit a pathological instruction sequence in the timed region, this doesn't matter unless you really want extreme precision.
You'd normally want to place lfence immediately before rdtsc. You can use a macro instead of a function or force the compiler to inline the function if possible (but even then you still have to examine the generated asm code and make sure it's what you want).
sfence doesn't interact with ret or rdtsc, so there is no ordering effects with respect to these instructions. mfence forces the load from ret to wait until most earlier memory-related operations reach the point of global oveservability or persistence. mfence and sfence alone don't serialize rdtsc.