(as you can see in line 4 of the assembly that the value stored in local variable x is not the address of the plt entry)
Huh?  The value isn't visible in the disassembly, only the location it's loaded from.  (In practice it's not loading a pointer to the PLT entry, but line 4 of the assembly doesn't tell you that1.)  Use objdump -dR to see dynamic relocations.
That's a load from memory using a RIP-relative addressing mode.  In this case it's loading a pointer to the real printf address in libc.  That pointer is stored in the Global Offset Table (GOT).
To make this work, the printf symbol gets "early binding" instead of lazy dynamic linking, avoiding PLT overhead for later uses of that function pointer.
Footenote 1: Although maybe you were basing that reasoning on the fact that it's a load instead of a RIP-relative LEA.  That pretty much does tell you it's not the PLT entry; part of the point of the PLT is to have an address that's a link-time constant for call rel32, which also enables LEA with a RIP+rel32 addressing mode.  The compiler would have used that if it wanted the PLT address in a register.
BTW, the PLT stub itself also uses the GOT entry for its memory-indirect jump; for symbols that are only used as function call targets, the GOT entry holds a pointer back to the PLT stub, to the push / jmp instructions that invoke the lazy dynamic linker to resolve that PLT entry.  i.e. to update the GOT entry.
Don't all the calls to functions undefined in the executable go first through the plt for better performance
No, the PLT costs runtime performance by adding an extra level of indirection to every call.  gcc -fno-plt uses early binding instead waiting for the first call, so it can inline the indirect call through the GOT right into each call site.
The PLT exists to avoid runtime fixups of call rel32 offsets during dynamic linking.  And on 64-bit systems, to allow reaching addresses that are more than 2GB away.  And also to support symbol interposition.  See https://www.macieira.org/blog/2012/01/sorry-state-of-dynamic-libraries-on-linux/ (written before -fno-plt existed; it's basically like one of the ideas he was suggesting).
The PLT's lazy binding can improve startup performance vs. early binding, but on modern systems where cache hits are very important, doing all the symbol-scanning stuff at once during startup is nice.
and for pic code?
Your code is PIC, or actually PIE (position-independent executable), which most distros configure GCC to do by default.
I expected x to point to the address of the PLT entry of printf
If you use -fno-pie, then the address of the PLT entry is a link-time constant, and at compile time the compiler doesn't know whether you're going to link libc statically or dynamically.  So it uses mov $printf, %eax to get the address of a function-pointer into a register, and at link time that can only convert to mov $printf@plt, %eax.  
See it on Godbolt.  (The Godbolt default is -fno-pie, unlike on most current Linux distros.)
# gcc9.2 -O3 -fpie    for your first block
        movq    printf@GOTPCREL(%rip), %rbp
        leaq    .LC0(%rip), %rdi
        xorl    %eax, %eax
        movq    %rbp, %rsi        # saved for later in rbp
        call    printf@PLT
vs.
# gcc9.2 -O3 -fno-pie
        movl    $printf, %esi          # linker converts this symbol reference to printf@plt
        movl    $.LC0, %edi
        xorl    %eax, %eax
        call    printf                 # will convert at link-time to printf@plt
      # next use also just uses mov-immediate to rematerialize, instead of saving a load result in a register.
So a PIE executable actually has better efficiency for repeated-use of function pointers to functions in standard libraries: the pointer is the final address, not just the PLT entry.
-fno-plt -fno-pie works more like PIE mode for taking function pointers.  Except it can still use $foo 32-bit immediates for the addresses of symbols in the same file, instead of a RIP-relative LEA.
# gcc9.2 -O3 -fno-plt -fno-pie
        movq    printf@GOTPCREL(%rip), %rbp    # saved for later in RBP
        movl    $.LC0, %edi
        xorl    %eax, %eax
        movq    %rbp, %rsi
        call    *printf@GOTPCREL(%rip)
  # pointers to static functions can use  mov $foo, %esi
It seems you need int foo(const char*,...) __attribute__((visibility("hidden"))); to tell the compiler it definitely doesn't need to go through the GOT for this symbol, with pie or -fno-plt.
Leaving it until link-time for the linker to convert symbol to symbol@plt if necessary allows the compiler to always use efficient 32-bit absolute immediates or RIP-relative addressing and only end up with PLT indirection for functions that turn out to be in a shared library.  But then you end up with pointers to PLT entries, instead of pointers to the final address.
If you were using Intel syntax, it would be mov rbp, QWORD PTR printf@GOTPCREL[rip] in GCC's output for this, if you look at asm instead of disassembly.
Looking at compiler output gives you significantly more information that just numeric offsets from RIP in plain objdump output.  -r to show relocation symbols helps some, but compiler output is generally better.  (Except you don't see that printf gets rewritten to printf@plt)