First of all, you should usually replace inline asm (with intrinsics or pure C) instead of porting it.  https://gcc.gnu.org/wiki/DontUseInlineAsm
clang -fasm-blocks is mostly compatible with MSVC's inefficient inline asm syntax.  But it doesn't support returning a value by leaving it in EAX and then falling off the end of a non-void function.
So you have to write inline asm that puts the value in a named C variable and return that, typically leading to an extra store/reload making MSVC syntax even worse.  (Pretty bad unless you're writing a whole loop in asm that amortizes that store/reload overhead of getting data into / out of the asm block).  See What is the difference between 'asm', '__asm' and '__asm__'? for a comparison of how inefficient MSVC inline-asm is when wrapping a single instruction.  It's less dumb inside functions with stack args when those functions don't inline, but that only happens if you're already making things inefficient (e.g. using legacy 32-bit calling conventions and not using link-time optimization to inline small functions).
MSVC can substitute A with an immediate 1 when inlining into a caller, but clang can't.  Both defeat constant-propagation but MSVC at least avoids bouncing constant inputs through a store/reload.  (As long as you only use it with instructions that can support an immediate source operand.)
Clang accepts __asm, asm, or __asm__ to introduce an asm-block.  MSVC accepts __asm (2 underscores like clang) or _asm (more commonly used, but clang doesn't accept it).
So for existing MSVC code you probably want #define  _asm  __asm so your code can compile with both MSVC and clang, unless you need to make separate versions anyway.  Or use clang -D_asm=asm to set a CPP macro on the command line.
Example: compile with MSVC or with clang -fasm-blocks
(Don't forget to enable optimization: clang -fasm-blocks -O3 -march=native -flto -Wall.  Omit or modify -march=native if you want a binary that can run on earlier/other CPUs than your compile host.)
int a_global;
inline
long foo(int A, int B, int *arr) {
    int out;
    // You can't assume A will be in RDI: after inlining it prob. won't be
    __asm {
        mov   ecx, A                   // comment syntax
        add   dword ptr [a_global], 1
        mov   out, ecx
    }
    return out;
}
Compiling with x86-64 Linux clang 8.0 on Godbolt shows that clang can inline the wrapper function containing the inline-asm, and how much store/reload MSVC syntax entails (vs. GNU C inline asm which can take inputs and outputs in registers).
I'm using clang in Intel-syntax asm output mode, but it also compiles Intel-syntax asm blocks when it's outputting in AT&T syntax mode.  (Normally clang compiles straight to machine-code anyway, which it also does correctly.)
## The x86-64 System V ABI passes args in rdi, rsi, rdx, ...
# clang -O3 -fasm-blocks -Wall
foo(int, int, int*):
        mov     dword ptr [rsp - 4], edi        # compiler-generated store of register arg to the stack
        mov     ecx, dword ptr [rsp - 4]        # start of inline asm
        add     dword ptr [rip + a_global], 1
        mov     dword ptr [rsp - 8], ecx        # end of inline asm
        movsxd  rax, dword ptr [rsp - 8]        # reload `out` with sign-extension to long (64-bit) : compiler-generated
        ret
Notice how the compiler substituted [rsp - 4] and [rsp - 8] for the C local variables A and out in the asm source block.  And that a variable in static storage gets RIP-relative addressing.  GNU C inline asm doesn't do this, you need to declare %[name] operands and tell the compiler where to put them.
We can even see clang inline that function twice into one caller, and optimize away the sign-extension to 64-bit because this function only returns int.
int caller() {
    return foo(1, 2, nullptr) + foo(1, 2, nullptr);
}
caller():                             # @caller()
        mov     dword ptr [rsp - 4], 1
        mov     ecx, dword ptr [rsp - 4]      # first inline asm
        add     dword ptr [rip + a_global], 1
        mov     dword ptr [rsp - 8], ecx
        mov     eax, dword ptr [rsp - 8]     # compiler-generated reload
        mov     dword ptr [rsp - 4], 1       # and store of A=1 again
        mov     ecx, dword ptr [rsp - 4]      # second inline asm
        add     dword ptr [rip + a_global], 1
        mov     dword ptr [rsp - 8], ecx
        add     eax, dword ptr [rsp - 8]     # compiler-generated reload
        ret
So we can see that just reading A from inline asm creates a missed-optimization: the compiler stores a 1 again even though the asm only read that input without modifying it.
I haven't done tests like assigning to or reading a_global before/between/after the asm statements to make sure the compiler "knows" that variable is modified by the asm statement.
I also haven't tested passing a pointer into an asm block and looping over the pointed-to array, to see if it's like a "memory" clobber in GNU C inline asm.  I'd assume it is.
My Godbolt link also includes an example of falling off the end of a non-void function with a value in EAX.  That's supported by MSVC, but is UB like usual for clang and breaks when inlining into a caller.  (Strangely with no warning, even at -Wall).  You can see how x86 MSVC compiles it on my Godbolt link above.
Porting MSVC asm to GNU C inline asm is almost certainly the wrong choice.  Compiler support for optimizing intrinsics is very good, so you can usually get the compiler to generate good-quality efficient asm for you.
If you're going to do anything to existing hand-written asm, usually replacing them with pure C will be most efficient, and certainly the most future-proof, path forward.  Code that can auto-vectorize to wider vectors in the future is always good.  But if you do need to manually vectorize for some tricky shuffling, then intriniscs are the way to go unless the compiler makes a mess of it somehow.
Look at the compiler-generated asm you get from intrinsics to make sure it's as good or better than the original.
If you're using MMX EMMS, now is probably a good time to replace your MMX code with SSE2 intrinsics.  SSE2 is baseline for x86-64, and few Linux systems are running obsolete 32-bit kernels.