You don't need to specify an operand size for the memory operand,
just use movdqu xmm0, [rsi] and let xmm0 imply 128-bit operand-size.
NASM supports SSE/AVX/AVX-512 instructions.
If you did want to specify an operand-size, the name for 128-bit is oword, according to ndisasm if you assemble that instruction and then disassemble the resulting machine code. oword = oct-word = 8x 2-byte words = 16 bytes.
Note that GNU .intel_syntax noprefix (as used by objdump -drwC -Mintel) will use xmmword ptr, unlike NASM.
If you really want to use xmmword, %define xmmword oword at the top of your file.
The operand-size is always implied by the mnemonic and / or other register operands for all SSE/AVX/AVX-512 instructions; I can't think of any instructions where you need to specify qword vs. oword vs. yword or anything, the way you do with movsx eax, byte [rdi] vs. word [rdi]. Often it's the same size as the register, but there are exceptions with some shuffle / insert / extract instructions. For example:
- SSE2
pinsrw xmm0, [rdi], 3 loads a word and merges it into bytes 6 and 7 of xmm0.
- SSE2
movq [rdi], xmm0 stores the qword low half
- SSE1
movhps [rdi], xmm0 stores the high qword
- AVX1
vextractf128 [rdi], ymm0, 1 does a 128-bit store of the high half
- AVX2
vpmovzxbw ymm0, [rdi] does packed byte->word zero extension from a 128-bit memory source operand
- AVX-512F
vpmovdb [rdi]{k1}, zmm2 narrows dword to byte elements (with truncation; other versions do saturation) and does a 128-bit store, with masking at byte granularity. (One of the only ways to do byte-granularity masking without AVX-512BW, other than legacy-SSE maskmovdqu which has cache-evicting NT semantics. So I guess that makes it especially interesting for Xeon Phi KNL.)
You could specify oword on any of those to make sure the size of the memory access is what you think it is. (i.e. to have NASM check it for you.)