-march and -m32 / -m64 gcc options are orthogonal.  64-bit mode doesn't support pushl.
gcc -march=i486 doesn't imply -m32.  Thus gcc -m32 is necessary, to invoke as --32.
Also, GCC doesn't pass on its -march= option to GAS, using it only for C->asm compilation.
By default, GAS accepts any instructions it knows about.  So gcc -m32 -c bswap.s works, and would also accept AVX512VBMI instructions like vpmultishiftqb (%ecx){1to8}, %zmm0, %zmm1 (broadcast-load and bitfield-extract) without further options.
This is basically opposite of how GCC works when compiling C to asm, where it has a default target baseline (e.g. for 32-bit mode, often i686 or i686 + SSE2, allowing instructions like CMOV).
This makes some sense because in asm, instruction choice is governed by the source.  If you don't want to use new instructions for compat with old CPUs, that's up to you.  But for GCC, where a machine is generating asm, you might want portable binaries that can run on any CPU, or any CPU newer than some baseline.  Or a binary that will use everything your CPU has (-march=native), avoiding instructions your CPU doesn't support.
If you use new instructions via inline asm, you can still compile with gcc without  a -march option.  (But normally it's better to use intrinsics to have GCC emit those instructions itself, so it knows what's going on.)
If you want to tell GAS to impose limits, e.g. to catch mistakes like accidentally using cmov or cmpxchg8b when you intended your code to be able to run on a 486, its as -march=i486 option or .arch i486 directive in the source supports that.
(See the GAS manual; the microarchitecture names are similar to what gcc -march= accepts, except for recent Intel where GCC accepts skylake, but GAS would need corei7.avx2.fma.movbe.bmi2 or something, and that's still incomplete.)
To get GCC to run as --32 -march=i486, you use
gcc -c -m32 -Wa,-march=i486 foo.s
If you omit the -m32, you get Assembler messages:
Fatal error: 64bit mode not supported on 'i486'.
Fun fact: GAS has lots of other x86 options that GCC doesn't set.  I'm showing the gcc -Wa,gas-option form; if you were running as --32 directly, you'd use just the as --32 -Os or whatever.
- gcc -Wa,-Os- optimize your asm for size, e.g. shortening- mov $1, %raxto- mov $1, %eaxbecause that's architecturally equivalent, or- test $1, %eax(5 bytes) to- test $1, %al(2 bytes).
- gcc -Wa,-mbranches-within-32B-boundaries- How can I mitigate the impact of the Intel jcc erratum on gcc?
- gcc -Wa,-msse2avx- encode SSE instructions with VEX prefix.
- gcc -Wa,-muse-unaligned-vector-move- translate- movapsto- movupsand so on.  (But it can't transparently turn- paddb (%ecx), %xmm0into something that doesn't require alignment, so it's probably only useful with AVX, if you want to relax the alignment requirements for a function.  In AVX, only- vmovaps/- vmovdqaload/store do alignment enforcement, memory source operands for ALU instructions are like- vmovups)
I've never really wanted to use any of these options (except the workaround for Skylake's JCC-erratum performance pothole), but it's neat that they exist.