Yes, there are a few possible choices on x86.
XADD r/m, r
This instruction atomically adds the second operand (r) to the first operand (r/m) and loads the second operand (r) with the original value of the first operand (r/m).
To use it you will need to load the second operand with the amount of the increment (I'm guessing, 1, here) and the first operand should be the memory location of what's being incremented.
This instruction must be preceded with the LOCK prefix (it will make it atomic).
The InterlockedAdd() function in Microsoft Visual C++ does this and, AFAIR, uses XADD if it's available (available since i80486).
Another way is to use a loop with the CMPXCHG instruction...
Pseudocode:
while (true)
{
int oldValue = l.n;
int newValue = oldValue + 1;
if (CAS(&l.n, newValue, oldValue) == oldValue)
break;
}
The CAS(), which stands for Compare And Swap (a common term in concurrent programming), is a function that tries to atomically replace a value in memory with a new value. The replacement succeeds when the value being replaced is equal to the last supplied parameter, oldValue. It fails otherwise. CAS returns the original value from the memory, which lets us know whether the replacement has been successful (we compare the returned value with oldValue). The failure (the returned original value differs from oldValue) indicates that between reading oldValue and the moment we tried to replace it with newValue another thread changed the value in the memory. In this case we simply retry the whole procedure.
The CMPXCHG instruction is the x86 CAS.
In Microsoft Visual C++ InterlockedCompareExchange() uses CMPXCHG to implement the CAS.
If XADD isn't available, InterlockedAdd() is implemented using the CAS/CMPXCHG/InterlockedCompareExchange().
On some other CPUs there can be other possibilities. Some allow atomic execution of a few adjacent instructions.