This probably will not be the most efficient (though it turns out to be faster than the other approaches presented here for this input -- see below), but one thing you can do is convert a and b to Python lists and then take their set difference:
# Method 1
tmp_1 = [tuple(i) for i in a]    # -> [(1, 2), (1, 3), (1, 4)]
tmp_2 = [tuple(i) for i in b]    # -> [(1, 2), (1, 3)]
c = np.array(list(set(tmp_1).difference(tmp_2)))
As noted by @EmiOB, this post offers some insights into why [ d for d in a if d not in b ] in your question does not work. Drawing from that post, you can use
# Method 2
c = np.array([d for d in a if all(any(d != i) for i in b)])
Remarks
The implementation of array_contains(PyArrayObject *self, PyObject *el) (in C) says that calling array_contains(self, el) (in C) is equivalent to
(self == el).any()
in Python,
where self is a pointer to an array and el is a pointer to a Python object.
In other words:
- if arris a numpy array andobjis some arbitrary Python object, then
obj in arr
is the same as
(arr == obj).any()
- if arris a typical Python container such as a list, tuple, dictionary, and so on, then
obj in arr
is the same as
any(obj is _ or obj == _ for _ in arr)
(see membership test operations).
All of which is to say, the meaning of obj in arr is different depending on the type of arr.
This explains why the logical comprehension that you proposed [d for d in a if d not in b] does not have the desired effect.
This can be confusing because it is tempting to reason that since a numpy array is a sequence (though not a standard Python one), test membership semantics should be the same. This is not the case.
Example:
a = np.array([[1,2],[1,3],[1,4]])
print((a == [1,2]).any())          # same as [1, 2] in a
# outputs True
Timings
For your input, I found my approach to be the fastest, followed by Method 2 obtained from the post @EmiOB suggested, followed by @DanielF's approach. I would not be surprised if changing the input size changes the ordering of the timings so take them with a grain of salt.
# Method 1
5.96 µs ± 8.92 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
# Method 2
6.45 µs ± 27.5 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
# @DanielF's answer
16.5 µs ± 276 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)