Edit 2020-11-21: (Added section starting here)
Reason for discrepancy:
A peek at the source code for numpy.linspace reveals the following:
- First, the stepis calculated from givenstart,stop, andnum. There's only one way to do this, and there's nothing wrong or imprecise in howstepis being computed.
- Successive output values are produced by taking the base value of start, and adding increasingly higher multiples ofstep, to thestartvalue. So, for example, the first value is produced by adding(0 * step)tostart, and the next value is produced by adding(1 * step)tostart, and so on. Again, nothing wrong or unusual or imprecise in this. Remember that the primary promises oflinspace()are to ensure uniformity of the spacing, and to ensure that exactly the requested number of samples are output.
- If endpoint=Truein the call tolinspace(), the last output value is forced to be equal to the specifiedstopvalue. Again, for most use-cases, this would seem the more desirable thing to do, rather than outputting a computed value as the last value, no matter how cleverly the last value is computed.
So, where is the imprecision coming from?
I can think of three reasons for the discrepancy being discussed here:
- Wrong expectation for the given input values of start,stopandnum: Even in an ideal world where there are no inherent limitations in FP representation, it is impossible forx[-3]to equal2148.95, whenstart=2110.35,stop=2149.05, andnum=775. It may be possible only if these input values were slightly different, for examplenum=776instead ofnum=775-- in which case(stop - start)/(776-1)would be a nice round spacing of0.05, which means it would then be reasonable to expectx[-3]to equal2148.95.
- The startandstartvalues themselves happen to be incapable of being precisely represented in FP. (Remember that most fractions can be represented in FP only with some imprecision. The notable exceptions are fractions that are negative powers of2). So, withstartandstop, what we see is not what the computer sees. This obviously affects the computation ofstep. The imprecision in this computedstepisnumtimes less than the imprecision in(stop - start). However, note that towards the end of the output sequence, this imprecision in(stop - start)will have an increasing impact, as increasingly higher multiples ofstepget added tostart. Here's a demonstration of this:
    print (format(2110.35, ".40g"))
    print (format(2149.05, ".40g"))
    print (format(0.125, ".40g"))  # 0.125 is an FP-friendly vaue (it is 2**(-3))
outputs:
    2110.349999999999909050529822707176208496
    2149.050000000000181898940354585647583008
    0.125
- (As OP is already aware) Each computed output value (however precisely the computation is done), may or may not be representable precisely in FP. In practice, this inherent imprecision of FP will affect most of the generated output values.
Hypothetically speaking, it could be argued that perhaps numpy developers could have inserted extra interventional logic -- they could have tried to somehow figure out which of the output values can be precisely represented in FP, and for those output positions, they could output forced true values rather than computed values. Such an intervention is currently being done for the last output value, when endpoint=True. However, such excessive intervention would violate the primary promise of regularity of spacing, in addition to affecting the speed.
Another hypothetical debate could be on whether the successive values could be generated by cumulative addition, rather than by (start + i*step). That is, the second value could be computed as (first + step), the third could be computed as (second + step), and so on. I am not sure why such an implementation was not chosen, but I doubt that will get us any benefits in precision. The factors that I suspect to be causing the imprecision now, would still apply (the inherent limitation of FP representation for precisely representing the computed values, and the imprecision in start, stop and step). It will also probably be slower, since each value can be computed only after the previous one has been computed.
To summarize, rather than the linspace algorithm being sub-optimal, the real culpable reasons are -- wrong expectation (or wrong input), and inherent limitation of FP representation affects start, stop, and the computed values.
Edit 2020-11-21: (Added section ends here)
Your little test:
print(y, type(y))
producing the output:
2148.95 <class 'numpy.float64'>
does not necessarily prove that y contains exactly 2148.95.
That's because, when you call print(y), it internally invokes str(y), and we really don't know too much about the precision to which str(y) produces its output.
Suggested Actions:
- In the invocation of linspace(), changenum=775tonum=776, to match your expectation ofstep=0.05
- To perform test such as if x[-3] <= y:, do something likeif round(x[-3],5) <= round(y,5):instead (here, I'm assuming that you care about no more than 5 digits after the decimal point).