My Observed Rules of Turtle Layering:
To illustrate my second point, consider this code:
from turtle import Turtle, Screen
a = Turtle(shape="square")
a.color("red")
a.width(6)
b = Turtle(shape="circle")
b.color("green")
b.width(3)
b.goto(-300, 0)
b.dot()
a.goto(-300, 0)
a.dot()
a.goto(300, 0)
b.goto(300, 0)
screen = Screen()
screen.exitonclick()
Run it and observe the result.  On my system, the final goto() draws a long green line over the red one but the green line disappears as soon as it has finished drawing.  Comment out the two calls to dot() and observe again.  Now the green line remains over the red one.  Now change the calls from dot() to stamp() or circle(5) instead.  Observe and formulate your own rule...
Now back to your example, which is badly flawed (you're actually manipulating three turtles, not two!)  Here's my simplification:
from turtle import Turtle, Screen
tri = Turtle(shape="turtle")
tri.color("black")
tri.pu()
turtle = Turtle(shape="square")
turtle.shapesize(4)
turtle.color("pink")
turtle.pu()
def drag_handler(x, y):
    turtle.ondrag(None)
    turtle.goto(x, y)
    turtle.ondrag(drag_handler)
turtle.ondrag(drag_handler)
tri.bk(400)
while tri.distance(turtle) > 10:
    tri.setheading(tri.towards(turtle))
    tri.fd(5)
screen = Screen()
screen.mainloop()
You can tease tri by dragging the pink square until tri catches up with it.  Ultimately, tri will land on top as long as the square isn't moving when tri catches it.  If you drag the square over tri, then it will temporarily cover him as it is the "last to arrive".