I have created a web bot that iterates over a website e.g example.com/?id=int where int is some integer. the function gets the result in raw html using requests library then hands it to parseAndWrite to extract a div and save its value in a sqlite db:
def archive(initial_index, final_index):
while True:
try:
for i in range(initial_index, final_index):
res = requests.get('https://www.example.com/?id='+str(i))
parseAndWrite(res.text)
print(i, ' archived')
except requests.exceptions.ConnectionError:
print("[-] Connection lost. ")
continue
except:
exit(1)
break
archive(1, 10000)
My problem is that, after some time, the loop doesn't continue to 10000 but repeats itself from a random value, resulting in many duplicate records in the database. What is causing this inconsistency ?