I learned recently that you can use wget -r -P ./pdfs -A pdf http://example.com/ to recursively download pdf files from a website. However this is not cross-platform as Windows doesn't have wget. I want to use Python to achieve the same thing. The only solutions I've seen are non-recursive - e.g. https://stackoverflow.com/a/54618327/3042018
I would also like to be able to just get the names of the files without downloading so I can check if a file has already been downloaded.
There are so many tools available in Python. What is a good solution here? Should I use one of the "mainstream" packages like scrapy or selenium or maybe just requests? Which is the most suitable for this task please, and how do I implement it?