2

I have the following page

http://www.fda.gov/downloads/scienceresearch/fieldscience/laboratorymanual/ucm092156.pdf

I would like to find the pages on www.fda.gov that links to this page? How can I do that?

Norfeldt
  • 266

1 Answers1

3
  1. You can use wget to recursively download the entire website:

    wget --recursive --page-requisites --html-extension --no-parent --domains www.fda.gov www.fda.gov

  2. You can then use egrep to recursively search through all the files to find which pages link to ucm092156.pdf:

    egrep -r -o '*ucm092156.pdf' www.fda.gov/