I am trying to crawl a web-page which requires authentication. I am able to access that page in browser when I am logged in, using JSoup http://jsoup.org/ library to parse HTML pages.
public static void main(String[] args) throws IOException {
    // need http protocol
    Document doc = Jsoup.connect("http://www.secinfo.com/$/SEC/Filing.asp?T=r643.91Dx_2nx").get();
    // get page title
    String title = doc.title();
    System.out.println("title : " + title);
    // get all links
    Elements links = doc.select("a");
    for (Element link : links) {                   
        // get the value from href attribute
        System.out.println("\nlink : " + link.attr("href"));                   
    }
            System.out.println();
  }
Output :
title : SEC Info - Sign In
This is getting the content of the sign in page not the actual URL i am passing. I am registered on secinfo.com and while running this program I am logged in from my default browser Firefox.
 
     
     
     
    