1

I have spent an entire day going around in circles trying to solve this problem. I've been through similar questions on SO and still no luck. Apologies if I've missed an obvious solution in my searches. This is my first day trying out Python so I am a complete newbie. I am trying to build a scraper for LinkedIn and I can't get past the logging in stage.

I have tried many different variations of code but here is one that I understand the most:

from bs4 import BeautifulSoup
import urllib.request
import requests

client = requests.Session()
LOGIN_URL = 'https://www.linkedin.com/uas/login'

# get source code of the page
with urllib.request.urlopen('https://www.linkedin.com/uas/login') as 
url:
s = url.read()
print(s)

soup = BeautifulSoup(s, "html.parser")
print(s)

csrf = soup.find(id="loginCsrfParam-login")['value']

login_information = {
'session_key':'email@gmail.com',
'session_password':'password',
'loginCsrfParam': csrf,
}

client.post(LOGIN_URL, data=login_information)

I am getting the below error and have no clue how to get around this:

Traceback (most recent call last): File "G:...\LinkedIn\testlogin3.py", line 16, in csrf = soup.find(id="loginCsrfParam-login")['value'] TypeError: 'NoneType' object is not subscriptable

Is anyone able to provide any insight or help me correct the code? Thanks in advance.

Sonal
  • 47
  • 1
  • 10
  • Look [here](https://stackoverflow.com/a/63541198/1705829) and change `loginCsrfParam-login` to `loginCsrfParam`. You also need a client.response(url) to print. – Timo Jan 30 '21 at 16:41

1 Answers1

1

I believe that this error is telling you that it cannot find anything with the ID "loginCsrfParam-login" and therefore the soup is returning a None element when it goes to look for it. You subsequently attempt to pull ['value'] from this None and Python doesn't know what to make of that request. I believe that you are trying to do something like:

csrf = soup.select("input[loginCsrfParam-login]")

This produces no errors when I run the code, although I am unsure if it accomplishes the desired effect. Personally though, I prefer to use the Selenium module for interacting with web pages because you can visually see how it is responding.

Reedinationer
  • 5,661
  • 1
  • 12
  • 33
  • Thank you! Yes the error doesn't appear anymore but I still don't think I'm being signed in. I will look into Selenium, thanks for the tip. – Sonal Feb 08 '19 at 09:09