I have spent an entire day going around in circles trying to solve this problem. I've been through similar questions on SO and still no luck. Apologies if I've missed an obvious solution in my searches. This is my first day trying out Python so I am a complete newbie. I am trying to build a scraper for LinkedIn and I can't get past the logging in stage.
I have tried many different variations of code but here is one that I understand the most:
from bs4 import BeautifulSoup
import urllib.request
import requests
client = requests.Session()
LOGIN_URL = 'https://www.linkedin.com/uas/login'
# get source code of the page
with urllib.request.urlopen('https://www.linkedin.com/uas/login') as
url:
s = url.read()
print(s)
soup = BeautifulSoup(s, "html.parser")
print(s)
csrf = soup.find(id="loginCsrfParam-login")['value']
login_information = {
'session_key':'email@gmail.com',
'session_password':'password',
'loginCsrfParam': csrf,
}
client.post(LOGIN_URL, data=login_information)
I am getting the below error and have no clue how to get around this:
Traceback (most recent call last): File "G:...\LinkedIn\testlogin3.py", line 16, in csrf = soup.find(id="loginCsrfParam-login")['value'] TypeError: 'NoneType' object is not subscriptable
Is anyone able to provide any insight or help me correct the code? Thanks in advance.