Home python Error: TypeError: 'NoneType' object is not subscriptable

Error: TypeError: ‘NoneType’ object is not subscriptable

Author

Date

Category

I have the code:

from bs4 import BeautifulSoup
import requests
head = {'User-Agent': 'Mozilla / 4.0 (compatible; MSIE 6.0; AOL 9.0; Windows NT 5.1)'}
proxi = {
 'http': 'http://195.9.149.198:8081',
}
query = input ('What are you searching for ?:')
number = input ('How many pages:')
url = 'http: //www.google.com/search? q ='
page = requests.get (url + query, headers = head, proxies = proxi)
for index in range (int (number)):
  soup = BeautifulSoup (page.text, "html.parser")
  next_page = soup.find ("a", class _ = "fl")
  next_link = ("https://www.google.com" + next_page ["href"])
  h3 = soup.find_all ("h3", class _ = "r")
  for elem in h3:
    elem = elem.contents [0]
    link = ("https://www.google.com" + elem ["href"])
    print (link)
  page = requests.get (next_link)

After working with this code for half a day, sending multiple requests for parsing url addresses, everything went fine for me, except when I added inurl:

This code worked flawlessly. But after a certain number of requests, I was constantly getting the error TypeError: 'NoneType' object is not subscriptable without even adding inurl:

I understand that captcha appears, and it is written that very suspicious traffic is coming from my network. And because of this, it blocks, trying to add headers and do this action through the proxi server. This error just popped up for me. What should I do?


Answer 1, authority 100%

A possible solution would be to wrap all the code inside for in try ... except , and fall asleep for some time in the place where the error is processed. It seems to me a good idea to increase this time a little after each captcha. For example, start with a value equal to five seconds, and increase by one second. Possible implementation:

from time import sleep
from bs4 import BeautifulSoup
import requests
head = {'User-Agent': 'Mozilla / 4.0 (compatible; MSIE 6.0; AOL 9.0; Windows NT 5.1)'}
proxi = {
  'http': 'http://195.9.149.198:8081',
}
time_to_sleep_when_captcha = 5
query = input ('What are you searching for ?:')
number = input ('How many pages:')
url = 'http://www.google.com/search?q='
page = requests.get (url + query, headers = head, proxies = proxi)
for index in range (int (number)):
  try:
    soup = BeautifulSoup (page.text, "html.parser")
    next_page = soup.find ("a", class _ = "fl")
    next_link = ("https://www.google.com" + next_page ["href"])
    h3 = soup.find_all ("h3", class _ = "r")
    for elem in h3:
      elem = elem.contents [0]
      link = ("https://www.google.com" + elem ["href"])
      print (link)
    page = requests.get (next_link)
  except:
    sleep (time_to_sleep_when_captcha)
    time_to_sleep_when_captcha + = 1

Answer 2, authority 94%

TypeError: ‘NoneType’ object is not subscriptable

Occurs when you try to access a None Object by index.

& gt; & gt; & gt; t = None
& gt; & gt; & gt; t [0]
Traceback (most recent call last):
 File "& lt; stdin & gt;", line 1, in & lt; module & gt;
TypeError: 'NoneType' object is not subscriptable

Check the call by index (there should be a line number in the error)

Programmers, Start Your Engines!

Why spend time searching for the correct question and then entering your answer when you can find it in a second? That's what CompuTicket is all about! Here you'll find thousands of questions and answers from hundreds of computer languages.

Recent questions