Have a website https://egrul.nalog.ru/index.html
I want to write a parser to unload discharge from there.
First Posted Request with Query = 'Inn'
In response, I get a JSON type key: "t": "C639546DE6364BFC462A39C02862A0B22F85F1E317B359D71395917F336D71BAA0236A6C60146CE8DA52DA53239A290DC8D49DCF99ECD6F42984FCB851DAF1702B970C938627ED122A7A66BEDCA70A09", "captchaRequired": false}
further apparently the request is made:
egrul.nalog.ru/search-result/ {To substitute earlier
Received}? R = 1627021393883 & amp; _ = 1627021393883
The question is in r = 1627021393883 & amp; _ = 1627021393883
, I understand this time in Unix format. Time is not current, there is no November, then October gives. I can not understand where it is taken from, where does the browser get it?
Please tell me.
Answer 1, Authority 100%
This parameter is optional.
import requests
url = 'https://egrul.nalog.ru'
url_1 = 'https://egrul.nalog.ru/search-result/'
url_2 = 'https://egrul.nalog.ru/vyp-download/'
Inn = 21..98
R = Requests.post (URL, DATA = {'Query': Inn})
Print (R.JSON () ['T'])
R1 = Requests.Get (URL_1 + R.JSON () ['T'])
Print (R1.json () ['rows'] [0] ['n'])
Print (R1.json () ['rows'] [0] ['T'])
R2 = Requests.get (URL_2 + R1.JSON () ['rows'] [0] ['T'])
With Open (F '{R1.json () ["rows"] [0] ["n"]} _ {STR (INN)}. PDF', 'WB') AS F:
F.Write (R2.Content)
So the PDF file will be downloaded with the name “FULL NAME_INN”
UPD:
s = requests.session ()
S.GET (URL + '/INDEX.HTML')
Print (S.Cookies)
R = S.POST (URL, DATA = {'Query': Inn}, cookies = s.cookies)
Print (R.JSON () ['T'])
R1 = S.GET (URL_1 + R.JSON () ['T'], Cookies = S.Cookies)
Print (R1.json () ['rows'] [0] ['n'])
Print (R1.json () ['rows'] [0] ['T'])
R2 = S.GET (URL_2 + R1.JSON () ['rows'] [0] ['T'], cookies = s.cookies)
With Open (F '{R1.json () ["rows"] [0] ["n"]} _ {STR (INN)}. PDF', 'WB') AS F:
F.Write (R2.Content)