Home python text parsing from the site on python

text parsing from the site on python

Author

Date

Category


Good afternoon all. How are the May holidays pass? There was a question about the site parsing on Python. There is a website: https://www.rttnews.com/corpinfo /Conferencecalls.aspx?date=04-may-2020 You need to resign the text of the blocks, in the photo I tried to portray the most clearly as possible. My program Parsit only the first text and displays it, but I need texts not only from the first block (top line), but also the next 7 pieces. Tell me what you need to fix in the program to parse not only the first, and all 8 texts from the web page?
Here is the code:

import requests
From BS4 Import Beautifulsoup
DEF Get_HTML (URL):
  R = Requests.get (URL)
  r.ENCoding = 'UTF8'
  Return R.Text
DEF Get_Link (HTML):
  Soup = Beautifulsoup (HTML, 'LXML')
  Text = Soup.find ('Div', {'Class': 'ECocalContent'}). Find ('Div', {'Class': 'TblContent5'}). Text
  PRINT (text)
get_link (get_html ('https://www.rttnews.com/corpinfo/conferencecalls.aspx?date=04-may-2020'))
# The first text is displayed, and you need all 8 from web pages:
# https://www.rttnews.com/corpinfo/conferencecalls.aspx?date=04-may-2020

Thank you in advance!


Answer 1, Authority 100%

You can search at the same time in several classes. To do this, pass them as a list:

{'class': ['ecocalcontent', 'ecocaltcontent']}

Example:

import requests
From BS4 Import Beautifulsoup
Url = 'https://www.rttnews.com/corpinfo/conferencecalls.aspx'
HTML = Requests.get (URL, PARAMS = {'DATE': '04 -May-2020 '}). Text
Soup = Beautifulsoup (HTML, 'LXML')
Divs = Soup.Findall ('Div', {'Class': ['ECocalContent', 'ECOCALALTCONTENT']})
RESULT = []
For Div in Divs:
  Text = Div.find ('Div', {'Class': 'TblContent5'}). Text
  Result.APPEND (Text)
Print (* Result)

stdout:

Cirrus Logic Inc. (CRUS) Will Host a Conference Call At 5:00 PM ET on May 4, 2020, to Discuss Q4 20 Earnings Results.
To Access The Live Webcast, Log On To www.cirrus.com
For a replay Call, Dial (416) 621-4642 OR (800) 585-8367 (Access Code: 9199070).
 Varian Medical Systems (VAR) Will Host A Conference Call At 4:30 PM ET on May 4, 2020, to Discuss Q2 20 Earnings Results.
To Access The Live Webcast, Log On to visit www.varian.com/investor American States Water Co. (AWR) Will Host a Conference Call at 2:00 PM ET On May 4, 2020, to Discuss Q4 19 Earnings Results.
To Access The Live Webcast, log on to to http://americanstateswatercompany.gcs-web.com/news-events/event-calendar
 WEC ENERGY GROUP. (WEC) Will Host a Conference Call at 2:00 PM ET on May 4, 2020, to Discuss Q1 20 Earnings Results.
To Access The Live Webcast, log on to wecenergygroup.com
To Listen To the Call, Dial 877-683-2228 (US) OR 647-689-5446 (International), Conference ID: 7898148.
For a replay call, Dial 800-585-8367 (US) OR 416-621-4642 (International), Conference ID: 7898148.
 SEMPRA ENERGY (SRE) Will Host A Conference Call At 12:00 PM ET on May 4, 2020, to Discuss Q1 20 Earnings Results.
To Access The Live Webcast, Log On To Sempra.com
For a Replay Call, Dial (888) 203-1112 WITH Passcode 8909332. Napco Security Technologies Inc. (NSSC) Will Host a Conference Call at 11:00 am ET on May 4, 2020, to Discuss Q3 20 Earnings Results.
To Access The Live Webcast, log on to www.napcosecurity.com
To Listen To The Call, Dial 1-877-407-4018 (US) OR 1-201-689-8471 (International).
For a Replay Call, Dial 1-844-512-2921 (US) OR 1-412-317-6671 (INTERNATIONAL) with access code 13702840. Loews Corp. (L) Will Host a Conference Call at 11:00 am Et On May 4, 2020, to Discuss Q4 19 Earnings Results.
To Access The Live Webcast, log on to www.loews.com 
To Listen To The Call, Dial (877) 692-2592 (US) OR (973) 582-2757 (International) with conference ID NUMBER 6097643.
For a Replay Call, Dial (800) 585-8367 (US) OR (404) 537-3406 (INTERNATIONAL) WITH Conference ID Number 6097643.
  WABTEC CORPORATION (WAB) WILL HOST A CONFERENCE CALL AT 10:00 AM ET ON MAY 4, 2020, TO DISCUSS Q1 20 EARNINGS RESULTS.
To Access The Live Webcast, log on to www.wabteccorp.com
For a replay call, dial 1-877-344-7529 OR 1-412-317-0088 (Access Code: 10142434).

Programmers, Start Your Engines!

Why spend time searching for the correct question and then entering your answer when you can find it in a second? That's what CompuTicket is all about! Here you'll find thousands of questions and answers from hundreds of computer languages.

Recent questions