我在 python 中使用 selenium 从网站上抓取信息,但是我遇到了一个问题,当我单击网站从表中获取更多行后,出现的行有一个hidden-xs hidden-sm
我似乎无法找到获取这些元素的方法。我的代码如下。你有什么办法可以帮助我吗?
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.keys import Keys
from selenium.common.exceptions import NoSuchElementException
import time
import pandas as pd
flight_Code=[]
Date=[]
Departure=[]
Arrival=[]
aircraft_code=[]
Code=["ph-bfy"]
headers={'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.149 Safari/537.36'}
chrome_path= "C:/Users/hugol/Documents/chromedriver.exe"
chrome_options=Options()
#chrome_options.add_argument({'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.149 Safari/537.36'})
chrome_options.add_argument("--no-sandbox")
driver=webdriver.Chrome(chrome_path, options=chrome_options)
url="https://www.flightradar24.com/"
driver.get(url)
login_button=WebDriverWait(driver, 2).until(EC.presence_of_element_located((By.ID, 'premiumOverlay')))
login_button.click()
username=WebDriverWait(driver, 2).until(EC.presence_of_element_located((By.ID, 'fr24_SignInEmail')))
username.send_keys(*******)
password=WebDriverWait(driver, 2).until(EC.presence_of_element_located((By.ID, 'fr24_SignInPassword')))
password.send_keys(*******)
login_button=WebDriverWait(driver, 2).until(EC.presence_of_element_located((By.ID, 'fr24_SignIn')))
login_button.click()
time.sleep(2)
for i in Code:
new_url="https://www.flightradar24.com/data/aircraft/"+i
driver.get(new_url)
more_button=WebDriverWait(driver, 1).until(EC.presence_of_element_located((By.ID, 'btn-load-earlier-flights')))
more_button.click()
# WebDriverWait(driver, 2).until(EC.presence_of_element_located((By.ID, 'tbl-datatable')))
for row in driver.find_elements_by_class_name("data-row"):
try:
flight_code=row.find_element_by_class_name("fbold").text
except NoSuchElementException:
flight_code=''
try:
flight_date=row.find_element_by_class_name("row").text
except NoSuchElementException:
flight_date=''
try:
flight_departure=row.find_elements_by_class_name("details")[4].text
except NoSuchElementException:
flight_departure=''
try:
flight_arrival=row.find_elements_by_class_name("details")[3].text
except NoSuchElementException:
flight_arrival=''
flight_Code.append(flight_code)
Date.append(flight_date)
Departure.append(flight_departure)
Arrival.append(flight_arrival)
aircraft_code.append(i)
df=pd.DataFrame({'Code': flight_Code,'Date': Date, 'Departure': Departure, 'Arrival': Arrival, 'Aircraft':aircraft_code})
网站 html 如下所示:
![enter image description here](https://i.stack.imgur.com/dJ7Oe.png)
多谢你们!!!
代替element.text
use element.get_attribute("textContent")
flight_code=row.find_element_by_class_name("fbold").get_attribute("textContent")
Update:
单击更多按钮后,您需要等待元素可见。使用显式等待。
WebDriverWait(driver,10).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR,".data-row")))
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)