作者ebenezer (三重劉德華)
看板Python
標題[問題] Agoda網站爬蟲問題
時間Tue Sep 19 00:27:35 2017
各位大大好:
小弟最近在練習用selenium進行網路爬蟲,正好以Agoda網站為目標爬取台北市所有的
飯店名稱(先爬取第一頁為主,然後換頁),結果顯示的結果不但沒有換頁,而且飯店名稱
只有印出四間。
以下是我的程式碼與報錯資訊,麻煩各位幫我瞧瞧哪裡寫錯了,謝謝。
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import re
import time
from bs4 import BeautifulSoup
browser = webdriver.Chrome()
browser.get('
https://www.agoda.com/zh-tw/pages/agoda/default/DestinationSearchResult.aspx?asq=Ss5PXyh1QUNdFOc4lzIDoPF%2BRvl%2F2EATmGvZScKd0zV2eryqVzxofG%2BmP16PwMJIhn6ZcBnwwPWMfnG%2FN94g7S8wUgfCFrPYXSVII0eHKcYuF%2FAeuf3Ntuv%2F3UlVEtcg%2Fh%2Fe4idk67OoEy6KdHwUNum%2B3QacrQMDUE7JkJAfzu3W62o9bPbdQ8KZcSPiCaH9nxv16MZrgiZOZki0W6H9dQ%3D%3D&city=4951&cid=-999&tick=636413477452&isdym=true&searchterm=%E5%8F%B0%E5%8C%97&txtuuid=f685547f-5b5a-4507-bace-845d3ca9b6f0&pagetypeid=1&origin=TW&tag=&gclid=&aid=130243&userId=6
5319270-7a97-4746-8d25-e53b463a0ddf&languageId=20&sessionId=vlmk2yavkgwagnc4g2g12nii&storefrontId=3¤cyCode=TWD&htmlLanguage=zh-tw&trafficType=User&machineName=HK-AGWEB-2E07&cultureInfoName=zh-TW&textToSearch=%E5%8F%B0%E5%8C%97&guid=f685547f-5b5a-4507-bace-845d3ca9b6f0&checkIn=2017-09-27&checkOut=2017-09-28&los=1&rooms=1&adults=2&children=0&childages=&ckuid=65319270-7a97-4746-8d25-e53b463a0ddf&sort=agodaRecommended')
soup = BeautifulSoup(browser.page_source)
for ele in soup.select('.hotel-name span'):
print(ele.text)
browser.find_element_by_id("paginationNext").click()
===========================================================================
藝宿商旅 - 板橋館 (Yi Su Hotel)
國聯大飯店 (United Hotel)
漾館時尚溫泉旅館 (Aquabella Hotel)
馥蘭朵烏來渡假酒店 (Volando Urai Spring Spa & Resort)
---------------------------------------------------------------------------
WebDriverException Traceback (most recent call last)
<ipython-input-1-ded7f6ded6ba> in <module>()
11 for ele in soup.select('.hotel-name span'):
12 print(ele.text)
---> 13 browser.find_element_by_id("paginationNext").click()
C:\Users\Mark Chen\Anaconda3\lib\site-packages\selenium\webdriver\remote\webelement.py in click(self)
76 def click(self):
77 """Clicks the element."""
---> 78 self._execute(Command.CLICK_ELEMENT)
79
80 def submit(self):
C:\Users\Mark Chen\Anaconda3\lib\site-packages\selenium\webdriver\remote\webelement.py in _execute(self, command, params)
497 params = {}
498 params['id'] = self._id
--> 499 return self._parent.execute(command, params)
500
501 def find_element(self, by=By.ID, value=None):
C:\Users\Mark Chen\Anaconda3\lib\site-packages\selenium\webdriver\remote\webdriver.py in execute(self, driver_command, params)
295 response = self.command_executor.execute(driver_command, params)
296 if response:
--> 297 self.error_handler.check_response(response)
298 response['value'] = self._unwrap_value(
299 response.get('value', None))
C:\Users\Mark Chen\Anaconda3\lib\site-packages\selenium\webdriver\remote\errorhandler.py in check_response(self, response)
192 elif exception_class == UnexpectedAlertPresentException and 'alert' in value:
193 raise exception_class(message, screen, stacktrace, value['alert'].get('text'))
--> 194 raise exception_class(message, screen, stacktrace)
195
196 def _value_or_default(self, obj, key, default):
WebDriverException: Message: unknown error: Element <button id="paginationNext" class="btn btn-right" data-selenium="pagination-next-btn">...</button> is not clickable at point (928, 554). Other element would receive the click: <span class="price soft-red" data-selenium="display-price" data-currency="">...</span>
(Session info: chrome=60.0.3112.113)
(Driver info: chromedriver=2.32.498550 (9dec58e66c31bcc53a9ce3c7226f0c1c5810906a),platform=Windows NT 6.3.9600 x86_64)
--
※ 發信站: 批踢踢實業坊(ptt.cc), 來自: 61.230.121.52
※ 文章網址: https://www.ptt.cc/bbs/Python/M.1505752057.A.3E9.html
→ billy0131: error msg不是有寫button is not clickable at point.. 09/19 01:15
→ ebenezer: 我用GOOGLE CHROME的開發者工具去定位元素確實是顯示 09/19 03:18
→ ebenezer: 下一頁的元素為id='paginationNext',不曉得為什麼無法 09/19 03:20
→ ebenezer: 點擊,試了很多次了,也有試過wait的方法,但就是無法~ 09/19 03:22
→ s860134: 點擊的方法有很多種,你可以嘗試 element.click 外的 09/19 03:56
→ s860134: 我記得 action chain 的 click 還是 js 的可以解 09/19 03:56
→ s860134: 他的意思是指元件疊在一起會先點到別人吧?(我的理解) 09/19 03:57
推 amarco: try: WebDriverWait(browser, 10).until(EC.element_to_be 09/19 06:13
→ amarco: _clickable((By.ID, "paginationNext"))) 09/19 06:13