看板 R_Language 關於我們 聯絡資訊
[問題類型]: 經驗諮詢(我想用R爬取網頁資料,請問大家的經驗) [軟體熟悉度]: 新手(沒寫過程式,R 是我的第一次) [問題敘述]: 跑範例跑出以下錯誤訊息 Error in open.connection(x, "rb") : HTTP error 503. 以前爬還沒問題,不知道是不是網站有在過濾爬蟲,用CHROME瀏覽正常。 [程式範例]: library(rvest) url <- "https://www.wantgoo.com/stock/astock/agentstat2?stockno=1722" DATA = read_html(url) [環境敘述]: R version 3.6.1 (2019-07-05) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 18362) Matrix products: default locale: [1] LC_COLLATE=Chinese (Traditional)_Taiwan.950 LC_CTYPE=Chinese (Traditional)_Taiwan.950 [3] LC_MONETARY=Chinese (Traditional)_Taiwan.950 LC_NUMERIC=C [5] LC_TIME=Chinese (Traditional)_Taiwan.950 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] jsonlite_1.6 httr_1.4.1 rvest_0.3.4 xml2_1.2.2 loaded via a namespace (and not attached): [1] compiler_3.6.1 magrittr_1.5 R6_2.4.0 tools_3.6.1 curl_4.2 Rcpp_1.0.2 [關鍵字]: rvest 爬蟲 -- ※ 發信站: 批踢踢實業坊(ptt.cc), 來自: 1.175.67.86 (臺灣) ※ 文章網址: https://www.ptt.cc/bbs/R_Language/M.1575130624.A.295.html
nyannyannyan: 網站服務條款看起來不給爬,應該是阻擋爬蟲 12/01 11:06