看板 Python 關於我們 聯絡資訊
大家好, 小弟我最近在 http://www.tpex.org.tw/web/emergingstock/single_historical/history.php?l=zh-tw 裡面撈資料,主要是希望能將資料下載下來並且作整理,而我在抓資料時(假如是1240)用firefox去看header時結果如下 http://www.tpex.org.tw/web/emergingstock/single_historical/download.php Host: www.tpex.org.tw User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:57.0) Gecko/20100101 Firefox/57.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: zh-TW,zh;q=0.8,en-US;q=0.5,en;q=0.3 Accept-Encoding: gzip, deflate Referer: http://www.tpex.org.tw/web/emergingstock/single_historical/history.php?l=zh-tw Content-Type: application/x-www-form-urlencoded Content-Length: 84 Cookie: _ga=GA1.3.582781261.1509173813; _gid=GA1.3.454443446.1513917119; _gat=1 Connection: keep-alive Upgrade-Insecure-Requests: 1 year=106&month=12&stkno=1240&stkname=茂生農經&lang=zh-tw 最後一行看起來無法用header的指令正常放進header裡面,請問要如何處理? 我的程式碼如下(Python 3.5) #!/usr/bin/env python3 # -*- coding: utf8 -*- import urllib.request url="http://www.tpex.org.tw/web/emergingstock/single_historical/download.php" headers={ "Host":"www.tpex.org.tw", "User-Agent":"Mozilla/5.0 (Windows NT 6.1; rv:57.0) Gecko/20100101 Firefox/57.0", "Accept":"text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "Accept-Language":"zh-TW,zh;q=0.8,en-US;q=0.5,en;q=0.3", "Accept-Encoding":"gzip, deflate", "Referer":"http://www.tpex.org.tw/web/emergingstock/single_historical/history.php?l=zh-tw", "Content-Type":"application/x-www-form-urlencoded", "Content-Length":"84", "Cookie":"_ga=GA1.3.582781261.1509173813; _gid=GA1.3.1976747965.1513496313; _gat=1", "Connection":"keep-alive", "Upgrade-Insecure-Requests":"1", # "year=106&month=12&stkno=1240&stkname=茂生農經&lang=zh-tw":"" } req=urllib.request.Request(url,headers=headers) response=urllib.request.urlopen(req) print (str(response)) 不將最後一行選項寫進去,print出來會是 <http.client.HTTPResponse object at 0x02700B10> 網路上找了半天還是沒有一個比較好的解法。 -- ※ 發信站: 批踢踢實業坊(ptt.cc), 來自: 60.250.205.229 ※ 文章網址: https://www.ptt.cc/bbs/Python/M.1513919785.A.C57.html
aweimeow: 最後一行是 Post data,不是 header 12/22 14:52
aweimeow: urllib.request.Request(url, headers=..., data=data) 12/22 14:56
aweimeow: 把 year 那一段以 & 分開,作為 key-value pair 12/22 14:56
aweimeow: 變成 dict type,然後塞到 data 這個 parameter 當中 12/22 14:56
ansem: 感謝您,我的程式現在可以運行了。 12/22 18:55