Re: [請益] curl 露天拍賣網站轉頁問題

作者chaoms (小企鵝)

看板PHP

標題Re: [請益] curl 露天拍賣網站轉頁問題

時間Wed Jul 21 21:24:07 2010

※ 引述《JohnGod21 (江神Johnson)》之銘言： : 我利用 : <?php : $url = : "http://search.ruten.com.tw/search/s000.php?searchfrom=indexbar&k=wii&t=0&o=2"; : $ch = curl_init(); : curl_setopt($ch, CURLOPT_URL, $url); : curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);//轉頁 : curl_setopt($ch, CURLOPT_FRESH_CONNECT, true);//強制轉頁 : curl_setopt($ch, CURLOPT_USERAGENT, "Google Bot"); : curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); : $content = curl_exec($ch); : curl_close($ch); : echo $content; : ?> : 取得我要的網頁原始碼但是我取得的是 : http://ppt.cc/B0za 的原始碼 : 而不是 : http://ppt.cc/bIU3 的原始碼 : 但是我已經用curl轉頁的方式取得最後的url地址 : 不放棄的我利用 : $time = curl_getinfo($ch, CURLINFO_PRETRANSFER_TIME); : 叫出從建立連接到準備傳輸所使用的時間大約0.55秒 : 我想問問有沒有讓curl晚一點讀取網頁原始碼的函數讓上面那的秒數延長 : 因為我不管怎麼轉就是無法取得後面那一頁的原始碼 : 謝謝各位恩? 不是你想的那樣.. 是因為你抓的資料沒有帶cookie過去..才會沒抓到該抓的資料.. 像你抓到的這個資料..有一段要去抓javascript 其中牠有一段是寫一段假圖來產生cookie.. 接著再轉址...所以你放到流覽器看都很正常..因為流覽器會正常解析並執行.. 所以你直接跳到亂給cookie在抓資料..像這樣 <?php $url = "http://search.ruten.com.tw/search/s000.php?searchfrom=headbar&k=wii&t=0"; $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_COOKIE, "_ts_id=".urlencode("我是小企鵝")); curl_setopt($ch, CURLOPT_USERAGENT, "Google Bot"); $content = curl_exec($ch); curl_close($ch); echo $content; ?> 哈..記得改 _ts_id -- 我的論壇：TimClub http://www.timteam.org/ 我的blog：94iPHPer http://94i.blogspot.com/ -- ※ 發信站: 批踢踢實業坊(ptt.cc) ◆ From: 114.26.22.70

推 JohnGod21:蠻神奇的謝謝大大幫忙 07/21 22:50

推 dragonming:謝謝大大 07/22 01:51

推 shadowjohn:謝謝大大，昨天我也想破了頭~~ 07/22 09:44