作者earlywinter (earlywinter)
看板Python
標題[問題]寫爬FB粉絲專業資訊時遇到問題
我在寫爬FB資料爬完後,想把爬的資料寫成csv檔,但我用jupyter看只有標題,
沒有我抓的資料,很確定他有再爬資料。
還有如果我要把檔案寫在D槽,我該如何寫? 謝謝各位大大了
下面是我寫的程式碼
import requests
import pandas as pd
while 'paging' in res.json():
for index, information in enumerate(res.json()['data']):
print('正在爬取第{}頁,第{}篇文章'.format(page, index + 1))
if 'message' in information:
res_post =
requests.get('
https://graph.facebook.com/v2.9/{}/likes?limit=1000&access_token={}'.format(information['id'],
token))
try:
if 'next' not in res_post.json()['paging']:
for likes in res_post.json()['data']:
information_list.append([information['id'],
information['message'], parse(information['created_time']).date(),
likes['id'], likes['name']])
elif 'next' in res_post.json()['paging']:
while 'paging' in res_post.json():
for likes in res_post.json()['data']:
information_list.append([information['id'],
information['message'], parse(information['created_time']).date(),
likes['id'], likes['name']])
if 'next' in res_post.json()['paging']:
res_post =
requests.get(res_post.json()['paging']['next'])
else:
break
except:
information_list.append([information['id'],
information['message'], parse(information['created_time']).date(), "NO",
"NO"])
if 'next' in res.json()['paging']:
res = requests.get(res.json()['paging']['next'])
page += 1
else:
break
print('爬取結束!')
df = pd.DataFrame(information_list, columns=['發文ID', '發文內容', '發文時間
', '按讚ID', '按讚名字'])
df.to_csv('台灣資料科學年會.csv', index=False)
--
※ 發信站: 批踢踢實業坊(ptt.cc), 來自: 1.163.99.217
※ 文章網址: https://www.ptt.cc/bbs/Python/M.1520767195.A.5DD.html
※ 編輯: earlywinter (1.163.99.217), 03/11/2018 22:43:22
→ HenryLiKing: 幹嘛刪掉留言啊 03/11 22:44
→ earlywinter: 抱歉,我是剛來用ptt的,因為我是直接換程式碼想說下 03/12 02:49
→ earlywinter: 面留言不要混淆到就刪了 03/12 02:50
→ Jyery: 刪留言在每個看板都是大忌耶 03/12 10:43
→ earlywinter: sorry..學到了 03/12 11:47
→ sky800507: 這程式碼好眼熟XD 03/12 20:29
→ earlywinter: 是我用網路上某大大的 03/14 15:55