[問題] df.groups後無法to_excel

作者Pettitte1 (低調奢華有內涵)

看板Python

標題[問題] df.groups後無法to_excel

時間Sun Sep 11 17:27:40 2022

想請問我有一個dic是用groupby結合後，再用 .groups得到的但是想要把這結果 to_excel 會跑出AttributeError: 'PrettyDict' object has no attribute 'to_excel' 如果用 for k,v in df.items(): print("key",k , "value:",v) 把key跟value一個一個列出來會變成類似這樣 key 電機機械 value: Int64Index([1513, 1526, 2066], dtype='int64', name='stock_id') 實際數據只有key=電機機械, value = [1513, 1526, 2066] 卻還有Int64Index... 等請問這是什麼原因呢? -- ※ 發信站: 批踢踢實業坊(ptt.cc), 來自: 59.115.90.63 (臺灣) ※ 文章網址: https://www.ptt.cc/bbs/Python/M.1662888463.A.CC3.html

→ chang1248w: 因為讀出來是叫group的object吧 09/11 17:38

→ chang1248w: 要再轉回dataframe 09/11 17:38

請問我的代碼如下 lists = [1218, 1402, 1444, 1513, 1526, 1536, 1712, 2066, 2231, 2305, 2380, 2387, 2413, 2538, 2542, 2850, 3022, 3202, 3296, 3322, 3346, 3680, 3710, 4114, 4402, 4743, 5353, 5498, 6279, 6285, 6570, 6605, 8069, 8105, 8403, 9802, 9907] import pandas as pd import os os.chdir(r'C:\python\xlsx') a = pd.DataFrame(lists) a.set_axis(['stock_id'], axis='columns',inplace=True) b = pd.read_excel('產業類別.xlsx') data = a.merge(b, on='stock_id') data.set_index(['stock_id'], inplace=True) e = data.groupby('產業類別') f = e.groups print(f) 如果我print(f)會得到這樣： {'光電業': [8069, 8105], '其他': [9802, 9907], '化學工業': [1712], '半導體業 ': [3680], '建材營造': [2538, 2542], '汽車工業': [1536, 2231, 3346, 6605], '生技醫療業': [4114, 4743, 8403], '紡織纖維': [1402, 1444, 4402], '通信網路業': [5353, 6285], '金融保險業': [2850], '電子零組件業': [2413, 3202, 3296, 3322, 3710, 5498, 6279], '電機機械': [1513, 1526, 2066], '電腦及週邊設備業': [2305, 2380, 2387, 3022, 6570], '食品工業': [1218]} 但如果DataFrame後格式會跑掉而且還是貼不上excel 用for k,v in f.items(): print(k + ' = ',v) 結果會是這樣：光電業 = Int64Index([8069, 8105], dtype='int64', name='stock_id') 其他 = Int64Index([9802, 9907], dtype='int64', name='stock_id') 化學工業 = Int64Index([1712], dtype='int64', name='stock_id') 半導體業 = Int64Index([3680], dtype='int64', name='stock_id') 建材營造 = Int64Index([2538, 2542], dtype='int64', name='stock_id') 汽車工業 = Int64Index([1536, 2231, 3346, 6605], dtype='int64', name='stock_id') 生技醫療業 = Int64Index([4114, 4743, 8403], dtype='int64', name='stock_id') 紡織纖維 = Int64Index([1402, 1444, 4402], dtype='int64', name='stock_id') 通信網路業 = Int64Index([5353, 6285], dtype='int64', name='stock_id') 金融保險業 = Int64Index([2850], dtype='int64', name='stock_id') 電子零組件業 = Int64Index([2413, 3202, 3296, 3322, 3710, 5498, 6279], dtype='int64', name='stock_id') 電機機械 = Int64Index([1513, 1526, 2066], dtype='int64', name='stock_id') 電腦及週邊設備業 = Int64Index([2305, 2380, 2387, 3022, 6570], dtype='int64', name='stock_id') 食品工業 = Int64Index([1218], dtype='int64', name='stock_id') 也貼不上excel 請問我該怎麼轉才能把(k + ' = ',v)這結果貼到excel上呢? 謝謝 ※ 編輯: Pettitte1 (59.115.90.63 臺灣), 09/11/2022 18:10:50

推 cloki: 這是...要再用一個for loop把v的值貼出來吧,v是numpy array 09/11 20:09

→ cloki: 直接print肯定是把資料型別之類的訊息直接印出來 09/11 20:11

→ lycantrope: 看不懂輸出想要什麼，是要產業分開後跟id 分別保存嗎 09/11 22:30

→ lycantrope: ？還是同一個table直接存成excel 09/11 22:30

我想要儲存成excel A B C 1 光電業化學工業 2 8069 1712 3 8105 謝謝 ※ 編輯: Pettitte1 (27.247.201.78 臺灣), 09/12/2022 08:53:45

→ KSJ: https://bit.ly/3d496eP 09/12 09:43

→ KSJ: pd.DataFrame(dict([(k,pd.Series(v)) for k,v in d.items()] 09/12 09:43

→ KSJ: d 是你的 groups 09/12 09:43

推 lycantrope: ttps://pastebin.com/QEE0H2Gs 09/12 11:05

推 lycantrope: https://pastebin.com/9Xq2p6r9 09/12 12:21

後來是將groupby.groups 使用pd.DataFrame.from_dict(groupby.groups, orient='index') 再轉成dataframe解決的謝謝各位幫忙 ※ 編輯: Pettitte1 (27.247.201.78 臺灣), 09/14/2022 13:43:15