[問題] Mutual Information Max. vs Disentangled

作者lifeowner (珍惜一分一秒)

看板DataScience

標題[問題] Mutual Information Max. vs Disentangled

時間Thu Dec 1 16:23:45 2022

各位前輩: 最近在看 InfoGan (以及 Beta-VAE ) 會提到希望把 Data 的隱變量 (latent representation/variables) 儘量做到 disentangled 這樣可以讓某些特性就用某個特定維度上的 latent variable來表達目前看到的方式就是用 Mutual Information Maximization. 不過我一直找不到 (不理解) 為何 MI Max --> Disentangled representation? 這有直覺上的解釋或者是有被嚴格證明的? 謝謝各位解惑 -- ※ 發信站: 批踢踢實業坊(ptt.cc), 來自: 223.137.22.220 (臺灣) ※ 文章網址: https://www.ptt.cc/bbs/DataScience/M.1669883027.A.C18.html ※ 編輯: lifeowner (223.137.22.220 臺灣), 12/01/2022 16:25:03

→ chang1248w: arXiv:1802.0494212/01 17:47

→ chang1248w: 你的理解中，MI分析的對象是？12/01 17:49

Image vs latent code ※ 編輯: lifeowner (223.137.22.220 臺灣), 12/01/2022 18:17:42

→ chang1248w: infogan的思路建立在latent code本身就是distangled 12/02 14:52

→ chang1248w: 然後MI reg的部分則是強迫gan啟用latent code的訊息 12/02 14:54

→ chang1248w: MI(X, Y) = KL(p(x,y)||p(x)p(y)) has maximum when 12/02 15:24

→ chang1248w: 痾 minimum... when p(x,y)=p(x)p(y) 12/02 15:28

→ chang1248w: 這種情況下代表generator output 和latent code完全沒 12/02 15:29

→ chang1248w: 關係 12/02 15:30

→ chang1248w: 另外直覺上VAE的latent nodes才會minimize彼此之間的 12/02 15:33

→ chang1248w: MI 12/02 15:34

→ truehero: arxiv 1811.12359v4 12/04 23:19

→ pups003: 應該不是chang說的那樣論文裡是寫MI maximization 02/15 23:39

→ pups003: 我的理解是，並不是maximizing MI->disentangled 02/15 23:39

→ pups003: 而是用了categorical code 02/15 23:42

→ pups003: MI max一般用在latent跟output上是用來減緩mode collapse 02/15 23:43