[問題] lxml.html 中文 URL 的亂碼問題

作者ggirls (哥)

看板Python

標題[問題] lxml.html 中文 URL 的亂碼問題

時間Fri Oct 7 21:48:31 2016

import lxml.html e = lxml.html.fromstring('<a name="中文">什麼</a>') print(lxml.html.tostring(e, encoding='unicode')) 結果 <a name="%E4%B8%AD%E6%96%87">什麼</a> URL 的部分都被轉碼了。雖然這應該是符合規格。可辦法叫他不要轉嗎？ -- ※ 發信站: 批踢踢實業坊(ptt.cc), 來自: 223.136.175.128 ※ 文章網址: https://www.ptt.cc/bbs/Python/M.1475848114.A.48B.html

推 longlongint: 如果你不會用到HTTP 就不用轉 10/08 01:00

→ s860134: urllib.parse.unquote(result) 好險你用 python3 10/08 01:00

→ s860134: 如果你用 python2 你會有點惱地抄一段 \Lib\urlparse.py 10/08 01:02