Re: [問題] 關於tree 結構

作者kornelius (c9s)

看板Perl

標題Re: [問題] 關於tree 結構

時間Mon May 18 23:25:03 2009

你可以使用 Ingy 寫的 pQuery 或是 miyagawa 寫的 Web::Scraper 不用幾行就可以很方便的取出資訊了。這是 Web::Scraper 的 synopsis use URI; use Web::Scraper; my $ebay_auction = scraper { process "h3.ens>a", description => 'TEXT', url => '@href'; process "td.ebcPr>span", price => "TEXT"; process "div.ebPicture >a>img", image => '@src'; }; my $ebay = scraper { process "table.ebItemlist tr.single", "auctions[]" => $ebay_auction; result 'auctions'; }; my $res = $ebay->scrape( URI->new("http://search.ebay.com/apple-ipod-nano_W0QQssP ageNameZWLRS") ); ※ 引述《abcg5 (nothing)》之銘言： : 問題是這樣的! : 小弟寫得程式需要 : 使用到DOM Tree結構 : 用道perl內建的module HTML::TreeBuilder; : 先my $h = $tree->look_down('_tag', 'html'); 來建出結構! : 接著小弟想要讀取每個text nodes裡面的字串 : 分別作處理!! : 不過卻只有看到as_text和content_list等方式去讀取 : 而前者將全部的text nodes變成一個字串的形態! : 沒辦法個別字串處理! : 後者就只能回傳child level的參照位置需要一層一層去跑! : 很沒有效率!只要該html檔的結構複雜就需要跑很多層! : 因為感覺perl算是一個蠻成熟的語言了! : 不太可能沒有方式能直接讀取個別text node的內容!! : 所以上來詢問看看!! : 請大家指導一下! -- -- ※ 發信站: 批踢踢實業坊(ptt.cc) ◆ From: 60.249.21.58 ※ 編輯: kornelius 來自: 60.249.21.58 (05/19 00:12)