root/trunk/plagger/assets/plugins/Filter-EntryFullText/japanese_chosun_com.yaml

Revision 1931 (checked in by otsune, 13 years ago)

Add EFT for japanese.chosun.com via http://www.mhatta.org/diary/?date=20070228#p01

Line 
1 # http://japanese.chosun.com/
2 author: mhatta
3 custom_feed_handle: http://japanese\.chosun\.com/
4 custom_feed_follow_link: /site/data/html_dir/\d{4}/\d{2}/\d{2}/\d+\.html
5 handle: http://japanese\.chosun\.com/site/data/html_dir/\d{4}/\d{2}/\d{2}/\d+\.html
6 extract: <!!--titlestart-->(.*?)<!!--titleend-->.*?<!!--subtitlestart-->(.*?)<!!--subtitleend-->
7 extract_capture: title subtitle
8 extract_xpath:
9   body: //td[@class="news"]
10 extract_after_hook: |
11   $data->{body} = $data->{subtitle} . '<br>' . $data->{body} if $data->{subtitle};
Note: See TracBrowser for help on using the browser.