Ticket #95 (closed defect: fixed)

Opened 13 months ago

Last modified 13 months ago

rss parser: possible character encoding problem

Reported by: erwin Owned by: somebody
Priority: major Milestone: M5
Component: other Version: 1.0
Keywords: Cc:

Description

Some rss feeds can't be parsed because of: Content is not allowed in prolog. Although the feed is valid according to feedvalidator.org

It could have something to do with the encoding that is used to read the rss feed.

Example of a feed that goes wrong: http://www.hbvl.be/syndication/service.ashx?src=limburg

Change History

Changed 13 months ago by joeri

This happens because the feed doesn't have the required <?xml version="1.0"?> prolog. One way to work around this is to not let the xml parser read in the url, but to read in the feed manually. Then we can check to see if the prolog is provided or not and add it if needed.

Changed 13 months ago by joeri

  • status changed from new to closed
  • resolution set to fixed
  • milestone changed from M6 to M5

The problem was the BOM (byte order mark) in front of the feed xml. Instead of reading in the xml using a URLConnection and a BufferedReader?, we just pass the url string directly to the xml parser.

Note: See TracTickets for help on using tickets.