Ugh. I give up. Today a customer of ours at Broadchoice ran into a bug with an RSS feed. The feed was being generated by cffeed of course. What was interesting is that this was a whole new bug for me. Yes, another one. I swear I love ColdFusion. Really, I do. I just don't know why this one darn tag seems to trouble me so much. I swear this tag has a personal vendetta against me. So what went wrong this time?
2
3 <cfset queryAddRow(getEntries)>
4 <cfset querySetCell(getEntries,"title", "LAST ENTRY")>
5 <cfset querySetCell(getEntries,"content", "<b>Test</b>")>
6 <cfset querySetCell(getEntries,"publisheddate", now())>
7
8 <cfset queryAddRow(getEntries)>
9 <cfset querySetCell(getEntries,"title", "LAST ENTRY2")>
10 <cfset querySetCell(getEntries,"content", "#chr(8220)#Test#chr(8220)# #chr(19)#")>
11 <cfset querySetCell(getEntries,"publisheddate", now())>
12
13 <cfset props = {version="rss_2.0",title="Test Feed",link="http://127.0.0.1",description="Test"}>
14
15
16
17 <cffeed action="create" properties="#props#" query="#getEntries#" xmlVar="result">
18
19 <cfcontent type="text/xml" reset="true"><cfoutput>#result#</cfoutput>
As just an FYI, the "X" above was the literal bad character. I replied it so as to not cause any possible problems in my own RSS feed here. Great. Not a legal XML character. Hey, I bet xmlFormat() will fix it, right? Of course not. As I said in the beginning. Ugh. So to fix it, I modified the UDF mentioned in the earlier blog entry to just replace chr(19) with nothing. You know - I get that different encodings can impact whats valid in XML. But would it be that hard to ask cffeed to sniff the current settings and just remove what isn't valid? Especially since it will be (most likely) crap characters like funky quotes or the like? Seriously - am I the only one having so much trouble with cffeed?The input values might be invalid. The reason for exception is :
The data "Test X" is not legal for a JDOM character content: 0x13 is not a legal XML character.
Comment 1 written by Ryan Stille on 29 August 2008, at 1:25 PM
Common characters I encountered that caused problems were ascii codes 11, 8220, 8221, 8216, 8217, 8211, 8212, 8226, 8230, and 8482. Probably all from MS Word.
Comment 2 written by Rob Brooks-Bilson on 29 August 2008, at 1:47 PM
Do you guys have paid support with Adobe? It seems to me that if you do, this is something that they could/should be able to fix for you and issue a hotfix for. I know you have a work-around, but it's stuff like this that Adobe should be issuing more hotfixes for between point releases.
Comment 3 written by Sean Corfield on 29 August 2008, at 1:57 PM
Comment 4 written by Raymond Camden on 29 August 2008, at 2:00 PM
@Sean - That sounds like a challenge to me. :) Going to try that.
Comment 5 written by Raymond Camden on 29 August 2008, at 2:40 PM
But anyway, I took one of their demos, and modified it to include bad character 19 in the output. It runs just fine. On display, it shows up invisible on my screen, but the point is, it runs just fine. Here is the code (again, copyright goes to the Groovy in Action folks, I modified other bits of the code as well while playing):
char bad = 19
def builder = new groovy.xml.MarkupBuilder()
builder.numbers {
for(i in 10..15) {
number (value: i, square: i*i, double:i*2, label:'Hard coded '+bad + ' more text') {
for (j in 2..<i) {
if(i % j == 0) {
factor (value:j)
}
}
}
}
}
I hope to heck that renders ok here.
Comment 6 written by Phillip Senn on 29 August 2008, at 3:30 PM
The Whale: Ahhh! Woooh! What's happening? Who am I? Why am I here? What's my purpose in life? What do I mean by who am I? Okay okay, calm down calm down get a grip now. Ooh, this is an interesting sensation. What is it? Its a sort of tingling in my... well I suppose I better start finding names for things. Lets call it a... tail! Yeah! Tail! And hey, what's this roaring sound, whooshing past what I'm suddenly gonna call my head? Wind! Is that a good name? It'll do. Yeah, this is really exciting. I'm dizzy with anticipation! Or is it the wind? There's an awful lot of that now isn't it? And what's this thing coming toward me very fast? So big and flat and round, it needs a big wide sounding name like 'Ow', 'Ownge', 'Round', 'Ground'! That's it! Ground! Ha! I wonder if it'll be friends with me? Hello Ground!
[dies]
http://www.imdb.com/title/tt0371724/quotes
Comment 7 written by Nicholas on 30 August 2008, at 10:54 AM
// remove anything outside of explicit hex range (x20-x7F=standard chars,xA=carriage return,xD=line feed)
rssXml = reReplace(rssXml,"[^(!\x20-\x7F|\xA|\xD)]","","all");
Comment 8 written by Elliott Sprehn on 1 September 2008, at 1:05 AM
The XML specification [1] states that char(19), and all the ascii control characters are not allowed in well formed XML.
If Groovy's MarkupBuilder is actually outputting char(19) then Groovy is broken. It's possible it's stripping the characters for you though, you'd have to check. If it's not, then the XML document you just generated is invalid, and almost every other XML parsing library will choke on it.
Yeah it's a real pain that this stuff happens, but that's what you get from a strict validating language like XML. CF is just using Xerces [2] for XML processing, so the error you're seeing has really nothing to do with CF at all, but rather the fact that Xerces followed the XML specification.
[1] <http://www.w3.org/TR/xml11/#NT-RestrictedChar>
[2] <http://xerces.apache.org/xerces-j/>
Comment 9 written by Raymond Camden on 1 September 2008, at 7:43 AM
So here is a crazy question. If chr(19) is never allowed in well formed XML, why doesn't xmlFormat remove it? Or handle it? I'd call that a bug.
Comment 10 written by Elliott Sprehn on 1 September 2008, at 10:55 AM
Well, I'm not sure I'd call it a bug since XMLFormat() is fairly specific about what it escapes in the docs. Certainly it'd be useful if XMLFormat() removed all the restricted chars, but as it stands it's really more like HTMFormat(), than some kind of XMLSanitize().
Seems like CF just uses org.apache.commons.lang.StringEscapeUtils.escapeXml(), which isn't designed to remove all the restricted chars.
Why not file a wish ticket about it? :)
http://livedocs.adobe.com/coldfusion/8/functions_t...
Comment 11 written by Raymond Camden on 1 September 2008, at 11:04 AM
Comment 12 written by Adam Tuttle on 2 September 2008, at 8:01 AM
Comment 13 written by Phillip Senn on 2 September 2008, at 8:41 AM
Thank you.
I found a 10 minute interview with Twitter founder Biz Stone at
http://www.npr.org/templates/story/story.php?story...
Comment 14 written by Adam Tuttle on 2 September 2008, at 8:44 AM
Comment 15 written by Jose Galdamez on 24 August 2009, at 6:02 PM
That's when I found this UDF on CFLib.
DeMoronize
http://cflib.org/index.cfm?event=page.udfbyid&...
I don't know if it'd work in this situation, but the I'd say the function name is definitely appropriate.
Comment 16 written by Charles on 3 September 2009, at 9:30 AM
Comment 17 written by Raymond Camden on 3 September 2009, at 9:31 AM
[Add Comment] [Subscribe to Comments]