Next Page

1

Previous Page

Thread: decode html special entities like & to normal text?

Created on: 06/24/08 11:29 AM

Replies: 6

reincat


New Member


Joined: 06/24/08

Posts: 2

decode html special entities like & to normal text?
06/24/08 11:29 AM

Is there a way to decode html special entities like & and é to normal text? I need to clean a MSSQL database field of nvarchar.

Link | Top | Bottom

marcovandenoever


Member


Joined: 02/20/07

Posts: 82

RE: decode html special entities like & to normal text?
06/24/08 4:31 PM

I think this script that uses the find replace function will do:

  <cfscript>
/**
 * Fixes text using Microsoft Latin-1 &quot;Extentions&quot;, namely ASCII characters 128-160.
 * 
 * @param text 	 Text to be modified. (Required)
 * @return Returns a string. 
 * @author Shawn Porter (sporter@rit.net) 
 * @version 1, June 16, 2004 
 */
function DeMoronize (text) {
	var i = 0;

    // map incompatible non-ISO characters into plausible 
	// substitutes
	text = Replace(text, Chr(128), "&euro;", "All");

	text = Replace(text, Chr(130), ",", "All");
	text = Replace(text, Chr(131), "<em>f</em>", "All");
	text = Replace(text, Chr(132), ",,", "All");
	text = Replace(text, Chr(133), "...", "All");
		
	text = Replace(text, Chr(136), "^", "All");

	text = Replace(text, Chr(139), ")", "All");
	text = Replace(text, Chr(140), "Oe", "All");

	text = Replace(text, Chr(145), "`", "All");
	text = Replace(text, Chr(146), "'", "All");
	text = Replace(text, Chr(147), """", "All");
	text = Replace(text, Chr(148), """", "All");
	text = Replace(text, Chr(149), "*", "All");
	text = Replace(text, Chr(150), "-", "All");
	text = Replace(text, Chr(151), "--", "All");
	text = Replace(text, Chr(152), "~", "All");
	text = Replace(text, Chr(153), "&trade;", "All");

	text = Replace(text, Chr(155), ")", "All");
	text = Replace(text, Chr(156), "oe", "All");

	// remove any remaining ASCII 128-159 characters
	for (i = 128; i LTE 159; i = i + 1)
		text = Replace(text, Chr(i), "", "All");

	// map Latin-1 supplemental characters into
	// their &name; encoded substitutes
	text = Replace(text, Chr(160), "&nbsp;", "All");

	text = Replace(text, Chr(163), "&pound;", "All");

	text = Replace(text, Chr(169), "&copy;", "All");

	text = Replace(text, Chr(176), "&deg;", "All");

	// encode ASCII 160-255 using ϧ format
	for (i = 160; i LTE 255; i = i + 1)
		text = REReplace(text, "(#Chr(i)#)", "&###i#;", "All");
	
    // supply missing semicolon at end of numeric entities
	text = ReReplace(text, "&##([0-2][[:digit:]]{2})([^;])", "&##\1;\2", "All");
	
    // fix obscure numeric rendering of &lt; &gt; &amp;
	text = ReReplace(text, "&##038;", "&amp;", "All");
	text = ReReplace(text, "&##060;", "&lt;", "All");
	text = ReReplace(text, "&##062;", "&gt;", "All");

	// supply missing semicolon at the end of &amp; &quot;
	text = ReReplace(text, "&(^;)", "&amp;\1", "All");
	text = ReReplace(text, ""(^;)", "&quot;\1", "All");
	text = ReReplace(text, "<BR>","", "all");

	return text;
}
</cfscript>

Link | Top | Bottom

Bradley


Member


Joined: 05/12/08

Posts: 90

RE: decode html special entities like &amp; to normal text?
06/25/08 2:09 PM

Uhhhh... hey guys...

Instead of that big huge script file, you could just enclose the special entities like so...

<cfoutput>
   #ToString("&amp;")#<br>
   #ToString("&eacute;")#
</cfoutput>

...or am I completely wrong and this isn't what you're looking for?

- Bradley
| http://bradleybeard.blogspot.com

Link | Top | Bottom

marcovandenoever


Member


Joined: 02/20/07

Posts: 82

RE: decode html special entities like &amp; to normal text?
06/25/08 2:12 PM

euhmmmm :) as i understand it he wants to replace the code? Well at least there are enough options to work with now.

Link | Top | Bottom

Bradley


Member


Joined: 05/12/08

Posts: 90

RE: decode html special entities like &amp; to normal text?
06/25/08 2:18 PM

Very true! Always best to have more than one way to go than be pigeon-holed into one way forever.

Let us know how it goes...

- Bradley
| http://bradleybeard.blogspot.com

Link | Top | Bottom

reincat


New Member


Joined: 06/24/08

Posts: 2

RE: decode html special entities like &amp; to normal text?
06/26/08 2:50 AM

Thanks for your suggestions. PHP has a html_decode but in CF you have to make your own. I found this one on the net and modified it (don't remember where) and it did the job well. Also if you build a query and dump it, showing the before and after text, all the special characters show up in the browser instead of rendering, which is really helpful when debugging these characters.


<cffunction name="HtmlUnEditFormat" access="public" returntype="string" output="no" displayname="HtmlUnEditFormat" hint="Undo escaped characters">
<cfargument name="str" type="string" required="Yes" />
<cfscript>
var lEntities = "&##xE7;,&##xF4;,&##xE2;,Î,Ç,È,Ó,Ê,&OElig,Â,«,»,À,É,≤,ý,χ,∑,′,ÿ,∼,β,⌈,ñ,ß,„,´,·,–,ς,®,†,⊕,õ,η,⌉,ó,­,>,φ,∠,‏,α,∩,↓,υ,ℑ,³,ρ,é,¹,<,¢,¸,π,⊃,÷,ƒ,¿,ê, ,∅,∀, ,γ,¡,ø,¬,à,ð,ℵ,º,ψ,⊗,δ,ö,°,≅,ª,‹,♣,â,ò,ï,♦,æ,∧,◊,è,¾,&,⊄,ν,“,∈,ç,ˆ,©,á,§,—,ë,κ,∉,⌊,≥,ì,↔,∗,ô,∞,¦,∫,¯,½,¤,≈,λ,⁄,‘,…,œ,£,♥,−,ã,ε,∇,∃,ä,μ,¼, ,≡,•,←,«,‾,∨,€,µ,≠,∪,å,ι,í,⊥,¶,→,»,û,ο,‚,ϑ,∋,∂,”,℘,‰,²,σ,⋅,š,¥,ξ,±,ℜ,þ,⟩,ù,√,‍,∴,↑,×, ,θ,⌋,⊂,⊇,ü,’,ζ,™,î,ϖ,‌,⟨,˜,ú,¨,∝,ϒ,ω,↵,τ,⊆,›,∏,",‎,♠";
var lEntitiesChars = "ç,ô,â,Î,Ç,È,Ó,Ê,Œ,Â,«,»,À,É,?,ý,?,?,?,Ÿ,?,?,?,ñ,ß,„,´,·,–,?,®,‡,?,õ,?,?,ó,­,>,?,?,?,?,?,?,?,?,³,?,é,¹,<,¢,¸,?,?,÷,ƒ,¿,ê,?,?,?,?,?,¡,ø,¬,à,ð,?,º,?,?,?,ö,°,?,ª,‹,?,â,ò,ï,?,æ,?,?,è,¾,&,?,?,“,?,ç,ˆ,©,á,§,—,ë,?,?,?,?,ì,?,?,ô,?,¦,?,¯,½,¤,?,?,?,‘,…,œ,£,?,?,ã,?,?,?,ä,?,¼, ,?,•,?,«,?,?,€,µ,?,?,å,?,í,?,¶,?,»,û,?,‚,?,?,?,”,?,‰,²,?,?,š,¥,?,±,?,þ,?,ù,?,?,?,?,×,?,?,?,?,?,ü,’,?,™,î,?,?,?,˜,ú,¨,?,?,?,?,?,?,›,?,"",?,?";
</cfscript>
<cfreturn ReplaceList(arguments.str, lEntities, lEntitiesChars) />
</cffunction>

Link | Top | Bottom

marcovandenoever


Member


Joined: 02/20/07

Posts: 82

RE: decode html special entities like &amp; to normal text?
06/26/08 3:56 AM

Ah! Looks nice to me, thanks for sharing, will try this out some time.

Link | Top | Bottom

Next Page

1

Previous Page

New Post

Please login to post a response.