Hello. I need to take a string in UTF-8 with extended characters (e.g trademark, curly quotes, etc) and encode it for html, with either the html named entities or xml numbered (unicode) entities.
I've tried HttpUtility.HtmlEncode(), but from what I can gather, this function is really (at least originally) only intended to prevent cross-site scripting, and therefore only encodes some characters, leaving the others unchanged.
For example, when I encode the string "<trademark><registered>" I get back "<trademark><ampersand>#174;". (sorry for the retarded representation, I ironically couldn't figure out how to represent characters in posts.) In other words, only the registered symbol is properly encoded, while the trademark symbol remains.
Is there a proper method or way to encode a string such that all extended characters are properly encoded?