472,127 Members | 1,871 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,127 software developers and data experts.


Recently i have engaged a company to track my articles across the net. at this point we have submitted them a few RSS feeds to load our content into their system for intial testing and what have you, nothing wrong with my RSS feeds, they are in standard .XML format UTF-8 encoding applied (yes i made i sure they were saved as such too.) Anyways they can not reach my content on the site because of "&"

now after some brief research i have come across URLDecode is the best way to handle this problem, OR URL Rewrite through the IIS.

So i guess my question is which is best, AND if URLDecode is the chosen method to handle this request what is the string that i have to pass to fix it.

Thanks in advance.
Sep 17 '07 #1
4 3696
3,406 Expert 2GB
check out this article
Sep 17 '07 #2
check out this article
handy little article thank you, but i guess i know less about urldecode than i thought :)

Below i have pasted the code from my localvars.asp which handles all the pages for my site, Do i need to add the url decode string here? or is it something that has to be added to everypage?

(sample link of how my site proceeds, notice all the "&" in the RSS feed these are all read as "&")


&type=print&lang=en&vol=XXXXXX&cat=X&articleID=XXX XXX&mode=view&id=

Expand|Select|Wrap|Line Numbers
  1. <% 
  2. DIM strContent, strFileID, strType, strLang, strVol, strCat, strArticleID, strKeyword,_ 
  3.         strThisRedirect, strDownload, strMode, strPageName, strBaseURL, strBaseQueryString, _
  4.         strhttp_host, strSection
  6. strSection = request("section")
  7. strhttp_host = "http://" & request("http_host")
  8. strContent = lcase(cstr(Request.QueryString("content")))
  9. strFileID = lcase(cstr(Request.QueryString("id")))
  10. strType = lcase(cstr(Request.QueryString("type")))
  11. if strType = "" and strFileID <> "" Then strType = "digital"
  12. strLang = lcase(Request.QueryString("lang"))
  13. if strLang = "" Then strLang = "en"
  14. strVol = lcase(cstr(Request.QueryString("vol")))
  15. 'if strVol = "" Then
  16. '     If strType = "digital" then 
  17. '             strVol = "current"
  18. '     Else
  19. '             strVol = ""
  20. '     End If     
  21. 'End If
  22. strCat = lcase(Request.QueryString("cat"))
  23. strArticleID = Request.QueryString("articleid")
  24. If strFileID <> "" and strArticleID = "" Then strArticleID = strFileID 
  25. 'Bridge code to support the absense of the content field in the url
  26. if strContent = "" Then
  27.      IF strArticleID <> "" Then 
  28.              strContent = "getcontent"
  29.      else
  30.         strContent = "defcontent"
  31.      End if
  32. End if
  33. strKeyword = Request.Form("keyword")
  34. strThisRedirect = lcase(Request("QUERY_STRING"))
  35. strDownload = lcase(Request.QueryString("download"))
  36. strMode = lcase(Request.QueryString("mode"))
  37. strPageName = "/scripts/" & lcase(Request.QueryString("pagename")) & ".asp"
  38. if strPageName = "/scripts/.asp" THEN strPageName = "/scripts/default_content.asp"
  39. strBaseURL = "default.asp?pagename=media"
  40. strBaseQueryString = "&amp;type=" & strType & "&amp;lang=" & strLang
  41. %>
Hope you can help with this.
Sep 18 '07 #3
3,406 Expert 2GB
Notice that your line 40 includes "&amp;". This is the problem (as far as I can tell). The thing is, when we include complex querystrings in URLs in links in a website, we are supposed to use "&amp;" rather than "&" (according to the w3c). This is supposed to protect against browsers misinterpreting some querystring as a complex character code. (imagine if someone had a querystring variable named "quot" and you can imagine how some browsers might interpret "&quot"). So there is an inherent conflict in how you need to write your pages for use with a standard browser and how they should be written for this service.

In my experience, I haven't seen a browser slip up on this problem in years so I would suggest erring in favor of the search service. I don't think any current browsers will interpret a character code if it doesn't include a final semi-colon, but I have seen this problem in the past and some fossils do use older browsers. Just to be sure, make sure none of your querystring variables are inadvertently HTML character code names, and then change all of your "&amp;"s to "&". This will not pass validation, it will definitely fail if you send it to w3c, but I would be very surprised if a browser ever got it wrong.

Sep 18 '07 #4
Yes i agree, but the problem lies within the company that crawls the net in search of the content, we submit an RSS feed to them and in the links possibly the text at times & is replaced with &amp; to make the feed vailid, but when they load these feeds into their webcrawler/sniffer they cannot reach the content because the website (I think) does not handle full url Decoding.

and on line 40 i guess it's the website parsing those & but i should say &amp; blah & srtblah

I am in process of trying a simple fix, i'll advise if it works.
Sep 18 '07 #5

Post your reply

Sign in to post your reply or Sign up for a free account.

Similar topics

2 posts views Thread by Thomas Henz | last post: by
reply views Thread by Yifan | last post: by
1 post views Thread by Joăo Santa Bárbara | last post: by
5 posts views Thread by Yifan | last post: by
4 posts views Thread by John Hoge | last post: by
6 posts views Thread by John Grandy | last post: by
10 posts views Thread by Alex | last post: by
1 post views Thread by fjm67 | last post: by
3 posts views Thread by gert | last post: by
reply views Thread by leo001 | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.