I was browsing on Google earlier and found that there was one site which had content which had changed since it was indexed - the page was no longer relevant to the search term. I know some sites, such as eBay, have places in some pages which change regularly on page load - for example a catalogue where there may be a New Items section or a Featured Items section in which 5 random items are selected from a catalogue of 500 on every page load. I actually quite regularly come across this kind of thing, which is what made me think of this idea.
My idea is to have a "changeFreq" attribute for which the developer can specify any of the following:
- PageLoad[:URL]
- Daily[:hh:mm][:URL]
- Weekly[Mon|Tue|Wed|Thu|Fri|Sat|Sun][:hh:mm][:URL]
- Monthly[:dd][:hh:mm][:URL]
- Yearly[:mm[/dd]][:URL]
So what does all this mean? Well for PageLoad, search engines simply will not index them. For options with [:hh:mm], the time is optional. If no time is supplied then it defaults to midnight. For options with [Mon|Tue|Wed|Thu|Fri|Sat|Sun], the day is optional. If no day is supplied then it defaults to Monday. For options with [:dd] (yes, you guessed it!) it is the day of the month. If it is more than the number of days in the month then it will default back to the last day. If no day is specified then it defaults to the 1st. For options with [:mm/dd] specified, it means (fairly obviously) the month and day. The same rules apply for day and if the month is specified on its own then the day defaults to the 1st.
Now for the clever bit - the [:URL]. This specifies the URL to load into the element when it is out of date. This can be used for two purposes - for more relevant searches and for better caching systems. I can hear asking what difference this would have... it would basically allow search engines to re-index just a part of the page, and browsers to load most of a page from the cache but reload any out-dated parts. This will be optional and, if not specified on an out-dated element, the whole page will be re-indexed or reloaded.
Oh and this attribute, if a URL is specified, could be used to load any outdated parts of the page with jQuery and similar frameworks... other options would have to be available for this, such as Hourly, Minutely, Secondly and so on.
Inheritance will be applied to all elements where there is no changeFreq specified. Any elements within an element set to PageLoad will not be indexed (even if they have a different changeFreq specified. Consider the following code:
Expand|Select|Wrap|Line Numbers
- <div changeFreq="Weekly:Mon:13:00">
- <div id="TodayOnlyOffers" changeFreq="Daily:13:30:/Parts/TodayOnly.aspx">
- Content will be updated with the URL /Parts/TodayOnly.aspx at 13:30 every day, but this tag will be replaced weekly every Monday at 13:00.
- </div>
- <div changeFreq="PageLoad">
- Content will not be indexed, and this tag will also be replaced weekly every Monday at 13:00.
- <div changeFreq="Daily:15:00:/Parts/Something-Else.aspx">
- This tag will also not be indexed because it is within a PageLoad tag. No matter what the changeFreq of this tag is set to, the search engine will still think it changes every page load. A solution would be to put the content outside this tag, but within the parent tag, within its own tag set to PageLoad and remove the changeFreq of the parent tag.
- </div>
- </div>
- </div>
Thanks in advance.
Regards,
Richard