473,396 Members | 1,767 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

Sorting by Popularity?

This is... I guess more of a programming structure question than
anything.
How does one index the popularity of something? Overall usage? How
does recent-term popularity come in? Is there an algorithm for
weighting the popularity, or does one just index a popularity value for
forever, 1 week, 1 day, etc? And in that case, how does one know when
oldd 'hits' pass out of current consideration?

I mean- this sort of problem has existed for twenty years. I'm
TRUSTING (hoping) there is some sort of disgustingly elegant but
anti-intuitive solution for this sort of thing I'm just not seeing.

.....help?

-Derik

Feb 8 '06 #1
5 1827
Filling out an application for Google, eh? ;)

My 0.02 ( in the scope of ranking web pages ):

I did an AI project where pages were dynamically ranked. I factored in
a few things:

1. direct clicks to a page
2. keywords within the page that match what I was searching for.
3. Relative position of keywords to other keywords on the page
4. Number of links from the page
5. Pages that linked to a clicked page

For your algorithm, I think it depends on exactly what you are ranking.
For instance, I will likely rank medical information differently than
tech information. It depends on what your customers are looking for.

Never the less, it's a fun topic to discuss.

Feb 8 '06 #2
NC
Re********@aol.com wrote:

How does one index the popularity of something?
However one pleases.
I mean- this sort of problem has existed for twenty years.


In fact, the problem has existed for several hundred years (at least
since the start of book printing). The publishing industry, and later
the sound recording indurstry and movie/video industry all measure
popularity in a variety of ways.

Let's take movies, for example. You can measure their popularity by
how many people have seen it the first weekend after it came out, how
many people have seen it in movie theaters, how long it's been playing
in movie theaters, how many people purchased the DVD, and how many
people rented the DVD...

Records' popularity is measured in a similar variety of ways. You have
weekly charts that reflect "instant" popularity (current week's rating,
highest rating achieved, number of weeks the record stayed in the
chart), but you also have lifetime recognition -- the Gold (500,000
copies sold), Platinum (1,000,000 copies sold), and Diamond (10,000,000
copies sold) awards...

Cheers,
NC

Feb 8 '06 #3
>Filling out an application for Google, eh?

No, sub-genere web-directory. The idea is to let the members of a
fandom index the several thousand pages online by category (multiple
possible categories allowed per page) and have it auto-ranking.
(in-category rank seperate from overall popularity, to prevent
selection depth wash-out on the top 10's.)

I take it your answer is- 'track several indices, and use an algorithm
to average them out in some sane way.'

Thus, I guess, I must ask again- is there an elegant way to teack 'just
the hits in the last week,' essentiallg dropping the 'back-end' hits as
they elapse past the hit-time, WITHOUT keeping an individual record of
all hits over the course of the week?

-Derik

Feb 8 '06 #4
NC
Re********@aol.com wrote:

I guess, I must ask again- is there an elegant way to teack 'just
the hits in the last week,' essentiallg dropping the 'back-end' hits
as they elapse past the hit-time, WITHOUT keeping an individual
record of all hits over the course of the week?


You could try daily totals... Say, you have a MySQL table called hits:
id (INT): Page ID
date (DATE): Day for which you need to know the number of hits
hits (INT): Hit count

The table's primary key is a two-column index based on `id` and `date`.
Every time a page is accessed, it does something like this:

$date = date('Y-m-d');
$id = [Page ID here];
$query = "UPDATE `hits` SET `hits` = `hits` + 1 " .
"WHERE `id`=$id AND `date`='$date' ";
mysql_query($query);

Summing up the daily totals for the last seven days then becomes rather
trivial:

SELECT id, SUM(hits) AS score
FROM hits
WHERE date >= '[the first day of counting]'
GROUP BY id
ORDER BY score DESC;

Chees,
NC

Feb 8 '06 #5
NC wrote:
Re********@aol.com wrote:

I guess, I must ask again- is there an elegant way to teack 'just
the hits in the last week,' essentiallg dropping the 'back-end' hits
as they elapse past the hit-time, WITHOUT keeping an individual
record of all hits over the course of the week?


You could try daily totals... Say, you have a MySQL table called hits:


<snip Nikolai Chuvakhin's MySQL solution>

Say, for example, take a product search engine and it's core DB
table:
products: id, name, hits

If the 'products' table is updated with hits count lively, it might
affect performance; but certainly to rank the products we need such
table structure. I don't have any idea, how it's done in some high
traffic sites.

But, when I last time dig on this subject, I read somewhere (or
mistakenly understood) that such sites don't rely on DB. The product
data is stored in filesystem with hash like system:

c:\products\01\product1.data
\02\product2.data

And it's been "indexed"--not sure, how it's indexed. The same
technique is used by search engines(?). If anyone has any good
experience on this topic, kindly share the architecture. I'm much
curious.

--
<?php echo 'Just another PHP saint'; ?>
Email: rrjanbiah-at-Y!com Blog: http://rajeshanbiah.blogspot.com/

Feb 8 '06 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

29
by: Xah Lee | last post by:
Computer Language Popularity Trend This page gives a visual report of computer languages's popularity, as indicated by their traffic level in newsgroups. This is not a comprehensive or fair...
26
by: Bob Nelson | last post by:
Some postings in this newsgroup over the past few months have questioned the popularity of the C programming language. For what it's worth, the ``Reader's Choice Awards 2008'' from _Linux...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.