473,322 Members | 1,755 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,322 software developers and data experts.

faster count(*) or alternative

hi,

I have a table with several million rows.
Each row is simply the date and time a certain page was viewed.

eg
page1 1-1-00
page2 2-1-00
page1 16-1-00
page1 17-1-00
page2 19-1-00

I now need to find the most popular pages so I do this...

SELECT TOP 10 COUNT(*) AS mycount FROM tblPageViews
GROUP BY place ORDER BY COUNT(place) DESC

....which gives me the top 10 most viewed pages

big problem - this is getting slower and slower as the table grows and
grows.

what should I do?

is there an alternative?

I think I need all the rows (rather than a simple incrementing
counter) because I might want to say "what was the most popular page
on a certain day or within a certain period"

tia

Tim

Feb 25 '07 #1
6 5802
"C10B" <ts******@gmail.comwrote in message
news:11*********************@a75g2000cwd.googlegro ups.com...
hi,

I have a table with several million rows.
Each row is simply the date and time a certain page was viewed.
First question, do you have an index on the table.

That should help.
>
eg
page1 1-1-00
page2 2-1-00
page1 16-1-00
page1 17-1-00
page2 19-1-00

I now need to find the most popular pages so I do this...

SELECT TOP 10 COUNT(*) AS mycount FROM tblPageViews
GROUP BY place ORDER BY COUNT(place) DESC

...which gives me the top 10 most viewed pages

big problem - this is getting slower and slower as the table grows and
grows.

what should I do?

is there an alternative?

I think I need all the rows (rather than a simple incrementing
counter) because I might want to say "what was the most popular page
on a certain day or within a certain period"

tia

Tim


--
Greg Moore
SQL Server DBA Consulting
sql (at) greenms.com http://www.greenms.com
Feb 25 '07 #2
C10B (ts******@gmail.com) writes:
I have a table with several million rows.
Each row is simply the date and time a certain page was viewed.

eg
page1 1-1-00
page2 2-1-00
page1 16-1-00
page1 17-1-00
page2 19-1-00

I now need to find the most popular pages so I do this...

SELECT TOP 10 COUNT(*) AS mycount FROM tblPageViews
GROUP BY place ORDER BY COUNT(place) DESC

...which gives me the top 10 most viewed pages

big problem - this is getting slower and slower as the table grows and
grows.

what should I do?
A non-clustered index on place would help, as SQL Server then only
would have to scan that index, and not the entire difference. But it
seems that the table has two columns. In such case, the index would not
reduce execution time that much.

A better alternative may be to define an indexed view that maintains
the count:

CREATE TABLE pagehits (pageid varchar(20) NOT NULL,
viewtime datetime NOT NULL,
PRIMARY KEY (pageid, viewtime)
)
go
CREATE VIEW pagecount WITH SCHEMABINDING AS
SELECT pageid, cnt = COUNT_BIG(*)
FROM dbo.pagehits
GROUP BY pageid
go
CREATE UNIQUE CLUSTERED INDEX pagecount_ix ON pagecount(pageid)
go
SELECT TOP 10 pageid, cnt
FROM pagecount WITH (NOEXPAND)
ORDER BY cnt
go
DROP VIEW pagecount
DROP TABLE pagehits

I added the NOEXPAND hint to the query, since it's only on Enterprise
Edition, the optimizer considers indexed views.

Note that this could have effect on performance when writing to the table.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Feb 25 '07 #3
Fastest returning query would be on a separate table that maintains the
count to the granularity you would require in your output (week, day, or
hour). A query that requires more than hourly granularity can still be done
on the source table.

For each new week|day|hour add a row for each page to be tracked. This can
be done dynamically during monitoring but it is better to do it ahead of
time.

Then put an insert trigger on your million row table that will increment the
count on the proper row of the table.

An alternative would be a temp table that contains a subset copy of the
source data for the time range most likely to queried. For example a table
that contains the last 31 days of data, the 32nd day of data
deleted/archived every night.
"C10B" <ts******@gmail.comwrote in message
news:11*********************@a75g2000cwd.googlegro ups.com...
hi,

I have a table with several million rows.
Each row is simply the date and time a certain page was viewed.

eg
page1 1-1-00
page2 2-1-00
page1 16-1-00
page1 17-1-00
page2 19-1-00

I now need to find the most popular pages so I do this...

SELECT TOP 10 COUNT(*) AS mycount FROM tblPageViews
GROUP BY place ORDER BY COUNT(place) DESC

...which gives me the top 10 most viewed pages

big problem - this is getting slower and slower as the table grows and
grows.

what should I do?

is there an alternative?

I think I need all the rows (rather than a simple incrementing
counter) because I might want to say "what was the most popular page
on a certain day or within a certain period"

tia

Tim

Feb 27 '07 #4
erland - you are quite right, I could have worded my first post a
little better

greg - I don't know what a DDL is

russ - thanks
Feb 27 '07 #5
C10B (ts******@gmail.com) writes:
greg - I don't know what a DDL is
DDL = Data Definition Language. That is, CREATE TABLE and CREATE INDEX and
the like.

If I had had a penny for every time someone says DDL in answer, without the
person asking having no idea what it means, I would be a rich man now.
russ - thanks
What Russ proposed is the same idea that I proposed, but rather than relying
SQL Server updating an indexed view, he suggested a separate table that you
update through a trigger.
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Feb 27 '07 #6

"Erland Sommarskog" <es****@sommarskog.sewrote in message
news:Xn**********************@127.0.0.1...
C10B (ts******@gmail.com) writes:
>greg - I don't know what a DDL is

DDL = Data Definition Language. That is, CREATE TABLE and CREATE INDEX and
the like.

If I had had a penny for every time someone says DDL in answer, without
the
person asking having no idea what it means, I would be a rich man now.
>russ - thanks

What Russ proposed is the same idea that I proposed, but rather than
relying
SQL Server updating an indexed view, he suggested a separate table that
you
update through a trigger.
Indexed view is probably best, I just like having more control.
>
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx

Feb 28 '07 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

19
by: pkilambi | last post by:
I wrote this function which does the following: after readling lines from file.It splits and finds the word occurences through a hash table...for some reason this is quite slow..can some one...
10
by: Willem | last post by:
Looking for some opinions on alternatives to programming with Access. I find that quite often I need to loop through my recordsets (first to last) performing calculations and was wondering if...
2
by: mark | last post by:
I've got an 83-page report which contains an Microsoft Graph 8 Chart, but the report takes FOREVER to format, probably because the row source is a bit complex: TRANSFORM...
43
by: Mountain Bikn' Guy | last post by:
I have a situation where an app writes data of various types (primitives and objects) into a single dimensional array of objects. (This array eventually becomes a row in a data table, but that's...
8
by: Scott Emick | last post by:
I am using the following to compute distances between two lat/long coordinates for a store locator - (VB .NET 2003) it seems to take a long time to iterate through like 100-150 locations -...
5
by: wackyphill | last post by:
If you were doing paging of results on a web page and were interested in grabbing say records 10-20 of a result set. But also wanted to know the total # of records in the result set (so you could...
10
by: Extremest | last post by:
I know there are ways to make this a lot faster. Any newsreader does this in seconds. I don't know how they do it and I am very new to c#. If anyone knows a faster way please let me know. All...
4
by: Sonnich | last post by:
Hi I have a costum function for a special search, which sort strings. This is currently the place where I can save a lot of time (~70%) if possible. So, which is faster: for($j =...
8
by: C10B | last post by:
hi, I have a table with several million rows. Each row is simply the date and time a certain page was viewed. eg page1 1-1-00 page2 2-1-00 page1 16-1-00 page1 17-1-00
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.