473,387 Members | 3,033 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

finding largest numbers

Hi,
I have, suppose 1000 numbers, in a file. I have to find out 5
largest numbers among them without sorting. Can you please give me an
efficient idea to do this? My idea is to put those numbers into a
binary tree and to find the largest numbers. How else can we do it?

Regards

Jun 26 '06 #1
19 8549
qsort & bsearch<stdlib.h> perhaps better.
ramu wrote:
Hi,
I have, suppose 1000 numbers, in a file. I have to find out 5
largest numbers among them without sorting. Can you please give me an
efficient idea to do this? My idea is to put those numbers into a
binary tree and to find the largest numbers. How else can we do it?

Regards


Jun 26 '06 #2
ph*****@gmail.com said:
ramu wrote:
Hi,
I have, suppose 1000 numbers, in a file. I have to find out 5
largest numbers among them without sorting. Can you please give me an
efficient idea to do this? My idea is to put those numbers into a
binary tree and to find the largest numbers. How else can we do it?
qsort & bsearch<stdlib.h> perhaps better.


Which syllable of "without sorting" were you struggling with?

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)
Jun 26 '06 #3
ramu said:
Hi,
I have, suppose 1000 numbers, in a file. I have to find out 5
largest numbers among them without sorting. Can you please give me an
efficient idea to do this? My idea is to put those numbers into a
binary tree and to find the largest numbers. How else can we do it?


A binary tree would basically be a sorting technique, which you say you're
not allowed to do.

Just set up an array m of five numbers, and set them all to INT_MIN.

Then do something like this:

count = 0;
while(successfully_got_next_number_in_file_into_n)
{
++count;
c = 0;
for(j = 0; c == 0 && j < 5; j++)
{
if(n > m[j])
{
m[j] = n;
c = 1;
}
}
}
if(count < 5)
{
you will still have some INT_MIN entries in n, which you should disregard
when reporting the results of the program.
}

If you are allowed to keep m sorted, there is a way to reduce the number of
comparisons still further, but my answer assumes you are not allowed to do
any sorting at all.
--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)
Jun 26 '06 #4
ph*****@gmail.com writes:
ramu wrote:
I have, suppose 1000 numbers, in a file. I have to find out 5
largest numbers among them without sorting. Can you please give me an
efficient idea to do this? My idea is to put those numbers into a
binary tree and to find the largest numbers. How else can we do it?


qsort & bsearch<stdlib.h> perhaps better.


Pleaes don't top-post. I've corrected it here.
See <http://www.caliburn.nl/topposting.html>.

Since the original question said "without sorting", I don't think
qsort() is going to be part of any solution.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Jun 26 '06 #5
ramu (in 11**********************@p79g2000cwp.googlegroups. com) said:

| I have, suppose 1000 numbers, in a file. I have to find out 5
| largest numbers among them without sorting. Can you please give me
| an efficient idea to do this? My idea is to put those numbers into a
| binary tree and to find the largest numbers. How else can we do it?

Initialize five variables (or five elements of an array) to a value
less than or equal to the smallest possible number in the file.

Make a single pass through the file, counting the numbers you're
checking, and if any number is larger than the smallest number of the
five, replace the smaller number with the larger number you found in
the file.

At the end of the file, make sure that you counted to at least five.
Your five values should be the five largest values from the file.

--
Morris Dovey
DeSoto Solar
DeSoto, Iowa USA
http://www.iedu.com/DeSoto
Jun 26 '06 #6
ramu wrote:
I have, suppose 1000 numbers, in a file. I have to find out 5
largest numbers among them without sorting. Can you please give me an
efficient idea to do this?


Starting with the first 5 numbers, enter them into a heap with the
smallest value at the root. Scan through the array. If the entry is
larger than the heap root, replace the root with the entry, then
re-heapify. When you are finished scanning, the heap contains the 5
largest values. A heap is partially ordered, so you are never fully
sorting either all entries or the heap.

--
Thad
Jun 26 '06 #7
Richard Heathfield <in*****@invalid.invalid> writes:
ph*****@gmail.com said:
ramu wrote:
Hi,
I have, suppose 1000 numbers, in a file. I have to find out 5
largest numbers among them without sorting. Can you please give me an
efficient idea to do this? My idea is to put those numbers into a
binary tree and to find the largest numbers. How else can we do it?

qsort & bsearch<stdlib.h> perhaps better.


Which syllable of "without sorting" were you struggling with?


With a name like "phus.lu" probably most of them I would have
thought. Otherwise I would guess the "out" syllable bit when combined
with the syllable "with" to form the word "without". Since the OP
obviously didnt even understand what "without sorting" meant and
proposed a binary tree then its not so out of the question to suggest
qsort too. Or?

And, I might suggest, the poster was suggesting that qsort was better
than using a binary tree. And hes right...
Jun 26 '06 #8
Richard G. Riley said:

<snip>

And, I might suggest, the poster was suggesting that qsort was better
than using a binary tree. And hes right...


Well, bear in mind that the data is coming in from file, and there might not
be sufficient RAM to store all the numbers contiguously. Bye-bye array.

Of course, there might not be sufficient RAM to store all the numbers, full
stop. Bye bye binary tree.

On reflection, the method I suggested is borken too. One has no option but
to at least keep /that/ part sorted.

So it will be something like:

int m[] = { INT_MIN, INT_MIN, INT_MIN, INT_MIN, INT_MIN, INT_MIN };
int j;
unsigned long count = 0;

while(you manage to retrieve n from the file)
{
++count;
for(j = 5; j > 0; j--)
{
if(n > m[j - 1])
{
m[j] = m[j - 1];
m[j - 1] = n;
}
else
{
j = 0;
}
}
}

if(count > 5) count = 5;

printf("In ascending order:\n");
while(count--)
{
printf(" %d", m[count]);
}
putchar('\n');

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)
Jun 26 '06 #9
ramu posted:
Hi,
I have, suppose 1000 numbers, in a file. I have to find out 5
largest numbers among them without sorting. Can you please give me an
efficient idea to do this? My idea is to put those numbers into a
binary tree and to find the largest numbers. How else can we do it?

Regards


Maybe something like:
(Unchecked code, likely to contain a thousand little errors)
#include <string.h>
int global_array[1000];
/* Lets pretend they have random (but legitimate) values */
unsigned const magic = 5;
typedef struct IntsArray {
int array[magic];
} IntsArray;
void ShiftDown( int * const p,
unsigned const quantity,
unsigned const places )
{
int * const q = p + places;

memmove( p, q, quantity );
}

IntsArray GetTopX( const int *p, const int * const p_over )
{
IntsArray fi = {};

int *pi =
fi.array + (sizeof(fi.array) / sizeof(*fi.array) - 1);

do
{
for( unsigned i = 0; i != magic; ++i, --pi )
{
if ( *p > *pi )
{
ShiftDown( fi.array, 5 - i, 5 - i );
/* Probably an error on the above line */
}
}
} while (p != p_over);
}
int main()
{
IntsArray ia = GetTopX( global_array, global_array + 1000 );
}

--

Frederick Gotham
Jun 26 '06 #10
In article <11**********************@p79g2000cwp.googlegroups .com>,
ramu <ra******@gmail.com> wrote:
I have, suppose 1000 numbers, in a file. I have to find out 5
largest numbers among them without sorting. Can you please give me an
efficient idea to do this?


Find the largest number. Find the next largest number. Etc.

This will take about 5n comparisons. Obviously it requires at least
n comparisons. How efficient do you need?

-- Richard
Jun 26 '06 #11
Morris Dovey <mr*****@iedu.com> wrote:
Make a single pass through the file, counting the numbers you're
checking, and if any number is larger than the smallest number of the
five, replace the smaller number with the larger number you found in
the file.


Determining which number of the five is smallest necessarily involves
an operation resembling sorting, which the OP was explicitly forbidden
to use.

--
Christopher Benson-Manica | I *should* know what I'm talking about - if I
ataru(at)cyberspace.org | don't, I need to know. Flames welcome.
Jun 26 '06 #12
"ramu" <ra******@gmail.com> wrote in message
news:11**********************@p79g2000cwp.googlegr oups.com...
Hi,
I have, suppose 1000 numbers, in a file. I have to find out 5
largest numbers among them without sorting. Can you please give me an
efficient idea to do this? My idea is to put those numbers into a
binary tree and to find the largest numbers. How else can we do it?


Your question is better suited to news:comp.programming.

The answer to your question is called Quickselect() and is described in
detail in
T. H. Cormen, C. E. Leiserson, and R. L. Rivest, Introduction to
Algorithms, Cambridge: The MIT Press, 1990.


Jun 26 '06 #13
Frederick Gotham <fg*******@SPAM.com> writes:
[...]
(Unchecked code, likely to contain a thousand little errors) [...] unsigned const magic = 5;
typedef struct IntsArray {
int array[magic];
} IntsArray;


The "array" member is a variable length array, so this won't work in
C90. (I'm not certain that a VLA is allowed as a struct member even
in C99.)

Counterintuitively, "magic", even though it's declared const, is not a
constant expression.

You can avoid this either by declaring magic as a macro (preferably MAGIC):
#define MAGIC 5
or, if you don't mind abusing enumerated types, as an enumeration constant:
enum { MAGIC = 5 };

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Jun 26 '06 #14
Keith Thompson posted:
Frederick Gotham <fg*******@SPAM.com> writes:
[...]
(Unchecked code, likely to contain a thousand little errors) [...]
unsigned const magic = 5;
typedef struct IntsArray {
int array[magic];
} IntsArray;


The "array" member is a variable length array, so this won't work in
C90. (I'm not certain that a VLA is allowed as a struct member even
in C99.)

Counterintuitively, "magic", even though it's declared const, is not a
constant expression.

C++ habits getting the better of me. (A const object in C++ can act as a
compile-time constant if it is initialised with a compile-time constant).

You can avoid this either by declaring magic as a macro (preferably
MAGIC):
#define MAGIC 5
or, if you don't mind abusing enumerated types, as an enumeration
constant:
enum { MAGIC = 5 };

It seems that there's variety in opinion when it comes to using enum's
for constants. Some, like yourself, seem to view it as abuse, but I like
to think that it's just making use of all the functionality we're given
in the language.
The results of using an enum for constants is well-defined, so I
don't see a problem.
Also, as I've said before, I avoid macros wherever possible.
--

Frederick Gotham
Jun 26 '06 #15
Frederick Gotham <fg*******@SPAM.com> writes:
Keith Thompson posted: [...]
You can avoid this either by declaring magic as a macro (preferably
MAGIC):
#define MAGIC 5
or, if you don't mind abusing enumerated types, as an enumeration
constant:
enum { MAGIC = 5 };


It seems that there's variety in opinion when it comes to using enum's
for constants. Some, like yourself, seem to view it as abuse, but I like
to think that it's just making use of all the functionality we're given
in the language.


My use of the term "abuse" is half jocular. It's an abuse in the
sense that it's not consistent with the originally intended use of the
construct. I actually think it's a *good* abuse.
The results of using an enum for constants is well-defined, so I
don't see a problem.
Nor do I (except that it's limited to type int.
Also, as I've said before, I avoid macros wherever possible.


I merely avoid them whenever practical.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Jun 26 '06 #16
Christopher Benson-Manica (in e7**********@chessie.cirr.com) said:

| Morris Dovey <mr*****@iedu.com> wrote:
|
|| Make a single pass through the file, counting the numbers you're
|| checking, and if any number is larger than the smallest number of
|| the five, replace the smaller number with the larger number you
|| found in the file.
|
| Determining which number of the five is smallest necessarily
| involves an operation resembling sorting, which the OP was
| explicitly forbidden to use.

Tsk-tsk. I'm not even remotely encouraging the OP to re-order
anything - only to compare values and copy when appropriate...

Do you have a solution that doesn't compare _any_ values? :-D

--
Morris Dovey
DeSoto Solar
DeSoto, Iowa USA
http://www.iedu.com/DeSoto
Jun 26 '06 #17
#define NUM_OF_BIGGEST 5

int i,j,num,max[NUM_OF_BIGGEST+1]; /*if u want only 5 biggest
numbers*/
max[0] = 0xffff;

for( i = 1;i <= NUM_OF_BIGGEST; i++)

for( j = 1;j <= 1000; j++)
{
get_next_num(&num);
if(max[i] < num && num < max[i-1] )
max[i] = num;

}
Start_reading_the_file_once_again();
}

print_number_from max[1] to max[5];

Jun 27 '06 #18
Morris Dovey <mr*****@iedu.com> wrote:
Do you have a solution that doesn't compare _any_ values? :-D


It's unfortunate that bogo-sort is technically a sorting algorithm :-)

(Point taken!)

--
Christopher Benson-Manica | I *should* know what I'm talking about - if I
ataru(at)cyberspace.org | don't, I need to know. Flames welcome.
Jun 27 '06 #19
/*
** This solves the general case for the selection problem in average case
** linear time.
** D. Corbit.
** This code is explicitly granted to the public domain.
*/
#include <stdlib.h>
typedef double Etype;

extern Etype RandomSelect(Etype * A, size_t p, size_t r, size_t i);
extern size_t RandRange(size_t a, size_t b);
extern size_t RandomPartition(Etype * A, size_t p, size_t r);
extern size_t Partition(Etype * A, size_t p, size_t r);

/*
**
** In the following code, every reference to CLR means:
**
** "Introduction to Algorithms"
** By Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest
** ISBN 0-07-013143-0
*/
/*
** CLR, page 187
*/
Etype RandomSelect(Etype A[], size_t p, size_t r, size_t i)
{
size_t q,
k;
if (p == r)
return A[p];
q = RandomPartition(A, p, r);
k = q - p + 1;

if (i <= k)
return RandomSelect(A, p, q, i);
else
return RandomSelect(A, q + 1, r, i - k);
}

size_t RandRange(size_t a, size_t b)
{
size_t c = (size_t) ((double) rand() / ((double) RAND_MAX + 1)
* (b - a));
return c + a;
}

/*
** CLR, page 162
*/
size_t RandomPartition(Etype A[], size_t p, size_t r)
{
size_t i = RandRange(p, r);
Etype Temp;
Temp = A[p];
A[p] = A[i];
A[i] = Temp;
return Partition(A, p, r);
}

/*
** CLR, page 154
*/
size_t Partition(Etype A[], size_t p, size_t r)
{
Etype x,
temp;
size_t i,
j;

x = A[p];
i = p - 1;
j = r + 1;

for (;;) {
do {
j--;
} while (!(A[j] <= x));
do {
i++;
} while (!(A[i] >= x));
if (i < j) {
temp = A[i];
A[i] = A[j];
A[j] = temp;
} else
return j;
}
}
Jun 27 '06 #20

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

12
by: e271828 | last post by:
Hi, I'm helping to work a developer tool that verifies a given HTML element has a given attribute (e.g., that all LABEL elements have a FOR attribute, all INPUT elements have an ID attribute,...
25
by: Subra | last post by:
Hi, What is the best way to find the 1000 largest numbers from the file having hell lot of entries ? Can you please help me to find out the way ? Do I need to go for B+ trees ?? Please help,...
5
by: Ramdas | last post by:
I am doing some HTML scrapping for a side project. I need a method using sgmllib or HTMLParser to parse an HTML file and get line nos of all the tags I tried a few things, but I am just not...
3
by: Bouzy | last post by:
I have a list of words and am trying to replace all the numbers in my list with whitespace. for word in words: numbers = re.search('+', word) word = clearup(word) if word in...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.