473,324 Members | 2,124 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,324 software developers and data experts.

Finding Repeated and Missing Numbers

Athough this might not be directly relayed to C syntax, I thought some
of you may find this an interesting problem :-

1.One number, in an array integers from 1 to 1,000,000 is present
twice. How can you determine which one? Can you think of a way to do
it using little extra memory?

2. How would you find the only missing element in that array? Can you
think of a way to do it while iterating through the array only once. Is
overflow a problem in the solution? Why not?

For both the problems, the solution I can think of is to sort
the elements and then compare consecutive elements (in O(nlogn) time).
Summing the numbers didn't seem of much use. Creating another such
array and then using the elements as indices may not be that neat of an
approach from a memory perspective, I felt.

Any other ways, folks?

Thanks,
Harsha

Nov 15 '05 #1
9 11657
On 26 Jun 2005 22:51:10 -0700, "Harsha Srinath" <ha*******@gmail.com>
wrote:
Athough this might not be directly relayed to C syntax, I thought some
of you may find this an interesting problem :-

1.One number, in an array integers from 1 to 1,000,000 is present
twice. How can you determine which one? Can you think of a way to do
it using little extra memory?

2. How would you find the only missing element in that array? Can you
think of a way to do it while iterating through the array only once. Is
overflow a problem in the solution? Why not?

For both the problems, the solution I can think of is to sort
the elements and then compare consecutive elements (in O(nlogn) time).
Summing the numbers didn't seem of much use. Creating another such
array and then using the elements as indices may not be that neat of an
approach from a memory perspective, I felt.

Any other ways, folks?


OT. That said, there is missing information. If array one has
1000001 entries and array two has 999999 entries (as implied by the
description) then summing works. The sum in case one is 500005000000
[n*(n+1)/2] plus the extra element and in case two minus the missing
element. The catch is that the summation may trigger overflows. Your
task is write the code for multiple precision integer arithmetic.

Richard Harter, cr*@tiac.net
http://home.tiac.net/~cri, http://www.varinoma.com
Save the Earth now!!
It's the only planet with chocolate.
Nov 15 '05 #2
> >1.One number, in an array integers from 1 to 1,000,000 is present
twice. How can you determine which one? Can you think of a way to do
it using little extra memory?

2. How would you find the only missing element in that array? Can you
think of a way to do it while iterating through the array only once. Is
overflow a problem in the solution? Why not?

For both the problems, the solution I can think of is to sort
the elements and then compare consecutive elements (in O(nlogn) time).
Summing the numbers didn't seem of much use. Creating another such
array and then using the elements as indices may not be that neat of an
approach from a memory perspective, I felt.

Any other ways, folks?


OT. That said, there is missing information. If array one has
1000001 entries and array two has 999999 entries (as implied by the
description) then summing works. The sum in case one is 500005000000
[n*(n+1)/2] plus the extra element and in case two minus the missing
element. The catch is that the summation may trigger overflows. Your
task is write the code for multiple precision integer arithmetic.


Sorry. You are right, there are 1,000,001 elements and 999,999 entires
in the two cases respectively.

And if not summation, any other way??

Thanks,
Harsha Srinath

Nov 15 '05 #3
Harsha Srinath wrote:
Athough this might not be directly relayed to C syntax, I thought some
of you may find this an interesting problem :-

1.One number, in an array integers from 1 to 1,000,000 is present
twice. How can you determine which one? Can you think of a way to do
it using little extra memory?
unsigned long i, missing, sum, Sum = 1000000ul / 2 * 1000001ul;
for (i = 0, sum = 0; i < 1000001; i++)
sum += array[i];
missing = sum - Sum;
2. How would you find the only missing element in that array? Can you
think of a way to do it while iterating through the array only once.
unsigned long i, missing, sum, Sum = 1000000ul / 2 * 1000001ul;
for (i = 0, sum = 0; i < 999999; i++)
sum += array[i];
missing = Sum - sum;
Is overflow a problem in the solution? Why not?


Summation works fine; if you're using unsigned arithmetic, then you
can't (generally) overflow. The real problem is determining why you
have such arrays in the first place. ;)

--
Peter

Nov 15 '05 #4

"Harsha Srinath" <ha*******@gmail.com> wrote in message news:11**********************@o13g2000cwo.googlegr oups.com...
Athough this might not be directly relayed to C syntax, I thought some
of you may find this an interesting problem :-

1.One number, in an array integers from 1 to 1,000,000 is present
twice. How can you determine which one? Can you think of a way to do
it using little extra memory?


The phrase "using little extra memory" makes me wonder if
the OP's friend who set this puzzle has in mind using
an array of a million bits (or a million of any data type)
and setting each bit as the corresponding number is seen.
If a bit is already set to one, then that number has been
seen before. At the end, any remaining zero bits represent
missing numbers.

--
John.
Nov 15 '05 #5
Harsha Srinath wrote:
And if not summation, any other way??


The XOR of all of the integers from 1 to 999999 is zero.

/* BEGIN new.c */

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>

#define EXTRA 998000
#define MISSING 999000
#define SIZE 1000000

int main(void)
{
long unsigned *array;
long unsigned index;
long unsigned result;
const long unsigned size = SIZE;

assert(SIZE == 1000000);
array = malloc((SIZE + 1) * sizeof *array);
if (array == NULL) {
puts("I'm tired");
exit(EXIT_FAILURE);
}
for (index = 0; index != SIZE; ++index) {
array[index] = index + 1;
}
array[index] = EXTRA;
result = 0;
for (index = 0; index != SIZE + 1; ++index) {
result ^= array[index];
}
free(array);
printf("The extra number is %lu\n", result ^ SIZE);
array = malloc((SIZE - 1) * sizeof *array);
if (array == NULL) {
puts("I'm tired");
exit(EXIT_FAILURE);
}
for (index = 0; index != SIZE - 1; ++index) {
array[index] = index + 1;
}
if (size > MISSING) {
array[MISSING - 1] = SIZE;
}
result = 0;
for (index = 0; index != SIZE - 1; ++index) {
result ^= array[index];
}
free(array);
printf("The missing number is %lu\n", result ^ SIZE);
return 0;
}

/* END new.c */

--
pete
Nov 15 '05 #6
John L wrote:
"Harsha Srinath" <ha*******@gmail.com> wrote in message news:11**********************@o13g2000cwo.googlegr oups.com...
Athough this might not be directly relayed to C syntax, I thought some
of you may find this an interesting problem :-

1.One number, in an array integers from 1 to 1,000,000 is present
twice. How can you determine which one? Can you think of a way to do
it using little extra memory?


The phrase "using little extra memory" makes me wonder if
the OP's friend who set this puzzle has in mind using
an array of a million bits (or a million of any data type)
and setting each bit as the corresponding number is seen.
If a bit is already set to one, then that number has been
seen before. At the end, any remaining zero bits represent
missing numbers.


Yup. Thats exactly what made me also think of an possible alternative
to the summation method.

Nov 15 '05 #7
Richard Harter wrote:
"Harsha Srinath" <ha*******@gmail.com> wrote:
Athough this might not be directly relayed to C syntax, I thought
some of you may find this an interesting problem :-

1.One number, in an array integers from 1 to 1,000,000 is present
twice. How can you determine which one? Can you think of a way
to do it using little extra memory?

2. How would you find the only missing element in that array? Can
you think of a way to do it while iterating through the array
only once. Is overflow a problem in the solution? Why not?

For both the problems, the solution I can think of is to sort
the elements and then compare consecutive elements (in O(nlogn)
time). Summing the numbers didn't seem of much use. Creating
another such array and then using the elements as indices may not
be that neat of an approach from a memory perspective, I felt.

Any other ways, folks?


OT. That said, there is missing information. If array one has
1000001 entries and array two has 999999 entries (as implied by
the description) then summing works. The sum in case one is
500005000000 [n*(n+1)/2] plus the extra element and in case two
minus the missing element. The catch is that the summation may
trigger overflows. Your task is write the code for multiple
precision integer arithmetic.


As long as M is greater than 1000000 you can use modulo M
arithmetic throughout and avoid all overflows. To put it on-topic,
this is fairly automatic if using C unsigned longs.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
Nov 15 '05 #8
(I had to restore some of the quotes here.)

In article <11**********************@o13g2000cwo.googlegroups .com>
"Harsha Srinath" <ha*******@gmail.com> wrote:
1.One number, in an array integers from 1 to 1,000,000 is present
twice. How can you determine which one? Can you think of a way to do
it using little extra memory?
[and again, but with one number missing]

(We just had this puzzle posted.)

In article <42****************@news.sbtc.net>
From: cr*@tiac.net (Richard Harter) replied:OT. That said, there is missing information. If array one has
1000001 entries and array two has 999999 entries (as implied by the
description) then summing works. ... The catch is that the summation
may trigger overflows. Your task is write the code for multiple
precision integer arithmetic.
Actually, you need not go that far. C99's "long long" is guaranteed
to hold the result without overflow (since as you noted the sum is
(1000000*1000001)/2 or 500000500000, and LLONG_MAX must be at least
9223372036854775807, which is considerably larger). But even C99
is not required:

In article <11**********************@g44g2000cwa.googlegroups .com>,
Harsha Srinath <ha*******@gmail.com> wrote:Sorry. You are right, there are 1,000,001 elements and 999,999 entires
in the two cases respectively.

And if not summation, any other way??


Use "unsigned" and the properties of Galois fields. Unsigned
arithmetic implements GF(2-sup-k), where k is the number of bits
in the unsigned integer type.

unsigned sum;
size_t i;

for (sum = 0, i = 0; i < 1000001; i++)
sum += arr[i];
sum -= (1000000U * 1000001U / 2U);
printf("the missing number is %u\n", sum);

If addition is not to your liking, you can use subtraction (which
is of course just addition in disguise), or xor.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Nov 15 '05 #9
On Mon, 27 Jun 2005 07:16:14 +0000, Chris Torek wrote:
(I had to restore some of the quotes here.)

In article <11**********************@o13g2000cwo.googlegroups .com>
"Harsha Srinath" <ha*******@gmail.com> wrote:


....
And if not summation, any other way??


Use "unsigned" and the properties of Galois fields. Unsigned
arithmetic implements GF(2-sup-k), where k is the number of bits
in the unsigned integer type.

unsigned sum;
size_t i;

for (sum = 0, i = 0; i < 1000001; i++)
sum += arr[i];
sum -= (1000000U * 1000001U / 2U);
printf("the missing number is %u\n", sum);

If addition is not to your liking, you can use subtraction (which
is of course just addition in disguise), or xor.


However these are all forms of summation.

Lawrence

Nov 15 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: David Berry | last post by:
Hi All. I'm trying to write an ASP page that shows me the UNIQUE account number for a customer (so I can pass it to another page) based on a search criteria. For example, I want to do a select...
25
by: Subra | last post by:
Hi, What is the best way to find the 1000 largest numbers from the file having hell lot of entries ? Can you please help me to find out the way ? Do I need to go for B+ trees ?? Please help,...
1
by: Silver8 | last post by:
Dear Expert, I have a problem on creating a function for the user to search any missing numbers in a table. The program may request the user to find for the start and end number and thus comparing...
8
by: Engrm | last post by:
In Access, I have a table of customers, each cust has sequence of numbers that are supposed to be in sequential order and some are missing, i want list of all missing sequence number of each...
20
by: kpfunf | last post by:
Hey all, I have a table of receipts used (Table Name: receipts, Field Name: receiptNumber) and a table with receipts issued (Table Name: Receipts Batches, Field Names: BatchID, BegRng, EndRng)....
5
by: maury | last post by:
Hello, I have a DB table with data filled from a weather sensor probe, I have one row every 10 minutes and the data fields is not in DateTime format but in string format: yyyyMMddHHmm So for...
2
by: arnuld | last post by:
Again, my friend, who does not much access to internet, has given me a working program and I need your views on improvement and design. I put some error checking but my experience with C has made...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.