By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
437,661 Members | 1,330 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 437,661 IT Pros & Developers. It's quick & easy.

hash increases after exists function

P: 2
Hello All,

I've run into a problem I am not able to solve myself because I don't know what perl exactly does when I try to use the exist function.

My script does the following:
first i use a database to build up a hash, this hash has has then around 800000 values divided over 2300 keys.
Then I use a file with which I must examine whether a component exists in that hash. If the value exists it simply adds 1 to the number of times the value was found. A value can exist in combination with multiple keys.
Now when I count the number of values before the counting and after the counting the computer comes up with different numbers which should, in my view, be impossible.

A piece of my code looks like this:
Expand|Select|Wrap|Line Numbers
  1. $chrompos = $chromosome."_".$position;
  2. $test_loc=$position+$lengthreads-1;
  3. $test_pos = $chromosome."_".$test_loc;
  4. foreach $exon_id(keys %exon_hash){
  5.     if (exists $exon_hash{$exon_id}{$chrompos}){
  6.         if (exists $exon_hash{$exon_id}{$test_pos}){
  7.                 for ($i=0;$i<$lengthreads;$i++){
  8.                     $next_position=$position+$i;
  9.                     $x= $chromosome."_".$next_position;
  10.                     $exon_hash{$exon_id}{$x}{amount}++;
  11.                 }
  12.                   }
  13.         }
  14. }
  15.  
when I print out a chromosomic location ($chrompos) the value is changed after the second exist function. This only happens in very rare cases but when it happens values gets added to my hash.

Does anyone knows what goes wrong here and how to solve it?
Thanks in advance.

Regards
Karel
Oct 20 '08 #1
Share this Question
Share on Google+
6 Replies


KevinADC
Expert 2.5K+
P: 4,059
Hello All,

I've run into a problem I am not able to solve myself because I don't know what perl exactly does when I try to use the exist function.

My script does the following:
first i use a database to build up a hash, this hash has has then around 800000 values divided over 2300 keys.
Then I use a file with which I must examine whether a component exists in that hash. If the value exists it simply adds 1 to the number of times the value was found. A value can exist in combination with multiple keys.
Now when I count the number of values before the counting and after the counting the computer comes up with different numbers which should, in my view, be impossible.

A piece of my code looks like this:
Expand|Select|Wrap|Line Numbers
  1. $chrompos = $chromosome."_".$position;
  2. $test_loc=$position+$lengthreads-1;
  3. $test_pos = $chromosome."_".$test_loc;
  4. foreach $exon_id(keys %exon_hash){
  5.     if (exists $exon_hash{$exon_id}{$chrompos}){
  6.         if (exists $exon_hash{$exon_id}{$test_pos}){
  7.                 for ($i=0;$i<$lengthreads;$i++){
  8.                     $next_position=$position+$i;
  9.                     $x= $chromosome."_".$next_position;
  10.                     $exon_hash{$exon_id}{$x}{amount}++;
  11.                 }
  12.                   }
  13.         }
  14. }
  15.  
when I print out a chromosomic location ($chrompos) the value is changed after the second exist function. This only happens in very rare cases but when it happens values gets added to my hash.

Does anyone knows what goes wrong here and how to solve it?
Thanks in advance.

Regards
Karel
Maybe you want to start this loop at 1 instead of 0:

Expand|Select|Wrap|Line Numbers
  1. for ($i=0;$i<$lengthreads;$i++){
If you add 0 to $position the current value of $next_position is not changed so the value of $x is not changed and then you increment the value of:

Expand|Select|Wrap|Line Numbers
  1. $exon_hash{$exon_id}{$x}{amount}++;

So it looks possible that you might count the above key twice for the same value. I could be totally wrong but try using 1 as the initial value instead of 0:

Expand|Select|Wrap|Line Numbers
  1. for ($i=1;$i<$lengthreads;$i++){
Oct 20 '08 #2

P: 2
I tried what you suggested but it didn't solve the problem, still suddenly some extra entries in my hash got created.

Any other suggestions?

Regards
Karel
Oct 21 '08 #3

numberwhun
Expert Mod 2.5K+
P: 3,503
I tried what you suggested but it didn't solve the problem, still suddenly some extra entries in my hash got created.

Any other suggestions?

Regards
Karel

Can you show what is in your hash and what was expected? Also, can you show your data source?

Regards,

Jeff
Oct 21 '08 #4

KevinADC
Expert 2.5K+
P: 4,059
OK, lets look at these two lines:

Expand|Select|Wrap|Line Numbers
  1. if (exists $exon_hash{$exon_id}{$chrompos}){
  2.          if (exists $exon_hash{$exon_id}{$test_pos}){
  3.  
When you check for the existence of the key $chrompos in the first line, if the key $exon_id did not already exist it will spring into "life". Same in the next line. If the $test_pos key did not exist $exon_id will spring into life if it did not already exist. This is called autovivication. The only key that does not get autovivified is the deepest key ($chrompos and $test_pos in the this case).

I don't know if that is the problem but I can't tell what the problem might be just by looking at the code you posted besides the two suggestions I have now given you.
Oct 21 '08 #5

P: 14
Here if we consider the top level 'for loop', autovivication should not arise:

Expand|Select|Wrap|Line Numbers
  1. foreach $exon_id(keys %exon_hash)
  2. {      
  3.       if (exists $exon_hash{$exon_id}{$chrompos})
  4.       {   
  5.                if (exists $exon_hash{$exon_id}{$test_pos}){
  6.  
  7.  

Because from the %xeon_hash only those keys will be taken which 'exists' in the hash and then second line checks second level key after giving first level key which already exists...same to next 'exists' check....so I feel in this tight check mode where there is clear navigation from keys level 1,2,3....autovivcation should not come in picture...

Please correct me if my understanding is wrong..

Regards,
Pawan
Oct 23 '08 #6

KevinADC
Expert 2.5K+
P: 4,059
Here if we consider the top level 'for loop', autovivication should not arise:

Expand|Select|Wrap|Line Numbers
  1. foreach $exon_id(keys %exon_hash)
  2. {      
  3.       if (exists $exon_hash{$exon_id}{$chrompos})
  4.       {   
  5.                if (exists $exon_hash{$exon_id}{$test_pos}){
  6.  
  7.  

Because from the %xeon_hash only those keys will be taken which 'exists' in the hash and then second line checks second level key after giving first level key which already exists...same to next 'exists' check....so I feel in this tight check mode where there is clear navigation from keys level 1,2,3....autovivcation should not come in picture...

Please correct me if my understanding is wrong..

Regards,
Pawan

After reading your clear explanation above, I agree with you. Autovivcation should not be occuring as the loop advances from level one key to level two key. As long as a level is not skipped everything should be OK.

Good observation Pawan,
Kevin
Oct 23 '08 #7

Post your reply

Sign in to post your reply or Sign up for a free account.