Connecting Tech Pros Worldwide Forums | Help | Site Map

hash increases after exists function

Newbie
 
Join Date: Oct 2008
Posts: 2
#1: Oct 20 '08
Hello All,

I've run into a problem I am not able to solve myself because I don't know what perl exactly does when I try to use the exist function.

My script does the following:
first i use a database to build up a hash, this hash has has then around 800000 values divided over 2300 keys.
Then I use a file with which I must examine whether a component exists in that hash. If the value exists it simply adds 1 to the number of times the value was found. A value can exist in combination with multiple keys.
Now when I count the number of values before the counting and after the counting the computer comes up with different numbers which should, in my view, be impossible.

A piece of my code looks like this:
Expand|Select|Wrap|Line Numbers
  1. $chrompos = $chromosome."_".$position;
  2. $test_loc=$position+$lengthreads-1;
  3. $test_pos = $chromosome."_".$test_loc;
  4. foreach $exon_id(keys %exon_hash){
  5.     if (exists $exon_hash{$exon_id}{$chrompos}){
  6.         if (exists $exon_hash{$exon_id}{$test_pos}){
  7.                 for ($i=0;$i<$lengthreads;$i++){
  8.                     $next_position=$position+$i;
  9.                     $x= $chromosome."_".$next_position;
  10.                     $exon_hash{$exon_id}{$x}{amount}++;
  11.                 }
  12.                   }
  13.         }
  14. }
  15.  
when I print out a chromosomic location ($chrompos) the value is changed after the second exist function. This only happens in very rare cases but when it happens values gets added to my hash.

Does anyone knows what goes wrong here and how to solve it?
Thanks in advance.

Regards
Karel

KevinADC's Avatar
Expert
 
Join Date: Jan 2007
Location: Southern California USA
Posts: 4,091
#2: Oct 20 '08

re: hash increases after exists function


Quote:

Originally Posted by Karel03

Hello All,

I've run into a problem I am not able to solve myself because I don't know what perl exactly does when I try to use the exist function.

My script does the following:
first i use a database to build up a hash, this hash has has then around 800000 values divided over 2300 keys.
Then I use a file with which I must examine whether a component exists in that hash. If the value exists it simply adds 1 to the number of times the value was found. A value can exist in combination with multiple keys.
Now when I count the number of values before the counting and after the counting the computer comes up with different numbers which should, in my view, be impossible.

A piece of my code looks like this:

Expand|Select|Wrap|Line Numbers
  1. $chrompos = $chromosome."_".$position;
  2. $test_loc=$position+$lengthreads-1;
  3. $test_pos = $chromosome."_".$test_loc;
  4. foreach $exon_id(keys %exon_hash){
  5.     if (exists $exon_hash{$exon_id}{$chrompos}){
  6.         if (exists $exon_hash{$exon_id}{$test_pos}){
  7.                 for ($i=0;$i<$lengthreads;$i++){
  8.                     $next_position=$position+$i;
  9.                     $x= $chromosome."_".$next_position;
  10.                     $exon_hash{$exon_id}{$x}{amount}++;
  11.                 }
  12.                   }
  13.         }
  14. }
  15.  
when I print out a chromosomic location ($chrompos) the value is changed after the second exist function. This only happens in very rare cases but when it happens values gets added to my hash.

Does anyone knows what goes wrong here and how to solve it?
Thanks in advance.

Regards
Karel

Maybe you want to start this loop at 1 instead of 0:

Expand|Select|Wrap|Line Numbers
  1. for ($i=0;$i<$lengthreads;$i++){
If you add 0 to $position the current value of $next_position is not changed so the value of $x is not changed and then you increment the value of:

Expand|Select|Wrap|Line Numbers
  1. $exon_hash{$exon_id}{$x}{amount}++;

So it looks possible that you might count the above key twice for the same value. I could be totally wrong but try using 1 as the initial value instead of 0:

Expand|Select|Wrap|Line Numbers
  1. for ($i=1;$i<$lengthreads;$i++){
Newbie
 
Join Date: Oct 2008
Posts: 2
#3: Oct 21 '08

re: hash increases after exists function


I tried what you suggested but it didn't solve the problem, still suddenly some extra entries in my hash got created.

Any other suggestions?

Regards
Karel
numberwhun's Avatar
Site Moderator
 
Join Date: May 2007
Location: New Hampshire
Posts: 2,572
#4: Oct 21 '08

re: hash increases after exists function


Quote:

Originally Posted by Karel03

I tried what you suggested but it didn't solve the problem, still suddenly some extra entries in my hash got created.

Any other suggestions?

Regards
Karel


Can you show what is in your hash and what was expected? Also, can you show your data source?

Regards,

Jeff
KevinADC's Avatar
Expert
 
Join Date: Jan 2007
Location: Southern California USA
Posts: 4,091
#5: Oct 21 '08

re: hash increases after exists function


OK, lets look at these two lines:

Expand|Select|Wrap|Line Numbers
  1. if (exists $exon_hash{$exon_id}{$chrompos}){
  2.          if (exists $exon_hash{$exon_id}{$test_pos}){
  3.  
When you check for the existence of the key $chrompos in the first line, if the key $exon_id did not already exist it will spring into "life". Same in the next line. If the $test_pos key did not exist $exon_id will spring into life if it did not already exist. This is called autovivication. The only key that does not get autovivified is the deepest key ($chrompos and $test_pos in the this case).

I don't know if that is the problem but I can't tell what the problem might be just by looking at the code you posted besides the two suggestions I have now given you.
pawanrpandey's Avatar
Newbie
 
Join Date: Feb 2007
Location: Bangalore
Posts: 11
#6: Oct 23 '08

re: hash increases after exists function


Here if we consider the top level 'for loop', autovivication should not arise:

Expand|Select|Wrap|Line Numbers
  1. foreach $exon_id(keys %exon_hash)
  2. {      
  3.       if (exists $exon_hash{$exon_id}{$chrompos})
  4.       {   
  5.                if (exists $exon_hash{$exon_id}{$test_pos}){
  6.  
  7.  

Because from the %xeon_hash only those keys will be taken which 'exists' in the hash and then second line checks second level key after giving first level key which already exists...same to next 'exists' check....so I feel in this tight check mode where there is clear navigation from keys level 1,2,3....autovivcation should not come in picture...

Please correct me if my understanding is wrong..

Regards,
Pawan
KevinADC's Avatar
Expert
 
Join Date: Jan 2007
Location: Southern California USA
Posts: 4,091
#7: Oct 23 '08

re: hash increases after exists function


Quote:

Originally Posted by pawanrpandey

Here if we consider the top level 'for loop', autovivication should not arise:

Expand|Select|Wrap|Line Numbers
  1. foreach $exon_id(keys %exon_hash)
  2. {      
  3.       if (exists $exon_hash{$exon_id}{$chrompos})
  4.       {   
  5.                if (exists $exon_hash{$exon_id}{$test_pos}){
  6.  
  7.  

Because from the %xeon_hash only those keys will be taken which 'exists' in the hash and then second line checks second level key after giving first level key which already exists...same to next 'exists' check....so I feel in this tight check mode where there is clear navigation from keys level 1,2,3....autovivcation should not come in picture...

Please correct me if my understanding is wrong..

Regards,
Pawan


After reading your clear explanation above, I agree with you. Autovivcation should not be occuring as the loop advances from level one key to level two key. As long as a level is not skipped everything should be OK.

Good observation Pawan,
Kevin
Reply