472,956 Members | 2,563 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,956 software developers and data experts.

hash increases after exists function

Hello All,

I've run into a problem I am not able to solve myself because I don't know what perl exactly does when I try to use the exist function.

My script does the following:
first i use a database to build up a hash, this hash has has then around 800000 values divided over 2300 keys.
Then I use a file with which I must examine whether a component exists in that hash. If the value exists it simply adds 1 to the number of times the value was found. A value can exist in combination with multiple keys.
Now when I count the number of values before the counting and after the counting the computer comes up with different numbers which should, in my view, be impossible.

A piece of my code looks like this:
Expand|Select|Wrap|Line Numbers
  1. $chrompos = $chromosome."_".$position;
  2. $test_loc=$position+$lengthreads-1;
  3. $test_pos = $chromosome."_".$test_loc;
  4. foreach $exon_id(keys %exon_hash){
  5.     if (exists $exon_hash{$exon_id}{$chrompos}){
  6.         if (exists $exon_hash{$exon_id}{$test_pos}){
  7.                 for ($i=0;$i<$lengthreads;$i++){
  8.                     $next_position=$position+$i;
  9.                     $x= $chromosome."_".$next_position;
  10.                     $exon_hash{$exon_id}{$x}{amount}++;
  11.                 }
  12.                   }
  13.         }
  14. }
  15.  
when I print out a chromosomic location ($chrompos) the value is changed after the second exist function. This only happens in very rare cases but when it happens values gets added to my hash.

Does anyone knows what goes wrong here and how to solve it?
Thanks in advance.

Regards
Karel
Oct 20 '08 #1
6 1647
KevinADC
4,059 Expert 2GB
Hello All,

I've run into a problem I am not able to solve myself because I don't know what perl exactly does when I try to use the exist function.

My script does the following:
first i use a database to build up a hash, this hash has has then around 800000 values divided over 2300 keys.
Then I use a file with which I must examine whether a component exists in that hash. If the value exists it simply adds 1 to the number of times the value was found. A value can exist in combination with multiple keys.
Now when I count the number of values before the counting and after the counting the computer comes up with different numbers which should, in my view, be impossible.

A piece of my code looks like this:
Expand|Select|Wrap|Line Numbers
  1. $chrompos = $chromosome."_".$position;
  2. $test_loc=$position+$lengthreads-1;
  3. $test_pos = $chromosome."_".$test_loc;
  4. foreach $exon_id(keys %exon_hash){
  5.     if (exists $exon_hash{$exon_id}{$chrompos}){
  6.         if (exists $exon_hash{$exon_id}{$test_pos}){
  7.                 for ($i=0;$i<$lengthreads;$i++){
  8.                     $next_position=$position+$i;
  9.                     $x= $chromosome."_".$next_position;
  10.                     $exon_hash{$exon_id}{$x}{amount}++;
  11.                 }
  12.                   }
  13.         }
  14. }
  15.  
when I print out a chromosomic location ($chrompos) the value is changed after the second exist function. This only happens in very rare cases but when it happens values gets added to my hash.

Does anyone knows what goes wrong here and how to solve it?
Thanks in advance.

Regards
Karel
Maybe you want to start this loop at 1 instead of 0:

Expand|Select|Wrap|Line Numbers
  1. for ($i=0;$i<$lengthreads;$i++){
If you add 0 to $position the current value of $next_position is not changed so the value of $x is not changed and then you increment the value of:

Expand|Select|Wrap|Line Numbers
  1. $exon_hash{$exon_id}{$x}{amount}++;

So it looks possible that you might count the above key twice for the same value. I could be totally wrong but try using 1 as the initial value instead of 0:

Expand|Select|Wrap|Line Numbers
  1. for ($i=1;$i<$lengthreads;$i++){
Oct 20 '08 #2
I tried what you suggested but it didn't solve the problem, still suddenly some extra entries in my hash got created.

Any other suggestions?

Regards
Karel
Oct 21 '08 #3
numberwhun
3,509 Expert Mod 2GB
I tried what you suggested but it didn't solve the problem, still suddenly some extra entries in my hash got created.

Any other suggestions?

Regards
Karel

Can you show what is in your hash and what was expected? Also, can you show your data source?

Regards,

Jeff
Oct 21 '08 #4
KevinADC
4,059 Expert 2GB
OK, lets look at these two lines:

Expand|Select|Wrap|Line Numbers
  1. if (exists $exon_hash{$exon_id}{$chrompos}){
  2.          if (exists $exon_hash{$exon_id}{$test_pos}){
  3.  
When you check for the existence of the key $chrompos in the first line, if the key $exon_id did not already exist it will spring into "life". Same in the next line. If the $test_pos key did not exist $exon_id will spring into life if it did not already exist. This is called autovivication. The only key that does not get autovivified is the deepest key ($chrompos and $test_pos in the this case).

I don't know if that is the problem but I can't tell what the problem might be just by looking at the code you posted besides the two suggestions I have now given you.
Oct 21 '08 #5
Here if we consider the top level 'for loop', autovivication should not arise:

Expand|Select|Wrap|Line Numbers
  1. foreach $exon_id(keys %exon_hash)
  2. {      
  3.       if (exists $exon_hash{$exon_id}{$chrompos})
  4.       {   
  5.                if (exists $exon_hash{$exon_id}{$test_pos}){
  6.  
  7.  

Because from the %xeon_hash only those keys will be taken which 'exists' in the hash and then second line checks second level key after giving first level key which already exists...same to next 'exists' check....so I feel in this tight check mode where there is clear navigation from keys level 1,2,3....autovivcation should not come in picture...

Please correct me if my understanding is wrong..

Regards,
Pawan
Oct 23 '08 #6
KevinADC
4,059 Expert 2GB
Here if we consider the top level 'for loop', autovivication should not arise:

Expand|Select|Wrap|Line Numbers
  1. foreach $exon_id(keys %exon_hash)
  2. {      
  3.       if (exists $exon_hash{$exon_id}{$chrompos})
  4.       {   
  5.                if (exists $exon_hash{$exon_id}{$test_pos}){
  6.  
  7.  

Because from the %xeon_hash only those keys will be taken which 'exists' in the hash and then second line checks second level key after giving first level key which already exists...same to next 'exists' check....so I feel in this tight check mode where there is clear navigation from keys level 1,2,3....autovivcation should not come in picture...

Please correct me if my understanding is wrong..

Regards,
Pawan

After reading your clear explanation above, I agree with you. Autovivcation should not be occuring as the loop advances from level one key to level two key. As long as a level is not skipped everything should be OK.

Good observation Pawan,
Kevin
Oct 23 '08 #7

Sign in to post your reply or Sign up for a free account.

Similar topics

2
by: CowBoyCraig | last post by:
It seems if I "Change" the $key going to if ""(exists($GREEN{$key}))"" at all It hoses. The keys look like "19973|3.1.A.4" without the quotes. If I print like this "print GREEN{$key}\n"; to see...
47
by: VK | last post by:
Or why I just did myArray = "Computers" but myArray.length is showing 0. What a hey? There is a new trend to treat arrays and hashes as they were some variations of the same thing. But they...
3
by: lestrov1 | last post by:
Hello all!! How I can return data from a hash: #include <stdlib.h> #include <stdio.h> #include <string.h> #include <search.h> void exists(char *k){
3
by: Brian | last post by:
I know this is the wrong way to do it, but maybe someone can tell me the right way to do it... I have two different databases that I need to synchronize. The database doesn't have keys exactly,...
21
by: Johan Tibell | last post by:
I would be grateful if someone had a minute or two to review my hash table implementation. It's not yet commented but hopefully it's short and idiomatic enough to be readable. Some of the code...
9
by: IamIan | last post by:
I'm using an array to store map features (name, lat, lon, caption, etc), from which the user can then select an individual feature. The problem is that when thousands of features are stored in the...
139
by: ravi | last post by:
Hi can anybody tell me that which ds will be best suited to implement a hash table in C/C++ thanx. in advanced
4
by: Amit Bhatia | last post by:
Hi, I am trying to write a hash function that works on a pair of integers. so I have a pair<int,intof integers. Let the first element be called "a" and second as "b". I know before hand that:...
6
by: j1mb0jay | last post by:
I am currently working on a dictionary populating program. I currently have a socket connection my local news server and am trawling through all of the articles looking for new words. I am...
2
by: DJRhino | last post by:
Was curious if anyone else was having this same issue or not.... I was just Up/Down graded to windows 11 and now my access combo boxes are not acting right. With win 10 I could start typing...
2
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 4 Oct 2023 starting at 18:00 UK time (6PM UTC+1) and finishing at about 19:15 (7.15PM) The start time is equivalent to 19:00 (7PM) in Central...
0
by: Aliciasmith | last post by:
In an age dominated by smartphones, having a mobile app for your business is no longer an option; it's a necessity. Whether you're a startup or an established enterprise, finding the right mobile app...
0
tracyyun
by: tracyyun | last post by:
Hello everyone, I have a question and would like some advice on network connectivity. I have one computer connected to my router via WiFi, but I have two other computers that I want to be able to...
2
by: giovanniandrean | last post by:
The energy model is structured as follows and uses excel sheets to give input data: 1-Utility.py contains all the functions needed to calculate the variables and other minor things (mentions...
3
NeoPa
by: NeoPa | last post by:
Introduction For this article I'll be using a very simple database which has Form (clsForm) & Report (clsReport) classes that simply handle making the calling Form invisible until the Form, or all...
1
by: Teri B | last post by:
Hi, I have created a sub-form Roles. In my course form the user selects the roles assigned to the course. 0ne-to-many. One course many roles. Then I created a report based on the Course form and...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 1 Nov 2023 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM) Please note that the UK and Europe revert to winter time on...
3
by: nia12 | last post by:
Hi there, I am very new to Access so apologies if any of this is obvious/not clear. I am creating a data collection tool for health care employees to complete. It consists of a number of...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.