By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
437,933 Members | 1,676 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 437,933 IT Pros & Developers. It's quick & easy.

How to get a while loop to recognize an extracted number

P: 14
I am writing a fairly simple script that is supposed to print out lines from a txt file and group them based on where I want them. The general format of the files I am using the script on is as follows:

662376 [S] |GCGG | | |
662375 [S] |CGCC | | |
662374 [S] |GCGG | | |
662373 [H] |CATC | | |
662371 [Y] |TCCC | | |
662369 [M] |CACCC| | |
662367 [Y] |TCTTT| | |
662365 [S] |GCGGG| | |
662364 [R] |AGAAA| | |
662360 [M] |CACCC| | |
662359 [W] |ATAAA| | |
662358 [S] |CGCCC| | |
662357 [M] |ACAAA| | |
662356 [R] |GAAAG| | |
662354 [S] |GCGGG| | |
662353 [R] |AGAAA| | |
662352 [R] |GAGGG| | |
662350 [K] |TGTTT| | |
662349 [Y] |CTCCC| | |
662348 [M] |ACAAA| | |
662347 [R] |GAGGG| | |
662376 [S] |GCGG | | |
662375 [S] |CGCC | | |
662374 [S] |GCGG | | |
662373 [H] |CATC | | |
662371 [Y] |TCCC | | |
662369 [M] |CACCC| | |
662367 [Y] |TCTTT| | |
662365 [S] |GCGGG| | |
662364 [R] |AGAAA| | |
662360 [M] |CACCC| | |
662359 [W] |ATAAA| | |
662358 [S] |CGCCC| | |
662357 [M] |ACAAA| | |
662356 [R] |GAAAG| | |
662354 [S] |GCGGG| | |
662353 [R] |AGAAA| | |
662352 [R] |GAGGG| | |
662350 [K] |TGTTT| | |
662349 [Y] |CTCCC| | |
662348 [M] |ACAAA| | |
662347 [R] |GAGGG| | |


and here is the code:

Expand|Select|Wrap|Line Numbers
  1. print "What is the start of the ORF?\n";
  2. $orf_beg = <STDIN>;
  3.  
  4. open (IN, '/home/asession/pracperl/bin/SNPYNR+-1kb.txt') || die "nope\n";
  5.  
  6. while (<IN>) {
  7.  
  8.     chomp;
  9.  
  10.     $position = substr $_, 1, 11;               # extract the position
  11.     $type_of_polymorphism = substr $_, 13, 3;   # extract the polymorphism type, same as ref if no polymorphism
  12.     $ref_seq = substr $_, 17, 1;                # extract the cerevisiae ref sequence
  13.     $para_seq = substr $_, 18, 1;               # extract the paradoxus sequence
  14.     $mika_seq = substr $_, 19, 1;               # extract the mika sequence
  15.     $kudr_seq = substr $_, 20, 1;               # extract the kudr sequence
  16.     $baya_seq = substr $_, 21, 1;               # extract the bayanus sequence
  17.  
  18.  
  19.     $baya_seq =~ s/\s+/x/g;
  20.     #$position =~ tr/ //d;
  21.  
  22.     foreach ($position) {
  23.         while ($postion > $orf_beg) {
  24.         print "$position $ref_seq $para_seq $baya_seq\n";
  25. }}
  26.  
  27. }
  28. close (IN);
  29.  
  30.     print "\n";
  31.  
Basically the problem I am having is that the while loop withing the foreach loop is not recognizing the position variable as a number, I tried to remove the white space, thinking that was an issue, but it did not work. Any suggestions?
Jul 17 '08 #1
Share this Question
Share on Google+
16 Replies


KevinADC
Expert 2.5K+
P: 4,059
If this is supposed to be input to the perl program, remove the newline from the end:

Expand|Select|Wrap|Line Numbers
  1. print "What is the start of the ORF?\n";

should be:

Expand|Select|Wrap|Line Numbers
  1. print "Enter the start of the ORF: ";
  2. chomp($orf_beg = <STDIN>);
Jul 17 '08 #2

P: 14
Thats not the variable that perl isnt recognizing, the $position is the problem, I defined orf beg before opening the file handle so that during the loop it wouldnt ask me to input the start of the ORF before every iteration.
Jul 17 '08 #3

eWish
Expert 100+
P: 971
Are you certain that $position has a number in it? Have you checked the value of $position?

--Kevin
Jul 17 '08 #4

numberwhun
Expert Mod 2.5K+
P: 3,503
Are you certain that $position has a number in it? Have you checked the value of $position?

--Kevin
To find out, you could run your script in the Perl debugger and check all the values to make sure. To start the debugger, use:

Expand|Select|Wrap|Line Numbers
  1. perl -d <scriptname>
  2.  
THe debugger is a little daunting at first, but its not bad once you read a tutorial, even the one at Perldoc is doable.

Regards,

Jeff
Jul 17 '08 #5

nithinpes
Expert 100+
P: 410
Are you certain that $position has a number in it? Have you checked the value of $position?

--Kevin

sessmurda,

I was wondering about the logic in this loop:
Expand|Select|Wrap|Line Numbers
  1.     foreach ($position) {
  2.         while ($postion > $orf_beg) {
  3.         print "$position $ref_seq $para_seq $baya_seq\n";
  4. }}
  5.  
  6.  
Inside while ($postion > $orf_beg) {} loop, you are not modifying the value of $position. So, if the condition returns true, this will result in an infinite loop.

You should be using if() loop instead of while(). However, check if $position has indeed a number in it by printing out the value.

- Nithin
Jul 18 '08 #6

KevinADC
Expert 2.5K+
P: 4,059
this line:

Expand|Select|Wrap|Line Numbers
  1. $position = substr $_, 1, 11;
with this input fromthe file:

Expand|Select|Wrap|Line Numbers
  1. 662376 [S] |GCGG | | |
will result in:

62376 [S] |

Is that really what you want?
Jul 18 '08 #7

numberwhun
Expert Mod 2.5K+
P: 3,503
sessmurda,

I was wondering about the logic in this loop:
Expand|Select|Wrap|Line Numbers
  1.     foreach ($position) {
  2.         while ($postion > $orf_beg) {
  3.         print "$position $ref_seq $para_seq $baya_seq\n";
  4. }}
  5.  
  6.  
Inside while ($postion > $orf_beg) {} loop, you are not modifying the value of $position. So, if the condition returns true, this will result in an infinite loop.

You should be using if() loop instead of while(). However, check if $position has indeed a number in it by printing out the value.

- Nithin
Also, correct me if I am wrong, but I don't see that $position was defined as an array, so doing a foreach will result in one time through.

Regards,

Jeff
Jul 18 '08 #8

nithinpes
Expert 100+
P: 410
Also, correct me if I am wrong, but I don't see that $position was defined as an array, so doing a foreach will result in one time through.

Regards,

Jeff
Thanks for the addition, Jeff. I missed it totally :)

The foreach loop will be of no specific purpose in that section of script.
Jul 18 '08 #9

P: 14
I am currently working on the foreach and while/if suggestions you guys provided, thanks! To answer some questions: the position is a number, it doesnt show when I copy and pasted the text, but there is a lot of white space between the last number of the position and the sequence character to allow for the fact that on occasion there will need to be decimals due to the format of the file. I checked all of the substring column numbers before posting, they are all correct. I originally had the if statement inside the loop but it kept asking me for the polymorphism on every iteration, I incorrectly thought while was what I should be using. Also when I tried pushing the $position variable into an array it would not move past the first position, even when I tried the shift function. When I use the script I posted without the while/if line, and have the $position being acted on by the foreach loop, it works, its just that I want to catalogue the positions based on where I specify it, then output those to each their own file, which I will work on after I get the script working.

Thanks again for the help.
Jul 18 '08 #10

KevinADC
Expert 2.5K+
P: 4,059
use the code tags around your data to preserve the spacing.
Jul 18 '08 #11

P: 14
Sorry for the late response, had a conference away in LA, but after playing with the script and using warnings I have figured out that even though when I print out only the $position, do not use STDIN for orf_beg, or redefine $position not to include decimals, the script tells me this..

Use of uninitialized value in string gt at gamma.pl line 25, <IN> line 1.
Use of uninitialized value in string gt at gamma.pl line 25, <IN> line 2.

etc etc. So I figure that even though I see only numbers whenever I print out the position variable, that it is not reading the position as a number because of the way I created the substr maybe. That is my only guess, and I was wondering how would I go about making sure that the gt statement works. I tried reformatting $position with sprint f so that all of the values turn to numbers that are rounded to nearest thousandth, but that did not work out. I also took out the foreach statement as suggested, the new script is below:

Expand|Select|Wrap|Line Numbers
  1. #!/usr/bin/perl -w
  2.  
  3.  
  4. print "What is the start of the ORF?";
  5. $orf_beg = <STDIN>;
  6.  
  7. open (IN, '/home/asession/pracperl/bin/SNPYNR+-1kb.txt') || die "nope\n";
  8.  
  9. while (<IN>) {
  10.  
  11.     chomp;
  12.  
  13.     $position = substr $_, 1, 10;               # extract the position
  14.     $type_of_polymorphism = substr $_, 13, 3;   # extract the polymorphism type, same as ref if no polymorphism
  15.     $ref_seq = substr $_, 17, 1;                # extract the cerevisiae ref sequence
  16.     $para_seq = substr $_, 18, 1;               # extract the paradoxus sequence
  17.     $mika_seq = substr $_, 19, 1;               # extract the mika sequence
  18.     $kudr_seq = substr $_, 20, 1;               # extract the kudr sequence
  19.     $baya_seq = substr $_, 21, 1;               # extract the bayanus sequence
  20.  
  21.  
  22.     $baya_seq =~ s/\s+/x/g;                     # replace all blanks with 'x'
  23.     $position =~ tr/ //d;                       # remove white space
  24.  
  25.     if ($postion gt $orf_beg) {
  26.         print "$position $ref_seq $para_seq $baya_seq\n";
  27.     #print "$position\n";
  28.  
  29. }}
  30. close (IN);
  31.  
  32.     print "\n";
  33.  
Jul 22 '08 #12

KevinADC
Expert 2.5K+
P: 4,059
need to see your data, the formatted data. Post it and put code tags around it.
Jul 22 '08 #13

P: 14
My bad, here is the data:

Expand|Select|Wrap|Line Numbers
  1.  662376     [S] |GCGG | |     |
  2.  662375     [S] |CGCC | |     |
  3.  662374     [S] |GCGG | |     |
  4.  662373     [H] |CATC | |     |
  5.  662371     [Y] |TCCC | |     |
  6.  662369     [M] |CACCC| |     |
  7.  662367     [Y] |TCTTT| |     |
  8.  662365     [S] |GCGGG| |     |
  9.  662364     [R] |AGAAA| |     |
  10.  662360     [M] |CACCC| |     |
  11.  662359     [W] |ATAAA| |     |
  12.  662358     [S] |CGCCC| |     |
  13.  662357     [M] |ACAAA| |     |
  14.  662356     [R] |GAAAG| |     |
  15.  662354     [S] |GCGGG| |     |
  16.  662353     [R] |AGAAA| |     |
  17.  662352     [R] |GAGGG| |     |
  18.  662350     [K] |TGTTT| |     |
  19.  662349     [Y] |CTCCC| |     |
  20.  662348     [M] |ACAAA| |     |
  21.  662347     [R] |GAGGG| |     |
  22.  662346     [b] |GCCTC| |     |
  23.  662345     [M] |CACCC| |     |
  24.  662344     [b] |CGTCC| |     |
  25.  662343.002 [c] |---C-| |     |
  26.  662343.001 [a] |---A-| |     |
  27.  662343     [K] |GTTGG| |     |
  28.  
Jul 23 '08 #14

nithinpes
Expert 100+
P: 410
Sorry for the late response, had a conference away in LA, but after playing with the script and using warnings I have figured out that even though when I print out only the $position, do not use STDIN for orf_beg, or redefine $position not to include decimals, the script tells me this..

Use of uninitialized value in string gt at gamma.pl line 25, <IN> line 1.
Use of uninitialized value in string gt at gamma.pl line 25, <IN> line 2.

The error you are getting is because of a typo in this line:
Expand|Select|Wrap|Line Numbers
  1. if ($postion gt $orf_beg) {
  2.  
Look out for the missing 'i' in $position.
Jul 23 '08 #15

KevinADC
Expert 2.5K+
P: 4,059
The error you are getting is because of a typo in this line:
Expand|Select|Wrap|Line Numbers
  1. if ($postion gt $orf_beg) {
  2.  
Look out for the missing 'i' in $position.

Good catch and it reminds me of why I don't like to look over peoples code that does not use "strict".
Jul 23 '08 #16

P: 14
Ah, now I feel like an idiot, but thanks for all the suggestions guys, and especially thanks for your patience kevinadc
Jul 23 '08 #17

Post your reply

Sign in to post your reply or Sign up for a free account.