By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
437,809 Members | 1,999 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 437,809 IT Pros & Developers. It's quick & easy.

Usage of shift command

P: 14
So I'm fairly new at perl and was looking for help with the usage of the shift command, which I think will help me solve a problem I've been having. Here is my script as is:
Expand|Select|Wrap|Line Numbers
  1. #!/usr/bin/perl
  2. #print "Where is the output file located?";
  3. #chomp($file_q = <STDIN>);
  4. $file_q = "/home/asession/pracperl/x.txt";
  5.  
  6. open(FILE, $file_q) ||die "nope\n";
  7. while(<FILE>){
  8.  
  9. my %data = ();
  10. my @genes = ();
  11. while (my $line=<FILE>) {
  12.    $line =~ tr/ //d;                                            # remove the spaces
  13.    my ($var1, $var2, $var3) = unpack("A4A6A1",$line);
  14.    push @genes, $var1;
  15.    $data{$var1} .= $var3;                                       # creates a hash
  16.  
  17. foreach my $g (@genes) {
  18.    print "$g ", substr($data{$g},0,10), "\n";
  19. }}
  20.  
  21. close(FILE);
  22.  
where the format of the data in the x.txt file is as follows:
cere 662376 G
para 662376 C
baya 662376 x
cere 662375 C
para 662375 G
baya 662375 x
cere 662374 G
para 662374 C
baya 662374 x
cere 662373 C
para 662373 A
baya 662373 x
cere 662372 A
para 662372 A
baya 662372 x
cere 662371 T
para 662371 C
baya 662371 x

and it goes on for tens of thosands of lines. Basically I want to align the sequence of the last letter, which is what the script dows but it'll only work for the first 10 lines (if I specify more it'll slowly do more and more, but only fit on 1 line). I think the shift command will allow me to empty the subtring when it is printing and fill it with another 10 each time so that it'll print 10 nucleotides down until the end of the sequence, but I cannot find the correct syntax to use this command, any ideas?
Jul 3 '08 #1
Share this Question
Share on Google+
8 Replies


numberwhun
Expert Mod 2.5K+
P: 3,503
There are a couple of things in your code that I wanted o point out. First is this line:
Expand|Select|Wrap|Line Numbers
  1. my ($var1, $var2, $var3) = unpack("A4A6A1",$line);
  2.  
Instead of taking the time to figure out the template that you needed, you could have simply written that line like so:

Expand|Select|Wrap|Line Numbers
  1. my ($var1, $var2, $var3) = split(/\s+/, $line);
  2.  
That simply splits each line using spaces as the separator(s) and puts each piece in its respective variable.

As for the following line:
Expand|Select|Wrap|Line Numbers
  1. $data{$var1} .= $var3;          # creates a hash
  2.  
You have that comment there and that kind of bothered me. This doesn't create a hash as I believe you expected (but I could be wrong). What this does is performs a concatenation. It is taking $data{$var1} and concatenating it with $var3. The ".=" is a shortcut for concatenation meaning:

Expand|Select|Wrap|Line Numbers
  1. $data{$var1} = $data{$var1} . $var3
  2.  
Does that make sense? Is that what you were wanting? If not, then you need to take the period out from in front of the equals sign.

One suggestion I also have, since you are working with so much data, is to put the following at the beginning of your script after the shebang line but before the rest of the script:

[code]
$| = 1;
[/cpde]

What that will do is force the buffer to be flushed automatically, not retaining anything in the buffer. Don't know if it will help, but its always a good idea when dealing with a lot of data or when you want to see what is going on as its happening.

Regards,

Jeff
Jul 3 '08 #2

P: 14
Thanks, I've been fiddling with your suggestions, only problem is the final set of code you said to insert after the shebang line just forces the script to give me a lot of errors instead of doing anything, basically I just want it to not repeat the first 10 characters of sequence over again, and dont know exactly how to set up the shift function, any help with that would be great
Jul 3 '08 #3

numberwhun
Expert Mod 2.5K+
P: 3,503
Thanks, I've been fiddling with your suggestions, only problem is the final set of code you said to insert after the shebang line just forces the script to give me a lot of errors instead of doing anything, basically I just want it to not repeat the first 10 characters of sequence over again, and dont know exactly how to set up the shift function, any help with that would be great
All shift really does, is take the next option in the active array, such as @_ or @ARGV if you are not inside of a subroutine or function, or the active, referenced array, and it pulls off the first item in the array and returns it, thus, shortening the array by 1 element. You can use it like this:

Expand|Select|Wrap|Line Numbers
  1. my $element = shift;
  2.  
to use the presently refernced array, or

Expand|Select|Wrap|Line Numbers
  1. my $element = shift(@arrayname);
  2.  
to use it on a specific array. If you need to know how to use any function in Perl, go to the source, Perldoc .

As for the variable, I left off the quotes around the 1, put them around it and see if it helps. What errors were thrown by the way? Can you post them here?

Regards,

Jeff
Jul 3 '08 #4

P: 14
The errors were

Scalar found where operator expected at alpha.pl line 4, near "$|"
(Missing semicolon on previous line?)
Bareword found where operator expected at alpha.pl line 9, near "$file_q = "/home"
(Might be a runaway multi-line // string starting on line 5)
(Missing operator before home?)
String found where operator expected at alpha.pl line 11, near "open(FILE, $file_q) ||die ""
(Might be a runaway multi-line "" string starting on line 9)
(Missing semicolon on previous line?)
Bareword found where operator expected at alpha.pl line 11, near "open(FILE, $file_q) ||die "nope"
Backslash found where operator expected at alpha.pl line 11, near "nope\"
String found where operator expected at alpha.pl line 18, near "my ($var1, $var2, $var3) = unpack(""
(Might be a runaway multi-line "" string starting on line 11)
(Missing semicolon on previous line?)
Bareword found where operator expected at alpha.pl line 18, near "my ($var1, $var2, $var3) = unpack("A4A6A1"
(Do you need to predeclare my?)
String found where operator expected at alpha.pl line 23, near "print ""
(Might be a runaway multi-line "" string starting on line 18)
(Missing semicolon on previous line?)
syntax error at alpha.pl line 4, near "$| "
syntax error at alpha.pl line 9, near "$file_q = "/home"

so if shift isnt going to help is there any way to clear the array every time i have it print and return the following values, again and again until its done?
Jul 3 '08 #5

KevinADC
Expert 2.5K+
P: 4,059
There are a couple of things in your code that I wanted o point out. First is this line:
Code: ( text )

1.
my ($var1, $var2, $var3) = unpack("A4A6A1",$line);



Instead of taking the time to figure out the template that you needed, you could have simply written that line like so:
If the fields are fixed width, unpack is way more efficient than split, magnitudes more efficient. split() should be used when records have a known delimiter with unknown data between the delimiters. Even using tr/// to get rid of the spaces then unpack should be more efficient than split().

I wrote the code he posted. Its from his other thread.
Jul 3 '08 #6

numberwhun
Expert Mod 2.5K+
P: 3,503
If the fields are fixed width, unpack is way more efficient than split, magnitudes more efficient. split() should be used when records have a known delimiter with unknown data between the delimiters.

I wrote the code he posted for him on another forum.
Ok, that explains it. See, I haven't had the chance to use unpack but now know what to use it for. I guess they should have explained where they got their code from. Just trying to help.

Jeff
Jul 3 '08 #7

KevinADC
Expert 2.5K+
P: 4,059
Ok, that explains it. See, I haven't had the chance to use unpack but now know what to use it for. I guess they should have explained where they got their code from. Just trying to help.

Jeff
pack and unpack might be the most under utilized functions in perl, probably because they are a bit hard to understand, I know I don't fully understand how to use them. There is a good tutorial on perldoc:

http://perldoc.perl.org/perlpacktut.html

He didn't mention where he got the code from over here either:

devshed

If you look at the code I wrote for him originally and the code he now has posted he has altered it so it no longer works.
Jul 4 '08 #8

numberwhun
Expert Mod 2.5K+
P: 3,503
pack and unpack might be the most under utilized functions in perl, probably because they are a bit hard to understand, I know I don't fully understand how to use them. There is a good tutorial on perldoc:

http://perldoc.perl.org/perlpacktut.html

He didn't mention where he got the code from over here either:

devshed

If you look at the code I wrote for him originally and the code he now has posted he has altered it so it no longer works.
Thanks for that link! I found that and was able to figure out what your unpack was doing and it makes total sense.

Unfortunately, making changes and making code unworkable doesn't make our job any easier.

Regards,

Jeff
Jul 5 '08 #9

Post your reply

Sign in to post your reply or Sign up for a free account.