I've just started programming in perl and have written a few successful scripts but had a quick question on how to do 2 things.
First here is a script that I wrote recently that works for what it is supposed to do, but is not quite what I want. - #!/usr/bin/perl
-
-
$file_q = "x.txt";
-
-
open(FILE, $file_q)||die "nope\n";
-
while(<FILE>){
-
-
@line = split(/\s+/, $_);
-
-
if($line[0]=~/cere/){
-
-
push(@wanted_lines,$line[2]);
-
}}
-
-
close (FILE);
-
-
print "@wanted_lines\n";
Basically what I need to do is to extract the nth character of each line beginning with 'cere' and push the output of that into an array. I will repeat that for some other strings as well. Then from there I need to be able to only print n characters per line so that I can say print 100 cere characters, then 100 a characters, then 100 b characters in a format similar to this:
cere-xxxxxxxxxxxxxxx xxxxxxxxxxxxxxx xxxxxxxxxxxx
aaaa-yyyyyyyyyyyyyyy yyyyyyyyyyyyyyy yyyyyyyyyyyy
bbbb-zzzzzzzzzzzzzzz zzzzzzzzzzzzzzz zzzzzzzzzzzz
any help is greatly appreciated!
4 1312 KevinADC 4,059
Recognized Expert Specialist
Hard to say wihtout seeing your data, but here is something you can maybe chew on: -
use strict;
-
use warnings;
-
-
my $file_q = "x.txt";
-
my @wanted = ();
-
open(FILE, $file_q) or die "nope: $!\n";
-
while(<FILE>){
-
if(/^cere/){ # line begins with cere
-
push @wanted_lines,substr($_,5,100);
-
}
-
}
-
close (FILE);
-
print "@wanted_lines\n";
-
Look up substr() and how to use it.
Basically the format of my data is like this, but contains closer to like 10,000 lines.
cere 662376 G
para 662376 C
baya 662376 x
cere 662375 C
para 662375 G
baya 662375 x
cere 662374 G
para 662374 C
baya 662374 x
cere 662373 C
para 662373 A
baya 662373 x
cere 662372 A
para 662372 A
baya 662372 x
cere 662371 T
para 662371 C
baya 662371 x
cere 662370 G
para 662370 G
baya 662370 x
cere 662369 C
para 662369 A
baya 662369 C
cere 662368 A
para 662368 A
baya 662368 A
cere 662367 T
para 662367 C
baya 662367 T
cere 662366 C
para 662366 C
baya 662366 C
cere 662365 G
para 662365 C
baya 662365 G
cere 662364 A
para 662364 G
baya 662364 A
cere 662363 C
para 662363 C
baya 662363 C
cere 662362 G
para 662362 G
baya 662362 G
cere 662361 T
para 662361 T
baya 662361 T
cere 662360 C
para 662360 A
baya 662360 C
cere 662359 A
para 662359 T
baya 662359 A
cere 662358 C
para 662358 G
baya 662358 C
I've been using the substring function, but the main thing is I want to align all the cere against all the para, against all the baya in a format similar to my first post while only printing a certain # of characters per line because 1) its so long, and 2) I have to do this to many different outputs. The problem with just the substring function I've been having is that itll list all of the cere points, then all of another, whereas I'd want it to be aligned so that I can compare.
KevinADC 4,059
Recognized Expert Specialist
Just going by the sample data, I wrote this: -
use strict;
-
use warnings;
-
my %data = ();
-
my @genes = ();
-
while (my $line=<DATA>) {
-
$line =~ tr/ //d; # remove the spaces
-
my ($var1, $var2, $var3) = unpack("A4A6A1",$line); # unpack is very efficient
-
push @genes, $var1; #to maintain order. Can be omitted if order is not important
-
$data{$var1} .= $var3; # creates a hash
-
}
-
-
foreach my $g (@genes) {
-
print "$g ", substr($data{$g},0,10), "\n";
-
}
-
-
__DATA__
-
cere 662376 G
-
para 662376 C
-
baya 662376 x
-
cere 662375 C
-
para 662375 G
-
baya 662375 x
-
cere 662374 G
-
para 662374 C
-
baya 662374 x
-
cere 662373 C
-
para 662373 A
-
baya 662373 x
-
cere 662372 A
-
para 662372 A
-
baya 662372 x
-
cere 662371 T
-
para 662371 C
-
baya 662371 x
-
cere 662370 G
-
para 662370 G
-
baya 662370 x
-
cere 662369 C
-
para 662369 A
-
baya 662369 C
-
cere 662368 A
-
para 662368 A
-
baya 662368 A
-
cere 662367 T
-
para 662367 C
-
baya 662367 T
-
cere 662366 C
-
para 662366 C
-
baya 662366 C
-
cere 662365 G
-
para 662365 C
-
baya 662365 G
-
cere 662364 A
-
para 662364 G
-
baya 662364 A
-
cere 662363 C
-
para 662363 C
-
baya 662363 C
-
cere 662362 G
-
para 662362 G
-
baya 662362 G
-
cere 662361 T
-
para 662361 T
-
baya 662361 T
-
cere 662360 C
-
para 662360 A
-
baya 662360 C
-
cere 662359 A
-
para 662359 T
-
baya 662359 A
-
cere 662358 C
-
para 662358 G
-
baya 662358 C
Thanks! I've done a bit more manipulation to get it to do exactly what I want, your help is greatly appreciated!
Sign in to post your reply or Sign up for a free account.
Similar topics |
by: @ |
last post by:
A benchmark in 2002 showed PHP is much slower in shell or when Apache has
Mod_Perl.
With the new PHP kissing Java's ass, Perl is once again the #1 CGI choice.
Java is for a big team in short time to develope something slow.
ASP is a joke.
PHP is a kid.
|
by: Piotr Turkowski |
last post by:
Hi!
I've got some code in Perl and I have to have it in C, but my knowlege
of Perl is < 0 :-(, so I need your help. here's the code. Thanks in advance.
decrypt.pl
#!/usr/local/bin/perl
$keyword=$ARGV ;
@key=split(//,$keyword) ;
$period=length($keyword) ;
|
by: Martin Foster |
last post by:
Hi.
I would like to be able to mimic the unix tool 'uniq' within a Perl script.
I have a file with entries that look like this
4 10 21 37 58 83 111 145 184 226
4 12 24 42 64 92 124 162 204 252
4 11 23 44 67 95 134 168 215 271
..
|
by: Richard Bell |
last post by:
I'm returning to Perl and Linux after many years away and while I
know/knew way back when about Perl and Unix I'm new to this world
today.
I'm considering using LWP as the heart of a Web application and have a
number of questions.
It appears to me that the Get method returns ONLY the content of the
single object referenced by the URL. ...
|
by: Jack Coxen |
last post by:
------_=_NextPart_001_01C3584E.5FF65B60
Content-Type: text/plain;
charset="iso-8859-1"
I've gone through the mailing list archives, read the appropriate parts of
the manual and searched the internet and still have questions. What I'm
running into is a serious performance problem with a fairly large (by my
standards) database. My system...
| |
by: pc |
last post by:
hi all,
I have been blessed with the task of writing a web based database
representing the state of our globally installed isam databases.
there are basically four steps in setting this up:
1) schedule isam file reports to run at all remote sites. this
generates a text file describing the current state of tables (files) in
the database.
|
by: jay |
last post by:
Hi,
I'm totally new to Python and was hoping someone might be able to
answer a few questions for me:
1. What are your views about Python vs Perl? Do you see one as
better than the other?
2. Is there a good book to start with while learning Python? I'm
currently reading 'Python Essential Reference' by David M. Beazley.
|
by: pvenu |
last post by:
Hi,
I know basic perl (regular expressions, pattern matching, string manipulation, reading writing into text files).
Yet, my requirement is to read an input text file -> process this input file through a Perl script and the output generated from this script is to be written into an Excel sheet file (into each cells of the sheet).
Can you...
|
by: filipo |
last post by:
Hello all;
I have a .csv file that contains messages exported from one discussion forum that I want to import into another forum (phpBB), but I need to do some data manipulation on the original export file first.
The original .csv file contains a subject or topic field on each line. Subsequent lines contain replies and other subjects. The...
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it.
First, let's disable language...
| |
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed.
This is as boiled down as I can make it. ...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules.
He will explain when you may want to use classes...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
| |
by: bsmnconsultancy |
last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...
| |