473,386 Members | 1,654 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

Need some hints on speeding up

I only do occasional Perl programming and most things I write are
short processes. I have something I'm working on that is scanning a
text file with about 15 million lines and trying to extract matches
from another text file, which has about 170 entries. The second text
file is read into an array. The process then scans through the big
file for certain possible patterns - it will find those in about 1 out
of 25 lines,, when it finds one, it then loops through the array
trying to find a match there, and then writes out a couple of lines
into another text file.

It also writes to the screen a summary line about every 25th record.

When I run this it takes anywhere from 1.5 to 7.5 hours. It seems that
to avoid the 7.5 hour time, I should fresh reboot, and ctrl+alt+del
almost everything.

But even 1.5 hours is too long since I need to be able to run this
with different sets of data several times a day,

One thought I had is that writting a summary progress to the screen
[which helps me judge how far along it is, may be slowing things down]

Running Windows 98 SE and the latest version of Active Perl.
Jul 19 '05 #1
1 3095
In article <oe********************************@4ax.com>, Spamtrap
<oc*******@sneakemail.com> wrote:
I only do occasional Perl programming and most things I write are
short processes. I have something I'm working on that is scanning a
text file with about 15 million lines and trying to extract matches
from another text file, which has about 170 entries. The second text
file is read into an array. The process then scans through the big
file for certain possible patterns - it will find those in about 1 out
of 25 lines,, when it finds one, it then loops through the array
trying to find a match there, and then writes out a couple of lines
into another text file.

It also writes to the screen a summary line about every 25th record.

When I run this it takes anywhere from 1.5 to 7.5 hours. It seems that
to avoid the 7.5 hour time, I should fresh reboot, and ctrl+alt+del
almost everything.

But even 1.5 hours is too long since I need to be able to run this
with different sets of data several times a day,

One thought I had is that writting a summary progress to the screen
[which helps me judge how far along it is, may be slowing things down]

Running Windows 98 SE and the latest version of Active Perl.


People are going to need a little bit more information to help you.

First of all, comp.lang.perl is a defunct newsgroup. You would do
better to post to comp.lang.perl.misc.

What kind of hardware are you using? Your program may be limited by CPU
speed, disk I/O speed, or memory size. Can you upgrade your hardware if
that proves to be the limiting factor?

How long does it take to read through the 15M-line file? That gives you
a baseline for the minimum amount of time it will take to process the
file. If your full program takes considerably more than that, you may
be using a slow search algorithm or have other problems. Printing will
show down your program somewhat, but probably not a significant amount
unless you are really printing too much. Cut down the amount and see.

The best thing you can do is post a minimal, complete program (to
comp.lang.perl.misc) that people can inspect. Normally, you would want
a runnable program, but since you can't post the big text file or even
the 170 line pattern file, you may have to rely on code inspection
rather than profiling. Be sure and make your posted program readable.

You might want to profile your program. Check out 'perldoc -q profile'.
Jul 19 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Snyke | last post by:
Hi. I have a command line script which works really fine, the only problem is that it take *really* long for the first output to be printed on screen. Since I also get some HTTP headers I'm...
0
by: mike | last post by:
I would greatly appreciate any help you can give. I have a PlanarImage that I need to rotate. When I try to use the AffineTransformOp, I get a java.awt.image.ImagingOpException. This was my...
12
by: dvumani | last post by:
I have C code which computes the row sums of a matrix, divide each element of each row with the row sum and then compute the column sum of the resulting matrix. Is there a way I can speed up the...
2
by: Robert Wilkens | last post by:
Ok... This may be the wrong forum, but it's the first place I'm trying. I'm new to C# and just implemented the 3-tier Distributed application from Chapter 1 (the first walkthrough) in the...
2
by: OHM | last post by:
I was wondering about this topic and although I accept that different situations call for different solutions, but wondered are there any other solutions and whether has anyone carried out a...
5
by: RobinAG | last post by:
Hello, I just split my database into front and back end. My front end users are experiencing really slow opening of forms. I've searched online for help speeding up forms, but I'm not sure what...
10
by: ags5406 | last post by:
I've created an application that downloads data daily from a secure web site and stores that data in an Access database. Then there are different options for allowing the user to do keyword and...
2
by: dp_pearce | last post by:
I have some code that takes data from an Access database and processes it into text files for another application. At the moment, I am using a number of loops that are pretty slow. I am not a...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.