how to reduce the time complexity while reading files

hi! all,

in a directory nearly 10 zipped file are available.
totally the size of the all files is nearly 15GB.

i have to retrive the line which dont have the text "ORA" from each file and i have to write this data to a another big file.

i got it but it is taking the time of nearly 5 minutes to complete the process.But i have to process 7 directories at a time..so totally it is taking so much time..

i wrote the code as..

Expand|Select|Wrap|Line Numbers

    !#use/bin/perl

    @filenames=</home/dir/*.gz>;
 
   open(OUT ,">bigfile");

    foreach $file(@filenames)

    {

       open(IN,"gzcat $file|");

         while($line=<IN>)

         {

             next if($line=~/^ORA | ^$/);

             print  OUT $line;

        }

    close IN;

  }# end for
 
close OUT;

this is only for one directory..like this seven directories r there.

if any one knows better way to do this..in order to reduce the time comlexity plz help me as i m new to perl.

thank & regards,
Manogna.

Mar 5 '08 #1

Subscribe Post Reply

1974

minowicz

I realize this isn't exactly a perl answer, but why not simply:

zgrep -vh ^ORA dirname/*.gz > bigfile

or if you don't have zgrep:

gzip -dc dirname/*.gz | grep -v ^ORA > bigfile

As to having multiple directories, it is not clear if you want each to be processed in sequence and appended to the single bigfile, or if you want them each to be processed in parallel and put into their own bigfile.

Mar 5 '08 #2

Manogna

thank you very much! for ur response.

i want to write the data by parellel execution of the respective directory files to thier respective big files.

i tried ur code its working properly but i am willing to use the regular expressions in that.
i tried as follows but its is not worikng properly...

zegrep -vh "^ORA |^\s*$ |read_time|" dirname/*.gz > bigfile

with in fraction of seconds i must write all the data from each directory to their respective big files.

and i want to parse the lines to take only first three fields from the about script before writing to the big file..

is it possible?

please help me..

thanks and regards,
Manogna.

Mar 6 '08 #3

Similar topics

226

reduce() anomaly?

by: Stephen C. Waterbury | last post by:

This seems like it ought to work, according to the description of reduce(), but it doesn't. Is this a bug, or am I missing something? Python 2.3.2 (#1, Oct 20 2003, 01:04:35) on linux2 Type...

Python

181

map/filter/reduce/lambda opinions and background unscientificmini-survey

by: Tom Anderson | last post by:

Comrades, During our current discussion of the fate of functional constructs in python, someone brought up Guido's bull on the matter: http://www.artima.com/weblogs/viewpost.jsp?thread=98196 ...

Python

ADP Ready for Prime Time?

by: Neil Ginsberg | last post by:

A while back I posted a message re. using an ADP file with a SQL Server back end as opposed to MDB file with linked tables, thinking that the ADP file would be less problematic. The input I got was...

Microsoft Access / VBA

Database Design Recommendation Sought - Objective: Reduce Sluggishness / Improve Efficiency

by: MLH | last post by:

I have a RDBMS app consisting of 3 primary mdb's... 1) a front-end with a few STATIC tables and the other menagerie of objects 2) a back-end with most of my DYNAMIC tables. I'll call it my main...

Microsoft Access / VBA

Time complexity of size() for std::set

by: Lionel B | last post by:

Hi, Anyone know if the Standard has anything to say about the time complexity of size() for std::set? I need to access a set's size (/not/ to know if it is empty!) heavily during an algorithm...

C / C++

complexity for tellg()

by: toton | last post by:

Hi, I am reading a big file , and need to have a flag for current file position so that I can store the positions for later direct access. However it looks tellg is a very costly function ! But...

C / C++

Delete from a std::set in amortized constant time

by: desktop | last post by:

In the C++ standard sec 23.1.2 table 69 it says that erase(q) where q is a pointer to an element can be done in amortized constant time. I guess that is not worst case since std::set is...

C / C++

Reading real time log files.

by: Justin Rich | last post by:

looking for the best approach to reading a real time log file. The file gets updated pretty quickly, always appended to the end. do i really need to just keep re-opening the file and reading the...

C# / C Sharp

amortized analysis time complexity problem

by: chits12345 | last post by:

Hi all Is there anybody who can solve this problem? This is not a easy problem. Problem: Assume that George(S,X) is a function that returns a Boolean value, where S is a stack, and that...

C / C++

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Basic Javascript concepts

by: aa123db | last post by:

Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...

Javascript

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server