473,326 Members | 2,125 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,326 software developers and data experts.

[perl-python] find & replace strings for all files in a dir

suppose you want to do find & replace of string of all files in a
directory.
here's the code:

©# -*- coding: utf-8 -*-
©# Python
©
©import os,sys
©
©mydir= '/Users/t/web'
©
©findStr='<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 FINAL//EN">'
©repStr='<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01
Transitional//EN">'
©
©def replaceStringInFile(findStr,repStr,filePath):
© "replaces all findStr by repStr in file filePath"
© tempName=filePath+'~~~'
© input = open(filePath)
© output = open(tempName,'w')
©
© for s in input:
© output.write(s.replace(findStr,repStr))
© output.close()
© input.close()
© os.rename(tempName,filePath)
© print filePath
©
©def myfun(dummy, dirr, filess):
© for child in filess:
© if '.html' == os.path.splitext(child)[1] and
©os.path.isfile(dirr+'/'+child):
© replaceStringInFile(findStr,repStr,dirr+'/'+child)
©os.path.walk(mydir, myfun, 3)
note that files will be overwritten.
be sure to backup the folder before you run it.

try to edit the code to suite your needs.

previous tips can be found at:
http://xahlee.org/perl-python/python.html

---------------------------------------
the following is a Perl version i wrote few years ago.
Note: if regex is turned on, correctness is not guranteed.
it is very difficult if not impossible in Perl to move regex pattern
around and preserve their meanings.
#!/usr/local/bin/perl

=pod

Description:
This script does find and replace on a given foler recursively.

Features:
* multiple Find and Replace string pairs can be given.
* The find/replace strings can be set to regex or literal.
* Files can be filtered according to file name suffix matching or other

criterions.
* Backup copies of original files will be made at a user specified
folder that preserves all folder structures of original folder.
* A report will be generated that indicates which files has been
changed, how many changes, and total number of files changed.
* files will retain their own/group/permissions settings.

usage:
1. edit the parts under the section '#-- arguments --'.
2. edit the subroutine fileFilterQ to set which file will be checked or

skipped.

to do:
* in the report, print the strings that are changed, possibly with
surrounding lines.
* allow just find without replace.
* add the GNU syntax for unix command prompt.
* Report if backup directory exists already, or provide toggle to
overwrite, or some other smarties.

Date created: 2000/02
Author: Xah

=cut

#-- modules --

use strict;
use File::Find;
use File::Path;
use File::Copy;
use Data::Dumper;

#-- arguments --

# the folder to be search on.
my $folderPath = q[/Users/t/web/UnixResource_dir];

# this is the backup folder path.
my $backupFolderPath = q[/Users/t/xxxb];

my %findReplaceH = (
q[<pre><a href="freebooks.html">back to Unix
Pestilence</a><pre>]=>q[<pre>? Back to <a href="freebooks.html">Unix
Pestilence</a></pre>],
);

# $useRegexQ has values 1 or 0. If 1, inteprets the pairs in
%findReplaceH
# to be regex.
my $useRegexQ = 0;

# in bytes. larger files will be skipped
my $fileSizeLimit = 500 * 1000;
#-- globals --

$folderPath =~ s[/$][]; # e.g. '/home/joe/public_html'
$backupFolderPath =~ s[/$][]; # e.g. '/tmp/joe_back';

$folderPath =~ m[/(\w+)$];
my $previousDir = $`; # e.g. '/home/joe'
my $lastDir = $1; # e.g. 'public_html'
my $backupRoot = $backupFolderPath . '/' . $1; # e.g.
'/tmp/joe_back/public_html'

my $refLargeFiles = [];
my $totalFileChangedCount = 0;

#-- subroutines --

# fileFilterQ($fullFilePath) return true if file is desired.
sub fileFilterQ ($) {
my $fileName = $_[0];

if ((-s $fileName) > $fileSizeLimit) {
push (@$refLargeFiles, $fileName);
return 0;
};
if ($fileName =~ m{\.html$}) {
print "processing: $fileName\n";
return 1;};

## if (-d $fileName) {return 0;}; # directory
## if (not (-T $fileName)) {return 0;}; # not text file

return 0;
};

# go through each file, accumulate a hash.
sub processFile {
my $currentFile = $File::Find::name; # full path spect
my $currentDir = $File::Find::dir;
my $currentFileName = $_;

if (not fileFilterQ($currentFile)) {
return 1;
}

# open file. Read in the whole file.
if (not(open FILE, "<$currentFile")) {die("Error opening file:
$!");};
my $wholeFileString;
{local $/ = undef; $wholeFileString = <FILE>;};
if (not(close(FILE))) {die("Error closing file: $!");};

# do the replacement.
my $replaceCount = 0;

foreach my $key1 (keys %findReplaceH) {
my $pattern = ($useRegexQ ? $key1 : quotemeta($key1));
$replaceCount = $replaceCount + ($wholeFileString =~
s/$pattern/$findReplaceH{$key1}/g);
};

if ($replaceCount > 0) { # replacement has happened
$totalFileChangedCount++;
# do backup
# make a directory in the backup path, make a backup
copy.
my $pathAdd = $currentDir; $pathAdd =~
s[$folderPath][];
mkpath("$backupRoot/$pathAdd", 0, 0777);
copy($currentFile,
"$backupRoot/$pathAdd/$currentFileName") or
die "error: file copying file failed on
$currentFile\n$!";

# write to the original
# get the file mode.
my ($mode, $uid, $gid) = (stat($currentFile))[2,4,5];

# write out a new file.
if (not(open OUTFILE, ">$currentFile")) {die("Error
opening file: $!");};
print OUTFILE $wholeFileString;
if (not(close(OUTFILE))) {die("Error closing file:
$!");};

# set the file mode.
chmod($mode, $currentFile);
chown($uid, $gid, $currentFile);

print "-----^$*%$@#-------------------------------\n";
print "$replaceCount replacements made at\n";
print "$currentFile\n";
}

};
#-- main body --

find(\&processFile, $folderPath);

print "--------------------------------------------\n\n\n";
print "Total of $totalFileChangedCount files changed.\n";

if (scalar @$refLargeFiles > 0) {
print "The following large files are skipped:\n";
print Dumper($refLargeFiles);
}
__END__
Xah
xa*@xahlee.org
http://xahlee.org/PageTwo_dir/more.html

Jul 18 '05 #1
1 3690
Xah Lee wrote:
suppose you want to do find & replace of string of all files in a
directory.
here's the code: [snip] Xah
xa*@xahlee.org
http://xahlee.org/PageTwo_dir/more.html


When are you going to take the hint (from everybody in
comp.lang.perl.misc and comp.lang.python) to stop posting! Your posts
do not help anybody and will only hurt a beginner. *PLEASE STOP POSTING*!

--
k g a b e r t (@at@) x m i s s i o n (.dot.) c o m

* After "extensive" research, I noticed
* that yy******@yahoo.com received 12
* spam e-mail messages after just two
* posts on usenet groups. If you want
* to email me, use the "encrypted"
* email address at the beggining of my
* signature.
Jul 18 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: David F. Skoll | last post by:
Hi, I'm tearing my hair out on this one. I'm trying to embed a Perl interpreter into a C program. I need to be able to create and destroy the interpreter periodically, but will never actually...
1
by: Julia Bell | last post by:
I would like to run the same script on two different platforms. The directory in which the script(s) will be stored is common to the two platforms. (I see the same directory contents regardless...
1
by: sm00thcrimnl13 | last post by:
if i have windows 2000 and know how to write perl scripts, how to i actuvate the script through perl?
1
by: smsabu2002 | last post by:
Hi, I am facing the build problem while installing the DBD-MySql perl module (ver 2.9008) using both GCC and CC compilers in HP-UX machine. For the Build using GCC, the compiler error is...
13
by: Otto J. Makela | last post by:
I'm trying to install to php the Perl-1.0.0.tgz package (from http://pecl.php.net/package/perl, enabling one to call perl libraries) to a pre-existing Solaris system. Unfortunately, the attempt...
6
by: surfivor | last post by:
I may be involved in a data migration project involving databases and creating XML feeds. Our site is PHP based, so I imagine the team might suggest PHP, but I had a look at the PHP documentation...
4
by: billb | last post by:
I installed a perl extension for PHP to use some perl inside my php primarily because I have perl working with oracle and not php and oracle. So I want to use my old perl scripts, and use the...
223
by: Pilcrow | last post by:
Given that UNIX, including networking, is almost entirely coded in C, how come so many things are almost impossible in ordinary C? Examples: Network and internet access, access to UNIX...
4
by: vijayarl | last post by:
Hi All, i have the following software installed in my system : 1.OS: Win2k 2.Eclipse Version used :3.4.0 & even the perl too... 1. I have imported the my own perl project in Eclipse, when i...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.