By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,091 Members | 1,546 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,091 IT Pros & Developers. It's quick & easy.

What language to manipulate text files

P: n/a
I want to do some tricky text file manipulation on many files, but have
only a little programming knowledge.

What are the ideal languages for the following examples?

1. Starting from a certain folder, look in the subfolders for all
filenames matching *FOOD*.txt Any files matching in each folder should
be copied to a new subfolder within the current folder called EATING
with a new name of *FOOD*COPY.txt

2. Process each file as follows:
Here is a simplified example of what I want as input and output.

------------------------------------- input
.......................... 'several unknown lines of text file
Get apples from apples shop
Get oranges from oranges shop
Get plums from plums shop
Get pears from pears shop
Eat from apples, oranges,
plums, pears 'whitespace at start of line is unimportant
.......................... 'more unknown lines of text file
Chapter 1
Several lines of text about apples in here
Chapter 2
Several lines of text about oranges in here
Chapter 3
Several lines of text about plums in here
Chapter 4
Several lines of text about pears in here

------------------------------------- output
.......................... 'several unknown lines of text file
Get apples from apples shop
Get oranges from oranges shop
Get plums from plums shop
Get pears from pears shop
Get bagels from bagels shop 'the Get lines...
Get donuts from donuts shop 'can be in any order
Eat from apples, bagels, oranges,
plums, donuts, pears 'whitespace at start of line is unimportant
.......................... 'more unknown lines of text file
Chapter 1
Several lines of text about apples in here
Chapter 2
Several lines of text about bagels in here
Chapter 3
Several lines of text about oranges in here
Chapter 4
Several lines of text about plums in here
Chapter 5
Several lines of text about donuts in here
Chapter 6
Several lines of text about pears in here

Summary:
I have added two new items to Get;
I have put them into the comma-delimited list after searching for a
particular fruit to put each one after;
The Chapters are renumbered to match their position in the
comma-delimited list.
The "several lines of text" about each new item can be pulled from a
new_foods.txt file (or a bagels.txt and a donuts.txt file).

My first objective is to process the files as described.
My second objective is to learn the best language for this sort of text
manipulation. The language should run on Windows 98, XP and Linux.

Would Python be best, or would a macro-scripting thing like AutoHotKey
work?
I thought about Perl, but think I would learn bad habits and have hard
to read code.

Thanks, Ross

Jul 19 '05 #1
Share this Question
Share on Google+
10 Replies


P: n/a
On Saturday 11 June 2005 11:37 pm, ross wrote:
I want to do some tricky text file manipulation on many files, but have
only a little programming knowledge. [...]
Would Python be best, or would a macro-scripting thing like AutoHotKey
work?
I thought about Perl, but think I would learn bad habits and have hard
to read code.


Both Perl and Python are *extremely* good at this kind of work. This is
pretty much what inspired Perl, and Python implements most of the same
toolset. You will solve many of these kinds of problems using "regular
expressions" (built-in first-class object in Perl, created from strings in
Python using the "re" module).

No surprise of course that I would choose Python. Mainly because of what
it provides beyond regular expressions. Many simple cases can be handled
with string methods in Python (check the Sequence types information in the
built-ins section of the Library Reference -- also look at the "string" module,
though it's usually easier to use the string methods approach).

You will probably end up with more readable code using Python and
take less time to develop sufficient proficiency to do the job with it.
--
Terry Hancock ( hancock at anansispaceworks.com )
Anansi Spaceworks http://www.anansispaceworks.com

Jul 19 '05 #2

P: n/a
Why do people keep asking what language to use for certain things in the
Python newsgroup? Obviously the answer is going to biased.

Not that it's a bad thing because I love Python, but it doesn't make sense
if you honestly want an objective opinion.

R

ross wrote:
I want to do some tricky text file manipulation on many files, but
have only a little programming knowledge.

What are the ideal languages for the following examples?

1. Starting from a certain folder, look in the subfolders for all
filenames matching *FOOD*.txt Any files matching in each folder should
be copied to a new subfolder within the current folder called EATING
with a new name of *FOOD*COPY.txt

2. Process each file as follows:
Here is a simplified example of what I want as input and output.

------------------------------------- input
......................... 'several unknown lines of text file
Get apples from apples shop
Get oranges from oranges shop
Get plums from plums shop
Get pears from pears shop
Eat from apples, oranges,
plums, pears 'whitespace at start of line is unimportant
......................... 'more unknown lines of text file
Chapter 1
Several lines of text about apples in here
Chapter 2
Several lines of text about oranges in here
Chapter 3
Several lines of text about plums in here
Chapter 4
Several lines of text about pears in here

------------------------------------- output
......................... 'several unknown lines of text file
Get apples from apples shop
Get oranges from oranges shop
Get plums from plums shop
Get pears from pears shop
Get bagels from bagels shop 'the Get lines...
Get donuts from donuts shop 'can be in any order
Eat from apples, bagels, oranges,
plums, donuts, pears 'whitespace at start of line is unimportant
......................... 'more unknown lines of text file
Chapter 1
Several lines of text about apples in here
Chapter 2
Several lines of text about bagels in here
Chapter 3
Several lines of text about oranges in here
Chapter 4
Several lines of text about plums in here
Chapter 5
Several lines of text about donuts in here
Chapter 6
Several lines of text about pears in here

Summary:
I have added two new items to Get;
I have put them into the comma-delimited list after searching for a
particular fruit to put each one after;
The Chapters are renumbered to match their position in the
comma-delimited list.
The "several lines of text" about each new item can be pulled from a
new_foods.txt file (or a bagels.txt and a donuts.txt file).

My first objective is to process the files as described.
My second objective is to learn the best language for this sort of
text manipulation. The language should run on Windows 98, XP and
Linux.

Would Python be best, or would a macro-scripting thing like AutoHotKey
work?
I thought about Perl, but think I would learn bad habits and have hard
to read code.

Thanks, Ross

Jul 19 '05 #3

P: n/a
ross wrote:
I want to do some tricky text file manipulation on many files, but have
only a little programming knowledge.

What are the ideal languages for the following examples?

1. Starting from a certain folder, look in the subfolders for all
filenames matching *FOOD*.txt Any files matching in each folder should
be copied to a new subfolder within the current folder called EATING
with a new name of *FOOD*COPY.txt
This should get you started:

import errno
from path import path # http://www.jorendorff.com/articles/python/path/

dst_dirpath = path("EATING")

# create dst_dirpath
try:
dst_dirpath.makedirs() # make destination directory and its parents
except OSError, err: # error!
if err.errno = errno.EEXIST: # might just be that it already exists
if not dst_dirpath.isdir(): # and it's a directory
raise # if not, raise an exception

for filepath in path(".").walkfiles("*FOOD*.txt"):
infile = file(filepath)
outfile = file(dst_dirpath.joinpath(filepath.namebase+"_COPY .txt"))

...do processing here...
My first objective is to process the files as described.
My second objective is to learn the best language for this sort of text
manipulation. The language should run on Windows 98, XP and Linux.

Would Python be best, or would a macro-scripting thing like AutoHotKey
work?


Personally, I'd use Python, but what do you expect when you ask here?
--
Michael Hoffman
Jul 19 '05 #4

P: n/a
Roose wrote:
Why do people keep asking what language to use for certain things in the
Python newsgroup? Obviously the answer is going to biased.

Not that it's a bad thing because I love Python, but it doesn't make sense
if you honestly want an objective opinion.

R


What usenet group is it best to ask in then?
Is there one where people have good knowledge of many scripting
languages?

Ross

Jul 19 '05 #5

P: n/a
ross wrote:
Roose wrote:
Why do people keep asking what language to use for certain things in the
Python newsgroup? Obviously the answer is going to biased.

Not that it's a bad thing because I love Python, but it doesn't make sense
if you honestly want an objective opinion.

R


What usenet group is it best to ask in then?
Is there one where people have good knowledge of many scripting
languages?


"What programming language is best for x" questions can be asked in
comp.programming and/or comp.lang.misc , and possibly in a
domain-specific newsgroup if it exists, for example
sci.math.num-analysis if x = scientific computing. The resulting
debates contain both heat and light :).

Jul 19 '05 #6

P: n/a
Hi Roose,

Actually, it is a good thing because it allows those who know the Python
language to be able to show the benefits and weaknesses of the language.
Sure, the attitude here will be "Yes, it's a great language." Yet, at
the same time, it also enables the poster to be able to see potential
benefits to Python that he or she may not of been aware of.

If we don't let others know about the benefits of Python, who will?

Brian
---
Roose wrote:
Why do people keep asking what language to use for certain things in the
Python newsgroup? Obviously the answer is going to biased.

Not that it's a bad thing because I love Python, but it doesn't make sense
if you honestly want an objective opinion.

R

Jul 19 '05 #7

P: n/a
ross <ro*******@gmail.com> writes:
What are the ideal languages for the following examples?

1. Starting from a certain folder, look in the subfolders for all
filenames matching *FOOD*.txt Any files matching in each folder should
be copied to a new subfolder within the current folder called EATING
with a new name of *FOOD*COPY.txt


Bash?

for f in *FOOD*.txt; do cp ${f} EATING/${f}COPY.txt; done

Or "mmv", a linux utility:

mmv '*FOOD*.txt' 'EATING/#1FOOD#2COPY.txt'

For the rest, I personally for choose python.

Dan
Jul 19 '05 #8

P: n/a
Jim


Roose wrote:
Why do people keep asking what language to use for certain things in the
Python newsgroup? Obviously the answer is going to biased.

Not that it's a bad thing because I love Python, but it doesn't make sense
if you honestly want an objective opinion.


It will, however, have the side-effect of helping people who google for
it tomorrow. I've often found a several months old answer that people
on a group had taken the trouble of patiently answering, which was a
big help to me. In this case I can imagine a person who has heard that
Python is in a class of languages like Perl and Ruby, and who googles
around with some keywords to get some idea of whether it can solve
their problem.

Jim

Jul 19 '05 #9

P: n/a
In article <11*********************@z14g2000cwz.googlegroups. com>,
<be*******@aol.com> wrote:
ross wrote:
Roose wrote:
> Why do people keep asking what language to use for certain things in the
> Python newsgroup? Obviously the answer is going to biased.
>
> Not that it's a bad thing because I love Python, but it doesn't make sense
> if you honestly want an objective opinion.
>
> R


What usenet group is it best to ask in then?
Is there one where people have good knowledge of many scripting
languages?


"What programming language is best for x" questions can be asked in
comp.programming and/or comp.lang.misc , and possibly in a
domain-specific newsgroup if it exists, for example
sci.math.num-analysis if x = scientific computing. The resulting
debates contain both heat and light :).


comp.lang.python is actually a fine place to ask such questions,
I submit, for reasons the original poster could not have known:
clp includes quite a few deeply-experienced commentators, and the
ethos of clp favors accuracy over invective far more than some
other newsgroups nominally better focused on general questions.
Jul 19 '05 #10

P: n/a
I tried Bash on Cygwin, but did not know enough about setting up the
environment to get it working.
Instead I got an excellent answer from alt.msdos.batch which used the
FOR IN DO command.
My next job is to learn Python.
Ross

Jul 19 '05 #11

This discussion thread is closed

Replies have been disabled for this discussion.