473,503 Members | 1,804 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

What language to manipulate text files

I want to do some tricky text file manipulation on many files, but have
only a little programming knowledge.

What are the ideal languages for the following examples?

1. Starting from a certain folder, look in the subfolders for all
filenames matching *FOOD*.txt Any files matching in each folder should
be copied to a new subfolder within the current folder called EATING
with a new name of *FOOD*COPY.txt

2. Process each file as follows:
Here is a simplified example of what I want as input and output.

------------------------------------- input
.......................... 'several unknown lines of text file
Get apples from apples shop
Get oranges from oranges shop
Get plums from plums shop
Get pears from pears shop
Eat from apples, oranges,
plums, pears 'whitespace at start of line is unimportant
.......................... 'more unknown lines of text file
Chapter 1
Several lines of text about apples in here
Chapter 2
Several lines of text about oranges in here
Chapter 3
Several lines of text about plums in here
Chapter 4
Several lines of text about pears in here

------------------------------------- output
.......................... 'several unknown lines of text file
Get apples from apples shop
Get oranges from oranges shop
Get plums from plums shop
Get pears from pears shop
Get bagels from bagels shop 'the Get lines...
Get donuts from donuts shop 'can be in any order
Eat from apples, bagels, oranges,
plums, donuts, pears 'whitespace at start of line is unimportant
.......................... 'more unknown lines of text file
Chapter 1
Several lines of text about apples in here
Chapter 2
Several lines of text about bagels in here
Chapter 3
Several lines of text about oranges in here
Chapter 4
Several lines of text about plums in here
Chapter 5
Several lines of text about donuts in here
Chapter 6
Several lines of text about pears in here

Summary:
I have added two new items to Get;
I have put them into the comma-delimited list after searching for a
particular fruit to put each one after;
The Chapters are renumbered to match their position in the
comma-delimited list.
The "several lines of text" about each new item can be pulled from a
new_foods.txt file (or a bagels.txt and a donuts.txt file).

My first objective is to process the files as described.
My second objective is to learn the best language for this sort of text
manipulation. The language should run on Windows 98, XP and Linux.

Would Python be best, or would a macro-scripting thing like AutoHotKey
work?
I thought about Perl, but think I would learn bad habits and have hard
to read code.

Thanks, Ross

Jul 19 '05 #1
10 3666
On Saturday 11 June 2005 11:37 pm, ross wrote:
I want to do some tricky text file manipulation on many files, but have
only a little programming knowledge. [...]
Would Python be best, or would a macro-scripting thing like AutoHotKey
work?
I thought about Perl, but think I would learn bad habits and have hard
to read code.


Both Perl and Python are *extremely* good at this kind of work. This is
pretty much what inspired Perl, and Python implements most of the same
toolset. You will solve many of these kinds of problems using "regular
expressions" (built-in first-class object in Perl, created from strings in
Python using the "re" module).

No surprise of course that I would choose Python. Mainly because of what
it provides beyond regular expressions. Many simple cases can be handled
with string methods in Python (check the Sequence types information in the
built-ins section of the Library Reference -- also look at the "string" module,
though it's usually easier to use the string methods approach).

You will probably end up with more readable code using Python and
take less time to develop sufficient proficiency to do the job with it.
--
Terry Hancock ( hancock at anansispaceworks.com )
Anansi Spaceworks http://www.anansispaceworks.com

Jul 19 '05 #2
Why do people keep asking what language to use for certain things in the
Python newsgroup? Obviously the answer is going to biased.

Not that it's a bad thing because I love Python, but it doesn't make sense
if you honestly want an objective opinion.

R

ross wrote:
I want to do some tricky text file manipulation on many files, but
have only a little programming knowledge.

What are the ideal languages for the following examples?

1. Starting from a certain folder, look in the subfolders for all
filenames matching *FOOD*.txt Any files matching in each folder should
be copied to a new subfolder within the current folder called EATING
with a new name of *FOOD*COPY.txt

2. Process each file as follows:
Here is a simplified example of what I want as input and output.

------------------------------------- input
......................... 'several unknown lines of text file
Get apples from apples shop
Get oranges from oranges shop
Get plums from plums shop
Get pears from pears shop
Eat from apples, oranges,
plums, pears 'whitespace at start of line is unimportant
......................... 'more unknown lines of text file
Chapter 1
Several lines of text about apples in here
Chapter 2
Several lines of text about oranges in here
Chapter 3
Several lines of text about plums in here
Chapter 4
Several lines of text about pears in here

------------------------------------- output
......................... 'several unknown lines of text file
Get apples from apples shop
Get oranges from oranges shop
Get plums from plums shop
Get pears from pears shop
Get bagels from bagels shop 'the Get lines...
Get donuts from donuts shop 'can be in any order
Eat from apples, bagels, oranges,
plums, donuts, pears 'whitespace at start of line is unimportant
......................... 'more unknown lines of text file
Chapter 1
Several lines of text about apples in here
Chapter 2
Several lines of text about bagels in here
Chapter 3
Several lines of text about oranges in here
Chapter 4
Several lines of text about plums in here
Chapter 5
Several lines of text about donuts in here
Chapter 6
Several lines of text about pears in here

Summary:
I have added two new items to Get;
I have put them into the comma-delimited list after searching for a
particular fruit to put each one after;
The Chapters are renumbered to match their position in the
comma-delimited list.
The "several lines of text" about each new item can be pulled from a
new_foods.txt file (or a bagels.txt and a donuts.txt file).

My first objective is to process the files as described.
My second objective is to learn the best language for this sort of
text manipulation. The language should run on Windows 98, XP and
Linux.

Would Python be best, or would a macro-scripting thing like AutoHotKey
work?
I thought about Perl, but think I would learn bad habits and have hard
to read code.

Thanks, Ross

Jul 19 '05 #3
ross wrote:
I want to do some tricky text file manipulation on many files, but have
only a little programming knowledge.

What are the ideal languages for the following examples?

1. Starting from a certain folder, look in the subfolders for all
filenames matching *FOOD*.txt Any files matching in each folder should
be copied to a new subfolder within the current folder called EATING
with a new name of *FOOD*COPY.txt
This should get you started:

import errno
from path import path # http://www.jorendorff.com/articles/python/path/

dst_dirpath = path("EATING")

# create dst_dirpath
try:
dst_dirpath.makedirs() # make destination directory and its parents
except OSError, err: # error!
if err.errno = errno.EEXIST: # might just be that it already exists
if not dst_dirpath.isdir(): # and it's a directory
raise # if not, raise an exception

for filepath in path(".").walkfiles("*FOOD*.txt"):
infile = file(filepath)
outfile = file(dst_dirpath.joinpath(filepath.namebase+"_COPY .txt"))

...do processing here...
My first objective is to process the files as described.
My second objective is to learn the best language for this sort of text
manipulation. The language should run on Windows 98, XP and Linux.

Would Python be best, or would a macro-scripting thing like AutoHotKey
work?


Personally, I'd use Python, but what do you expect when you ask here?
--
Michael Hoffman
Jul 19 '05 #4
Roose wrote:
Why do people keep asking what language to use for certain things in the
Python newsgroup? Obviously the answer is going to biased.

Not that it's a bad thing because I love Python, but it doesn't make sense
if you honestly want an objective opinion.

R


What usenet group is it best to ask in then?
Is there one where people have good knowledge of many scripting
languages?

Ross

Jul 19 '05 #5
ross wrote:
Roose wrote:
Why do people keep asking what language to use for certain things in the
Python newsgroup? Obviously the answer is going to biased.

Not that it's a bad thing because I love Python, but it doesn't make sense
if you honestly want an objective opinion.

R


What usenet group is it best to ask in then?
Is there one where people have good knowledge of many scripting
languages?


"What programming language is best for x" questions can be asked in
comp.programming and/or comp.lang.misc , and possibly in a
domain-specific newsgroup if it exists, for example
sci.math.num-analysis if x = scientific computing. The resulting
debates contain both heat and light :).

Jul 19 '05 #6
Hi Roose,

Actually, it is a good thing because it allows those who know the Python
language to be able to show the benefits and weaknesses of the language.
Sure, the attitude here will be "Yes, it's a great language." Yet, at
the same time, it also enables the poster to be able to see potential
benefits to Python that he or she may not of been aware of.

If we don't let others know about the benefits of Python, who will?

Brian
---
Roose wrote:
Why do people keep asking what language to use for certain things in the
Python newsgroup? Obviously the answer is going to biased.

Not that it's a bad thing because I love Python, but it doesn't make sense
if you honestly want an objective opinion.

R

Jul 19 '05 #7
ross <ro*******@gmail.com> writes:
What are the ideal languages for the following examples?

1. Starting from a certain folder, look in the subfolders for all
filenames matching *FOOD*.txt Any files matching in each folder should
be copied to a new subfolder within the current folder called EATING
with a new name of *FOOD*COPY.txt


Bash?

for f in *FOOD*.txt; do cp ${f} EATING/${f}COPY.txt; done

Or "mmv", a linux utility:

mmv '*FOOD*.txt' 'EATING/#1FOOD#2COPY.txt'

For the rest, I personally for choose python.

Dan
Jul 19 '05 #8
Jim


Roose wrote:
Why do people keep asking what language to use for certain things in the
Python newsgroup? Obviously the answer is going to biased.

Not that it's a bad thing because I love Python, but it doesn't make sense
if you honestly want an objective opinion.


It will, however, have the side-effect of helping people who google for
it tomorrow. I've often found a several months old answer that people
on a group had taken the trouble of patiently answering, which was a
big help to me. In this case I can imagine a person who has heard that
Python is in a class of languages like Perl and Ruby, and who googles
around with some keywords to get some idea of whether it can solve
their problem.

Jim

Jul 19 '05 #9
In article <11*********************@z14g2000cwz.googlegroups. com>,
<be*******@aol.com> wrote:
ross wrote:
Roose wrote:
> Why do people keep asking what language to use for certain things in the
> Python newsgroup? Obviously the answer is going to biased.
>
> Not that it's a bad thing because I love Python, but it doesn't make sense
> if you honestly want an objective opinion.
>
> R


What usenet group is it best to ask in then?
Is there one where people have good knowledge of many scripting
languages?


"What programming language is best for x" questions can be asked in
comp.programming and/or comp.lang.misc , and possibly in a
domain-specific newsgroup if it exists, for example
sci.math.num-analysis if x = scientific computing. The resulting
debates contain both heat and light :).


comp.lang.python is actually a fine place to ask such questions,
I submit, for reasons the original poster could not have known:
clp includes quite a few deeply-experienced commentators, and the
ethos of clp favors accuracy over invective far more than some
other newsgroups nominally better focused on general questions.
Jul 19 '05 #10
I tried Bash on Cygwin, but did not know enough about setting up the
environment to get it working.
Instead I got an excellent answer from alt.msdos.batch which used the
FOR IN DO command.
My next job is to learn Python.
Ross

Jul 19 '05 #11

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
4507
by: Anonymous | last post by:
I need to create an application that will do fairly simple text manipulation on 20,000 files in text format (html files). The files exist both on my Windows machine and on a FreeBSD server. I...
220
18803
by: Brandon J. Van Every | last post by:
What's better about Ruby than Python? I'm sure there's something. What is it? This is not a troll. I'm language shopping and I want people's answers. I don't know beans about Ruby or have...
24
2373
by: Xah Lee | last post by:
in computer languages, often a function definition looks like this: subroutine f (x1, x2, ...) { variables ... do this or that } in advanced languages such as LISP family, it is not uncommon...
56
3684
by: Xah Lee | last post by:
What are OOP's Jargons and Complexities Xah Lee, 20050128 The Rise of Classes, Methods, Objects In computer languages, often a function definition looks like this: subroutine f (x1, x2, ...)...
6
1889
by: GoCMS | last post by:
Hi, guys: I am trying debug other people(who has left company)'s ASP code, and had difficulty understanding the use of a hidden asp page. The application has an index page, like MyIndex.asp...
121
9905
by: typingcat | last post by:
First of all, I'm an Asian and I need to input Japanese, Korean and so on. I've tried many PHP IDEs today, but almost non of them supported Unicode (UTF-8) file. I've found that the only Unicode...
23
3586
by: Xah Lee | last post by:
The Concepts and Confusions of Pre-fix, In-fix, Post-fix and Fully Functional Notations Xah Lee, 2006-03-15 Let me summarize: The LISP notation, is a functional notation, and is not a...
669
25390
by: Xah Lee | last post by:
in March, i posted a essay “What is Expressiveness in a Computer Language”, archived at: http://xahlee.org/perl-python/what_is_expresiveness.html I was informed then that there is a academic...
5
1978
by: gerryR | last post by:
Not sure where to post this as I don't know what language it applies to (yet) Basically I work in IT and often have to manipulate folder structures or large amounts of text files and am looking...
0
7076
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
7274
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
7323
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
6984
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
7453
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
5576
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development projectplanning, coding, testing,...
0
3162
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
3151
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
732
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.