473,756 Members | 1,818 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

What language to manipulate text files

I want to do some tricky text file manipulation on many files, but have
only a little programming knowledge.

What are the ideal languages for the following examples?

1. Starting from a certain folder, look in the subfolders for all
filenames matching *FOOD*.txt Any files matching in each folder should
be copied to a new subfolder within the current folder called EATING
with a new name of *FOOD*COPY.txt

2. Process each file as follows:
Here is a simplified example of what I want as input and output.

------------------------------------- input
............... ........... 'several unknown lines of text file
Get apples from apples shop
Get oranges from oranges shop
Get plums from plums shop
Get pears from pears shop
Eat from apples, oranges,
plums, pears 'whitespace at start of line is unimportant
............... ........... 'more unknown lines of text file
Chapter 1
Several lines of text about apples in here
Chapter 2
Several lines of text about oranges in here
Chapter 3
Several lines of text about plums in here
Chapter 4
Several lines of text about pears in here

------------------------------------- output
............... ........... 'several unknown lines of text file
Get apples from apples shop
Get oranges from oranges shop
Get plums from plums shop
Get pears from pears shop
Get bagels from bagels shop 'the Get lines...
Get donuts from donuts shop 'can be in any order
Eat from apples, bagels, oranges,
plums, donuts, pears 'whitespace at start of line is unimportant
............... ........... 'more unknown lines of text file
Chapter 1
Several lines of text about apples in here
Chapter 2
Several lines of text about bagels in here
Chapter 3
Several lines of text about oranges in here
Chapter 4
Several lines of text about plums in here
Chapter 5
Several lines of text about donuts in here
Chapter 6
Several lines of text about pears in here

Summary:
I have added two new items to Get;
I have put them into the comma-delimited list after searching for a
particular fruit to put each one after;
The Chapters are renumbered to match their position in the
comma-delimited list.
The "several lines of text" about each new item can be pulled from a
new_foods.txt file (or a bagels.txt and a donuts.txt file).

My first objective is to process the files as described.
My second objective is to learn the best language for this sort of text
manipulation. The language should run on Windows 98, XP and Linux.

Would Python be best, or would a macro-scripting thing like AutoHotKey
work?
I thought about Perl, but think I would learn bad habits and have hard
to read code.

Thanks, Ross

Jul 19 '05 #1
10 3724
On Saturday 11 June 2005 11:37 pm, ross wrote:
I want to do some tricky text file manipulation on many files, but have
only a little programming knowledge. [...]
Would Python be best, or would a macro-scripting thing like AutoHotKey
work?
I thought about Perl, but think I would learn bad habits and have hard
to read code.


Both Perl and Python are *extremely* good at this kind of work. This is
pretty much what inspired Perl, and Python implements most of the same
toolset. You will solve many of these kinds of problems using "regular
expressions" (built-in first-class object in Perl, created from strings in
Python using the "re" module).

No surprise of course that I would choose Python. Mainly because of what
it provides beyond regular expressions. Many simple cases can be handled
with string methods in Python (check the Sequence types information in the
built-ins section of the Library Reference -- also look at the "string" module,
though it's usually easier to use the string methods approach).

You will probably end up with more readable code using Python and
take less time to develop sufficient proficiency to do the job with it.
--
Terry Hancock ( hancock at anansispacework s.com )
Anansi Spaceworks http://www.anansispaceworks.com

Jul 19 '05 #2
Why do people keep asking what language to use for certain things in the
Python newsgroup? Obviously the answer is going to biased.

Not that it's a bad thing because I love Python, but it doesn't make sense
if you honestly want an objective opinion.

R

ross wrote:
I want to do some tricky text file manipulation on many files, but
have only a little programming knowledge.

What are the ideal languages for the following examples?

1. Starting from a certain folder, look in the subfolders for all
filenames matching *FOOD*.txt Any files matching in each folder should
be copied to a new subfolder within the current folder called EATING
with a new name of *FOOD*COPY.txt

2. Process each file as follows:
Here is a simplified example of what I want as input and output.

------------------------------------- input
............... .......... 'several unknown lines of text file
Get apples from apples shop
Get oranges from oranges shop
Get plums from plums shop
Get pears from pears shop
Eat from apples, oranges,
plums, pears 'whitespace at start of line is unimportant
............... .......... 'more unknown lines of text file
Chapter 1
Several lines of text about apples in here
Chapter 2
Several lines of text about oranges in here
Chapter 3
Several lines of text about plums in here
Chapter 4
Several lines of text about pears in here

------------------------------------- output
............... .......... 'several unknown lines of text file
Get apples from apples shop
Get oranges from oranges shop
Get plums from plums shop
Get pears from pears shop
Get bagels from bagels shop 'the Get lines...
Get donuts from donuts shop 'can be in any order
Eat from apples, bagels, oranges,
plums, donuts, pears 'whitespace at start of line is unimportant
............... .......... 'more unknown lines of text file
Chapter 1
Several lines of text about apples in here
Chapter 2
Several lines of text about bagels in here
Chapter 3
Several lines of text about oranges in here
Chapter 4
Several lines of text about plums in here
Chapter 5
Several lines of text about donuts in here
Chapter 6
Several lines of text about pears in here

Summary:
I have added two new items to Get;
I have put them into the comma-delimited list after searching for a
particular fruit to put each one after;
The Chapters are renumbered to match their position in the
comma-delimited list.
The "several lines of text" about each new item can be pulled from a
new_foods.txt file (or a bagels.txt and a donuts.txt file).

My first objective is to process the files as described.
My second objective is to learn the best language for this sort of
text manipulation. The language should run on Windows 98, XP and
Linux.

Would Python be best, or would a macro-scripting thing like AutoHotKey
work?
I thought about Perl, but think I would learn bad habits and have hard
to read code.

Thanks, Ross

Jul 19 '05 #3
ross wrote:
I want to do some tricky text file manipulation on many files, but have
only a little programming knowledge.

What are the ideal languages for the following examples?

1. Starting from a certain folder, look in the subfolders for all
filenames matching *FOOD*.txt Any files matching in each folder should
be copied to a new subfolder within the current folder called EATING
with a new name of *FOOD*COPY.txt
This should get you started:

import errno
from path import path # http://www.jorendorff.com/articles/python/path/

dst_dirpath = path("EATING")

# create dst_dirpath
try:
dst_dirpath.mak edirs() # make destination directory and its parents
except OSError, err: # error!
if err.errno = errno.EEXIST: # might just be that it already exists
if not dst_dirpath.isd ir(): # and it's a directory
raise # if not, raise an exception

for filepath in path(".").walkf iles("*FOOD*.tx t"):
infile = file(filepath)
outfile = file(dst_dirpat h.joinpath(file path.namebase+" _COPY.txt"))

...do processing here...
My first objective is to process the files as described.
My second objective is to learn the best language for this sort of text
manipulation. The language should run on Windows 98, XP and Linux.

Would Python be best, or would a macro-scripting thing like AutoHotKey
work?


Personally, I'd use Python, but what do you expect when you ask here?
--
Michael Hoffman
Jul 19 '05 #4
Roose wrote:
Why do people keep asking what language to use for certain things in the
Python newsgroup? Obviously the answer is going to biased.

Not that it's a bad thing because I love Python, but it doesn't make sense
if you honestly want an objective opinion.

R


What usenet group is it best to ask in then?
Is there one where people have good knowledge of many scripting
languages?

Ross

Jul 19 '05 #5
ross wrote:
Roose wrote:
Why do people keep asking what language to use for certain things in the
Python newsgroup? Obviously the answer is going to biased.

Not that it's a bad thing because I love Python, but it doesn't make sense
if you honestly want an objective opinion.

R


What usenet group is it best to ask in then?
Is there one where people have good knowledge of many scripting
languages?


"What programming language is best for x" questions can be asked in
comp.programmin g and/or comp.lang.misc , and possibly in a
domain-specific newsgroup if it exists, for example
sci.math.num-analysis if x = scientific computing. The resulting
debates contain both heat and light :).

Jul 19 '05 #6
Hi Roose,

Actually, it is a good thing because it allows those who know the Python
language to be able to show the benefits and weaknesses of the language.
Sure, the attitude here will be "Yes, it's a great language." Yet, at
the same time, it also enables the poster to be able to see potential
benefits to Python that he or she may not of been aware of.

If we don't let others know about the benefits of Python, who will?

Brian
---
Roose wrote:
Why do people keep asking what language to use for certain things in the
Python newsgroup? Obviously the answer is going to biased.

Not that it's a bad thing because I love Python, but it doesn't make sense
if you honestly want an objective opinion.

R

Jul 19 '05 #7
ross <ro*******@gmai l.com> writes:
What are the ideal languages for the following examples?

1. Starting from a certain folder, look in the subfolders for all
filenames matching *FOOD*.txt Any files matching in each folder should
be copied to a new subfolder within the current folder called EATING
with a new name of *FOOD*COPY.txt


Bash?

for f in *FOOD*.txt; do cp ${f} EATING/${f}COPY.txt; done

Or "mmv", a linux utility:

mmv '*FOOD*.txt' 'EATING/#1FOOD#2COPY.tx t'

For the rest, I personally for choose python.

Dan
Jul 19 '05 #8
Jim


Roose wrote:
Why do people keep asking what language to use for certain things in the
Python newsgroup? Obviously the answer is going to biased.

Not that it's a bad thing because I love Python, but it doesn't make sense
if you honestly want an objective opinion.


It will, however, have the side-effect of helping people who google for
it tomorrow. I've often found a several months old answer that people
on a group had taken the trouble of patiently answering, which was a
big help to me. In this case I can imagine a person who has heard that
Python is in a class of languages like Perl and Ruby, and who googles
around with some keywords to get some idea of whether it can solve
their problem.

Jim

Jul 19 '05 #9
In article <11************ *********@z14g2 000cwz.googlegr oups.com>,
<be*******@aol. com> wrote:
ross wrote:
Roose wrote:
> Why do people keep asking what language to use for certain things in the
> Python newsgroup? Obviously the answer is going to biased.
>
> Not that it's a bad thing because I love Python, but it doesn't make sense
> if you honestly want an objective opinion.
>
> R


What usenet group is it best to ask in then?
Is there one where people have good knowledge of many scripting
languages?


"What programming language is best for x" questions can be asked in
comp.programmi ng and/or comp.lang.misc , and possibly in a
domain-specific newsgroup if it exists, for example
sci.math.num-analysis if x = scientific computing. The resulting
debates contain both heat and light :).


comp.lang.pytho n is actually a fine place to ask such questions,
I submit, for reasons the original poster could not have known:
clp includes quite a few deeply-experienced commentators, and the
ethos of clp favors accuracy over invective far more than some
other newsgroups nominally better focused on general questions.
Jul 19 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
4522
by: Anonymous | last post by:
I need to create an application that will do fairly simple text manipulation on 20,000 files in text format (html files). The files exist both on my Windows machine and on a FreeBSD server. I prefer to do the manipulation on my machine where it's easier to create backup copies, recover from programming errors and so on, then upload the files to the server. All I'm doing is extracting certain elements from each file and creating a different...
220
19128
by: Brandon J. Van Every | last post by:
What's better about Ruby than Python? I'm sure there's something. What is it? This is not a troll. I'm language shopping and I want people's answers. I don't know beans about Ruby or have any preconceived ideas about it. I have noticed, however, that every programmer I talk to who's aware of Python is also talking about Ruby. So it seems that Ruby has the potential to compete with and displace Python. I'm curious on what basis it...
24
2411
by: Xah Lee | last post by:
in computer languages, often a function definition looks like this: subroutine f (x1, x2, ...) { variables ... do this or that } in advanced languages such as LISP family, it is not uncommon to define functions inside a function. For example:
56
3759
by: Xah Lee | last post by:
What are OOP's Jargons and Complexities Xah Lee, 20050128 The Rise of Classes, Methods, Objects In computer languages, often a function definition looks like this: subroutine f (x1, x2, ...) { variables ... do this or that }
6
1910
by: GoCMS | last post by:
Hi, guys: I am trying debug other people(who has left company)'s ASP code, and had difficulty understanding the use of a hidden asp page. The application has an index page, like MyIndex.asp which has nothing but a couple of other asp files, like <frameset COLS="0,0,52%, *"> <frame NAME="hidden" SRC="MyHidden.asp"> <frame NAME="content" SRC="MyContent.asp"> </frameset>
121
10141
by: typingcat | last post by:
First of all, I'm an Asian and I need to input Japanese, Korean and so on. I've tried many PHP IDEs today, but almost non of them supported Unicode (UTF-8) file. I've found that the only Unicode support IDEs are DreamWeaver 8 and Zend PHP Studio. DreamWeaver provides full support for Unicode. However, DreamWeaver is a web editor rather than a PHP IDE. It only supports basic IntelliSense (or code completion) and doesn't have anything...
23
3647
by: Xah Lee | last post by:
The Concepts and Confusions of Pre-fix, In-fix, Post-fix and Fully Functional Notations Xah Lee, 2006-03-15 Let me summarize: The LISP notation, is a functional notation, and is not a so-called pre-fix notation or algebraic notation. Algebraic notations have the concept of operators, meaning, symbols placed around arguments. In algebraic in-fix notation, different
669
26159
by: Xah Lee | last post by:
in March, i posted a essay “What is Expressiveness in a Computer Language”, archived at: http://xahlee.org/perl-python/what_is_expresiveness.html I was informed then that there is a academic paper written on this subject. On the Expressive Power of Programming Languages, by Matthias Felleisen, 1990. http://www.ccs.neu.edu/home/cobbe/pl-seminar-jr/notes/2003-sep-26/expressive-slides.pdf
5
2006
by: gerryR | last post by:
Not sure where to post this as I don't know what language it applies to (yet) Basically I work in IT and often have to manipulate folder structures or large amounts of text files and am looking for the best way of automating this procedure specifically through a programming language. I did Pascal, and C++ in college (several years ago now) so I'm sure the basics of coding will come back to me once I start. Question
0
9456
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, well explore What is ONU, What Is Router, ONU & Routers main usage, and What is the difference between ONU and Router. Lets take a closer look ! Part I. Meaning of...
0
10034
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9872
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
9843
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8713
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development projectplanning, coding, testing, and deploymentwithout human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7248
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupr who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6534
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5142
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
3805
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.