473,692 Members | 1,963 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Formatting text files

1 New Member
Im starting new project where I have to take data from one text file and toss it into another is a specified format ( i have a 27 page document outlining that format). I am new to python and was wondering if anyone had some tips on usefull ways to parse text files using python?
Jan 5 '07 #1
5 6443
bvdet
2,851 Recognized Expert Moderator Specialist
Im starting new project where I have to take data from one text file and toss it into another is a specified format ( i have a 27 page document outlining that format). I am new to python and was wondering if anyone had some tips on usefull ways to parse text files using python?
I'm no expert on parsing data, but regular expressions (module re) is a powerful tool.
Jan 5 '07 #2
bartonc
6,596 Recognized Expert Expert
Im starting new project where I have to take data from one text file and toss it into another is a specified format ( i have a 27 page document outlining that format). I am new to python and was wondering if anyone had some tips on usefull ways to parse text files using python?
Python has many powerful and easy to use tools for dealing with text. So many, in fact, that picking the tools is sometimes the hard part. After getting a grasp on the language syntax and structures, parsing text is fairly easy to implement.
Jan 6 '07 #3
badech
16 New Member
hello
i hope this is useful
http://ibiblio.org/obp/thinkCS/pytho...ml/chap11.html

you can see also :
http://users.info.unicaen.fr/~fhoube...honcours9B.pdf

http://www.cifen.ulg.ac.be/inforef/s...otes_hyper.pdf
chapter 9 ( page 112 )
but it's in french
Jan 6 '07 #4
dshimer
136 Recognized Expert New Member
In my mind it also depends a great deal on the structure of the text. I do this task constantly and because of the setting I am in I deal primarily with two types of data files.

One is text or numeric data that is fairly structured, for example each line may contain a variety of numbers or strings delimited by a character or white space. For example to describe a point in space, a file may have hundreds of lines with
Name,X,Y,Z,R1,R 2,R3,Desc
Where Name is the name of a point, x,y,z, are geographic coordinates and the R's are some kind of real numbers, and Desc is a text description. When working with these kinds of files I find it easiest to take the data from the input file and create a list of lists then just loop over the lists outputing the data in the new format. I usually grab the whole file using readlines() then run it through a function I call lines2lists which works on data separated by whitespace but could easily be modified to add a delimiter variable.

Expand|Select|Wrap|Line Numbers
  1. def lines2lists(AListOfDataLines):
  2.     '''
  3.     Function readlines returns an entire file with each line as a string in a
  4.     list of data.  This function will convert each string into a list of words,
  5.     then return a list of lists.
  6.     Example:            lines2lists(['first line','the second line','or sentences'])
  7.     Would return:       [['first','line'],['the','second','line'],['or','sentences]]
  8.     '''
  9.     DataList=[]
  10.     for Line in AListOfDataLines:
  11.         DataList.append(Line.split())
  12.     return DataList
  13.  
I usually combine it with readlines(), for example
Expand|Select|Wrap|Line Numbers
  1. Data=utilitymodule.lines2lists(AnOpenFile.readlines())
Where Data is the resulting list in which the example above would look something like the following.
[[Name,X,Y,Z,R1,R 2,R3,Desc],[Name,X,Y,Z,R1,R 2,R3,Desc],[Name,X,Y,Z,R1,R 2,R3,Desc]]

The for each element in Data I format and output the elements of each coordinate (in this case).

The other type of file I work with a lot is one in which there may be multiple elements of a particular dataset but they are on different lines of the file. Using the example above it might look like.
Name1 aString
Name2 aString
X1 aNumber
X2 aNumber
Y1 aNumber
Y2 aNumber
and so on....

In this case I still read in the lines, build lists from them, then use something like the count() function to test for a value in a particular list, if the test value exists, I grab the other member of the list which is the actual data then append it to the actual list of data. In pseudocode it would be something like

If InputString.cou nt(test value like name1) is true then datastring.appe nd( the value associated with name1) then when all the input strings have been parsed. for each element of datastring, format and output the values.

There are probably easier ways to do this, but most of the time I may only need to write a program every couple of weeks, may only have 5 minutes notice and need to have it done very quickly. Since there is so much power in lists I tend to stick with the functions I know and love and can hack together quickly.
Jan 9 '07 #5
bvdet
2,851 Recognized Expert Moderator Specialist
dshimer,
Whatever works for you in a efficient manner IS the easiest way. Here are two things that I learned on this forum:
Expand|Select|Wrap|Line Numbers
  1. f = open(dlg1.import_file, "r")
  2. # Files can be used as iterators. Internally, for calls file next() method.
  3. for item in f:
  4.     ....do stuff....
Expand|Select|Wrap|Line Numbers
  1. # If a sub-string is in a string, evaluate True
  2. if "subject_text" in item.lower():
  3.     pt = re.split('[:;,]', item)
Maybe you can use these sometime.
Jan 9 '07 #6

Sign in to post your reply or Sign up for a free account.

Similar topics

2
1936
by: Colleyville Alan | last post by:
I am using Access and have embedded the ActiveX control Formula One that came with Office 2000. (ver 3.04). I have created and formatted a spreadsheet and now I want to copy the info with formatting to another program. The help file reads: "Formula One maintains its own internal clipboard and also supports text on the Windows clipboard. The internal clipboard is more flexible than the Windows clipboard. The internal clipboard...
4
3131
by: Bradley | last post by:
I have an A2000 database in which I have a continuous form with a tick box. There is also a text box with a conditional format that is based on the expression , if it's true then change the background colour. In A2000 it works great, but in A2003 the background doesn't always change and when it does it only changes when the record looses the focus. Any way around this? Is it a bug? Or have they "improved" it?
8
3518
by: Mike MacSween | last post by:
tblCourses one to many to tblEvents. A course may have an intro workshop (a type of event), a mid course workshop, a final exam. Or any combination. Or something different in the future. At the moment the printed output is usually going to Word. It's turning into an unholy mess, because I'm having to prepare umpteen different Word templates, and the queries that drive them, depending on what events a course has.
8
1112
by: coleenholley | last post by:
I'm trying to format this bit of code into a numeric format of 12,123 no decimals in the particular format. Here's the code: tbl_worksheet1.Rows(b).Cells(2).Text() = Format(ls_sum_gasohol, "##,##0") This of course reurns a literal string of ##,##0 not the formatted number. I tried = String.Format(ls_sum_gasohol, "##,##0") which returns me exactly what I had before, the number without the comma. How do I get the format to work...
6
1335
by: Brad | last post by:
I guess I still have not grasped the logic of formatting a simple label or text box. Why would the following not work? I need the resulting label to display currency. lblAmtDue.Text = CInt(txtEntries.Text) * 1.5 lblAmtDue.Text.Format("$#,##0.00")
2
1094
by: D. Shane Fowlkes | last post by:
The Smart Indenting "feature" is driving me absolutlely crazy in VWD. Problem: VWD in Code View insists on tabbing a lot of my code over to it's own liking and effects mainly my comments in the code. First of all, I've searched the help files and searched forums.asp.net. I can't seem to find a solution so I thought I'd try here. When I look at Tools >> Options and look under Text Editor >> All Languages >> Tabs, it has a subtle...
8
1694
by: alamb200 | last post by:
Hi I have set up a SQL database to contain alist FAQ's for our company and then plan to pull this info off using a web page. So far I have entered the data but I am unable to control how it is displayed inside SQL ie I cannot enter new blank lines I have tried using lots of spaces but this does not work when I use the website to display the info. Is there a way of formatting and editing the data in the sql database
14
4327
by: Scott M. | last post by:
Ok, this is driving me nuts... I am using VS.NET 2003 and trying to take an item out of a row in a loosely-typed dataset and place it in a label as a currency. As it is now, I am getting my unformatted data values (as decimals) just fine, so I know there's not a problem with the data retrieval, just the formatting. I have read that this would work: lblPrice.Text = prodRow.ToString("C");
1
1668
by: aman909 | last post by:
Hello, Im trying to use conditional formatting in a text box on a form. What im trying to do is that conditional formatting changes the colour of the text in the text box. I need the conditional formatting to look up another text box. This is what i need: i have one text box which is called colour and it shows various colours. What i would like to do is in the other text box with conditional formatting is for the text box to change...
6
4591
by: pteare | last post by:
Hello, I am using access 07 and have some VBA code which puts some text into a memo field for me. It gets the text from various columns of a table. Currently it looks like the below: phil - hire - advanced jonathan - lessons - beginner claire - lift pass - local area
0
8955
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
7627
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6459
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5818
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4323
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4556
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2970
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
2236
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
1957
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.