473,549 Members | 2,784 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

basic python questions

I have a simple assignment for school but am unsure where to go. The
assignment is to read in a text file, split out the words and say which
line each word appears in alphabetical order. I have the basic outline
of the program done which is:

def Xref(filename):
try:
fp = open(filename, "r")
lines = fp.readlines()
fp.close()
except:
raise "Couldn't read input file \"%s\"" % filename
dict = {}
for line_num in xrange(len(line s)):
if lines[line_num] == "": continue
words = lines[line_num].split()
for word in words:
if not dict.has_key(wo rd):
dict[word] = []
if line_num+1 not in dict[word]:
dict[word].append(line_nu m+1)
return dict

My question is, how do I easily parse out punction marks and how do I
sort the list and if there anything else that I am doing wrong in this
code it would be much help.

Nov 18 '06 #1
21 1543
na*******@gmail .com wrote:
I have a simple assignment for school but am unsure where to go. The
assignment is to read in a text file, split out the words and say which
line each word appears in alphabetical order. I have the basic outline
of the program done which is:

def Xref(filename):
try:
fp = open(filename, "r")
lines = fp.readlines()
fp.close()
except:
raise "Couldn't read input file \"%s\"" % filename
dict = {}
for line_num in xrange(len(line s)):
if lines[line_num] == "": continue
words = lines[line_num].split()
for word in words:
if not dict.has_key(wo rd):
dict[word] = []
if line_num+1 not in dict[word]:
dict[word].append(line_nu m+1)
return dict

My question is, how do I easily parse out punction marks and how do I
sort the list and if there anything else that I am doing wrong in this
code it would be much help.
Hi,
on first reading, you have a naked except clause that catches all
exceptions. You might want to try your program on a non-existent file
to find out the actual exception you need to trap for that error
message. Do you want the program to continue if you have no input file?

If you have not covered Regular Expressions, often called RE's then one
way of getting rid of puctuation is to turn the problem on its head.
create a string of all the characters that you consider as valid in
words then go through each input line discarding any character not *in*
the string. Use the doctored line for word extraction.

help(sorted) will start you of on sorting in python. Other
documentation sources have a lot more.

P.S. I have not run the code myself
P.P.S. Where is the functions docstring!
P.P.P.S. You might want to read up on enumerate. It gives another way
to do things when you want an index as well as each item from an
iterable but remember, the index given starts from zero.

Oh, and welcome to comp.lang.pytho n :-)

- Paddy.

Nov 18 '06 #2
In <11************ *********@j44g2 000cwa.googlegr oups.com>,
na*******@gmail .com wrote:
def Xref(filename):
try:
fp = open(filename, "r")
lines = fp.readlines()
fp.close()
except:
raise "Couldn't read input file \"%s\"" % filename
dict = {}
for line_num in xrange(len(line s)):
Instead of reading the file completely into a list you can iterate over
the (open) file object and the `enumerate()` function can be used to get
an index number for each line.
if lines[line_num] == "": continue
Take a look at the lines you've read and you'll see why the ``continue``
is never executed.
words = lines[line_num].split()
for word in words:
if not dict.has_key(wo rd):
dict[word] = []
if line_num+1 not in dict[word]:
dict[word].append(line_nu m+1)
Instead of dealing with words that appear more than once in a line you may
use a `set()` to remove duplicates before entering the loop.

Ciao,
Marc 'BlackJack' Rintsch
Nov 18 '06 #3
na*******@gmail .com wrote:
I have a simple assignment for school but am unsure where to go. The
assignment is to read in a text file, split out the words and say which
line each word appears in alphabetical order. I have the basic outline
of the program done which is:
looks like an excellent start to me.
def Xref(filename):
try:
fp = open(filename, "r")
lines = fp.readlines()
fp.close()
except:
raise "Couldn't read input file \"%s\"" % filename
dict = {}
for line_num in xrange(len(line s)):
if lines[line_num] == "": continue
words = lines[line_num].split()
for word in words:
if not dict.has_key(wo rd):
dict[word] = []
if line_num+1 not in dict[word]:
dict[word].append(line_nu m+1)
return dict

My question is, how do I easily parse out punction marks
it depends a bit how you define the term "word".

if you're using regular text, with a limited set of punctuation
characters, you can simply do e.g.

word = word.strip(".,! ?:;")
if not word:
continue

inside the "for word" loop. this won't handle such characters if they
appear inside words, but that's probably good enough for your task.

another, slightly more advanced approach is to use regular expressions,
such as re.findall("\w+ ") to get a list of all alphanumeric "words" in
the text. that'll have other drawbacks (e.g. it'll split up words like
"couldn't" and "cross-reference", unless you tweak the regexp), and is
probably overkill.

and how do I sort the list and

how to sort the dictionary when printing the cross-reference, you mean?
just use "sorted" on the dictionary; that'll get you a sorted list
of the keys.

sorted(dict)

to avoid duplicates and simplify sorting, you probably want to normalize
the case of the words you add to the dictionary, e.g. by converting all
words to lowercase.
if there anything else that I am doing wrong in this code
there's plenty of things that can be tweaked and tuned and written in a
slightly shorter way by an experienced Python programmer, but assuming
that this is a general programming assignment, I don't see something
seriously "wrong" in your code (just make sure you test it on a file
that doesn't exist before you hand it in)

</F>

Nov 18 '06 #4
<na*******@gmai l.comwrote in message
news:11******** *************@j 44g2000cwa.goog legroups.com...
>I have a simple assignment for school but am unsure where to go. The
assignment is to read in a text file, split out the words and say which
line each word appears in alphabetical order. I have the basic outline
of the program done which is:
And in general, this is one of the best "can anyone help me with my
homework?" posts I've ever seen.
A. You told us up front that it was your homework.
B. You made an honest stab at the solution before posting, and posted the
actual code.
C. You ended with some specific questions on things that didn't work or that
you wanted to improve.

Your current program looks like at least A- material. Add use of sorted and
enumerate, and handle that exception a little better, and you're getting
into A+ territory.

Out of curiosity, what school are you attending that is teaching Python, and
under what course of study?

-- Paul
Nov 18 '06 #5
I am currently going to school at Utah Valley State College, the course
that I am taking is analysis of programming languages. It's an upper
division course but our teacher wanted to teach us python as part of
the course, he spent about 2 - 3 weeks on python which has been good. I
currently work with .net and it is fun to see what other languages have
and what sytax they use.

Paul McGuire wrote:
<na*******@gmai l.comwrote in message
news:11******** *************@j 44g2000cwa.goog legroups.com...
I have a simple assignment for school but am unsure where to go. The
assignment is to read in a text file, split out the words and say which
line each word appears in alphabetical order. I have the basic outline
of the program done which is:

And in general, this is one of the best "can anyone help me with my
homework?" posts I've ever seen.
A. You told us up front that it was your homework.
B. You made an honest stab at the solution before posting, and posted the
actual code.
C. You ended with some specific questions on things that didn't work or that
you wanted to improve.

Your current program looks like at least A- material. Add use of sorted and
enumerate, and handle that exception a little better, and you're getting
into A+ territory.

Out of curiosity, what school are you attending that is teaching Python, and
under what course of study?

-- Paul
Nov 18 '06 #6
I have taken the coments and think I have implemented most. My only
question is how to use the enumerator. Here is what I did, I have tried
a couple of things but was unable to figure out how to get the line
number.

def Xref(filename):
try:
fp = open(filename, "r")
except:
raise "Couldn't read input file \"%s\"" % filename
dict = {}
line_num=0
for words in iter(fp.readlin e,""):
words = set(words.split ())
line_num = line_num+1
for word in words:
word = word.strip(".,! ?:;")
if not dict.has_key(wo rd):
dict[word] = []
dict[word].append(line_nu m)
fp.close()
keys = sorted(dict);
for key in keys:
print key," : ", dict[key]
return dict

Marc 'BlackJack' Rintsch wrote:
In <11************ *********@j44g2 000cwa.googlegr oups.com>,
na*******@gmail .com wrote:
def Xref(filename):
try:
fp = open(filename, "r")
lines = fp.readlines()
fp.close()
except:
raise "Couldn't read input file \"%s\"" % filename
dict = {}
for line_num in xrange(len(line s)):

Instead of reading the file completely into a list you can iterate over
the (open) file object and the `enumerate()` function can be used to get
an index number for each line.
if lines[line_num] == "": continue

Take a look at the lines you've read and you'll see why the ``continue``
is never executed.
words = lines[line_num].split()
for word in words:
if not dict.has_key(wo rd):
dict[word] = []
if line_num+1 not in dict[word]:
dict[word].append(line_nu m+1)

Instead of dealing with words that appear more than once in a line you may
use a `set()` to remove duplicates before entering the loop.

Ciao,
Marc 'BlackJack' Rintsch
Nov 18 '06 #7
tom
na*******@gmail .com wrote:
I have taken the coments and think I have implemented most. My only
question is how to use the enumerator. Here is what I did, I have tried
a couple of things but was unable to figure out how to get the line
number.

Try this in the interpreter,

l = [5,4,3,2,1]
for count, i in enumerate(l):
print count, i

def Xref(filename):
try:
fp = open(filename, "r")
except:
raise "Couldn't read input file \"%s\"" % filename
dict = {}
line_num=0
for words in iter(fp.readlin e,""):
words = set(words.split ())
line_num = line_num+1
for word in words:
word = word.strip(".,! ?:;")
if not dict.has_key(wo rd):
dict[word] = []
dict[word].append(line_nu m)
fp.close()
keys = sorted(dict);
for key in keys:
print key," : ", dict[key]
return dict

Marc 'BlackJack' Rintsch wrote:
>In <11************ *********@j44g2 000cwa.googlegr oups.com>,
na*******@gmail .com wrote:

>>def Xref(filename):
try:
fp = open(filename, "r")
lines = fp.readlines()
fp.close()
except:
raise "Couldn't read input file \"%s\"" % filename
dict = {}
for line_num in xrange(len(line s)):
Instead of reading the file completely into a list you can iterate over
the (open) file object and the `enumerate()` function can be used to get
an index number for each line.

>> if lines[line_num] == "": continue
Take a look at the lines you've read and you'll see why the ``continue``
is never executed.

>> words = lines[line_num].split()
for word in words:
if not dict.has_key(wo rd):
dict[word] = []
if line_num+1 not in dict[word]:
dict[word].append(line_nu m+1)
Instead of dealing with words that appear more than once in a line you may
use a `set()` to remove duplicates before entering the loop.

Ciao,
Marc 'BlackJack' Rintsch

Nov 18 '06 #8
tom
tom wrote:
na*******@gmail .com wrote:
>I have taken the coments and think I have implemented most. My only
question is how to use the enumerator. Here is what I did, I have tried
a couple of things but was unable to figure out how to get the line
number.
Try this in the interpreter,

l = [5,4,3,2,1]
for count, i in enumerate(l):
print count, i
you could do it like this.

for count, line in enumerate(fb):
for word in line.split():
etc...

filehandles are iterators themselves.

dont take my words for granted though, i'm kinda new to all this too :)
>def Xref(filename):
try:
fp = open(filename, "r")
except:
raise "Couldn't read input file \"%s\"" % filename
dict = {}
line_num=0
for words in iter(fp.readlin e,""):
words = set(words.split ())
line_num = line_num+1
for word in words:
word = word.strip(".,! ?:;")
if not dict.has_key(wo rd):
dict[word] = []
dict[word].append(line_nu m)
fp.close()
keys = sorted(dict);
for key in keys:
print key," : ", dict[key]
return dict

Marc 'BlackJack' Rintsch wrote:

>>In <11************ *********@j44g2 000cwa.googlegr oups.com>,
na*******@gmail .com wrote:

def Xref(filename):
try:
fp = open(filename, "r")
lines = fp.readlines()
fp.close()
except:
raise "Couldn't read input file \"%s\"" % filename
dict = {}
for line_num in xrange(len(line s)):
Instead of reading the file completely into a list you can iterate over
the (open) file object and the `enumerate()` function can be used to get
an index number for each line.

if lines[line_num] == "": continue
Take a look at the lines you've read and you'll see why the ``continue``
is never executed.

words = lines[line_num].split()
for word in words:
if not dict.has_key(wo rd):
dict[word] = []
if line_num+1 not in dict[word]:
dict[word].append(line_nu m+1)
Instead of dealing with words that appear more than once in a line you may
use a `set()` to remove duplicates before entering the loop.

Ciao,
Marc 'BlackJack' Rintsch



Nov 18 '06 #9
na*******@gmail .com schrieb:
I have taken the coments and think I have implemented most. My only
Unfortunately, no.
question is how to use the enumerator. Here is what I did, I have tried
a couple of things but was unable to figure out how to get the line
number.

def Xref(filename):
try:
fp = open(filename, "r")
except:
raise "Couldn't read input file \"%s\"" % filename
You still got that I-catch-all-except in there.
This will produce subtle bugs when you e.g. misspell a variable name:

filename = '/tmp/foo'
try:
f = open(fliename, 'r')
except:
raise "can't open filename"
Please notice the wrong-spelled 'fliename'.

This OTOH will give you more clues on what really goes wrong:

filename = '/tmp/foo'
try:
f = open(fliename, 'r')
except IOError:
raise "can't open filename"
Diez
Nov 18 '06 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
10689
by: Orange Free | last post by:
I want to create a program that will ask a user a series of questions and then generate a Microsoft Word document whose content is dictated by the answers. I am not a professional programmer, and I understand only a little about OO programming. Should I a) stick to -- *gasp* -- Visual Basic to accomplish my goal; b) use Python, with...
4
1646
by: Skip Montanaro | last post by:
(moving over from webmaster mailbox) scott> I'm sorry for bothering you, but I've tried to post to the Python scott> Tutor Mail List, tried to get someone from Bay PIggies to scott> respond, but no one is responding to my questions. If you don't scott> want to answer my questions, I'd appreciate an e-mail stating scott> that. My questions...
7
9267
by: Michael Foord | last post by:
#!/usr/bin/python -u # 15-09-04 # v1.0.0 # auth_example.py # A simple script manually demonstrating basic authentication. # Copyright Michael Foord # Free to use, modify and relicense. # No warranty express or implied for the accuracy, fitness to purpose
3
1514
by: Player | last post by:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello I am teaching myself python, and I have gotten a long way, it's quite a decent language and the syntax is great :) However I am having a few, "problems" shall we say with certain conventions in python.
4
1795
by: Francis Lavoie | last post by:
Hello I have some questions regarding webframework, I must say that I quite lost and these questions are basicly to help me understand the way it work. I have some knowledge with PHP and JSP. I have looked for a python web framework to build a web site, a site that I had start in php (and quite finish), but for some reason I wont...
9
3813
by: abisofile | last post by:
hi I'm new to programming.I've try a little BASIC so I want ask since Python is also interpreted lang if it's similar to BASIC.
7
1288
by: fxlogx | last post by:
here we discuss the most basic concepts about python.
1
1460
by: bruce | last post by:
hi... i have the following test python script.... i'm trying to figure out a couple of things... 1st.. how can i write the output of the "label" to an array, and then how i can select a given element of the array.. i know real basic.. 2nd.. where can i go to find methods of libxml2dom. i've been looking using google, but can't seem to...
4
1521
by: Schüle Daniel | last post by:
Hello, first question In : cmp("ABC",) Out: 1 against what part of the list is the string "ABC" compared? second question
0
7518
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
0
7956
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
1
7469
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
0
7808
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
0
6040
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
1
5368
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
5087
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3480
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
1935
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.