473,383 Members | 1,862 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,383 software developers and data experts.

Help with inverted dictionary

I'm new to Python and I'm struggling. I have a text file (*.txt) with
a couple thousand entries, each on their own line (similar to a phone
book). How do I make a script to create something like an inverted
dictionary that will allow me to call "robert" and create a new text
file of all of the lines that contain "robert"?
Thanks so much.

Jul 21 '05 #1
16 1825
Hello,

First I'm not so clear about your problem, but you can do the following
steps:
1. Transform your file into list (list1)
2. Use regex to capture 'robert' in every member of list1 and add to
list2
3. Transform your list2 into a file

pujo

Jul 21 '05 #2
import re
name = "Robert"
f = file('phonebook.txt','r')
lines = [line.rstrip("\n") for line in f.readlines()]
pat = re.compile(name, re.I)
related_lines = [line for line in lines if pat.search(line)]

And then you write the lines in related_lines to a file. I don't really
write text to files much so, um, yeah.

Jul 21 '05 #3
OK, so my problem is I have a text file with all of these instances,
for example 5000 facts about animals. I need to go through the file
and put all of the facts (lines) that contain the word lion into a file
called lion.txt. If I come across an animal or other word for which a
file does not yet exist I need to create a file for that word and put
all instances of that word into that file. I realize that this should
probably create 30,000 files or so. Any help would be REALLY
appreciated. Thanks. Reece

Jul 21 '05 #4
I think you need to get a database. Anyways, if anything, it should
create no more than 5,000 files, since 5,000 facts shouldn't talk about
30,000 animals. There have been a few discussions about looking at
files in directories though, if you want to look at those.

Jul 21 '05 #5
I will transfer eventually use a database but is there any way for now
you could help me make the text files? Thank you so much. Reece

Jul 21 '05 #6
I will transfer eventually use a database but is there any way for now
you could help me make the text files? Thank you so much. Reece

Jul 21 '05 #7
I will transfer eventually use a database but is there any way for now
you could help me make the text files? Thank you so much. Reece

Jul 21 '05 #8
Oh, I seem to have missed the part saying 'or other word'. Are you
doing this for every single word in the file?

Jul 21 '05 #9
Yes, I am. Does that make it harder.

Jul 21 '05 #10
On Tue, 12 Jul 2005 08:25:50 -0700, rorley wrote:
OK, so my problem is I have a text file with all of these instances,
for example 5000 facts about animals. I need to go through the file
and put all of the facts (lines) that contain the word lion into a file
called lion.txt. If I come across an animal or other word for which a
file does not yet exist I need to create a file for that word and put
all instances of that word into that file. I realize that this should
probably create 30,000 files or so. Any help would be REALLY
appreciated. Thanks. Reece


Sounds like homework to me...

Start by breaking the big problem down into little problems:

Step 1: read the data from the file

You do that with something like this:

data = file("MyFile.txt", "r").read()

Notice I said *something like* -- that's a hint that you want to change
that to something slightly different.

Step 2: grab each line, one at a time

Somehow you want to read lines (hint! hint!) from the file, so that you
have a list of text lines in data. How do you read lines (hint!) from a
file in Python?

Once you do that, data should look something like this:

["lions are mammals\n", "lions eat meat\n", "sheep eat grass\n"]

So you can work with each line in data with:

for line in data:
do_something(line)

Step 3: grab each word from the line

I'll make this one easy for you:

words = line.split()

words now looks like: ["lions", "are", "mammals"]

Step 4: for each word, open a file:

This one is also easy:

for word in words:
fp = file(word, "w")
fp.write(all the other words)
fp.close()

Hint: this code won't quite do what you want. You need to change a few
things.

Does this help? Is that enough to get started? See how far you get, and
then come back for more help.
--
Steven.

Jul 21 '05 #11
Not quite homework but a special project. Thanks for the advice. I'll
let you know if I run into anymore stumbling blocks. Reece

Jul 21 '05 #12
As Steven said this looks too much like home work
But what the heck I am also learning python. So I wrote a small
program. A very small program. I am fairly new to Python, I am stunned
each time to see how small programs like this can be.

Since I am also learning can somebody comment if anything here is not
Pythonesque.

dictwords = dict()
for line in open('testfile.txt','r'):
for word in line.rstrip('\n').split():
dictwords.setdefault(word,set()).update((line.rstr ip('\n'),))
for wordfound in dictwords.items():
open(wordfound[0],'w').write('\n'.join(wordfound[1]))
Jul 21 '05 #13
Thanks for the hints, I think I've figured it out. I've only been
using Python for 2 days so I really needed the direction. If you were
curious, this is not homework but an attempt to use the ConceptNet data
(its an MIT AI project) to make a website in a Wiki-like format that
would allow the data to be edited on the fly. I'll ask again if I need
more help. You guys are great. Reece

Jul 21 '05 #14
ro****@gmail.com wrote:
I will transfer eventually use a database but is there any way for now
you could help me make the text files? Thank you so much. Reece


No. There is utterly no reason why you should create 5000 or 30000 text
files. While you are waiting to get a clue about databases, do it in
Python, in memory. It should only take a very tiny time to suck your
5000-fact file into memory, index the data appropriately, and do some
queries e.g. list all facts about "lion".

Jul 21 '05 #15
On Wed, 13 Jul 2005 11:38:44 +1000, John Machin wrote:
ro****@gmail.com wrote:
I will transfer eventually use a database but is there any way for now
you could help me make the text files? Thank you so much. Reece


No. There is utterly no reason why you should create 5000 or 30000 text
files.


There is one possible reason: if it is a homework assignment, and creating
all those files is part of the assignment.

(I've seen stupider ideas, but not by much.)

--
Steven.

Jul 21 '05 #16
On 12 Jul 2005 10:26:38 -0700, ro****@gmail.com declaimed the following
in comp.lang.python:
Thanks for the hints, I think I've figured it out. I've only been
using Python for 2 days so I really needed the direction. If you were
curious, this is not homework but an attempt to use the ConceptNet data
(its an MIT AI project) to make a website in a Wiki-like format that
would allow the data to be edited on the fly. I'll ask again if I need
more help. You guys are great. Reece
You may still want to consider filtering the words you use as
keys... Do you REALLY want something like a KWIC file just for words
like: a, in, of, the, etc.

-- ================================================== ============ <
wl*****@ix.netcom.com | Wulfraed Dennis Lee Bieber KD6MOG <
wu******@dm.net | Bestiaria Support Staff <
================================================== ============ <
Home Page: <http://www.dm.net/~wulfraed/> <
Overflow Page: <http://wlfraed.home.netcom.com/> <

Jul 21 '05 #17

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
by: Mark Hobley | last post by:
I have some information that states that the if conditional can be be inverted from the traditional syntax if (EXPRESSION) BLOCK to an alternative syntax: if BLOCK (EXPRESSION); I have a...
3
by: aking | last post by:
Dear Python people, im a newbie to python and here...so hello! Im trying to iterate through values in a dictionary so i can find the closest value and then extract the key for that...
5
by: vd12005 | last post by:
Hello, While playing to write an inverted index (see: http://en.wikipedia.org/wiki/Inverted_index), i run out of memory with a classic dict, (i have thousand of documents and millions of terms,...
1
hpbutterbeer
by: hpbutterbeer | last post by:
We have a Machine Project and my brain is currently in a clouded state. Sorry, I'm just a beginner in C Programming... Text twist is a windows game whose main objective is to form words out of the...
3
by: xandra | last post by:
hello: how many inversions there in S O T R M E ? thank you
2
by: muchexie | last post by:
i have two scripts that are not running to reset a password that has been forgotten and the other to change old password. here are the scripts. change_passwd.php session_start();...
2
by: weird0 | last post by:
I have to create a string that contains inverted commas inside it. How can i do that in c#? As follows:- "AT+CMGF=1" string temp=""AT+CMGF=1""; // not valid
1
by: Geosondaman | last post by:
#include <stdio.h> #include <stdlib.h> int main(int argc, char *argv) { int i=0, j; while (++i<10) { j=9; while (++j<=i) { printf("*");
1
by: Shashank tiwari | last post by:
Hi I am trying Putting single inverted comma before a digit in Excel and a single inverted comma after the number and a comma to follow. Eg. '234567', Can anybody tell us how to do it for 7000...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.