Help with inverted dictionary

rorley

I'm new to Python and I'm struggling. I have a text file (*.txt) with
a couple thousand entries, each on their own line (similar to a phone
book). How do I make a script to create something like an inverted
dictionary that will allow me to call "robert" and create a new text
file of all of the lines that contain "robert"?
Thanks so much.

Jul 21 '05 #1

Subscribe Post Reply

1825

ajikoe

Hello,

First I'm not so clear about your problem, but you can do the following
steps:
1. Transform your file into list (list1)
2. Use regex to capture 'robert' in every member of list1 and add to
list2
3. Transform your list2 into a file

pujo

Jul 21 '05 #2

Devan L

import re
name = "Robert"
f = file('phonebook.txt','r')
lines = [line.rstrip("\n") for line in f.readlines()]
pat = re.compile(name, re.I)
related_lines = [line for line in lines if pat.search(line)]

And then you write the lines in related_lines to a file. I don't really
write text to files much so, um, yeah.

Jul 21 '05 #3

rorley

OK, so my problem is I have a text file with all of these instances,
for example 5000 facts about animals. I need to go through the file
and put all of the facts (lines) that contain the word lion into a file
called lion.txt. If I come across an animal or other word for which a
file does not yet exist I need to create a file for that word and put
all instances of that word into that file. I realize that this should
probably create 30,000 files or so. Any help would be REALLY
appreciated. Thanks. Reece

Jul 21 '05 #4

Devan L

I think you need to get a database. Anyways, if anything, it should
create no more than 5,000 files, since 5,000 facts shouldn't talk about
30,000 animals. There have been a few discussions about looking at
files in directories though, if you want to look at those.

Jul 21 '05 #5

rorley

I will transfer eventually use a database but is there any way for now
you could help me make the text files? Thank you so much. Reece

Jul 21 '05 #6

rorley

I will transfer eventually use a database but is there any way for now
you could help me make the text files? Thank you so much. Reece

Jul 21 '05 #7

rorley

I will transfer eventually use a database but is there any way for now
you could help me make the text files? Thank you so much. Reece

Jul 21 '05 #8

Devan L

Oh, I seem to have missed the part saying 'or other word'. Are you
doing this for every single word in the file?

Jul 21 '05 #9

rorley

Yes, I am. Does that make it harder.

Jul 21 '05 #10

Steven D'Aprano

On Tue, 12 Jul 2005 08:25:50 -0700, rorley wrote:

OK, so my problem is I have a text file with all of these instances,
for example 5000 facts about animals. I need to go through the file
and put all of the facts (lines) that contain the word lion into a file
called lion.txt. If I come across an animal or other word for which a
file does not yet exist I need to create a file for that word and put
all instances of that word into that file. I realize that this should
probably create 30,000 files or so. Any help would be REALLY
appreciated. Thanks. Reece

Sounds like homework to me...

Start by breaking the big problem down into little problems:

Step 1: read the data from the file

You do that with something like this:

data = file("MyFile.txt", "r").read()

Notice I said *something like* -- that's a hint that you want to change
that to something slightly different.

Step 2: grab each line, one at a time

Somehow you want to read lines (hint! hint!) from the file, so that you
have a list of text lines in data. How do you read lines (hint!) from a
file in Python?

Once you do that, data should look something like this:

["lions are mammals\n", "lions eat meat\n", "sheep eat grass\n"]

So you can work with each line in data with:

for line in data:
do_something(line)

Step 3: grab each word from the line

I'll make this one easy for you:

words = line.split()

words now looks like: ["lions", "are", "mammals"]

Step 4: for each word, open a file:

This one is also easy:

for word in words:
fp = file(word, "w")
fp.write(all the other words)
fp.close()

Hint: this code won't quite do what you want. You need to change a few
things.

Does this help? Is that enough to get started? See how far you get, and
then come back for more help.
--
Steven.

Jul 21 '05 #11

rorley

Not quite homework but a special project. Thanks for the advice. I'll
let you know if I run into anymore stumbling blocks. Reece

Jul 21 '05 #12

Dark Cowherd

As Steven said this looks too much like home work
But what the heck I am also learning python. So I wrote a small
program. A very small program. I am fairly new to Python, I am stunned
each time to see how small programs like this can be.

Since I am also learning can somebody comment if anything here is not
Pythonesque.

dictwords = dict()
for line in open('testfile.txt','r'):
for word in line.rstrip('\n').split():
dictwords.setdefault(word,set()).update((line.rstr ip('\n'),))
for wordfound in dictwords.items():
open(wordfound[0],'w').write('\n'.join(wordfound[1]))

Jul 21 '05 #13

rorley

Thanks for the hints, I think I've figured it out. I've only been
using Python for 2 days so I really needed the direction. If you were
curious, this is not homework but an attempt to use the ConceptNet data
(its an MIT AI project) to make a website in a Wiki-like format that
would allow the data to be edited on the fly. I'll ask again if I need
more help. You guys are great. Reece

Jul 21 '05 #14

John Machin

ro****@gmail.com wrote:

I will transfer eventually use a database but is there any way for now
you could help me make the text files? Thank you so much. Reece

No. There is utterly no reason why you should create 5000 or 30000 text
files. While you are waiting to get a clue about databases, do it in
Python, in memory. It should only take a very tiny time to suck your
5000-fact file into memory, index the data appropriately, and do some
queries e.g. list all facts about "lion".

Jul 21 '05 #15

Steven D'Aprano

On Wed, 13 Jul 2005 11:38:44 +1000, John Machin wrote:

ro****@gmail.com wrote:
I will transfer eventually use a database but is there any way for now
you could help me make the text files? Thank you so much. Reece

No. There is utterly no reason why you should create 5000 or 30000 text
files.

There is one possible reason: if it is a homework assignment, and creating
all those files is part of the assignment.

(I've seen stupider ideas, but not by much.)

--
Steven.

Jul 21 '05 #16

Dennis Lee Bieber

On 12 Jul 2005 10:26:38 -0700, ro****@gmail.com declaimed the following
in comp.lang.python:

Thanks for the hints, I think I've figured it out. I've only been
using Python for 2 days so I really needed the direction. If you were
curious, this is not homework but an attempt to use the ConceptNet data
(its an MIT AI project) to make a website in a Wiki-like format that
would allow the data to be edited on the fly. I'll ask again if I need
more help. You guys are great. Reece
You may still want to consider filtering the words you use as
keys... Do you REALLY want something like a KWIC file just for words
like: a, in, of, the, etc.

-- ================================================== ============ <
wl*****@ix.netcom.com | Wulfraed Dennis Lee Bieber KD6MOG <
wu******@dm.net | Bestiaria Support Staff <
================================================== ============ <
Home Page: <http://www.dm.net/~wulfraed/> <
Overflow Page: <http://wlfraed.home.netcom.com/> <

Jul 21 '05 #17

Similar topics

Inverted syntax for an if conditional

by: Mark Hobley | last post by:

I have some information that states that the if conditional can be be inverted from the traditional syntax if (EXPRESSION) BLOCK to an alternative syntax: if BLOCK (EXPRESSION); I have a...

Perl

help - iter & dict

by: aking | last post by:

Dear Python people, im a newbie to python and here...so hello! Im trying to iterate through values in a dictionary so i can find the closest value and then extract the key for that...

Python

ZODB for inverted index?

by: vd12005 | last post by:

Hello, While playing to write an inverted index (see: http://en.wikipedia.org/wiki/Inverted_index), i run out of memory with a classic dict, (i have thousand of documents and millions of terms,...

Python

Needs help on making a "text twist" game within 24 hrs. *crying*

by: hpbutterbeer | last post by:

We have a Machine Project and my brain is currently in a clouded state. Sorry, I'm just a beginner in C Programming... Text twist is a windows game whose main objective is to form words out of the...

C / C++

need help in number of inversions in sorting

by: xandra | last post by:

hello: how many inversions there in S O T R M E ? thank you

C / C++

my script not running and i need help

by: muchexie | last post by:

i have two scripts that are not running to reset a password that has been forgotten and the other to change old password. here are the scripts. change_passwd.php session_start();...

PHP

Creating a string that contains inverted commas inside it

by: weird0 | last post by:

I have to create a string that contains inverted commas inside it. How can i do that in c#? As follows:- "AT+CMGF=1" string temp=""AT+CMGF=1""; // not valid

C# / C Sharp

Drawing an Inverted Triangle

by: Geosondaman | last post by:

#include <stdio.h> #include <stdlib.h> int main(int argc, char *argv) { int i=0, j; while (++i<10) { j=9; while (++j<=i) { printf("*");

C / C++

Putting single inverted comma before a digit in Excel

by: Shashank tiwari | last post by:

Hi I am trying Putting single inverted comma before a digit in Excel and a single inverted comma after the number and a comma to follow. Eg. '234567', Can anybody tell us how to do it for 7000...

General

Cloud Servers without Credit Card and Email Registration: A Simpler Way to Get on the Cloud

by: CloudSolutions | last post by:

Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...

General

Wordpress or something else?

by: Faith0G | last post by:

I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

Content Management Systems

One-click Importing Excel Data into a*Database

by: ryjfgjl | last post by:

In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...

Microsoft Excel

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware