473,396 Members | 1,797 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

grabbing random words

Jay
How would I be able to grab random words from an internet source. I'd
like to grab a random word from a comprehensive internet dictionary.
What would be the best source and the best way to go about this?
Thanks.

(Sorry if this sounds/is super noobish.)

Sep 23 '06 #1
7 2722
Jay wrote:
How would I be able to grab random words from an internet source.
I'd like to grab a random word from a comprehensive internet
dictionary. What would be the best source and the best way to go
about this?
The *best* source would be a function of the internet dictionary
that selects a random word and passes it to you. Otherwise you'd
have to read quite an amount of words, and select one yourself.
(Sorry if this sounds/is super noobish.)
It's quite difficult to let readable and complete questions (also
with meaningful subject) sound noobish ;)

Regards,
Björn

--
BOFH excuse #265:

The mouse escaped.

Sep 23 '06 #2
Jay:
How would I be able to grab random words from an internet source. I'd
like to grab a random word from a comprehensive internet dictionary.
What would be the best source and the best way to go about this?
Why do you need to grab them from the net?
A simpler solution seems to keep a local file containing the sequence
of words. You can find some open source sequences of such words. Then
you can read all the words in a list, and use random.choice to take one
of them randomly. If you don't want to keep all the dictionary/lexer
(that can be up to 20 MB if it's a lexer) in memory you can (knowing
the len of the file) seek a random position, and read 20-30 bytes, and
take the word inside it (or you can create a dictionary file where each
word is contained in in a fixed len of chars, so you can seek exactly a
single word).

Bye,
bearophile

Sep 23 '06 #3
Another approach would be to just scrape a CS's random (5.75 x 10^30)
word haiku generator. ;)

import urllib
import libxml2
import random

uri = 'http://www.cs.indiana.edu/cgi-bin/haiku'

sock = urllib.urlopen(uri)
data = sock.read()
sock.close()

doc = libxml2.htmlParseDoc(data, None)
words = [p.content for p in doc.xpathEval('//a')[8:-3]]
doc.freeDoc()

print random.choice(words)

Regards,
Jordan

Sep 23 '06 #4
On Sat, 23 Sep 2006 04:37:31 -0700, MonkeeSage wrote:
Another approach would be to just scrape a CS's random (5.75 x 10^30)
word haiku generator. ;)
That isn't 5.75e30 words, it is the number of possible haikus. There
aren't that many words in all human languages combined.

Standard English working vocabulary is about 800 words in typical daily
use, and 5000 words that most people can understand. Particularly
well-read people might understand a dozen times that, about 60,000 words.
The total number of words in English is hard to count, but the Oxford
English Dictionary estimates about three quarters of a million words.

http://www.askoxford.com/asktheexper...sh/numberwords
Call it a million; and lets say that there are, or have every been, a
million distinct human languages (which is surely a large overestimate,
even including dialects and pigeons). That gives only a "mere" 10**12
words, about a million million million times smaller than the number of
haikus.

(Note however that there are languages like Finnish which allow you to
stick together words into a single "word" of indefinite length, sort of as
if we could say in English "therearelanguageswhichallowyou" to
"sticktogetherwordsintoasinglewordofindefiniteleng th". Such languages
might be said to have an infinite number of words, in some sense.)

--
Steven.

Sep 24 '06 #5
Jay wrote:
How would I be able to grab random words from an internet source. I'd
like to grab a random word from a comprehensive internet dictionary.
What would be the best source and the best way to go about this?
Here's a source that gives you a random word:
http://www.zokutou.co.uk/randomword/

Frode

Sep 24 '06 #6
Jay,

Your problem is specific to a particular internet dictionary provider.
UNLESS

1) The dictionary page has some specific link that gets you a
random word, OR

2) after you click through a couple of word definitions you find in
the URLs of the pages that the words are indexed using integers and
there no gaps in the sequence, OR

3) The dictionary somehow exposes its database for all to access,

THEN you cannot really get random words from it.

If you need random words find yourself lists of such words online
(sites devoted to natural language processing or linguistics might have
them) then load them up into a list and randomly choose between the
indices of the list to get your words.

Nick V.
Jay wrote:
How would I be able to grab random words from an internet source. I'd
like to grab a random word from a comprehensive internet dictionary.
What would be the best source and the best way to go about this?
Thanks.

(Sorry if this sounds/is super noobish.)
Sep 24 '06 #7
Steven D'Aprano wrote:
That isn't 5.75e30 words, it is the number of possible haikus. There
aren't that many words in all human languages combined.
Doh! This is why _I'm_ not a computer scientist. I'm kinda slow. ;)
(Note however that there are languages like Finnish which allow you to
stick together words into a single "word" of indefinite length, sort of as
if we could say in English "therearelanguageswhichallowyou" to
"sticktogetherwordsintoasinglewordofindefiniteleng th". Such languages
might be said to have an infinite number of words, in some sense.)
Imagine an agglutinating (sp?) programming language, heh!

Regards,
Jordan

Sep 24 '06 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
by: Hans A | last post by:
I have a textfile "textfile.txt" containing a list of words. There is one word on each line. I want to pick two random lines from this textfile, and I have tried to do something like: //Loading...
5
by: Alistair | last post by:
Hello folks... this is my first post in here. I'm new to ASP having done all my previous work in Flash and bog standard HTML. Only been learning for a couple of weeks. anyway...I have been...
23
by: Thomas Mlynarczyk | last post by:
I remember there is a programming language where you can initialize the random number generator, so that it can - if you want - give you the exactly same sequence of random numbers every time you...
11
by: Olaf \El Blanco\ | last post by:
How can i generate random words? ('a'..'z') Is there any function that convert a number to it ascci char? My english is horrible! Here an example: function(65) return 'a'; Thank you!
28
by: Elfour | last post by:
In c++ what is the code to make teh program randomly select a number?
3
by: duffint | last post by:
Hi there, I have this script that I need some direction in; it's mangled my head a bit. I want to be able to stick links around six random words; these links are then popup ads. I saw kind...
21
by: chico_yallin | last post by:
I just wana make a random id number based on4 digits-for examples?? Thanks in Advance Ch.Yallin
24
by: pereges | last post by:
I need to generate two uniform random numbers between 0 and 1 in C ? How to do it ? I looked into rand function where you need to #define RAND_MAX as 1 but will this rand function give me ...
4
by: philly_bob | last post by:
In the sample program below, I want to send a random method to a class instance. In other words, I don't know which method to send until run-time. How can I send ch, which is my random choice, to...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.