Hi,
I'm a postgraduate and my project deals with a fair bit of text
analysis. I'm looking for some libraries and tools that is geared
towards text analysis (and text engineering). So far, the most
comprehensive toolkit in python for my purpose is NLTK (natural language
tool kit) by Edward Loper and Steven Bird, followed by mxTextTools. Are
there any OSS tools out there that is more comprehensive than NLTK?
In the Java world, there is GATE (general architecture for text
engineering) and it seems very impressive. Are there something like that
for Python?
Thanks in advance.
Cheers
Maurice 10 2870
In article <ma**************************************@python.o rg>,
Maurice Ling <ma*********@acm.org> wrote:
The book "Text Processing in Python" by David Mertz, available online
at http://gnosis.cx/TPiP/ , may be helpful.
Maurice Ling wrote: Hi,
I'm a postgraduate and my project deals with a fair bit of text analysis. I'm looking for some libraries and tools that is geared towards text analysis (and text engineering). So far, the most comprehensive toolkit in python for my purpose is NLTK (natural language tool kit) by Edward Loper and Steven Bird, followed by mxTextTools. Are there any OSS tools out there that is more comprehensive than NLTK?
In the Java world, there is GATE (general architecture for text engineering) and it seems very impressive. Are there something like that for Python?
Thanks in advance.
Cheers Maurice
You might try http://web.media.mit.edu/~hugo/montylingua/
"Liu, Hugo (2004). MontyLingua: An end-to-end natural
language processor with common sense. Available
at: web.media.mit.edu/~hugo/montylingua."
Mark Winrock wrote: You might try http://web.media.mit.edu/~hugo/montylingua/
"Liu, Hugo (2004). MontyLingua: An end-to-end natural language processor with common sense. Available at: web.media.mit.edu/~hugo/montylingua."
Thanks Mark. I've downloaded MontyLingua and it looks pretty cool. To
me, it seems like pretty much geared to people like myself who needs
something to process written text but do not need the hardcore bolts and
nuts of a computational linguistist. NLTK is more of the bolts and nuts
toolkit. GATE still seems more advanced than MontyLingua but to a
different end.
Is there anyone in this forum that is using or had used MontyLingua and
is happy to comment more on it? I'm happy to get more opinions.
Thanks and cheers
Maurice
"Maurice LING" <ma*********@acm.org> wrote in message
news:42**************@acm.org... Say I code my stuffs in Jython (importing java libraries) in a file "text.py"
Just to be clear, Jython is not a separate langague that you code *in*, but
a separate implementation that you may slightly differently code *for*.
... Will there be any issues when I try to import text.py into CPython?
If text.py is written in an appropriate version of Python, it itself will
cause no problem. Hoqwever, when it imports javacode files, as opposed to
CPython bytecode files, CPython will choke.
Terry J. Reedy
Maurice Ling wrote: In the Java world, there is GATE (general architecture for text engineering) and it seems very impressive. Are there something like that for Python?
I worked with GATE this last summer and really hated it. Can't decide
whether that was just my growing distaste for Java or actually the GATE
API. Anyway, if you're looking for something like GATE that (in my
experience) runs significantly faster, you should look at Ellogon
( www.ellogon.org). It's written in C and TCL, with C++, Java, Perl, and
Python bindings. And I believe, if you have any software already
written for GATE, Ellogon can run those modules directly. I've
personally never done so -- all my modules are written in Python (often
simple wrappers for things like MXPOST, MXTerminator, Charniak's parser,
etc.) I find the Python interface simple and easy to use, and they've
added a number of my suggestions to the API in the last release.
STeVe
Terry Reedy wrote: "Maurice LING" <ma*********@acm.org> wrote in message news:42**************@acm.org...
Say I code my stuffs in Jython (importing java libraries) in a file "text.py"
Just to be clear, Jython is not a separate langague that you code *in*, but a separate implementation that you may slightly differently code *for*.
Yes, I do get this point rightly. Jython is just an implementation of
Python virtual machine using Java. I do note that there are some
differences, such as, Jython can only handle pure python modules.
However, I'm not a language expert to differentiate language differences
between these 2 implementations of Python, as in Jython and CPython. If
someone care to enlighten, it will be my pleasure to consult. TIA. ... Will there be any issues when I try to import text.py into CPython?
If text.py is written in an appropriate version of Python, it itself will cause no problem. Hoqwever, when it imports javacode files, as opposed to CPython bytecode files, CPython will choke.
In my example, the file "text.py" is coded in Jython, importing Java
libraries. I do get that I cannot import Java jar files directly into
CPython. What I do not get is that what is so special about Jython that
it can "fool" CPython into using Java libraries... or is that there will
always be a need for Java virtual machine and Python virtual machine
when I use Java libraries in Jython... and importing Jython coded files
into CPython....
Cheers
Maurice
On Mon, 04 Apr 2005 09:36:32 +1000, Maurice LING <ma*********@acm.org>
declaimed the following in comp.lang.python: Yes, I do get this point rightly. Jython is just an implementation of Python virtual machine using Java. I do note that there are some
Pardon? I though Jython directly used the Java VM... It is not a
Python VM at all. It's the same language at the source level, but a
totally different back-end.
Hence, it requires the JVM to be able to run anything that
imports a Java library. Pure Python (source code) is compatible because
the two implementations will "compile" into either JVM byte code
(Jython) or classic Python byte code (CPython).
The CPython /run time/ has no facilities for interpreting JVM
byte code and can not, therefore, process Java library imports.
Similarly, the JVM has no facilities for interfacing with CPython
compiled libraries.
-- ================================================== ============ < wl*****@ix.netcom.com | Wulfraed Dennis Lee Bieber KD6MOG < wu******@dm.net | Bestiaria Support Staff < ================================================== ============ < Home Page: <http://www.dm.net/~wulfraed/> < Overflow Page: <http://wlfraed.home.netcom.com/> <
Maurice LING wrote: Terry Reedy wrote:
"Maurice LING" <ma*********@acm.org> wrote in message news:42**************@acm.org...
Say I code my stuffs in Jython (importing java libraries) in a file "text.py" Just to be clear, Jython is not a separate langague that you code *in*, but a separate implementation that you may slightly differently code *for*. Yes, I do get this point rightly. Jython is just an implementation of Python virtual machine using Java. I do note that there are some differences, such as, Jython can only handle pure python modules. However, I'm not a language expert to differentiate language differences between these 2 implementations of Python, as in Jython and CPython. If someone care to enlighten, it will be my pleasure to consult. TIA.
That's not strictly correct. The Python virtual machine isn;t
implemented at all in Jython, instead the JVM is used as the compilation
target. ... Will there be any issues when I try to import text.py into CPython? If text.py is written in an appropriate version of Python, it itself will cause no problem. Hoqwever, when it imports javacode files, as opposed to CPython bytecode files, CPython will choke.
In my example, the file "text.py" is coded in Jython, importing Java libraries. I do get that I cannot import Java jar files directly into CPython. What I do not get is that what is so special about Jython that it can "fool" CPython into using Java libraries... or is that there will always be a need for Java virtual machine and Python virtual machine when I use Java libraries in Jython... and importing Jython coded files into CPython....
Jython is pretty much a Python interpreter that compiles Python into JVM
bytecodes. Consequently the amount of "trickery" involved is rather
less, though clearly there is some (automated conversion b etween Java
and Pythin data types where appropriate, and automated signature-based
selection of the appropriate Java method being the two most obvious).
regards
Steve
--
Steve Holden +1 703 861 4237 +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/
Python Web Programming http://pydish.holdenweb.com/ This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: beliavsky |
last post by:
If I run PyChecker on the following program, stored in xtry.py,
m = 10000000
k = 0
for i in xrange(m):
k = k + i
print k
x = range(3)
print x
|
by: Bishara Gabriel |
last post by:
I encourage feedback from all parties and especially those which would
like to be directly involved (I may include you in the grant proposal
and budget). Send me your comments!
------->
...
|
by: dcolford2000 |
last post by:
Hi All - I'm new to python
Is there an impact analysis tool out there that can cross reference
python -- VB has a couple of these tools (eg. Visual Expert)
TIA, All comments welcome
Dave
|
by: ross |
last post by:
I want to do some tricky text file manipulation on many files, but have
only a little programming knowledge.
What are the ideal languages for the following examples?
1. Starting from a certain...
|
by: Ray Tomes |
last post by:
Hi Folks
I am an old codger who has much experience with computers
in the distant past before all this object oriented stuff.
Also I have loads of software in such languages as FORTRAN
and...
|
by: Thomas Nelson |
last post by:
Sorry if this is a FAQ, but I couldn't find a good summary through
google. What kinds of statistical analysis tools exist in python? I
really just need t-tests, chi-squared test, and other such...
|
by: jld730 |
last post by:
Greetings!
I am still new to Python, sorry! I have been searching through many posts on this subject and have attempted to TRY, but I feel really lost. So, any detailed guidance would be oh-so...
|
by: Eric Wertman |
last post by:
I have a set of files with this kind of content (it's dumped from WebSphere):
]
|
by: kmartinenko |
last post by:
Hello,
I am wondering if there is a sly workaround in ArcGIS 9.2 where I can write a "near" analysis script in Python and create my own tool for the purpose of identifying the distance values...
|
by: DolphinDB |
last post by:
Tired of spending countless mintues downsampling your data? Look no further!
In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
|
by: ryjfgjl |
last post by:
ExcelToDatabase: batch import excel into database automatically...
|
by: isladogs |
last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM).
In this month's session, we are pleased to welcome back...
|
by: isladogs |
last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM).
In this month's session, we are pleased to welcome back...
|
by: jfyes |
last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
|
by: Defcon1945 |
last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
|
by: af34tf |
last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
|
by: Faith0G |
last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome former...
| |