473,322 Members | 1,781 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,322 software developers and data experts.

Java vs Perl for specific tasks

Hello,

I have a rather odd question. My company is an all java/oracle shop.
We do everything is Java... no matter what it is... parsing of text
files, messaging, gui you name it. My question is this... is Perl so
much better at parsing text files and outputing that we would see a
substantial speed increase? We process about 10 million records in
flat files a day for reformatting before putting them in a DB.

Also, when it comes to Unix threading... which one would be better off
Java or perl? Essentially, we would break the 10 million down into 10
files... each file is done in a seperate thread... The program also
has to keep a hashmap of keys to make sure we don;t include duplicate
records and it must connect to oracle every once in a while... is
switching to perl worth it considering the investment and know how we
have in java? This is the only portion of the code we would consider
switching to perl...

ideas?

Thanks,

Joie
Jul 17 '05 #1
6 5925
John Smith wrote:
I have a rather odd question. My company is an all java/oracle shop.
We do everything is Java... no matter what it is... parsing of text
files, messaging, gui you name it. My question is this... is Perl so
much better at parsing text files and outputing that we would see a
substantial speed increase? We process about 10 million records in
flat files a day for reformatting before putting them in a DB.
Parsing text with regular expressions is build in to Perl. Also working
with regular expressions is part of the language. Ten years ago that was
quite unique.

With Java 1.4, regular expressions is also build in to Java or else you
could use regexp from Apache.

So the only difference left is syntaxis. With Java you have to type more
code.

Perl also has some very nice regex features, like non-greedy matching,
but now all other regex libraries took over the Perl features.

So, taken all together, there is not much advancement of Perl above Java.
Also, when it comes to Unix threading... which one would be better off
Java or perl? Essentially, we would break the 10 million down into 10
files... each file is done in a seperate thread... The program also
has to keep a hashmap of keys to make sure we don;t include duplicate
records and it must connect to oracle every once in a while... is
switching to perl worth it considering the investment and know how we
have in java? This is the only portion of the code we would consider
switching to perl...


Java has nice thread support. Don't know about Perl.

Edwin Martin

--
http://www.bitstorm.org/edwin/en/
Jul 17 '05 #2
li*************@yahoo.com (John Smith) wrote in message news:<24**************************@posting.google. com>...
Hello,

I have a rather odd question.
Odd indeed.
My company is an all java/oracle shop.
We do everything is Java... no matter what it is... parsing of text
files, messaging, gui you name it. My question is this... is Perl so
much better at parsing text files and outputing that we would see a
substantial speed increase? We process about 10 million records in
flat files a day for reformatting before putting them in a DB.
That might be something you could write some test cases for. regex
engines are complex enough that it can be difficult to predict speed
without trying things out on real data. And you are the one with the
real data you would be using it on.

There are some differences in worst case handling in current versions,
I don't remember what they are. You might go over to perl.org and hunt
for a mail list to get opinions of people with real experience with
perl. (I'm just a wannabee, myself.) They could also point you in the
right directions to get good sample code to work with.
Also, when it comes to Unix threading... which one would be better off
Java or perl?
Which UNIX? Which libraries?

Which Java? come to think of it ...
Essentially, we would break the 10 million down into 10
files... each file is done in a seperate thread... The program also
has to keep a hashmap of keys to make sure we don;t include duplicate
records and it must connect to oracle every once in a while... is
switching to perl worth it considering the investment and know how we
have in java? This is the only portion of the code we would consider
switching to perl...


Not knowing why you are asking these questions makes it difficult to
come up with reasons you might want to investigate perl. But I think
it could be worth your while to investigate, just because I personally
find perl easier to write in (whether clean code or sloppy), and also
because CPAN has a huge depository of useful stuff not yet matched by
the Java community. RegExes are not the half of what perl is.

Switching, of course, is the wrong question to ask. Think in terms of
filling out your toolbox a bit.

JDZ
Jul 17 '05 #3
jo*******************@yahoo.com (Joseph Daniel Zukiger) wrote in message news:<d1**************************@posting.google. com>...
li*************@yahoo.com (John Smith) wrote in message news:<24**************************@posting.google. com>...
Hello,

I have a rather odd question.
Odd indeed.
My company is an all java/oracle shop.
We do everything is Java... no matter what it is... parsing of text
files, messaging, gui you name it. My question is this... is Perl so
much better at parsing text files and outputing that we would see a
substantial speed increase? We process about 10 million records in
flat files a day for reformatting before putting them in a DB.


That might be something you could write some test cases for. regex
engines are complex enough that it can be difficult to predict speed
without trying things out on real data. And you are the one with the
real data you would be using it on.

There are some differences in worst case handling in current versions,
I don't remember what they are. You might go over to perl.org and hunt
for a mail list to get opinions of people with real experience with
perl. (I'm just a wannabee, myself.) They could also point you in the
right directions to get good sample code to work with.
Also, when it comes to Unix threading... which one would be better off
Java or perl?


Which UNIX? Which libraries?

Which Java? come to think of it ...


We would be delpoying the solution on a linux box, a suse, redhat,
etc...

And it would be using Java 1.4.2 (1.5 once it gets to a 1.5.1) or perl
5.8...

Essentially, we would break the 10 million down into 10
files... each file is done in a seperate thread... The program also
has to keep a hashmap of keys to make sure we don;t include duplicate
records and it must connect to oracle every once in a while... is
switching to perl worth it considering the investment and know how we
have in java? This is the only portion of the code we would consider
switching to perl...


Not knowing why you are asking these questions makes it difficult to
come up with reasons you might want to investigate perl. But I think
it could be worth your while to investigate, just because I personally
find perl easier to write in (whether clean code or sloppy), and also
because CPAN has a huge depository of useful stuff not yet matched by
the Java community. RegExes are not the half of what perl is.

Switching, of course, is the wrong question to ask. Think in terms of
filling out your toolbox a bit.


Honestly, I am the type of person who hates to get away from something
I know, but for some reason this task seemed like a good reason to
check out something else... co-workers suggested we write it in C, but
perl is just easier on the eyes and I basically said we go perl or
stay java. the other issue is Java in general runs best on solaris...
but when we were given a linux box to do this stuff we started asking
if we should stick with java for this specific ask.
ideas?
JDZ

Jul 17 '05 #4
I haven't a lot of experience writing multiple threaded applications
in Perl. If multiple thread are important I haven't a clue about
how well it would stack up compared to Java. Probably not as well.

But Perl might be a way to go. Perl has a more flexible object model
than Java does. Once you get used to it it's easy to like

I've reviewed Java 1.4's regular expression capabilities. It is
a clean implementation but it doesn't approach what you can do in Perl.

The Jakarata ORO package gives you all the regular expression
stuff you get with Perl. It is also less verbose than Java 1.4.

If you develop something in Perl, then later decide to translate
it line for line into Java ORO can do that.

li*************@yahoo.com (John Smith) wrote in message news:<24**************************@posting.google. com>...
Hello,

I have a rather odd question. My company is an all java/oracle shop.
We do everything is Java... no matter what it is... parsing of text
files, messaging, gui you name it. My question is this... is Perl so
much better at parsing text files and outputing that we would see a
substantial speed increase? We process about 10 million records in
flat files a day for reformatting before putting them in a DB.

Also, when it comes to Unix threading... which one would be better off
Java or perl? Essentially, we would break the 10 million down into 10
files... each file is done in a seperate thread... The program also
has to keep a hashmap of keys to make sure we don;t include duplicate
records and it must connect to oracle every once in a while... is
switching to perl worth it considering the investment and know how we
have in java? This is the only portion of the code we would consider
switching to perl...

ideas?

Thanks,

Joie

Jul 17 '05 #5
li*************@yahoo.com (John Smith) wrote in message news:<24**************************@posting.google. com>...
jo*******************@yahoo.com (Joseph Daniel Zukiger) wrote in message news:<d1**************************@posting.google. com>...
li*************@yahoo.com (John Smith) wrote in message news:<24**************************@posting.google. com>...
Hello,

I have a rather odd question.
Odd indeed.
My company is an all java/oracle shop.
We do everything is Java... no matter what it is... parsing of text
files, messaging, gui you name it. My question is this... is Perl so
much better at parsing text files and outputing that we would see a
substantial speed increase? We process about 10 million records in
flat files a day for reformatting before putting them in a DB.


That might be something you could write some test cases for. regex
engines are complex enough that it can be difficult to predict speed
without trying things out on real data. And you are the one with the
real data you would be using it on.

There are some differences in worst case handling in current versions,
I don't remember what they are. You might go over to perl.org and hunt
for a mail list to get opinions of people with real experience with
perl. (I'm just a wannabee, myself.) They could also point you in the
right directions to get good sample code to work with.
Also, when it comes to Unix threading... which one would be better off
Java or perl?


Which UNIX? Which libraries?

Which Java? come to think of it ...


We would be delpoying the solution on a linux box, a suse, redhat,
etc...

And it would be using Java 1.4.2 (1.5 once it gets to a 1.5.1) or perl
5.8...


Well, perl 6 looks to be not nearly as close as Java 5. Not sure what
that means, although perl 6 is supposed to clean perl up a bit and
java 5 is supposed to cover some of the inflexibility that has made it
verbose.
Essentially, we would break the 10 million down into 10
files... each file is done in a seperate thread... The program also
has to keep a hashmap of keys to make sure we don;t include duplicate
records and it must connect to oracle every once in a while... is
switching to perl worth it considering the investment and know how we
have in java? This is the only portion of the code we would consider
switching to perl...


Not knowing why you are asking these questions makes it difficult to
come up with reasons you might want to investigate perl. But I think
it could be worth your while to investigate, just because I personally
find perl easier to write in (whether clean code or sloppy), and also
because CPAN has a huge depository of useful stuff not yet matched by
the Java community. RegExes are not the half of what perl is.

Switching, of course, is the wrong question to ask. Think in terms of
filling out your toolbox a bit.


Honestly, I am the type of person who hates to get away from something
I know, but for some reason this task seemed like a good reason to
check out something else... co-workers suggested we write it in C, but
perl is just easier on the eyes and I basically said we go perl or
stay java.


Interesting thought.
the other issue is Java in general runs best on solaris...
but when we were given a linux box to do this stuff we started asking
if we should stick with java for this specific ask.
If you're using RedHat supported stuff, I hear you'll need to compile
perl yourselves to get the full advantage of 5.8. No big deal, I do it
myself from time to time. If you are used to handling version upgrades
in Java, you'll have some idea of what to expect.
ideas?


Well, look around CPAN and the MLs you can find at perl.org, and see
what you find that looks related to the task. Other than that, still
too abstract for me to say much.
Jul 17 '05 #6
My rule of thumb is that if the code is short (~ 100 lines), I
do it in Perl. Else, I do it in Java. Java is better with regard
to code readability.

Binh

li*************@yahoo.com (John Smith) wrote in message news:<24**************************@posting.google. com>...
Hello,

I have a rather odd question. My company is an all java/oracle shop.
We do everything is Java... no matter what it is... parsing of text
files, messaging, gui you name it. My question is this... is Perl so
much better at parsing text files and outputing that we would see a
substantial speed increase? We process about 10 million records in
flat files a day for reformatting before putting them in a DB.

Also, when it comes to Unix threading... which one would be better off
Java or perl? Essentially, we would break the 10 million down into 10
files... each file is done in a seperate thread... The program also
has to keep a hashmap of keys to make sure we don;t include duplicate
records and it must connect to oracle every once in a while... is
switching to perl worth it considering the investment and know how we
have in java? This is the only portion of the code we would consider
switching to perl...

ideas?

Thanks,

Joie

Jul 17 '05 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

42
by: Fred Ma | last post by:
Hello, This is not a troll posting, and I've refrained from asking because I've seen similar threads get all nitter-nattery. But I really want to make a decision on how best to invest my time....
17
by: Michael McGarry | last post by:
Hi, I am just starting to use Python. Does Python have all the regular expression features of Perl? Is Python missing any features available in Perl? Thanks, Michael
3
by: Mohammd M. Hussain | last post by:
Hi, I have a good knowledge of Perl ( 2 and half years ) and despite working with the language that long, I have yet to build a substantial program ( during this time I was merely testing out...
3
by: John Smith | last post by:
Hello, I have a rather odd question. My company is an all java/oracle shop. We do everything is Java... no matter what it is... parsing of text files, messaging, gui you name it. My question...
4
by: Thomas Honold | last post by:
Hi there, I want to generate C++ or Java code by script, e.g. classes with getter/setter methods. Someone told me that nowadays it is done with XML and XSLT. I should use - XML File which...
133
by: Gaurav | last post by:
http://www.sys-con.com/story/print.cfm?storyid=45250 Any comments? Thanks Gaurav
1
by: David Van D | last post by:
Hi there, A few weeks until I begin my journey towards a degree in Computer Science at Canterbury University in New Zealand, Anyway the course tutors are going to be teaching us JAVA wth bluej...
29
by: walterbyrd | last post by:
Some think it will. Up untill now, Java has never been standard across different versions of Linux and Unix. Some think that is one reason that some developers have avoided Java in favor of...
223
by: Pilcrow | last post by:
Given that UNIX, including networking, is almost entirely coded in C, how come so many things are almost impossible in ordinary C? Examples: Network and internet access, access to UNIX...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.