By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,407 Members | 1,753 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,407 IT Pros & Developers. It's quick & easy.

Java vs Perl for specific tasks

P: n/a
Hello,

I have a rather odd question. My company is an all java/oracle shop.
We do everything is Java... no matter what it is... parsing of text
files, messaging, gui you name it. My question is this... is Perl so
much better at parsing text files and outputing that we would see a
substantial speed increase? We process about 10 million records in
flat files a day for reformatting before putting them in a DB.

Also, when it comes to Unix threading... which one would be better off
Java or perl? Essentially, we would break the 10 million down into 10
files... each file is done in a seperate thread... The program also
has to keep a hashmap of keys to make sure we don;t include duplicate
records and it must connect to oracle every once in a while... is
switching to perl worth it considering the investment and know how we
have in java? This is the only portion of the code we would consider
switching to perl...

ideas?

Thanks,

Joie
Jul 17 '05 #1
Share this Question
Share on Google+
6 Replies


P: n/a
John Smith wrote:
I have a rather odd question. My company is an all java/oracle shop.
We do everything is Java... no matter what it is... parsing of text
files, messaging, gui you name it. My question is this... is Perl so
much better at parsing text files and outputing that we would see a
substantial speed increase? We process about 10 million records in
flat files a day for reformatting before putting them in a DB.
Parsing text with regular expressions is build in to Perl. Also working
with regular expressions is part of the language. Ten years ago that was
quite unique.

With Java 1.4, regular expressions is also build in to Java or else you
could use regexp from Apache.

So the only difference left is syntaxis. With Java you have to type more
code.

Perl also has some very nice regex features, like non-greedy matching,
but now all other regex libraries took over the Perl features.

So, taken all together, there is not much advancement of Perl above Java.
Also, when it comes to Unix threading... which one would be better off
Java or perl? Essentially, we would break the 10 million down into 10
files... each file is done in a seperate thread... The program also
has to keep a hashmap of keys to make sure we don;t include duplicate
records and it must connect to oracle every once in a while... is
switching to perl worth it considering the investment and know how we
have in java? This is the only portion of the code we would consider
switching to perl...


Java has nice thread support. Don't know about Perl.

Edwin Martin

--
http://www.bitstorm.org/edwin/en/
Jul 17 '05 #2

P: n/a
li*************@yahoo.com (John Smith) wrote in message news:<24**************************@posting.google. com>...
Hello,

I have a rather odd question.
Odd indeed.
My company is an all java/oracle shop.
We do everything is Java... no matter what it is... parsing of text
files, messaging, gui you name it. My question is this... is Perl so
much better at parsing text files and outputing that we would see a
substantial speed increase? We process about 10 million records in
flat files a day for reformatting before putting them in a DB.
That might be something you could write some test cases for. regex
engines are complex enough that it can be difficult to predict speed
without trying things out on real data. And you are the one with the
real data you would be using it on.

There are some differences in worst case handling in current versions,
I don't remember what they are. You might go over to perl.org and hunt
for a mail list to get opinions of people with real experience with
perl. (I'm just a wannabee, myself.) They could also point you in the
right directions to get good sample code to work with.
Also, when it comes to Unix threading... which one would be better off
Java or perl?
Which UNIX? Which libraries?

Which Java? come to think of it ...
Essentially, we would break the 10 million down into 10
files... each file is done in a seperate thread... The program also
has to keep a hashmap of keys to make sure we don;t include duplicate
records and it must connect to oracle every once in a while... is
switching to perl worth it considering the investment and know how we
have in java? This is the only portion of the code we would consider
switching to perl...


Not knowing why you are asking these questions makes it difficult to
come up with reasons you might want to investigate perl. But I think
it could be worth your while to investigate, just because I personally
find perl easier to write in (whether clean code or sloppy), and also
because CPAN has a huge depository of useful stuff not yet matched by
the Java community. RegExes are not the half of what perl is.

Switching, of course, is the wrong question to ask. Think in terms of
filling out your toolbox a bit.

JDZ
Jul 17 '05 #3

P: n/a
jo*******************@yahoo.com (Joseph Daniel Zukiger) wrote in message news:<d1**************************@posting.google. com>...
li*************@yahoo.com (John Smith) wrote in message news:<24**************************@posting.google. com>...
Hello,

I have a rather odd question.
Odd indeed.
My company is an all java/oracle shop.
We do everything is Java... no matter what it is... parsing of text
files, messaging, gui you name it. My question is this... is Perl so
much better at parsing text files and outputing that we would see a
substantial speed increase? We process about 10 million records in
flat files a day for reformatting before putting them in a DB.


That might be something you could write some test cases for. regex
engines are complex enough that it can be difficult to predict speed
without trying things out on real data. And you are the one with the
real data you would be using it on.

There are some differences in worst case handling in current versions,
I don't remember what they are. You might go over to perl.org and hunt
for a mail list to get opinions of people with real experience with
perl. (I'm just a wannabee, myself.) They could also point you in the
right directions to get good sample code to work with.
Also, when it comes to Unix threading... which one would be better off
Java or perl?


Which UNIX? Which libraries?

Which Java? come to think of it ...


We would be delpoying the solution on a linux box, a suse, redhat,
etc...

And it would be using Java 1.4.2 (1.5 once it gets to a 1.5.1) or perl
5.8...

Essentially, we would break the 10 million down into 10
files... each file is done in a seperate thread... The program also
has to keep a hashmap of keys to make sure we don;t include duplicate
records and it must connect to oracle every once in a while... is
switching to perl worth it considering the investment and know how we
have in java? This is the only portion of the code we would consider
switching to perl...


Not knowing why you are asking these questions makes it difficult to
come up with reasons you might want to investigate perl. But I think
it could be worth your while to investigate, just because I personally
find perl easier to write in (whether clean code or sloppy), and also
because CPAN has a huge depository of useful stuff not yet matched by
the Java community. RegExes are not the half of what perl is.

Switching, of course, is the wrong question to ask. Think in terms of
filling out your toolbox a bit.


Honestly, I am the type of person who hates to get away from something
I know, but for some reason this task seemed like a good reason to
check out something else... co-workers suggested we write it in C, but
perl is just easier on the eyes and I basically said we go perl or
stay java. the other issue is Java in general runs best on solaris...
but when we were given a linux box to do this stuff we started asking
if we should stick with java for this specific ask.
ideas?
JDZ

Jul 17 '05 #4

P: n/a
I haven't a lot of experience writing multiple threaded applications
in Perl. If multiple thread are important I haven't a clue about
how well it would stack up compared to Java. Probably not as well.

But Perl might be a way to go. Perl has a more flexible object model
than Java does. Once you get used to it it's easy to like

I've reviewed Java 1.4's regular expression capabilities. It is
a clean implementation but it doesn't approach what you can do in Perl.

The Jakarata ORO package gives you all the regular expression
stuff you get with Perl. It is also less verbose than Java 1.4.

If you develop something in Perl, then later decide to translate
it line for line into Java ORO can do that.

li*************@yahoo.com (John Smith) wrote in message news:<24**************************@posting.google. com>...
Hello,

I have a rather odd question. My company is an all java/oracle shop.
We do everything is Java... no matter what it is... parsing of text
files, messaging, gui you name it. My question is this... is Perl so
much better at parsing text files and outputing that we would see a
substantial speed increase? We process about 10 million records in
flat files a day for reformatting before putting them in a DB.

Also, when it comes to Unix threading... which one would be better off
Java or perl? Essentially, we would break the 10 million down into 10
files... each file is done in a seperate thread... The program also
has to keep a hashmap of keys to make sure we don;t include duplicate
records and it must connect to oracle every once in a while... is
switching to perl worth it considering the investment and know how we
have in java? This is the only portion of the code we would consider
switching to perl...

ideas?

Thanks,

Joie

Jul 17 '05 #5

P: n/a
li*************@yahoo.com (John Smith) wrote in message news:<24**************************@posting.google. com>...
jo*******************@yahoo.com (Joseph Daniel Zukiger) wrote in message news:<d1**************************@posting.google. com>...
li*************@yahoo.com (John Smith) wrote in message news:<24**************************@posting.google. com>...
Hello,

I have a rather odd question.
Odd indeed.
My company is an all java/oracle shop.
We do everything is Java... no matter what it is... parsing of text
files, messaging, gui you name it. My question is this... is Perl so
much better at parsing text files and outputing that we would see a
substantial speed increase? We process about 10 million records in
flat files a day for reformatting before putting them in a DB.


That might be something you could write some test cases for. regex
engines are complex enough that it can be difficult to predict speed
without trying things out on real data. And you are the one with the
real data you would be using it on.

There are some differences in worst case handling in current versions,
I don't remember what they are. You might go over to perl.org and hunt
for a mail list to get opinions of people with real experience with
perl. (I'm just a wannabee, myself.) They could also point you in the
right directions to get good sample code to work with.
Also, when it comes to Unix threading... which one would be better off
Java or perl?


Which UNIX? Which libraries?

Which Java? come to think of it ...


We would be delpoying the solution on a linux box, a suse, redhat,
etc...

And it would be using Java 1.4.2 (1.5 once it gets to a 1.5.1) or perl
5.8...


Well, perl 6 looks to be not nearly as close as Java 5. Not sure what
that means, although perl 6 is supposed to clean perl up a bit and
java 5 is supposed to cover some of the inflexibility that has made it
verbose.
Essentially, we would break the 10 million down into 10
files... each file is done in a seperate thread... The program also
has to keep a hashmap of keys to make sure we don;t include duplicate
records and it must connect to oracle every once in a while... is
switching to perl worth it considering the investment and know how we
have in java? This is the only portion of the code we would consider
switching to perl...


Not knowing why you are asking these questions makes it difficult to
come up with reasons you might want to investigate perl. But I think
it could be worth your while to investigate, just because I personally
find perl easier to write in (whether clean code or sloppy), and also
because CPAN has a huge depository of useful stuff not yet matched by
the Java community. RegExes are not the half of what perl is.

Switching, of course, is the wrong question to ask. Think in terms of
filling out your toolbox a bit.


Honestly, I am the type of person who hates to get away from something
I know, but for some reason this task seemed like a good reason to
check out something else... co-workers suggested we write it in C, but
perl is just easier on the eyes and I basically said we go perl or
stay java.


Interesting thought.
the other issue is Java in general runs best on solaris...
but when we were given a linux box to do this stuff we started asking
if we should stick with java for this specific ask.
If you're using RedHat supported stuff, I hear you'll need to compile
perl yourselves to get the full advantage of 5.8. No big deal, I do it
myself from time to time. If you are used to handling version upgrades
in Java, you'll have some idea of what to expect.
ideas?


Well, look around CPAN and the MLs you can find at perl.org, and see
what you find that looks related to the task. Other than that, still
too abstract for me to say much.
Jul 17 '05 #6

P: n/a
My rule of thumb is that if the code is short (~ 100 lines), I
do it in Perl. Else, I do it in Java. Java is better with regard
to code readability.

Binh

li*************@yahoo.com (John Smith) wrote in message news:<24**************************@posting.google. com>...
Hello,

I have a rather odd question. My company is an all java/oracle shop.
We do everything is Java... no matter what it is... parsing of text
files, messaging, gui you name it. My question is this... is Perl so
much better at parsing text files and outputing that we would see a
substantial speed increase? We process about 10 million records in
flat files a day for reformatting before putting them in a DB.

Also, when it comes to Unix threading... which one would be better off
Java or perl? Essentially, we would break the 10 million down into 10
files... each file is done in a seperate thread... The program also
has to keep a hashmap of keys to make sure we don;t include duplicate
records and it must connect to oracle every once in a while... is
switching to perl worth it considering the investment and know how we
have in java? This is the only portion of the code we would consider
switching to perl...

ideas?

Thanks,

Joie

Jul 17 '05 #7

This discussion thread is closed

Replies have been disabled for this discussion.