469,658 Members | 1,855 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,658 developers. It's quick & easy.

Reading Java byte[] data stream over standard input

Hello,
I am using HadoopStreaming using a BinaryInputStream. What this
basically does is send a stream of bytes ( the java type is : private
byte[] bytes) to my python program.

I have done a test like this,
while 1:
x=sys.stdin.read(100)
if x:
print x
else:
break

Now, the incoming data is binary(though mine is actually merely ascii
text) but the output is not what is expected. I expect for e.g

all/86000/114.310.151.209.60370-121.110.5.176.113\n62485.9718
118.010.241.12 60370 128.210.5.176

However i get a 1 before all and a 4 just after \n and before the 6.

My question is : how do i read binary data(Java's byte stream) from
stdin?
Or is this actually what i'm getting?

Thanks
Sapsi
Jun 27 '08 #1
6 2896
I should also mention that for some reason there are several binay
values popping in between for some reason. This behavior (for the
inputr stream) is not expected

Now, the incoming data is binary(though mine is actually merely ascii
text) but the output is not what is expected. I expect for e.g

all/86000/114.310.151.209.60370-121.110.5.176.113\n62485.9718
118.010.241.12 60370 128.210.5.176

However i get a 1 before all and a 4 just after \n and before the 6.

My question is : how do i read binary data(Java's byte stream) from
stdin?
Or is this actually what i'm getting?

Thanks
Sapsi
Jun 27 '08 #2
On Sun, 18 May 2008 22:11:33 -0700, sapsi wrote:
I am using HadoopStreaming using a BinaryInputStream. What this
basically does is send a stream of bytes ( the java type is : private
byte[] bytes) to my python program.

I have done a test like this,
while 1:
x=sys.stdin.read(100)
if x:
print x
else:
break

Now, the incoming data is binary(though mine is actually merely ascii
text) but the output is not what is expected. I expect for e.g

all/86000/114.310.151.209.60370-121.110.5.176.113\n62485.9718
118.010.241.12 60370 128.210.5.176

However i get a 1 before all and a 4 just after \n and before the 6.

My question is : how do i read binary data(Java's byte stream) from
stdin?
Or is this actually what i'm getting?
If there's extra data in `x` then it was sent to stdin. Maybe there's
some extra information like string length, Java type information, or
checksums encoded in that data!?

Ciao,
Marc 'BlackJack' Rintsch
Jun 27 '08 #3
Yes, that could be the case. Browsing through hadoop's source, i see
stdin in the above code is reading from piped Java DataOutputStream.
I read of a libray on the net Javadata.py that reads this but it has
disappeared.
What is involved in reading from a Dataoutputstream?

Thank you
Sapsi
Jun 27 '08 #4
On Mon, 19 May 2008 00:14:25 -0700, sapsi wrote:
Yes, that could be the case. Browsing through hadoop's source, i see
stdin in the above code is reading from piped Java DataOutputStream.
I read of a libray on the net Javadata.py that reads this but it has
disappeared.
What is involved in reading from a Dataoutputstream?
According to the Java docs of `DataInput` and `DataOutput` it is quite
simple. Most methods just seem to write the necessary bytes for the
primitive types except `writeUTF()` which prefixes the string data with
length information.

So if it is not Strings you are writing then "hadoop" seems to throw in
some information into the stream.

Ciao,
Marc 'BlackJack' Rintsch
Jun 27 '08 #5
On 19 May, 06:11, sapsi <saptarshi.g...@gmail.comwrote:
Hello,
I am using HadoopStreaming using a BinaryInputStream. What this
basically does is send a stream of bytes ( the java type is : private
byte[] bytes) to my python program.

I have done a test like this,
while 1:
x=sys.stdin.read(100)
if x:
print x
else:
break

Now, the incoming data is binary(though mine is actually merely ascii
text) but the output is not what is expected. I expect for e.g

all/86000/114.310.151.209.60370-121.110.5.176.113\n62485.9718
118.010.241.12 60370 128.210.5.176

However i get a 1 before all and a 4 just after \n and before the 6.

My question is : how do i read binary data(Java's byte stream) from
stdin?
Or is this actually what i'm getting?

Thanks
Sapsi
In the past I've sent binary data to a java applet reading
DataInputStream using xdrlib from the standard library. I'd expect
that it would work in the reverse direction so I suggest you have a
look at that.

Giles
Jun 27 '08 #6
sapsi wrote:
I should also mention that for some reason there are several binay
values popping in between for some reason. This behavior (for the
inputr stream) is not expected

>Now, the incoming data is binary(though mine is actually merely ascii
text) but the output is not what is expected. I expect for e.g

all/86000/114.310.151.209.60370-121.110.5.176.113\n62485.9718
118.010.241.12 60370 128.210.5.176

However i get a 1 before all and a 4 just after \n and before the 6.

My question is : how do i read binary data(Java's byte stream) from
stdin?
Or is this actually what i'm getting?
Consider changing "print x" to "print repr(x)" ... this would mean that
you have a better chance of understanding what the extra or unexpected
popping-in bytes are.
Jun 27 '08 #7

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

73 posts views Thread by RobertMaas | last post: by
30 posts views Thread by Richard | last post: by
3 posts views Thread by Fernando Arbeiza | last post: by
8 posts views Thread by junk5 | last post: by
3 posts views Thread by Marc Gravell | last post: by
3 posts views Thread by Sir Psycho | last post: by
4 posts views Thread by radhikams | last post: by
reply views Thread by gheharukoh7 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.