471,870 Members | 1,432 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,870 software developers and data experts.

Sending both binary data and strings over the same stream


I've bumbed into a slight problem now, and I just don't seem to know how to fix it. What I want to do is the following:
Send over a socket:
1. Number of files to be send (not as an integer, just as a string)
then for each file to be send:
2. Length of Filename (again as a string)
3. Filename
4. File as binary data.

I grabbed my Core Java book and figured it'd be easily doable using a bufferd data input/output stream, As i can cast it to a buffered data stream for the binary data in the file, and to the data stream for sending/receiving strings (is that even a valid way of thinking ?)

The whole 'plan' just started crumbling down when I tried to implement it and it said that the DataInputStream method is deprecated. And I can't use a Reader as I must be able to receive binary data...

Now, I think it'd be a grave mistake (please do correct me if I'm wrong) to hand the same input/output stream over to a reader, and hand construct a bufferedInput/OutputStream with the same input/output stream as is being used by the reader. And on second thought, the same problem probably occurs with Wrapping the stream in as a DataInputStream(new BufferedInputStream ( socket.getInputstream())) so i'll just scratch that plan too...

So... I'm kinda stuck here... How can I send both binary data and strings over the same socket ? While not making it that complicated that it's really difficult to construct a correctly working C++ counterpart using the same protocol that can communicate with the java implementation.

Thanks in advance.

Feb 8 '09 #1
20 9045
3,112 Expert 2GB
Hi Tom!

I checked the DataInputStream API and it seems that the readLine() method is depreciated, but not the whole class.

Also I think, it shouldn't be a problem to reuse a stream in the way you suggested.

Oh, and for sending Strings, you could also use an ObjectOutputStream / ObjectInputStream (although that might be problematic when trying to create a C++ counterpart, I really don't know that).

Feb 8 '09 #2

Thanks for the reply :).

What I'm not sure about with the reusing of a stream is both the closing of the stream, and the buffered data.
If data enters on the original stream, is it copied to, for example both the buffer for the BufferedInputStream and to the buffer of a BufferedReader, or is it only present in one of the buffers ?
The latter case would be quite problematic as I would never know where my data is at. If it's the former then I should know exactly how many bytes to skip in each stream as I read from the other Stream, but that shouldn't be to hard.

Another thing I was thinking about is, how about if I take a pure bytestream and just cast it char's for example for all the things I know are chars (for example until the result of a cast of 2 bytes results in a newline) and then interpret the next part as it is supposed to be.
But tbh, I could do that in c++ But i have no idea whatsoever on doing this in java.

Do you have any idea's about that ?

Thanks in advance,
Feb 9 '09 #3
13,262 8TB
If you just send the binary file you should be able to get both its name and length after sending the file, right?
Feb 9 '09 #4
But, I can't send the java file object since we have to 'interface' with c++. Or isn't that what you meant ?
The thing is also, we want it to be possible to for example 10 files after one another without having to openup new connections.

So far what I'm thinking that might work is just sending bytes over and casting everything to chars, but I think it'll be a bit inefficient. But I think it might work...

So, just using a raw BufferedOutput/InputStream and getting the bytes from the strings, and then on the receiving end just cast every 1 or 2 bytes (according to java standard) as a character (if that's possible, sigh, c++ is so much easier :P) and seperate based on new lines that way.

When the protocol then says that the next part should be interpreted as a binary file, e.g. after reading file lenght using the above described method, I don't interpret any of the bytes and just write them to a file, until the whole file has been received. Then I start casting again for the next filename length.

Would that be a doable approach ?

Thanks in advance
Feb 9 '09 #5
11,448 Expert 8TB
No need to over-complicate things: when Strings are written/read by a stream those Strings are encoded/decoded. ASCII Strings (each char <= 0x7f) encode to a single byte in UTF-8. UTF-8 decoding (on the C++ side) isn't much trouble either; I suggest a simple protocol:

0,1: file name length in high endian format
2 ... n: file name UTF-8 encoded
n+1 ... n+4: length of the file in high endian format
n+5 ...: binary content of the file

The C++ end shouldn't have any trouble with this data format. All you need is a simple OutputStream on the Java sending side. The String class itself can take care of the encoding (UTF-8)

kind regards,

Feb 9 '09 #6
Oh, the standard encoded size of a char is 1 byte ? I thought it was 2 bytes... If it's 1 byte that indeed simplifies the matter a bit. And I didn't know the String class itself took care of the encoding.
Thanks a lot Jos, I'll try and get it fixed that way. I'll post here if another corpse jumps out of the closet on my line of thought.

Thanks a lot already !

Feb 10 '09 #7
11,448 Expert 8TB
The UTF-8 encoding scheme encodes the bytes 0x00 - 0x7f to the same range: 0x00 - 0x7f. All the ASCII characters happen to be in that range, so they get encoded to themselves.

Internally a char takes up two bytes in Java; all chars do. When you write them to an OutputStream they are encoded because streams write bytes, not chars.

kind regards,

Feb 10 '09 #8

As it seems the people working on the C++ counterpart are not sending the lengths of the filenames and the lengths of the files are being send as integers (4 bytes long).
Now, keeping in mind that characters are encoded in UTF-8 by default i've thought up of the following 'draft' implementation:

Both the fReader and fWriter are actually DataInput/OutputStreams.

Expand|Select|Wrap|Line Numbers
  1. /*
  2.      * (non-Javadoc)
  3.      * 
  4.      * @see firefile.shared.net.ISocket#readString()
  5.      */
  6.     public String readString() throws IOException {
  7.         int length = this.fReader.readInt();
  9.         System.out.print("Lenght received:");
  10.         System.out.println(length);
  12.         final char[] ch = new char[length];
  13.         for (int i = 0; i < length; i++) {
  14.             final int tmp = this.fReader.read();
  15.             if (tmp == -1) {
  16.                 throw new IOException("End of stream prematurely ended.");
  17.             } else {
  18.                 ch[i] = (char) tmp;
  19.             }
  20.         }
  21.         final String in = new String(ch);
  23.         System.out.print("Readstring returned: ");
  24.         System.out.println(in);
  26.         return in;
  27.     }
  29.     /*
  30.      * (non-Javadoc)
  31.      * 
  32.      * @see firefile.shared.net.ISocket#sendString(java.lang.String)
  33.      */
  34.     public void sendString(final String msg) throws IOException {
  35.         this.fWriter.writeInt(msg.length());
  36.         this.fWriter.flush();        
  37.         this.fWriter.writeBytes(msg);
  38.         this.flush();
  39.     }
So, I'm thinking this will do what I want it to do.
Feb 10 '09 #9

I just wanted to let you know that I've found it :). The 'final' version (except for the debug output) is much like the above. I've used DataInputStreams and DataOutputStreams to be able to easily send and receive integers.
To receive characters, I just read the bytes one by one until all have been read ( as advertised by the length ) and cast them to characters which works nicely.
For the binary data I also read bytes one at a time but I just immediately write them using a BufferedOutputStream to a file. So all problems have been solved, and it's working perfectly with the c++ counterparts.

Following code is the implementation for receiving a single file. (with debug output though)

Expand|Select|Wrap|Line Numbers
  1. public void receiveFile(String uri) throws NetworkException, FileException {
  2.         System.out.println("receivefile() start");
  3.         try {            
  4.             //    length of filename + filename
  5.             final int fileNameLength = in.readInt();
  6.             System.out.print("Filename length: ");
  7.             System.out.println(fileNameLength);
  8.             final char[] ch = new char[fileNameLength];
  9.             for (int i = 0; i < fileNameLength; i++) {
  10.                 final int tmp = in.read();
  11.                 if (tmp == -1) {
  12.                     throw new IOException("End of stream prematurely ended.");
  13.                 } else {
  14.                     ch[i] = (char) tmp;
  15.                 }
  16.             }
  17.             String fileName = new String(ch);
  18.             fileName = resolveNameCollisions(uri, fileName);
  19.             System.out.print("Filename - after collision resolving: ");
  20.             System.out.println(fileName);
  21.             BufferedOutputStream bos = new BufferedOutputStream(new FileOutputStream(uri+pathSeparator+fileName)); 
  23.             // length of file + file
  24.             final int fileSize = in.readInt();
  25.             for(int i = 0; i < fileSize; ++i){
  26.                 final byte tmp = in.readByte();
  27.                 bos.write(tmp);
  28.             }
  29.             bos.flush();
  30.             bos.close();
  31.             System.out.println("recieveFile() - end");
  32.         } catch (IOException e) {
  33.             throw new NetworkException(e);
  34.         }        
  35.     }
I'd like to thank you a lot for all the help ! You've really enlighted me on the whole character encoding problem which was somewhat the largest 'black hole' for me, so thanks for shedding some light on that.
And thanks to r035198x and Nepomuk too !

greets !
Feb 10 '09 #10
Hi Tom!
Why you not consider any encoding methods - ASN.1 or XML?
Feb 13 '09 #11

As a matter of fact, we do use XML for message passing. I don't know what ASN.1 is though.
It's just, we can't send files in XML messages...
We were in the need of sending mixed datatypes, lengths, characters and binary data over one one socket.
Where, lengths also includes the length of the xml message being send.
Feb 13 '09 #12
Hi again!
Perhaps, you may use TLV(Tag/Length/Value or Type/Length/Value) notation.
Typical usage: tag(1 byte) for each datatype, length - 2-byte sequence(usually big endian) and data itself. For strings exchange use String's method getBytes(). Details look in example.txt.
Feb 13 '09 #13

Hm, that indeed does look nice :). Though, the only difference with what we're doing now is the type field, which is specified by our protocol to be send in a specific order, it looks quite the same as what we are doing :).

Only, for sending out integers we use a dataoutputstream sending 4 byte integers, big endian. But the rest is quite the same.

I don't think we'll be changing it anymore though as it is working correctly as is for now, and we're all satisfied with how it works. And it's going perfectly between C++ en Java now so if we keep it like that we can move on to the next steps, coding more business logic. I appreciate the help though, if only I knew that sooner :)

Thanks a lot.
Kind Regards,

Feb 14 '09 #14
Thanks on a kind words :)
But, be careful with encodings. If filenames will contains non-ASCII characters at C++ side you'll need special libraries for properly encoding/decoding of these strings, they should exist for every platform.
Feb 14 '09 #15

Yeah, we figured we'd just go for the simplest policy of not supporting non-ASCII file names :P. It's not really a prerequisite either so we can easily skim on some on the difficulties there :).

Kind Regards,
Feb 14 '09 #16
11,448 Expert 8TB
If you change your lines #8 and #14 to 'byte' instead of 'char' your entire class is prepared for Unicode names. The sender has to send the length of the file name measured in bytes of the encoded name. Your String constructor will take care of the decoding (if the encoding and decoding match).

kind regards,

Feb 15 '09 #17
Expand|Select|Wrap|Line Numbers
  1.   final char[] ch = new char[fileNameLength];
  2.              for (int i = 0; i < fileNameLength; i++) {
  3.                  final int tmp = in.read();
  4.                  if (tmp == -1) {
  5.                      throw new IOException("End of stream prematurely ended.");
  6.                  } else {
  7.                      ch[i] = (char) tmp;
  8.                  }
  9.             }
One thing I'm wondering about then is... How do I determine how long the array of bytes for the filename has to be ? I'm assuming I'd just arbitrarily have to use a multiplier depending on the encoding used.

But I'm glad to know it is actually that simple, when we've got the basics done, I'll pitch the idea to the other group members and see if we will implement it or not :). Cause Java isn't the problem with the encoding then, it'll depend on the C++ side then.
Feb 15 '09 #18
11,448 Expert 8TB
I assumed the sender sends the length of the file name (measured in bytes). According to your code you assume the same.

kind regards,

Feb 15 '09 #19
Hiiiiiiiiiiiii i want know how to post a query
Feb 15 '09 #20
Ah yeah, offcourse. My bad.
Thanks :)
Feb 15 '09 #21

Post your reply

Sign in to post your reply or Sign up for a free account.

Similar topics

1 post views Thread by coder_1024 | last post: by
26 posts views Thread by Patient Guy | last post: by
4 posts views Thread by Robert McNally | last post: by
9 posts views Thread by thorley | last post: by
19 posts views Thread by ... | last post: by
9 posts views Thread by Miro | last post: by
4 posts views Thread by David Hirschfield | last post: by
12 posts views Thread by =?Utf-8?B?enRSb24=?= | last post: by
reply views Thread by NeoPa | last post: by
reply views Thread by YellowAndGreen | last post: by
reply views Thread by aboka | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.