Connecting Tech Pros Worldwide Help | Site Map

what random binary data looks like

  #1  
Old April 16th, 2009, 08:35 PM
Member
 
Join Date: Apr 2007
Posts: 53
We are always being told random binary data cannot be compressed,so i'm wondering what does random data look like.I think it would consist of short runs of 1's and 0's.Any ideas.Thanks
  #2  
Old April 16th, 2009, 08:46 PM
Dormilich's Avatar
Moderator
 
Join Date: Aug 2008
Location: Leipzig, Germany
Posts: 3,485
Provided Answers: 9

re: what random binary data looks like


in the end—just a totally random distribution of 1's and 0's. if you can't make out any pattern, there is no way for an algorithm to reduce its complexity.
  #3  
Old April 16th, 2009, 09:06 PM
JosAH's Avatar
Expert
 
Join Date: Mar 2007
Posts: 10,634
Provided Answers: 2

re: what random binary data looks like


Quote:
Originally Posted by dynamo View Post
We are always being told random binary data cannot be compressed,so i'm wondering what does random data look like.I think it would consist of short runs of 1's and 0's.Any ideas.Thanks
In an infinit sequence of random bits any finit sub-sequence is as likely as the other; this means that short runs of 1's and 0's are as likely as long runs of 1's or 0's. If you represent those bits on a 2 dimensional matrix it'll look like white noise you see on a television screen without a signal (on the old ones it does ;-)

kind regards,

Jos
  #4  
Old April 18th, 2009, 01:15 PM
Member
 
Join Date: Apr 2007
Posts: 53

re: what random binary data looks like


Quote:
Originally Posted by JosAH View Post
In an infinit sequence of random bits any finit sub-sequence is as likely as the other; this means that short runs of 1's and 0's are as likely as long runs of 1's or 0's. If you represent those bits on a 2 dimensional matrix it'll look like white noise you see on a television screen without a signal (on the old ones it does ;-)

kind regards,

Jos
thanks for replying.I agree with your description of truly random data, but if one applied an rle compression algorithm( or most other ones) to data don't you expect that the compressed data will consist of short runs of 1's and zero's.
  #5  
Old April 21st, 2009, 05:27 PM
Moderator
 
Join Date: Mar 2006
Posts: 1,103

re: what random binary data looks like


I think we should be careful to define 'random'.
If we mean random, as in randomly generated, there is a good chance that the bytes can be compressed.

If we mean random, as in entropy, chaos, then a sequence of bytes with maximum entropy cannot be compressed.


Prefer usage of 'unordered' to 'random' in our current context.
  #6  
Old June 8th, 2009, 04:35 PM
RedSon's Avatar
Site Moderator
 
Join Date: Jan 2007
Location: America
Posts: 3,362

re: what random binary data looks like


Quote:
Originally Posted by jkmyoung View Post
I think we should be careful to define 'random'.
If we mean random, as in randomly generated, there is a good chance that the bytes can be compressed.

If we mean random, as in entropy, chaos, then a sequence of bytes with maximum entropy cannot be compressed.


Prefer usage of 'unordered' to 'random' in our current context.
I think it's safe to assume that the OP meant 'actual' randomness, as in entropy, given that any pseudo randomly generated number or string of numbers could be compressed into it's original generation formula.
  #7  
Old June 8th, 2009, 09:23 PM
Banfa's Avatar
AdministratorVoR
 
Join Date: Feb 2006
Location: South West UK
Posts: 6,117
Provided Answers: 6

re: what random binary data looks like


Quote:
Originally Posted by dynamo View Post
thanks for replying.I agree with your description of truly random data, but if one applied an rle compression algorithm( or most other ones) to data don't you expect that the compressed data will consist of short runs of 1's and zero's.
I am not quite sure what you are trying to say but if you applied a RLE algorithm to a truly random stream of data then your "compressed" data would most likely be larger than the original stream.

Treating the data as a stream of bytes remember in RLE you either have the RLE every byte or you have to mark the the sequences that are RLE. In the first case given any byte to encode then 255/256 (or 99.6%) of the time the byte that follows is different and I have to encode 1 byte into 2. With other 0.4% of the time if the byte following the second byte is different then 2 bytes is encoded in 2 bytes so there is no compression. It is only if I have 3 bytes (or more) the same that I actually get compression. There is only a 0.0015% chance of this.

Similarly if I choose to mark the compressed runs of data using some byte value M then a minimum RLE code is 3 bytes (M, 1 byte containing the count, 1 byte containing the value). That means that to compress the data I need a run of at least 4 bytes which only occurs 0.000006% of the time. And offseting that every time M appears in the original stream without M following it (about 0.38% probability) I have to find a way to mark it, normally by repeating it twice in the output stream.

The point is that RLE only really works as a compression algorithm when you know that mainly what you are dealing with is long runs of data that is the same so that you are guaranteed good compression; random data does not meet this caveat you are equally likely to get short runs of data as long ones.


This basic point is true of every compression algorithm. Any given algorithm is tuned to compress data of a given pattern (or patterns). Data not of that pattern will at best not compress under the algorithm and in at least some worst cases the algorithm will cause the data to be expanded.

That is why if you have data you wish to compress it is important to choose the right algorithm for the patterns in the data.

Random binary data by its definition has no pattern and therefore you are unable to choose a algorithm that will compress it. Any algorithm you chose will probably compress some parts of the stream but it will also certainly expand other parts of it.
  #8  
Old June 9th, 2009, 09:31 AM
JosAH's Avatar
Expert
 
Join Date: Mar 2007
Posts: 10,634
Provided Answers: 2

re: what random binary data looks like


For an example of truly random byte values read this URL:

http://www.fourmilab.ch/cgi-bin/uncg...es=128&fmt=bin

it reads 128 bytes of random data (given some nutrino bombardment); it'll block sending new data after a couple of times (dos attack prevention).

kind regards,

Jos
Reply