what random binary data looks like - Algorithms / Advanced Math

We are always being told random binary data cannot be compressed,so i'm wondering what does random data look like.I think it would consist of short runs of 1's and 0's.Any ideas.Thanks

Apr 16 '09 #1

Subscribe Post Reply

6764

Dormilich

8,658

Expert Mod 8TB

in the end—just a totally random distribution of 1's and 0's. if you can't make out any pattern, there is no way for an algorithm to reduce its complexity.

Apr 16 '09 #2

JosAH

11,448

Expert 8TB

@dynamo
In an infinit sequence of random bits any finit sub-sequence is as likely as the other; this means that short runs of 1's and 0's are as likely as long runs of 1's or 0's. If you represent those bits on a 2 dimensional matrix it'll look like white noise you see on a television screen without a signal (on the old ones it does ;-)

kind regards,

Jos

Apr 16 '09 #3

dynamo

@JosAH
thanks for replying.I agree with your description of truly random data, but if one applied an rle compression algorithm( or most other ones) to data don't you expect that the compressed data will consist of short runs of 1's and zero's.

Apr 18 '09 #4

jkmyoung

2,057

Expert 2GB

I think we should be careful to define 'random'.
If we mean random, as in randomly generated, there is a good chance that the bytes can be compressed.

If we mean random, as in entropy, chaos, then a sequence of bytes with maximum entropy cannot be compressed.

Prefer usage of 'unordered' to 'random' in our current context.

Apr 21 '09 #5

RedSon

5,000

Expert 4TB

@jkmyoung
I think it's safe to assume that the OP meant 'actual' randomness, as in entropy, given that any pseudo randomly generated number or string of numbers could be compressed into it's original generation formula.

Jun 8 '09 #6

Banfa

9,065

Expert Mod 8TB

@dynamo
I am not quite sure what you are trying to say but if you applied a RLE algorithm to a truly random stream of data then your "compressed" data would most likely be larger than the original stream.

Treating the data as a stream of bytes remember in RLE you either have the RLE every byte or you have to mark the the sequences that are RLE. In the first case given any byte to encode then 255/256 (or 99.6%) of the time the byte that follows is different and I have to encode 1 byte into 2. With other 0.4% of the time if the byte following the second byte is different then 2 bytes is encoded in 2 bytes so there is no compression. It is only if I have 3 bytes (or more) the same that I actually get compression. There is only a 0.0015% chance of this.

Similarly if I choose to mark the compressed runs of data using some byte value M then a minimum RLE code is 3 bytes (M, 1 byte containing the count, 1 byte containing the value). That means that to compress the data I need a run of at least 4 bytes which only occurs 0.000006% of the time. And offseting that every time M appears in the original stream without M following it (about 0.38% probability) I have to find a way to mark it, normally by repeating it twice in the output stream.

The point is that RLE only really works as a compression algorithm when you know that mainly what you are dealing with is long runs of data that is the same so that you are guaranteed good compression; random data does not meet this caveat you are equally likely to get short runs of data as long ones.

This basic point is true of every compression algorithm. Any given algorithm is tuned to compress data of a given pattern (or patterns). Data not of that pattern will at best not compress under the algorithm and in at least some worst cases the algorithm will cause the data to be expanded.

That is why if you have data you wish to compress it is important to choose the right algorithm for the patterns in the data.

Random binary data by its definition has no pattern and therefore you are unable to choose a algorithm that will compress it. Any algorithm you chose will probably compress some parts of the stream but it will also certainly expand other parts of it.

Jun 8 '09 #7

JosAH

11,448

Expert 8TB

For an example of truly random byte values read this URL:

http://www.fourmilab.ch/cgi-bin/uncg...es=128&fmt=bin

it reads 128 bytes of random data (given some nutrino bombardment); it'll block sending new data after a couple of times (dos attack prevention).

kind regards,

Jos

Jun 9 '09 #8

Similar topics

140

Horrible Visual C Bug!

by: Oliver Brausch | last post by:

Hello, have you ever heard about this MS-visual c compiler bug? look at the small prog: static int x=0; int bit32() { return ++x; }

C / C++

storing mysql data to a random access file

by: VB.NET | last post by:

I'm using a mysql database and connecting my vb.net program to the DB over a network connection. i would like to bring this data over to a vb.net random access file. does anyone know how to...

Visual Basic .NET

generate 2 random numbers in rapid sequence

by: Jim Michaels | last post by:

I need to generate 2 random numbers in rapid sequence from either PHP or mysql. I have not been able to do either. I get the same number back several times from PHP's mt_rand() and from mysql's...

PHP

669

What is Expressiveness in a Computer Language

by: Xah Lee | last post by:

in March, i posted a essay â€œWhat is Expressiveness in a Computer Languageâ€, archived at: http://xahlee.org/perl-python/what_is_expresiveness.html I was informed then that there is a academic...

Python

Random image downloader for newsgroups (first script)

by: Kim | last post by:

Random image downloader for specified newsgroup. Hi I'm writing a small script that will download random images from a specified newsgroup. I've imported yenc into the script but I can't open the...

Python

Random news downloader (first script!)

by: Kim | last post by:

Random image downloader for specified newsgroup. Hi I'm writing a small script that will download random images from a specified newsgroup. I've imported yenc into the script but I can't open the...

Python

Using VisualBasic 6 Random Access Files in dotnet

by: Peter | last post by:

Hi I will use a Random Access File in dotnet/csharp. The file is created with Visual Basic 6 (VB6). My Problem is to find out the corresponding Types I had to use in dotnet - reading the VB6...

C# / C Sharp

Most pythonic way of weighted random selection

by: Manuel Ebert | last post by:

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Dear list, who's got aesthetic advice for the following problem? I've got some joint probabilities of two distinct events Pr(X=x, Y=y), stored...

Python

Reading from a 64 bit data Binary File

by: vikuba | last post by:

HI I'm new to this forum I'm having a peculiar problem I'm trying to read a 64 bit data binary file I have 5 integer variables a b c d e the problem is I have to read the 64 bit data ...

C / C++

Easy Steps to Fix "Canon Printer Won't Connect to WiFi Network"

by: taylorcarr | last post by:

A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...

General

Basic Javascript concepts

by: aa123db | last post by:

Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...

Javascript

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General