473,756 Members | 3,663 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

xor: how come so slow?

Hi,
I'm trying to encode a byte data. Let's not focus on the process of
encoding; in fact, I want to emphasize that the method
create_random_b lock takes 0.5s to be executed (even Java it's faster) on
a Dual-Core 3.0Ghz machine:

took 46.746999979s, avg: 0.46746999979s

Thus I suppose that the xor operation between bytes raise the execution
time to 0.5; why I suppose that?
Because in Python there's no support for bytes and even for xoring
bytes, so I used a workaround:
I cycle on the two strings to be xor-red
for every char in the strings
convert one char on integer and then xor them; (ord)
insert one char in the result, transforming the previous integer
in char (chr)

I suppose that ord() and char() are the main problems of my
implementation, but I don't know either way to xor two bytes of data
(bytes are represented as strings).
For more information, see the code attached.

How should I decrease the execution time?

Thank you
from __future__ import division
import random
import time
import sha
import os

class Encoder(object) :
def create_random_b lock(self, data, seed, blocksize):
number_of_block s = int(len(data)/blocksize)
random.seed(see d)
random_block = ['0'] * blocksize
for index in range(number_of _blocks):
if int(random.getr andbits(1)) == 1:
block = data[blocksize*index :blocksize*inde x+blocksize]
for bit in range(len(block )):
random_block[bit] =
chr(ord(random_ block[bit])^ord(block[bit])) # workaround per fare xor
bit a bit di str; xor e' solo supportato per int -ord
return ''.join(random_ block)
x = Encoder()
piece = os.urandom(1024 *1024)
blocksize = 16384
t1 = time.time()
for l in range(100):
seed = random.getrandb its(32)
block = x.create_random _block(piece, seed, blocksize)
t2 = time.time()
print 'took ' + str(t2-t1) + 's, avg: ' + str((t2-t1)/100.0) + 's'
Oct 15 '08
21 2497
[I think these attributions are right]
Steven D'Aprano <st***@REMOVE-THIS-cybersource.com .auwrote:
>On Fri, 17 Oct 2008 22:45:19 +1300, Lawrence D'Oliveiro wrote:
>In message <01************ **********@news .astraweb.com>, Steven D'Aprano
wrote:
>>... why do you say that xoring random data with other random data
produces less randomness than you started with?
blocksize <= number_of_block s * blocksize
I must be thick, because that looks like a non sequitor to me. I don't
see the relevance.
Lawrence originally said something along the lines of this just
being a way of taking some random data and producing "less random
data". You're reading it as "(less random) data". The intent (I
assume) is for it to be read as "less (random data)".

Maybe it should be "fewer random data". After all, each byte in
the block is discrete.

--
\S -- si***@chiark.gr eenend.org.uk -- http://www.chaos.org.uk/~sion/
"Frankly I have no feelings towards penguins one way or the other"
-- Arthur C. Clarke
her nu becomeþ se bera eadward ofdun hlæddre heafdes bæce bump bump bump
Oct 17 '08 #11
In message <Jl*******@news .chiark.greenen d.org.uk>, Sion Arrowsmith wrote:
Maybe it should be "fewer random data".
Except these days we tend to think of "data" being, say, more like "flour"
than "bees", so it's "less data", like "less flour", rather than
like "fewer bees". :)
After all, each byte in the block is discrete.
Data can come in fractional bits. That's how compression works.
Oct 17 '08 #12
On Fri, 17 Oct 2008 13:59:27 +0100, Sion Arrowsmith wrote:
[I think these attributions are right] Steven D'Aprano
<st***@REMOVE-THIS-cybersource.com .auwrote:
>>On Fri, 17 Oct 2008 22:45:19 +1300, Lawrence D'Oliveiro wrote:
>>In message <01************ **********@news .astraweb.com>, Steven
D'Aprano wrote:
... why do you say that xoring random data with other random data
produces less randomness than you started with?
blocksize <= number_of_block s * blocksize
I must be thick, because that looks like a non sequitor to me. I don't
see the relevance.

Lawrence originally said something along the lines of this just being a
way of taking some random data and producing "less random data". You're
reading it as "(less random) data". The intent (I assume) is for it to
be read as "less (random data)".

Ah. That was how I read it.

Maybe it should be "fewer random data". After all, each byte in the
block is discrete.

Or "a smaller amount of random data".

--
Steven
Oct 19 '08 #13
On Sat, 18 Oct 2008 09:16:11 +1300, Lawrence D'Oliveiro wrote:
In message <Jl*******@news .chiark.greenen d.org.uk>, Sion Arrowsmith
wrote:
>Maybe it should be "fewer random data".

Except these days we tend to think of "data" being, say, more like
"flour" than "bees", so it's "less data", like "less flour", rather than
like "fewer bees". :)
>After all, each byte in the block is discrete.

Data can come in fractional bits. That's how compression works.
Compression works by removing redundant data so the same amount of useful
information requires fewer bits. If you don't believe me, try compressing
a single bit and see if you get a "fractional bit". I leave it to you to
choose whether the bit should be on or off.

That's why compressing data twice doesn't gain much, if anything. Try
compressing a typical JPEG image, chances are it won't get any smaller.
That's because JPEGs already remove the redundancy in the data,
compressing it. With no redundancy, there can be no compression.

Due to the pigeon-hole principle, any reversible compression scheme must
cause some data to be expanded rather than compressed. (If not, then
there must be two different pieces of data that compress to the same
thing, which means the scheme is not reversible.) If there were
fractional bits, that wouldn't apply, because you can always divide a
pigeon-hole (a bit) into two smaller pigeon-holes.

--
Steven
Oct 19 '08 #14
In message <01************ **********@news .astraweb.com>, Steven D'Aprano
wrote:
On Sat, 18 Oct 2008 09:16:11 +1300, Lawrence D'Oliveiro wrote:
>Data can come in fractional bits. That's how compression works.

If you don't believe me, try compressing a single bit and see if you get
a "fractional bit".
If both states of the bit are not equally likely, then you do indeed have a
fractional bit, since

nrbits = (- logbase2(P[bit = 0]) - logbase2(P[bit = 1])) / 2
Oct 19 '08 #15
Steven D'Aprano <st***@REMOVE-THIS-cybersource.com .auwrote:
>
On Fri, 17 Oct 2008 20:51:37 +1300, Lawrence D'Oliveiro wrote:
>Is piece really meant to be random? If so, your create_random_b lock
function isn't achieving much--xoring random data together isn't going
to produce anything more exciting than less random data than you started
with.

Hmmm... why do you say that xoring random data with other random data
produces less randomness than you started with?

I'm not saying that you're wrong, and certainly it is pointless since
you're not going to improve on the randomness of /dev/urandom without a
lot of work. But less random?
For those who got a bit lost here, I'd would point out that Knuth[1] has an
excellent chapter on random numbers that includes a detailed discussion of
this effect. His net takeaway is that most of the things people do to
increase randomness actually have exactly the opposite effect.
-----
[1] Knuth, Donald. The Art of Computer Programming, Volume 2,
Seminumerical Algorithms.
--
Tim Roberts, ti**@probo.com
Providenza & Boekelheide, Inc.
Oct 19 '08 #16
On Sun, 19 Oct 2008 16:38:37 +1300, Lawrence D'Oliveiro wrote:
In message <01************ **********@news .astraweb.com>, Steven D'Aprano
wrote:
>On Sat, 18 Oct 2008 09:16:11 +1300, Lawrence D'Oliveiro wrote:
>>Data can come in fractional bits. That's how compression works.

If you don't believe me, try compressing a single bit and see if you
get a "fractional bit".

If both states of the bit are not equally likely, then you do indeed
have a fractional bit, since

nrbits = (- logbase2(P[bit = 0]) - logbase2(P[bit = 1])) / 2

That's an arithmetic mean of the logarithms. It doesn't imply that there
are fractional bits any more than an average family having 2.3 children
implies that there are 0.3 of a child wandering around the streets.

Using the Shannon measure of information, you can have messages which
contain fractional information (technically, "surprisal" ), when measured
in bits. But that doesn't imply the existence of fractional bits. Look at
it this way: consider a barter economy where I agree to swap 5 chickens
for 2 axes. So each axe is equivalent to 2.5 chickens. But that doesn't
imply that there is such a thing as 0.5 of a chicken -- at least not a
*live* chicken. While I can blithely talk about bartering fractional
chickens, in practice when I actually go to make good on my promise, it
must be an integer number of chickens.

Similarly, we can talk about messages containing fractional bits of
information, but when we actually store or transmit that message in
practice, we can only use integer numbers of bits.

As Wikipedia puts it:

It is important to differentiate between the use of "bit" in referring to
a discrete storage unit and the use of "bit" in referring to a
statistical unit of information. The bit, as a discrete storage unit, can
by definition store only 0 or 1. A statistical bit is the amount of
information that, on average[citation needed], can be stored in a
discrete bit. ... If these two ideas need to be distinguished, sometimes
the name bit is used when discussing data storage while shannon is used
for the statistical bit.

http://en.wikipedia.org/wiki/Bit

--
Steven
Oct 19 '08 #17
On Sun, 19 Oct 2008 04:38:04 +0000, Tim Roberts wrote:
Steven D'Aprano <st***@REMOVE-THIS-cybersource.com .auwrote:
>>
On Fri, 17 Oct 2008 20:51:37 +1300, Lawrence D'Oliveiro wrote:
>>Is piece really meant to be random? If so, your create_random_b lock
function isn't achieving much--xoring random data together isn't going
to produce anything more exciting than less random data than you
started with.

Hmmm... why do you say that xoring random data with other random data
produces less randomness than you started with?

I'm not saying that you're wrong, and certainly it is pointless since
you're not going to improve on the randomness of /dev/urandom without a
lot of work. But less random?

For those who got a bit lost here, I'd would point out that Knuth[1] has
an excellent chapter on random numbers that includes a detailed
discussion of this effect. His net takeaway is that most of the things
people do to increase randomness actually have exactly the opposite
effect.
I don't doubt it at all. But xoring random data with more random data?
I'm guessing that if the two sources of data are independent and from the
same distribution, then xoring them is pointless but not harmful. Here's
a rough-and-ready test which suggests there's little harm in it:

>>import os, math
def rand_data(size) :
.... return [ord(c) for c in os.urandom(size )]
....
>>def mean(data):
.... return sum(data)/len(data)
....
>>def stdev(data):
.... return math.sqrt( mean([x**2 for x in data]) - mean(data)**2 )
....
>>A = rand_data(1000) # good random data
B = rand_data(1000) # more good random data
AB = [a^b for (a,b) in zip(A, B)] # is this still good random data?
assert len(AB) == len(A) == len(B)

mean(A), stdev(A)
(126, 73.918874450305 31)
>>mean(B), stdev(B)
(128, 74.242844773082 339)
>>mean(AB), stdev(AB)
(129, 74.390859653589 16)
Note: I wouldn't take the above terribly seriously. Mean and standard
deviation alone are terrible measures of the randomness of data. But this
does suggest that any deviation from uniform randomness will be quite
subtle.

--
Steven
Oct 19 '08 #18
On Oct 19, 7:13*am, Dennis Lee Bieber <wlfr...@ix.net com.comwrote:
On Sun, 19 Oct 2008 04:38:04 GMT, Tim Roberts <t...@probo.com declaimed
the following in comp.lang.pytho n:
For those who got a bit lost here, I'd would point out that Knuth[1] has an
excellent chapter on random numbers that includes a detailed discussionof
this effect. *His net takeaway is that most of the things people do to
increase randomness actually have exactly the opposite effect.

* * * * Some decade I'll have to obtain his volumes... But they've never
shown up in a $60 special offer from a book club (unlike the compact
editions of the OED) <G>.

* * * * And while XOR may seem significant, just consider die rolls...

* * * * If each "byte" were one die roll, you'd expect a nearly even
distribution... (for a 6 sided die, 1/6 would have each value). But
using the sum of two die, your begin to get a bell curve: 2 and 12
appear 1/36 of the time (each), but 7 occurs 6/36 of the time. Use three
die, and it gets worse: 3 and 18 occur 1/216, "10.5" occurs much more
often...
That should be one die, two dice, etc. :-)
Oct 19 '08 #19
Lawrence D'Oliveiro wrote:
In message <01************ **********@news .astraweb.com>, Steven D'Aprano
wrote:
>On Sat, 18 Oct 2008 09:16:11 +1300, Lawrence D'Oliveiro wrote:
>>Data can come in fractional bits. That's how compression works.
If you don't believe me, try compressing a single bit and see if you get
a "fractional bit".

If both states of the bit are not equally likely, then you do indeed have a
fractional bit, since

nrbits = (- logbase2(P[bit = 0]) - logbase2(P[bit = 1])) / 2
What's happening here is that the two different meanings of "bit" are
being confused. A bit is both a binary digit and a measure of information.

Obviously you can't create a bit stream with half a bit in it.

In a coding system where all messages of N binary digits are equally
likely then each message contains N bits of information content. This is
the theoretical upper bound on the information content.

In most practical systems, however, the messages have differing
probabilities; then an N-binary-digit message conveys less than N bits
of information, as Lawrence indicated above. Fractional bits are
perfectly valid as a measure of information content.

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/

Oct 19 '08 #20

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
5249
by: Mike Mc | last post by:
Can anyone tell me an easy way to hide numbers using XOR as mask, i.e. I want to write a program that will get a number, then it will create two random numbers that when they are XOR'ed together will create the original number. e.g. Original number = 65 The program will then come up with two random numbers that fit the bill, i.e. 150 & 215.
50
6177
by: dataangel | last post by:
I wrote a function to compare whether two strings are "similar" because I'm using python to make a small text adventure engine and I want to it to be responsive to slight mispellings, like "inevtory" and the like. To save time the first thing my function does is check if I'm comparing an empty vs. a non-empty string, because they are to be never considered similar. Right now I have to write out the check like this: if str1 and str2: if...
20
400
by: JJK | last post by:
I'd tried to port an application from vb6 to vb.net 2003. While testing is I'd got a error in the XOR operator. This is an example: dim c as Byte .... mask = 190 Fileget(1,c) c = c XOR mask Fileput(2,c) Run 1:
80
35129
by: Christopher Benson-Manica | last post by:
Of course one can get the effect with appropriate use of existing operators, but a ^^ operator would make for nice symmetry (as well as useful to me in something I'm working on). Am I the only one who would find it useful? -- Christopher Benson-Manica | I *should* know what I'm talking about - if I ataru(at)cyberspace.org | don't, I need to know. Flames welcome.
5
6661
by: robert.h.lowe | last post by:
Hi, I'm looking for a computationally fast means of XOR folding the individual bytes of an unsigned 32-bit int, and an unsigned 16-bit int. Can I point a char * at the address of each of the int's (in succession), and safely access each byte? Danger, danger, Will Robinsin!?? Any better ideas?? I'm assuming this is faster than a series of 8-bit right shifts followed by ORs with 255 as input to the XOR. E.g.
6
1365
by: Chris | last post by:
Hi, Can someone explain the use of XOR? Thanks
11
5048
by: John Williams | last post by:
I've written a simple program to do XOR encryption as my first foray into understanding how encryption works. The code compiles fine, however it segmentation faults on every run. using gdb to debug it let me narrow the problem down to the Cipher function I think it faults at line 84 or 85. The program makes it's first read/cipher/write pass without issue but the second pass kills it. Using gdb to print the variables left showed me the...
0
9456
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9273
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10032
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
8712
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
6534
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5141
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
3805
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3358
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2666
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.