473,569 Members | 3,054 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

XORing long strings opimization?

def XOR(s1,s2):
""" XOR string s1 with s2 """
output = ""
# Argument check
if (type(s1) and type(s2)) != type(""):
raise TypeError, "Arguments are not strings"
if len(s1) != len(s2):
raise ValueError, "Size differs in strings"
# Xoring
for c1 in s1:
for c2 in s2:
output += chr(ord(c1) ^ ord(c2))
return output

This way is very slow for large strings.
Anyone know of a better and faster way to do it?

Jul 18 '05 #1
17 2264
Noen wrote:
def XOR(s1,s2):
""" XOR string s1 with s2 """
output = ""
# Argument check
if (type(s1) and type(s2)) != type(""):
raise TypeError, "Arguments are not strings"
if len(s1) != len(s2):
raise ValueError, "Size differs in strings"
# Xoring
for c1 in s1:
for c2 in s2:
output += chr(ord(c1) ^ ord(c2))
return output

This way is very slow for large strings.
Anyone know of a better and faster way to do it?


Before we start optimizing:

len(XOR(s1, s2)) == len(s1) * len(s2)

Bug or feature?

Peter
Jul 18 '05 #2
Peter Otten wrote:
Noen wrote:

def XOR(s1,s2):
""" XOR string s1 with s2 """
output = ""
# Argument check
if (type(s1) and type(s2)) != type(""):
raise TypeError, "Arguments are not strings"
if len(s1) != len(s2):
raise ValueError, "Size differs in strings"
# Xoring
for c1 in s1:
for c2 in s2:
output += chr(ord(c1) ^ ord(c2))
return output

This way is very slow for large strings.
Anyone know of a better and faster way to do it?

Before we start optimizing:

len(XOR(s1, s2)) == len(s1) * len(s2)

Bug or feature?

Peter


Oh, didnt notice that as I wrote it. Thanks...
Anyway, this is how it should be.

def XOR(s1,s2):
""" XOR string s1 with s2 """
output = ""
# Argument check
if (type(s1) and type(s2)) != type(""):
raise TypeError, "Arguments are not strings"
if len(s1) != len(s2):
raise ValueError, "Size differs in strings"
# Xoring
for i in range(len(s1)):
output += chr(ord(s1[i]) ^ ord(s2[i]))
return output
Jul 18 '05 #3
Noen wrote:
Oh, didnt notice that as I wrote it. Thanks...
Anyway, this is how it should be.

def XOR(s1,s2):
""" XOR string s1 with s2 """
output = ""
# Argument check
if (type(s1) and type(s2)) != type(""):
raise TypeError, "Arguments are not strings"
if len(s1) != len(s2):
raise ValueError, "Size differs in strings"
# Xoring
for i in range(len(s1)):
output += chr(ord(s1[i]) ^ ord(s2[i]))
return output


Oops, I missed one:
xor2.XOR(1, "")

Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "xor2.py", line 7, in XOR
if len(s1) != len(s2):
TypeError: len() of unsized object

Did you expect this?

Peter
Jul 18 '05 #4

"Noen" <no***********@ na.no> wrote in message
news:i8******** *************@j uliett.dax.net. ..
def XOR(s1,s2):
""" XOR string s1 with s2 """
output = ""
# Argument check
if (type(s1) and type(s2)) != type(""):
raise TypeError, "Arguments are not strings"
if len(s1) != len(s2):
raise ValueError, "Size differs in strings"
# Xoring
for i in range(len(s1)):
output += chr(ord(s1[i]) ^ ord(s2[i]))
return output


You fell into the O(n**2) immutable sequence repeated addition trap.
For O(n) behavior, try output = [] and return ''.join(output) .
Replacing output+=stuff with output.append(s tuff) might be faster
still.

Terry J. Reedy
Jul 18 '05 #5
Peter Otten wrote:
Noen wrote:

Oh, didnt notice that as I wrote it. Thanks...
Anyway, this is how it should be.

def XOR(s1,s2):
""" XOR string s1 with s2 """
output = ""
# Argument check
if (type(s1) and type(s2)) != type(""):
raise TypeError, "Arguments are not strings"
if len(s1) != len(s2):
raise ValueError, "Size differs in strings"
# Xoring
for i in range(len(s1)):
output += chr(ord(s1[i]) ^ ord(s2[i]))
return output

Oops, I missed one:

xor2.XOR( 1, "")


Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "xor2.py", line 7, in XOR
if len(s1) != len(s2):
TypeError: len() of unsized object

Did you expect this?

Peter

nope, thought the Argument check would do it, but I used and opeartor
instead of or. Well, here is XOR v.0.3b :P Any comments how to make it
faster?

def XOR(s1,s2):
""" XOR string s1 with s2 """
output = ""
# Argument check
if (type(s1) or type(s2)) != type(""):
raise TypeError, "Arguments are not strings"
if len(s1) != len(s2):
raise ValueError, "Size differs in strings"
# Xoring
for i in range(len(s1)):
output += chr(ord(s1[i]) ^ ord(s2[i]))
return output
Jul 18 '05 #6
Noen wrote:
def XOR(s1,s2):
""" XOR string s1 with s2 """
output = ""
# Argument check
if (type(s1) or type(s2)) != type(""):
raise TypeError, "Arguments are not strings"
if len(s1) != len(s2):
raise ValueError, "Size differs in strings"
# Xoring
for i in range(len(s1)):
output += chr(ord(s1[i]) ^ ord(s2[i]))
return output


I keep nagging:
xor3.XOR("", 1)

Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "xor3.py", line 7, in XOR
if len(s1) != len(s2):
TypeError: len() of unsized object

Must be in a nasty mood this evening :-)
Seriously, I think that "first make it right, then make it fast" is a
reasonable design principle.
You can check the type

if type(s1) != type("") or type(s2) != type(""):
....

or

if not isinstance(s1, str) or not isinstance(s2, str):
....

or even

if not type(s1) == type(s2) == type(""):
....

but personally I would omit the check altogether. The second check could be
relaxed to len(s1) <= len(s2), assuming that s1 is the data to be encrypted
and s2 the random character sequence that makes your encryption NSA-proof.
For (much) weaker encryption schemes, you could omit it completely and
cycle over the characters of s2 (not recommended).

As to speed, I would suppose the following to be somewhat faster:

def xor(s1, s2):
return "".join([chr(ord(c1) ^ ord(c2)) for c1, c2 in itertools.izip( s1,
s2)])

It favours "".join(charact er list) over consecutive s += tail operations,
which need to build "many" strings as intermediate results.

I think, further speedups could be achieved building a lookup table for all
character combinations.

Peter

Jul 18 '05 #7
"Peter Otten" <__*******@web. de> schrieb im Newsbeitrag
news:bo******** *****@news.t-online.com...
Noen wrote:
Oh, didnt notice that as I wrote it. Thanks...
Anyway, this is how it should be.

def XOR(s1,s2):
""" XOR string s1 with s2 """
output = ""
# Argument check
if (type(s1) and type(s2)) != type(""):
raise TypeError, "Arguments are not strings"
if len(s1) != len(s2):
raise ValueError, "Size differs in strings"
# Xoring
for i in range(len(s1)):
output += chr(ord(s1[i]) ^ ord(s2[i]))
return output


Oops, I missed one:


Therese even more ;)
if (type(s1) and type(s2)) != type(""):
raise TypeError, "Arguments are not strings"
i guess he meant: if type(s1) != type("") or type(s2) != type(""):
raise TypeError, "Arguments are not strings"


Ciao Ulrich
Jul 18 '05 #8
"Noen" <no***********@ na.no> wrote in message
news:ie******** *************@j uliett.dax.net. ..
nope, thought the Argument check would do it, but I used and opeartor
instead of or. Well, here is XOR v.0.3b :P Any comments how to make it
faster?


How about using the struct module?
Caveat: I don't know if struct limits the repeat count.

import struct
def XOR(data, key):
if len(data) != len(key):
raise ValueError, "data and key not of equal length"
ldata = struct.unpack(' =%sb' % len(data), data)
lkey = struct.unpack(' =%sb' % len(key), key)
lxored = [d^k for d,k in zip(ldata, lkey)]
xored = struct.pack('=% sb' % len(lxored), *lxored)
return xored

Just an idea.
--
Francis Avila

Jul 18 '05 #9
Peter Otten wrote:
Noen wrote:
Oh, didnt notice that as I wrote it. Thanks...
Anyway, this is how it should be.

def XOR(s1,s2):
""" XOR string s1 with s2 """
output = ""
# Argument check
if (type(s1) and type(s2)) != type(""):
raise TypeError, "Arguments are not strings"
if len(s1) != len(s2):
raise ValueError, "Size differs in strings"
# Xoring
for i in range(len(s1)):
output += chr(ord(s1[i]) ^ ord(s2[i]))
return output


Oops, I missed one:
xor2.XOR(1, "")

Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "xor2.py", line 7, in XOR
if len(s1) != len(s2):
TypeError: len() of unsized object


Checks apart -- on to optimization as per subject:

[alex@lancelot krb5module]$ timeit.py -c -s'from string import letters as S'
'"".join([chr(ord(S[i])^ord(S[i])) for i in xrange(len(S))])'
10000 loops, best of 3: 113 usec per loop

i'm not sure i can do much better with strings. arrays are cool, though:

[alex@lancelot krb5module]$ timeit.py -c -s'from string import letters as S'
-s'import array' 'x=array.array( "b",S)' 'for i,v in enumerate(x): x[i]^=v'
'x.tostring()'
10000 loops, best of 3: 57 usec per loop

Morale: when you find yourself fiddling with lots of chr and ord, and care
about performance, think about "array('b', ... -- it could buy you an easy
speedup by a factor of 2, like here.
Alex


Jul 18 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
2509
by: lawrence | last post by:
"Garp" <garp7@no7.blueyonder.co.uk> wrote in message news:<_vpuc.1424$j_3.13346038@news-text.cableinet.net>... > "lawrence" <lkrubner@geocities.com> wrote in message > news:da7e68e8.0405300907.3c8eabf7@posting.google.com... > > This reg ex will find me any block of strings 40 or more characters in > > length without a white space, yes? > > >...
83
3391
by: kartik | last post by:
there seems to be a serious problem with allowing numbers to grow in a nearly unbounded manner, as int/long unification does: it hides bugs. most of the time, i expect my numbers to be small. 2**31 is good enough for most uses of variables, and when more is needed, 2**63 should do most of the time. granted, unification allows code to work...
4
2296
by: Abhishek Jha | last post by:
You can implement very very long integer using strings. Strings have no limit and using your own code implementation you can store the numbers of as much length as you want..........for more details contact at abhishekjha10@yahoo.co.in
7
7583
by: Eric | last post by:
Hi All, I need to XOR two same-length Strings against each other. I'm assuming that, in order to do so, I'll need to convert each String to a BitArray. Thus, my question is this: is there an easy way to convert a String to a BitArray (and back again)? I explained the ultimate goal (XORing two Strings) so that, if anyone has a better...
3
2369
by: ajaksu | last post by:
Hello c.l.p.ers :) Running long(Decimal) is pretty slow, and the conversion is based on strings. I'm trying to figure out whether there is a good reason for using strings like in decimal.py (that reason would be bound to bite me down the road). This converts Decimal to long and is much faster in my test system (PIII 650MHz, but was written...
73
7373
by: Yevgen Muntyan | last post by:
Hey, I was reading C99 Rationale, and it has the following two QUIET CHANGE paragraphs: 6.5.3.4: "With the introduction of the long long and extended integer types, the sizeof operator may yield a value that exceeds the range of an unsigned long." 6.5.6: "With the introduction of the long long and extended integer
10
2812
by: Mo | last post by:
Hi, I am trying to write a code to build a string 768 characters long. This string is going to be written to a file which is then read by another application. The format of the string is already defined. For example Firstname starts at position 20, Last Name at position 30, address at position 40 etc... I am reading these data from sql...
5
5533
by: David Mathog | last post by:
I'm looking at a program which stores perl scripts in an array. Each script is stored as a single entry in that array, and the whole set of them live in a single header file (escaped out the wazoo to get the perl code intact through the C preprocessor.) The issue is that many of these strings are quite long, which causes gcc to throw these...
2
1594
by: Jean-Paul Calderone | last post by:
On Fri, 5 Sep 2008 14:24:16 -0500, Robert Dailey <rcdailey@gmail.comwrote: mystring = ( "This is a very long string that " "spans multiple lines and does " "not include line breaks or tabs " "from the source file between " "the strings partitions.") Jean-Paul
0
7619
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
7930
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
0
8138
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
0
7983
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
0
6290
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
0
5228
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3651
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2118
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
0
950
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.