problems with base64

Karl Pech

Hi all,

I'm trying to write a program which can read in files in the following
format:
sos_encoded.txt:
---
begin-base64 644 sos.txt
UGxlYXNlLCBoZWxwIG1lIQ==
---

and convert them to "clear byte code". For example if you take the file
sos_encoded.txt and use my program on it you should get the following:
sos.txt:
---
Please, help me!
---

Unfortunately if I try to convert files which didn't have any "human
readable text" when they were encoded and if these files are large (> 1.5MB)
I get back corrupted files.

This is the source of my program:
---
import string

def extract_base64(source):
source_ = source
all_chars = string.maketrans('','') # create 256-ASCII-char-table

# delete all base64-chars from source-copy
source_without_base64_signs = source_.translate(all_chars, string.letters+string.digits+"+/=")

# delete all chars, which remained in the changed source-copy, from the first copy
# and return this new copy
# --> all base64-chars remain
return source_.translate(all_chars, source_without_base64_signs)

def convert_to_8bits(source):
base64_table = {'A' : 0, 'N' : 13, 'a' : 26, 'n' : 39, '0' : 52,
'B' : 1, 'O' : 14, 'b' : 27, 'o' : 40, '1' : 53,
'C' : 2, 'P' : 15, 'c' : 28, 'p' : 41, '2' : 54,
'D' : 3, 'Q' : 16, 'd' : 29, 'q' : 42, '3' : 55,
'E' : 4, 'R' : 17, 'e' : 30, 'r' : 43, '4' : 56,
'F' : 5, 'S' : 18, 'f' : 31, 's' : 44, '5' : 57,
'G' : 6, 'T' : 19, 'g' : 32, 't' : 45, '6' : 58,
'H' : 7, 'U' : 20, 'h' : 33, 'u' : 46, '7' : 59,
'I' : 8, 'V' : 21, 'i' : 34, 'v' : 47, '8' : 60,
'J' : 9, 'W' : 22, 'j' : 35, 'w' : 48, '9' : 61,
'K' : 10, 'X' : 23, 'k' : 36, 'x' : 49, '+' : 62,
'L' : 11, 'Y' : 24, 'l' : 37, 'y' : 50, '/' : 63,
'M' : 12, 'Z' : 25, 'm' : 38, 'z' : 51, '=' : 0}

result_ = []

# fill an integer with four 6-bit-blocks from left to right
box_ = int( (base64_table[source[0]] << 26)\
+ (base64_table[source[1]] << 20)\
+ (base64_table[source[2]] << 14)\
+ (base64_table[source[3]] << 8) )

# get 8-bit-blocks out of the integer starting with the first 6-bit-block we have
# inserted plus the two highest bits from the second 6-bit-block
result_ += chr((box_ >> 24) & 255) + chr((box_ >> 16) & 255) + chr((box_ >> 8) & 255)

# strip possible zeros from decoded result
del result_[len(result_)-source.count('='):len(result_)]

return result_

#open source file in binary-mode
fsource = open(raw_input("Please specify the source file that should be decoded: "), "rb")

# read in first line of the file and split it in 2+n "whitespace-blocks"
_1stline = fsource.readline().split()

# delete the first two blocks ("begin ..." and "644 ...")
del _1stline[0:2]

# join the other blocks to the target-filename
targetname = string.join(_1stline)
ftarget = open(targetname, "wb")

# read in the remainder of the file in 4-byte-blocks and write the results in 3-byte-blocks
# into the target file

while 1 == 1:
source = ''
while len(source) < 4:
source += fsource.read(4)
if source == '':
break

# reduce byte-code to base64-chars
source = extract_base64(source)

if source == '':
break

# convert 6-bit-blocks to 8-bit-blocks
clear_text = convert_to_8bits(source)

ftarget.writelines(clear_text)

ftarget.close()
fsource.close()

print "file "+targetname+" has been written!"
---

Unfortunately I can't use python's standard base64-module since
this whole task is an exercise. :(

And I don't see any logical problems in my code. I think I really
need some more eyes to watch over this. So you are my "last hope"! ;)
Perhaps you can give me a hint.

Thank you very much!

Regards
Karl

Jul 18 '05 #1

Subscribe Post Reply

2548

Byron

Hi Karl,

I don't know if this is much help for you, but have you tried using the
following:

import base64
print base64.decodestring("UGxlYXNlLCBoZWxwIG1lIQ==")
print base64.encodestring("Please, help me!")

Byron
---

Karl Pech wrote:

Hi all,

I'm trying to write a program which can read in files in the following
format:
sos_encoded.txt:
---
begin-base64 644 sos.txt
UGxlYXNlLCBoZWxwIG1lIQ==
---

and convert them to "clear byte code". For example if you take the file
sos_encoded.txt and use my program on it you should get the following:
sos.txt:
---
Please, help me!
---

Unfortunately if I try to convert files which didn't have any "human
readable text" when they were encoded and if these files are large (> 1.5MB)
I get back corrupted files.

This is the source of my program:
---
import string

def extract_base64(source):
source_ = source
all_chars = string.maketrans('','') # create 256-ASCII-char-table

# delete all base64-chars from source-copy
source_without_base64_signs = source_.translate(all_chars, string.letters+string.digits+"+/=")

# delete all chars, which remained in the changed source-copy, from the first copy
# and return this new copy
# --> all base64-chars remain
return source_.translate(all_chars, source_without_base64_signs)

def convert_to_8bits(source):
base64_table = {'A' : 0, 'N' : 13, 'a' : 26, 'n' : 39, '0' : 52,
'B' : 1, 'O' : 14, 'b' : 27, 'o' : 40, '1' : 53,
'C' : 2, 'P' : 15, 'c' : 28, 'p' : 41, '2' : 54,
'D' : 3, 'Q' : 16, 'd' : 29, 'q' : 42, '3' : 55,
'E' : 4, 'R' : 17, 'e' : 30, 'r' : 43, '4' : 56,
'F' : 5, 'S' : 18, 'f' : 31, 's' : 44, '5' : 57,
'G' : 6, 'T' : 19, 'g' : 32, 't' : 45, '6' : 58,
'H' : 7, 'U' : 20, 'h' : 33, 'u' : 46, '7' : 59,
'I' : 8, 'V' : 21, 'i' : 34, 'v' : 47, '8' : 60,
'J' : 9, 'W' : 22, 'j' : 35, 'w' : 48, '9' : 61,
'K' : 10, 'X' : 23, 'k' : 36, 'x' : 49, '+' : 62,
'L' : 11, 'Y' : 24, 'l' : 37, 'y' : 50, '/' : 63,
'M' : 12, 'Z' : 25, 'm' : 38, 'z' : 51, '=' : 0}

result_ = []

# fill an integer with four 6-bit-blocks from left to right
box_ = int( (base64_table[source[0]] << 26)\
+ (base64_table[source[1]] << 20)\
+ (base64_table[source[2]] << 14)\
+ (base64_table[source[3]] << 8) )

# get 8-bit-blocks out of the integer starting with the first 6-bit-block we have
# inserted plus the two highest bits from the second 6-bit-block
result_ += chr((box_ >> 24) & 255) + chr((box_ >> 16) & 255) + chr((box_ >> 8) & 255)

# strip possible zeros from decoded result
del result_[len(result_)-source.count('='):len(result_)]

return result_

#open source file in binary-mode
fsource = open(raw_input("Please specify the source file that should be decoded: "), "rb")

# read in first line of the file and split it in 2+n "whitespace-blocks"
_1stline = fsource.readline().split()

# delete the first two blocks ("begin ..." and "644 ...")
del _1stline[0:2]

# join the other blocks to the target-filename
targetname = string.join(_1stline)
ftarget = open(targetname, "wb")

# read in the remainder of the file in 4-byte-blocks and write the results in 3-byte-blocks
# into the target file

while 1 == 1:
source = ''
while len(source) < 4:
source += fsource.read(4)
if source == '':
break

# reduce byte-code to base64-chars
source = extract_base64(source)

if source == '':
break

# convert 6-bit-blocks to 8-bit-blocks
clear_text = convert_to_8bits(source)

ftarget.writelines(clear_text)

ftarget.close()
fsource.close()

print "file "+targetname+" has been written!"
---

Unfortunately I can't use python's standard base64-module since
this whole task is an exercise. :(

And I don't see any logical problems in my code. I think I really
need some more eyes to watch over this. So you are my "last hope"! ;)
Perhaps you can give me a hint.

Thank you very much!

Regards
Karl

Jul 18 '05 #2

Karl Pech

"Byron" <De*********@netscape.net> wrote in
news:29******************@newsread1.news.pas.earth link.net...

Hi Byron,

I don't know if this is much help for you, but have you tried using the
following:

import base64
print base64.decodestring("UGxlYXNlLCBoZWxwIG1lIQ==")
print base64.encodestring("Please, help me!")

Thank you very much for your answer! I'm glad that somebody answered me,
because my problem slowly but surely is getting worse. :((
I found the mistake in the previous code which was a problem with read()
I think it should be read(1). Anyway I deleted this old version of my
program since it was very slow for base64 - files > 0.5 MB. Now I coded the
program below. This program doesn't work at all. It seems to me that
some variables that read_in_data should use are somehow out of scope.
The idea of the program is: "Read in 5-Blocks of base64-data. Since
5 % 4 =|= 0 we get an IndexError-Exception in the "while len(source) < 4"-loop,
we catch this exception, set the source file pointer 1 byte back, because the
would lose one byte otherwise, and read in the next 4 bytes of the source file.
Unfortunately I don't know how to solve this Unbound...Error-Thing. |((
I think I'm really stuck now. @{

And as I said, since this task is an exercise for this weekend I'm not allowed to
use the standard base64-module. Well, I have to code the base64-decoder myself. :(((

Anyway, Thanks, Byron.

---
import string

def convert_to_8bits(source):
base64_table = {'A' : 0, 'N' : 13, 'a' : 26, 'n' : 39, '0' : 52,
'B' : 1, 'O' : 14, 'b' : 27, 'o' : 40, '1' : 53,
'C' : 2, 'P' : 15, 'c' : 28, 'p' : 41, '2' : 54,
'D' : 3, 'Q' : 16, 'd' : 29, 'q' : 42, '3' : 55,
'E' : 4, 'R' : 17, 'e' : 30, 'r' : 43, '4' : 56,
'F' : 5, 'S' : 18, 'f' : 31, 's' : 44, '5' : 57,
'G' : 6, 'T' : 19, 'g' : 32, 't' : 45, '6' : 58,
'H' : 7, 'U' : 20, 'h' : 33, 'u' : 46, '7' : 59,
'I' : 8, 'V' : 21, 'i' : 34, 'v' : 47, '8' : 60,
'J' : 9, 'W' : 22, 'j' : 35, 'w' : 48, '9' : 61,
'K' : 10, 'X' : 23, 'k' : 36, 'x' : 49, '+' : 62,
'L' : 11, 'Y' : 24, 'l' : 37, 'y' : 50, '/' : 63,
'M' : 12, 'Z' : 25, 'm' : 38, 'z' : 51, '=' : 0}

result_ = []

box_ = int( (base64_table[source[0]] << 26)\
+ (base64_table[source[1]] << 20)\
+ (base64_table[source[2]] << 14)\
+ (base64_table[source[3]] << 8) )

result_ += chr((box_ >> 24) & 255) + chr((box_ >> 16) & 255) + chr((box_ >> 8) & 255)

return result_

fsource = open(raw_input("Bitte geben Sie die zu dekodierende base64-Datei an: "), "rb")

_1stline = fsource.readline().split()

del _1stline[0:2]

targetname = string.join(_1stline)
ftarget = open(targetname, "wb")

global source_file_data
global target_file_buffer
global filling_counter

source_file_data = fsource.read(5)
target_file_buffer = []
filling_counter = 0

def read_in_data():
for i in range(0, len(source_file_data)):
print i
source = []
while len(source) < 4:
source += source_file_data[i]

if source_file_data[i] == '=':
filling_counter += 1

if not(ord('a') <= ord(source_file_data[i]) <= ord('z')) and\
not(ord('A') <= ord(source_file_data[i]) <= ord('Z')) and\
not(ord('0') <= ord(source_file_data[i]) <= ord('9'))\
and not(ord(source_file_data[i]) in [ord('+'), ord('/'), ord('=')]):
del source[len(source)-1]

i += 1

clear_text = convert_to_8bits(source)

target_file_buffer += clear_text

try:
read_in_data()
except IndexError:
print "exception"
fsource.seek(-1, 1)
source_file_data = fsource.read(5)
read_in_data()

ftarget.writelines(target_file_buffer)
ftarget.seek(filling_counter-1, 2)
ftarget.truncate()

ftarget.close()
fsource.close()

print "decoded file "+targetname+" has been written!"
---

Jul 18 '05 #3

Similar topics

Strange problems with encoding

by: Sebastian Meyer | last post by:

Hi newsgroup, i am trying to replace german special characters in strings like str = re.sub('ö', 'oe', str) When i work with this, i always get the message UniCode Error: ASCII decoding error...

Python

XML_RPC and unicode problems

by: Thomas | last post by:

I am currently passing email messages over XML_RPC as the payload for a certain function call. On some of these messages, XML_RPC blows up on the server side and says something to the effect of: ...

Python

CryptoStream/MemoryStream problems

by: Stingray | last post by:

Are there any know problems with using a MemoryStream as a backing store for a CryptoStream? I'm trying to simply encrypt and decrypt text in memory. I'd like to create some simple methods to...

.NET Framework

Decoding base64 data and extracting images

by: gRizwan | last post by:

Hello all, We have a problem on a webpage. That page is sent some email data in base64 format. what we need to do is, decode the base64 data back to original shape and extract attached image...

ASP / Active Server Pages

Base64

by: John | last post by:

Hi all, I've been going through google and yahoo looking for a certain base64 decoder in C without success. What I'm after is something that you can pass a base64 encoded string into and get back...

C / C++

SmtpMail Problems

by: Russell Stevens | last post by:

I generate pdf files on my server and allow users to access them via a browser and also email them. Most files work fine whether the user uses his browser or gets an email with a pdf attachment...

Visual Basic .NET

base64

by: Jay | last post by:

I have bean trying to get my head around reading .GIF files from base64 strings, Basically I need to specify a filename and convert it to base64 then I can copy/past the string to wear I want it....

Python

Using Tools/freeze.py on AIX -- having problems

by: jmalone | last post by:

I have a python script that I need to freeze on AIX 5.1 (customer has AIX and does not want to install Python). The python script is pretty simple (the only things it imports are sys and socket)....

Python

sample for base64 encoding in c language

by: aruna.eies.eng | last post by:

i am currently trying to convert data into binary data.for that i need to know how to achieve it in c language and what are the libraries that we can use. so if any one can send me a sample code or...

C / C++

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

Windows Server

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

Microsoft Access / VBA