Hi all,
I need some help on the following issue. I can't seem to solve it.
I have a binary (pcl) file.
In this file i want to search for specific codes (like <0C>). I have
tried to solve it by reading the file character by character, but this
is very slow. Especially when it comes to files which are large
(>10MB) this is consuming quite some time.
Does anyone has a hint/clue/solution on this?
thanks already!
Jeroen 7 1550
jvdb schrieb:
Hi all,
I need some help on the following issue. I can't seem to solve it.
I have a binary (pcl) file.
In this file i want to search for specific codes (like <0C>). I have
tried to solve it by reading the file character by character, but this
is very slow. Especially when it comes to files which are large
(>10MB) this is consuming quite some time.
Does anyone has a hint/clue/solution on this?
What has the searching to do with the reading? 10MB easily fit into the
main memory of a decent PC, so just do
contents = open("file").read() # yes I know I should close the file...
print contents.find('\x0c')
Diez
On 8 jun, 14:07, "Diez B. Roggisch" <d...@nospam.web.dewrote:
jvdb schrieb:
.......
What has the searching to do with the reading? 10MB easily fit into the
main memory of a decent PC, so just do
contents = open("file").read() # yes I know I should close the file...
print contents.find('\x0c')
Diez
True. But there is another issue attached to the one i wrote.
When i know how much this occurs, i know the amount of pages in the
file. After that i would like to be able to extract a given amount of
data:
file x contains 20 <0C>. then for example i would like to extract from
instance 5 to instance 12 from the file.
The reason why i want to do this: The 0C stands for a pagebreak in PCL
language. This way i would be absle to extract a certain amount of
pages from the file.
jvdb schrieb:
On 8 jun, 14:07, "Diez B. Roggisch" <d...@nospam.web.dewrote:
>jvdb schrieb:
......
>What has the searching to do with the reading? 10MB easily fit into the main memory of a decent PC, so just do
contents = open("file").read() # yes I know I should close the file...
print contents.find('\x0c')
Diez
True. But there is another issue attached to the one i wrote.
When i know how much this occurs, i know the amount of pages in the
file. After that i would like to be able to extract a given amount of
data:
file x contains 20 <0C>. then for example i would like to extract from
instance 5 to instance 12 from the file.
The reason why i want to do this: The 0C stands for a pagebreak in PCL
language. This way i would be absle to extract a certain amount of
pages from the file.
And? Finding the respective indices by using
last_needle_position = 0
positions = []
while last_needle_position != -1:
last_needle_position = contents.find(needle, last_needle_position+1)
if last_needle_position != -1:
positions.append(last_needle_position)
will find all the pagepbreaks. then just slice contents appropriatly.
Did you read the python tutorial?
diez
In <5c*************@mid.uni-berlin.de>, Diez B. Roggisch wrote:
jvdb schrieb:
>True. But there is another issue attached to the one i wrote. When i know how much this occurs, i know the amount of pages in the file. After that i would like to be able to extract a given amount of data: file x contains 20 <0C>. then for example i would like to extract from instance 5 to instance 12 from the file. The reason why i want to do this: The 0C stands for a pagebreak in PCL language. This way i would be absle to extract a certain amount of pages from the file.
And? Finding the respective indices by using
last_needle_position = 0
positions = []
while last_needle_position != -1:
last_needle_position = contents.find(needle, last_needle_position+1)
if last_needle_position != -1:
positions.append(last_needle_position)
will find all the pagepbreaks. then just slice contents appropriatly.
Did you read the python tutorial?
Maybe splitting at '\x0c', selecting/slicing the wanted pages and joining
them again is enough, depending of the size of the files and memory of
course.
One problem I see is that '\x0c' may not always be the page end. It may
occur in "rastered image" data too I guess.
Ciao,
Marc 'BlackJack' Rintsch
On 8 jun, 15:19, Marc 'BlackJack' Rintsch <bj_...@gmx.netwrote:
In <5ct0evF31n73...@mid.uni-berlin.de>, Diez B. Roggisch wrote:
jvdb schrieb:
True. But there is another issue attached to the one i wrote.
When i know how much this occurs, i know the amount of pages in the
file. After that i would like to be able to extract a given amount of
data:
file x contains 20 <0C>. then for example i would like to extract from
instance 5 to instance 12 from the file.
The reason why i want to do this: The 0C stands for a pagebreak in PCL
language. This way i would be absle to extract a certain amount of
pages from the file.
And? Finding the respective indices by using
last_needle_position = 0
positions = []
while last_needle_position != -1:
last_needle_position = contents.find(needle, last_needle_position+1)
if last_needle_position != -1:
positions.append(last_needle_position)
will find all the pagepbreaks. then just slice contents appropriatly.
Did you read the python tutorial?
Maybe splitting at '\x0c', selecting/slicing the wanted pages and joining
them again is enough, depending of the size of the files and memory of
course.
One problem I see is that '\x0c' may not always be the page end. It may
occur in "rastered image" data too I guess.
Ciao,
Marc 'BlackJack' Rintsch
Hi,
your last comment is also something i have noticed. There are a number
of occasions where this will happen. I also have to deal with this.
I will dive into this on monday, after this hot weekend.
cheers,
Jeroen
On Jun 8, 2:07 am, "Diez B. Roggisch" <d...@nospam.web.dewrote:
>...
What has the searching to do with the reading? 10MB easily fit into the
main memory of a decent PC, so just do
contents = open("file").read() # yes I know I should close the file...
print contents.find('\x0c')
Diez
Better make that 'open("file", "rb"). This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: christos panagiotou |
last post by:
hi all
I am trying to open some .raw files that represent images (256x256, 8
bit per pixel, no header) in a c++ program
I cannot copy paste the module here as it uses a method from the VTK...
|
by: Olivier Maurice |
last post by:
Hi all,
I suppose some of you know the program Redmon (type redmon in google, first
result). This neat little tool allows to hook up any functionality to a
printer by putting the file printed...
|
by: laclac01 |
last post by:
So I am converting some matlab code to C++. I am stuck at one part of
the code. The matlab code uses fread() to read in to a vector a file.
It's a binary file. The vector is made up of floats,...
|
by: Michael Mair |
last post by:
Cheerio,
I would appreciate opinions on the following:
Given the task to read a _complete_ text file into a string:
What is the "best" way to do it?
Handling the buffer is not the problem...
|
by: rnorthedge |
last post by:
I am working on a code library which needs to read in the data from
large binary files. The files hold int, double and string data. This
is the code for reading in the strings:
protected...
|
by: John Dann |
last post by:
I'm trying to read some binary data from a file created by another
program. I know the binary file format but can't change or control the
format. The binary data is organised such that it should...
|
by: amfr |
last post by:
On windows, is there anything special I have to do to read a binary
file correctly?
|
by: siliconwafer |
last post by:
Hi All,
I want to know tht how can one Stop reading a file in C (e.g a Hex
file)with no 'EOF'?
|
by: nnimod |
last post by:
Hi. I'm having trouble reading some unicode files. Basically, I have to
parse certain files. Some of those files are being input in Japanese,
Chinese etc. The easiest way, I figured, to distinguish...
|
by: arne.muller |
last post by:
Hello,
I've come across some problems reading strucutres from binary files.
Basically I've some strutures
typedef struct {
int i;
double x;
int n;
double *mz;
|
by: DJRhino |
last post by:
Was curious if anyone else was having this same issue or not....
I was just Up/Down graded to windows 11 and now my access combo boxes are not acting right. With win 10 I could start typing...
|
by: isladogs |
last post by:
The next Access Europe meeting will be on Wednesday 4 Oct 2023 starting at 18:00 UK time (6PM UTC+1) and finishing at about 19:15 (7.15PM)
The start time is equivalent to 19:00 (7PM) in Central...
|
by: giovanniandrean |
last post by:
The energy model is structured as follows and uses excel sheets to give input data:
1-Utility.py contains all the functions needed to calculate the variables and other minor things (mentions...
|
by: NeoPa |
last post by:
Hello everyone.
I find myself stuck trying to find the VBA way to get Access to create a PDF of the currently-selected (and open) object (Form or Report).
I know it can be done by selecting :...
|
by: NeoPa |
last post by:
Introduction
For this article I'll be using a very simple database which has Form (clsForm) & Report (clsReport) classes that simply handle making the calling Form invisible until the Form, or all...
|
by: Teri B |
last post by:
Hi, I have created a sub-form Roles. In my course form the user selects the roles assigned to the course.
0ne-to-many. One course many roles.
Then I created a report based on the Course form and...
|
by: isladogs |
last post by:
The next Access Europe meeting will be on Wednesday 1 Nov 2023 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM)
Please note that the UK and Europe revert to winter time on...
|
by: nia12 |
last post by:
Hi there,
I am very new to Access so apologies if any of this is obvious/not clear.
I am creating a data collection tool for health care employees to complete. It consists of a number of...
|
by: NeoPa |
last post by:
Introduction
For this article I'll be focusing on the Report (clsReport) class. This simply handles making the calling Form invisible until all of the Reports opened by it have been closed, when it...
| |