473,396 Members | 1,996 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

Need for efficient method for File Parsing and copying

Tom
I am trying to update another developers code, and am stuck in a
cludge.

It works like this:

A user uploads a file from the web page. Our code reads from a
NetworkStream to a Filestream, and writes out a temporary file. Then
the file is parsed (The file contains HTML data + the actual binary
file that was transmitted). As part of the parsing, once the main
binary file is to be parsed out, it is written with a new Filestream,
byte by byte to a new location on disk.

This was great when the file was 24Mb, but now it is 70Mb, and our
windows CE device can not hold 2 copies of this large file. Not to
mention that the additional copy takes a while to perform.

To me it seams that there are two methods to solve this, the first
being to read the networkstream realtime until the actual binary file
shows up, then filestream directly. The second method, which to me
seems simpler, is to just take the large temp file, and truncate off
the non-binary data.

By the way, I can not use a memorystream, as I have alocated most of
the device memory to storage, thus there is not enough memory room.

Anyone have any advice??

Tom Kuhn
Aug 27 '08 #1
4 1892
On Wed, 27 Aug 2008 10:08:47 -0700, Tom <ku****@gmail.comwrote:
[...]
To me it seams that there are two methods to solve this, the first
being to read the networkstream realtime until the actual binary file
shows up, then filestream directly. The second method, which to me
seems simpler, is to just take the large temp file, and truncate off
the non-binary data.
Well, the second method would definitely be simpler if the file system
allowed you to trunacte a file from the beginning.

But, it doesn't. You'd have to rewrite the entire file in-place, shifting
the binary data down within the file, and then truncate the end (which you
can do).

"Simpler" is in the eye of the beholder, but personally I'd just go with
your first suggestion: process the incoming stream as its read, writing
only the data you need to the file.

If you've already got code that takes the FileStream, parses that and
emits another FileStream, it should be simple enough to change that code
to just take the NetworkStream as input instead. Ideally, you would have
written the code so that it actually just takes a Stream and not a
FileStream. In that case, there's practically nothing to change. Just
pass a different Stream to the code.

But if not, it would just be a matter of changing the FileStream
variable(s) to a Stream variable(s). Again, ideally you would not have
used any of the members specific to FileStream, but if so it should not be
that hard to rework the code so that they aren't (either just don't do
whatever it was you were doing with them, or provide the same information
some other way as needed).

Of course, the simplest approach is to somehow redefine the problem so
that you can just upgrade the storage space to accomodate the larger data
size. :)

Pete
Aug 27 '08 #2
>
A user uploads a file from the web page.
Upload or download?
From the description below it seems that you are getting the file from
a webserver to a Win CE device.
>*Our code reads from a
NetworkStream to a Filestream, and writes out a temporary file. *Then
the file is parsed (The file contains HTML data + the actual binary
file that was transmitted). *As part of the parsing, once the main
binary file is to be parsed out, it is written with a new Filestream,
byte by byte to a new location on disk.

>
This was great when the file was 24Mb, but now it is 70Mb, and our
windows CE device can not hold 2 copies of this large file. *Not to
mention that the additional copy takes a while to perform.
You can delete the old file. If this is not possible cause you need at
least one working copy. then you can try to zip it . download the new
file (it should also be zipped) and only when the new file is
downloaded ok then y ou delete the old one and unzip the new one.

Note that with files that big it might take a LONG time to download &
process all the info. Maybe it's time to change the delivery method of
your app.
Anyone have any advice??
As I said above, maybe it's time to rethink your strategic. 70MB only
ni data is a LOT especially when some devices have only 128MB
Aug 28 '08 #3
Tom
Zipping... Now thats an idea that for some reason never crossed my
mind! The file that we are copying is a full Win CE Flash file system
image and is 63Mb unzipped... Well I zipped it and WOW, only 12Mb.
Since we alreeady handle unzipping in the device, it should be no
problem to go this route. Good Idea.

As for what we were doing, the file was transfered via HTTP POST from
a users computer. The networkStream grabbed to POST data and built a
temp file with the file along with the post data in it. I was trying
to real0time analyse the data and find the beginning of the actual
file in a byte by byte process reading from the stream.

I think this zip solution will work out well.

Thanks,
Tom


On Aug 28, 12:18 pm, "Ignacio Machin ( .NET/ C# MVP )"
<ignacio.mac...@gmail.comwrote:
A user uploads a file from the web page.

Upload or download?
From the description below it seems that you are getting the file from
a webserver to a Win CE device.
Our code reads from a
NetworkStream to a Filestream, and writes out a temporary file. Then
the file is parsed (The file contains HTML data + the actual binary
file that was transmitted). As part of the parsing, once the main
binary file is to be parsed out, it is written with a new Filestream,
byte by byte to a new location on disk.
This was great when the file was 24Mb, but now it is 70Mb, and our
windows CE device can not hold 2 copies of this large file. Not to
mention that the additional copy takes a while to perform.

You can delete the old file. If this is not possible cause you need at
least one working copy. then you can try to zip it . download the new
file (it should also be zipped) and only when the new file is
downloaded ok then y ou delete the old one and unzip the new one.

Note that with files that big it might take a LONG time to download &
process all the info. Maybe it's time to change the delivery method of
your app.
Anyone have any advice??

As I said above, maybe it's time to rethink your strategic. 70MB only
ni data is a LOT especially when some devices have only 128MB
Aug 28 '08 #4
On Aug 28, 1:36*pm, Tom <kuh...@gmail.comwrote:
>
I think thiszipsolution will work out well.

Thanks,
Tom
How do you do zipping / unzipping on the device?
Sep 12 '08 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: Luc Saffre | last post by:
Hello, I had a strange problem when freezing (using either py2exe or McMillan installer) a script that imports reportlab (which imports PIL (which imports FixTk))). - Python 2.3.3c (also with...
6
by: Sridhar R | last post by:
I am looking for a class browser that has these features. 1. Given a symbol (class, method or function) it should giveback the lineno n source code 2. It should be efficient and quick. I...
14
by: Arthur | last post by:
A bit inspired by the decorator discussions, I'm trying to tackle something I had been avoiding. Essentially I am trying to create a non-destructive tranformation of an instance of a class - is...
9
by: netpurpose | last post by:
I need to extract data from this table to find the lowest prices of each product as of today. The product will be listed/grouped by the name only, discarding the product code - I use...
19
by: Materialised | last post by:
Hi everyone, What I am wanting to do, is to copy, a simple plain text file, to another file, but omitting duplicate items. The way I thought of doing this, involved copying all the items into...
4
by: Beren | last post by:
Hello, Can anyone give some tips to efficiently update a remote project ? I prefer to keep my projects locally, compile as release and then copy everything it to the remote server. What is...
15
by: Gan Quan | last post by:
I'm writing a c++ program that has many (100+) threads read/write files simultaneously. It works well if not considering the efficiency. The file i/o seems to be the bottleneck. This is my code...
46
by: Bruce W. Darby | last post by:
This will be my very first VB.Net application and it's pretty simple. But I've got a snag in my syntax somewhere. Was hoping that someone could point me in the right direction. The history: My...
2
by: Anders B | last post by:
I want to make a program that reads the content of a LUA array save file.. More precicely a save file from a World of Warcraft plugin called CharacterProfiler, which dumps alot of information about...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.