473,765 Members | 2,172 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

using in-memory zlib deflate from c# (with max performance :-)

Hi, I need to implement in-memory zlib compression in c# to replace an
old c++ app.

Pre-requisites..
1) The performance must be FAST (with source memory sizes from a few k
to a meg).
2) The output must match exactly compression generated with the c++
zlib.org code with default compression.

zlib.org c# code is as slow as hell! Not sure if I'm doing anything
wrong but it crawls - on the plus side it matches what's generated
with the c++ libraries exactly.

system.io.compr ession deflate is fast, as s SharpZLib, but the output
is different from the zlib.org.

Any help would be GREATLY appreciated!

Cheers,

Tom
Jun 30 '08 #1
5 4814
On Jun 30, 10:45*am, tombrog...@goog lemail.com wrote:
Hi, I need to implement in-memory zlib compression in c# to replace an
old c++ app.

Pre-requisites..
1) The performance must be FAST (with source memory sizes from a few k
to a meg).
2) The output must match exactly compression generated with the c++
zlib.org code with default compression.
That second point is an odd one, and likely to give you issues. What's
the basis of that requirement? Obviously the decompressed data should
be the same, but do you really need the compressed version to be
identical?

Jon
Jun 30 '08 #2
On 30 Jun, 11:14, "Jon Skeet [C# MVP]" <sk...@pobox.co mwrote:
On Jun 30, 10:45*am, tombrog...@goog lemail.com wrote:
Hi, I need to implement in-memory zlib compression in c# to replace an
old c++ app.
Pre-requisites..
1) The performance must be FAST (with source memory sizes from a few k
to a meg).
2) The output must match exactly compression generated with the c++
zlib.org code with default compression.

That second point is an odd one, and likely to give you issues. What's
the basis of that requirement? Obviously the decompressed data should
be the same, but do you really need the compressed version to be
identical?

Jon
Hi, unfortunately yes.

I'm compressing the data then writing it to a file (the file has an
extremely propriatary format that means I can't just compress into it
directly).

The file will then be read by another c++ process.

Obviously if both match the deflate spec then c++ will be able to read
it, but my solution will be a lot more "acceptable " if the output
files are the same for c# amd c++.

Thanks,

Tom
Jun 30 '08 #3
On Jun 30, 11:20*am, tombrog...@goog lemail.com wrote:
That second point is an odd one, and likely to give you issues. What's
the basis of that requirement? Obviously the decompressed data should
be the same, but do you really need the compressed version to be
identical?

Hi, unfortunately yes.

I'm compressing the data then writing it to a file (the file has an
extremely propriatary format that means I can't just compress into it
directly).

The file will then be read by another c++ process.

Obviously if both match the deflate spec then c++ will be able to read
it, but my solution will be a lot more "acceptable " if the output
files are the same for c# amd c++.
In that case you may find yourself digging into the zlib.org code in
the normal profiling kind of way. It's unlikely that other compressors
will produce *exactly* the same output, although you can try tweaking
options (window sizes etc) to see if that will help.

Personally I'd try to push back on the "identical output" requirement,
satisfying myself instead with a comprehensive sets of tests for the
"compress and then uncompress" cycle. I realise that may be futile in
some situations, but it may be worth pointing out that if the C++ zlib
code is ever patched that may well change the output in a harmless
manner too.

Jon
Jun 30 '08 #4
On 30 Jun, 11:45, "Jon Skeet [C# MVP]" <sk...@pobox.co mwrote:
On Jun 30, 11:20*am, tombrog...@goog lemail.com wrote:


That second point is an odd one, and likely to give you issues. What's
the basis of that requirement? Obviously the decompressed data should
be the same, but do you really need the compressed version to be
identical?
Hi, unfortunately yes.
I'm compressing the data then writing it to a file (the file has an
extremely propriatary format that means I can't just compress into it
directly).
The file will then be read by another c++ process.
Obviously if both match the deflate spec then c++ will be able to read
it, but my solution will be a lot more "acceptable " if the output
files are the same for c# amd c++.

In that case you may find yourself digging into the zlib.org code in
the normal profiling kind of way. It's unlikely that other compressors
will produce *exactly* the same output, although you can try tweaking
options (window sizes etc) to see if that will help.

Personally I'd try to push back on the "identical output" requirement,
satisfying myself instead with a comprehensive sets of tests for the
"compress and then uncompress" cycle. I realise that may be futile in
some situations, but it may be worth pointing out that if the C++ zlib
code is ever patched that may well change the output in a harmless
manner too.

Jon- Hide quoted text -

- Show quoted text -

Cheers Jon, I'll do that.

You don't know any way to get maximum performance do you?

I'm having to iterate through many structures (up to 1 million),
compressing them one at a time, with a source data size ranging from a
few kbytes to a meg.

Do you think threading would help? (it will run on multi processor
machines).

Thanks,

Tom
Jun 30 '08 #5
On Jun 30, 12:08*pm, tombrog...@goog lemail.com wrote:
Personally I'd try to push back on the "identical output" requirement,
satisfying myself instead with a comprehensive sets of tests for the
"compress and then uncompress" cycle. I realise that may be futile in
some situations, but it may be worth pointing out that if the C++ zlib
code is ever patched that may well change the output in a harmless
manner too.
- Show quoted text -

Cheers Jon, I'll do that.

You don't know any way to get maximum performance do you?
Find a bottleneck, squish it. Lather, rinse repeat :)

The exact details of squishing the bottleneck depend on the kind of
bottleneck, but basically profiling is your friend. Don't expect a
profiler to necessarily give you accurate results - the various
techniques used by different profilers always skew results, but you
can still use them a lot to help. (Basically you need to make sure
you've got a benchmark which runs in release mode, not under a
profiler, to see the *actual* improvements gained by making changes
suggested by the profiler.)
I'm having to iterate through many structures (up to 1 million),
compressing them one at a time, with a source data size ranging from a
few kbytes to a meg.

Do you think threading would help? (it will run on multi processor
machines).
Threading should help in that case, if you've got a naturally parallel
system - if you can compress two data sources independently, without
caring about which ends up being written first, for instance. If it's
not naturally parallel it may be harder, but still feasible.

If you're not close to release and don't mind using beta software,
Parallel Extensions makes life a lot simpler in my experience.

Jon
Jun 30 '08 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

16
7479
by: Andrew | last post by:
I'm afraid I don't know PHP well enough to figure this out. What I would like is to keep an array in memory so that it doesn't have to be reloaded each time a .php script is run. Is this possible? In Java, I could load the array from a file
8
2027
by: cody | last post by:
i basically want to create an object which contains an array (the last element of the class). the size of the array is determined when the object is created. for performance reasons (avoiding cache misses) the whole objekt should be in one linear chunk of memory, that is the array starts where the objekt ends in memory. class Cool { Cool (int arraysize) { .. }
1
2442
by: Howie | last post by:
Hi, does someone know a simple algorithm (Huffman, LZW etc.) for compression in memory with ANSI - C++ or STL-strings ? An implemention with zlib seems a little bit heavy for me. Thanks, Howie
3
3332
by: csomberg | last post by:
I'm doing some performance reviews and wish to know what tables SQL has pinned in memory and which ones have are loaded through usage ... Is there a way ? Thanks, Craig
15
2181
by: Alexis | last post by:
Hello, I'm working on a project that uses over a hundred XLSs for transforming xml documents. The project consists of several webservices (IIS) calling a few dlls. This dlls make the business logic and are the ones that do the transformation, so the XLSs are used by these dlls. Right now we have the XSLs files on hard disk. This means there is lots of reading from disk. Since reading from disk is much slower than reading from memory I am...
5
3396
by: Rhino | last post by:
I am trying to determine the behaviour of stored procedures in DB2 V8.2.x in Windows/Unix/Linux and how I can control that behaviour. Some documentation in the manuals is confusing the issue somewhat. First, am I right in understanding that the normal behaviour of a stored procedure, fenced or unfenced, is to only go into memory when it is invoked and to be swapped out of memory when it is not needed any more? Second, am I right in...
2
2075
by: Sara T. | last post by:
Can I add some data to data grid control on the fly by not connecting to database ? I mean I need to put data to data grid control on the memory, I need to use it ro manage page such as next and previous. Any part of some code will be appreicated.
38
5141
by: Peteroid | last post by:
I looked at the addresses in an 'array<>' during debug and noticed that the addresses were contiguous. Is this guaranteed, or just something it does if it can? PS = VS C++.NET 2005 Express using clr:/pure syntax
10
2309
by: deciacco | last post by:
I'm writing a command line utility to move some files. I'm dealing with thousands of files and I was wondering if anyone had any suggestions. This is what I have currently: $arrayVirtualFile = array( 'filename'=>'filename', 'basename'=>'filename.ext', 'extension'=>'ext', 'size'=>0,
0
9568
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10007
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
9951
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9832
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8831
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
5419
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3924
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3531
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2805
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.