473,402 Members | 2,061 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,402 software developers and data experts.

Dealing with large files (random access)

Hi,

This post is the 'sequel' ;) of the "Data Oriented vs Object Oriented
Design" post, but it can be read and treated apart from that one. I
will just quote the beginning of my previous message to expose the
problem:


New to .NET, I'm working on an Winforms client application using VS
2005 beta2. My needs considering data storage are the followings:

(1) Small files (0 < length < 10 mb), containing lots of small
'objects' that need to be loaded in memory at runtime, in order to
garantee small access time to each 'object'. Each of these 'objects'
collection should be able to easily bind to a DataGridView, AND to
provide *filtering* and *sorting* capability.

(2) 'Mid-large' files (0 < length < 100 mb), containing lots of
'mid-large objects'. Such files shouldn't be fully loaded in memory,
but the class that deals with those files must expose random
accessing/adding/removing/modifying 'object' functionalities.

In facts, a file of type (1) contains an index of the large 'objects'
contained in a corresponding file of type (2). Each index will be used
to fill a DataGridView, and each entry (row) will provide a
'reference' to the 'big object' stored in the corresponding type (2)
file (in addition to some other object-specific details that will be
displayed on the DGView of course).

Since this is a Windows client application, using a database with a
data provider is out of the question.
This post deals with type (2) files.

My question will go straight to the point: what is the most efficient
approach to deal with such files?

I haven't much thought about it yet. Again, using a serialized List<T>
(or any object Collection) would be fine, but randomly accessing an
object without loading the whole list into memory is easier said than
done.

I'm sure you experienced people will have some great starting ideas ;)

Nov 17 '05 #1
1 2361
interesting assumption:
Since this is a Windows client application, using a database with a
data provider is out of the question.


Why in the world would you say that? Access databases and MSDE databases
are both completely free to distribute with your windows client application.
Download WebMatrix from the Microsoft site and you get MSDE.

For a good writeup of MSDE and comparisons with other tools, See
http://www.builderau.com.au/architec...9130338,00.htm

To see how this plays with the next version of MS software, See
http://www.itwriting.com/sqlyukon2.php
Honestly, the idea of writing something new to do all this is difficult for
me to rationalize.

Your notion of packing all your small files into a big one is not better.
Your small files cannot grow without rewriting the large one. Why not use
the file system... that's what it's for!

--
--- Nick Malik [Microsoft]
MCSD, CFPS, Certified Scrummaster
http://blogs.msdn.com/nickmalik

Disclaimer: Opinions expressed in this forum are my own, and not
representative of my employer.
I do not answer questions on behalf of my employer. I'm just a
programmer helping programmers.
--
Nov 17 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Laphan | last post by:
Hi All I know this is my 2nd (and final) cross-post, but which NG should I use for the below. I want to create a game that queries and updates text and numeric stats on a regular basis, so...
19
by: Brad Tilley | last post by:
I have some large files (between 2 & 4 GB) that I want to do a few things with. Here's how I've been using the md5 module in Python: original = file(path + f, 'rb') data = original.read(4096)...
24
by: Joerg Schuster | last post by:
Hello, I am looking for a method to "shuffle" the lines of a large file. I have a corpus of sorted and "uniqed" English sentences that has been produced with (1): (1) sort corpus | uniq >...
6
by: shailesh kumar | last post by:
Hi, I need to design data interfaces for accessing files of very large sizes efficiently. The data will be accessed in chunks of fixed size ... My data interface should be able to do a random...
2
by: jdev8080 | last post by:
We are looking at creating large XML files containing binary data (encoded as base64) and passing them to transformers that will parse and transform the data into different formats. Basically,...
57
by: Chris Foote | last post by:
Hi all. I have the need to store a large (10M) number of keys in a hash table, based on a tuple of (long_integer, integer). The standard python dictionary works well for small numbers of keys,...
0
by: kaminekutte | last post by:
Hi everybody, I have been trying to parse a 100MB log file(tab separated). Basic aim is to read the file randomly, do some procesing and then display the contents of the file line by line. Working...
3
by: sebastian.harko | last post by:
Helllo, What's the general accepted strategy for dealing with very large binary files in C# ? I have to do a program that reads some "multi frame bitmap " files which can reach up to one...
25
by: tekctrl | last post by:
Anyone: I have a simple MSAccess DB which was created from an old ASCII flatfile. It works fine except for something that just started happening. I'll enter info in a record, save the record,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.