473,383 Members | 1,805 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,383 software developers and data experts.

Best way to delete the first and the last records of a very big file

I am reading some very large files greater than 10 GB. Some of the files
(not all) contain a header and footer record identified by "***" in the
first three characters of the record. I need to delete the header or
footer record before reading the file into a database. Whats the best
way to do this in C#?
Any help appreciated.

Joe

*** Sent via Developersdex http://www.developersdex.com ***
Nov 17 '05 #1
5 1422
booksnore wrote:
I am reading some very large files greater than 10 GB. Some of the files
(not all) contain a header and footer record identified by "***" in the
first three characters of the record. I need to delete the header or
footer record before reading the file into a database. Whats the best
way to do this in C#?
Any help appreciated.

Joe

*** Sent via Developersdex http://www.developersdex.com ***


if you're able to loop through the records in the very large files, skip
processing when you see that the record you're on starts with "***".

using System.Text.RegularExpressions;
....

foreach (string s in records) { // example code
Match m = Regex.Match(s, "^***");

if (m.Success) { // this is a header or footer
continue;
} else {
processRecord();
}

}
Nov 17 '05 #2
In article <#t**************@TK2MSFTNGP12.phx.gbl>,
jeremiah johnson <na*******@gmail.com> wrote:

: booksnore wrote:

: > I am reading some very large files greater than 10 GB. Some of the files
: > (not all) contain a header and footer record identified by "***" in the
: > first three characters of the record. I need to delete the header or
: > footer record before reading the file into a database. Whats the best
: > way to do this in C#?
: > Any help appreciated.
:
: if you're able to loop through the records in the very large
: files, skip processing when you see that the record you're on
: starts with "***".
:
: using System.Text.RegularExpressions;
: ...
:
: foreach (string s in records) { // example code
: Match m = Regex.Match(s, "^***");

Of course, that should be @"^\*\*\*"; otherwise, you'll get an exception
at runtime.

: if (m.Success) { // this is a header or footer
: continue;
: } else {
: processRecord();
: }

If you like guards instead, you could write

if (Regex.IsMatch(input, @"^\*\*\*"))
continue;

ObJonSkeet: You could also write the test as

if (input.IndexOf("***") == 0)

Greg
--
The idea that the vote of a people, no matter how nearly unanimous, makes or
creates or determines what is right or just becomes as absurd and
unacceptable as the idea that right and justice are simply whatever a king
says they are. -- Robert Welch
Nov 17 '05 #3
booksnore wrote:
I am reading some very large files greater than 10 GB. Some of the
files (not all) contain a header and footer record identified by
"***" in the first three characters of the record. I need to delete
the header or footer record before reading the file into a database.
Whats the best way to do this in C#?
Any help appreciated.

Joe

*** Sent via Developersdex http://www.developersdex.com ***


That depends on how you are "reading the file into a database". If you are
parsing the file yourself, simply skip the input records you want to (as
identified above as having "***" in the first 3 characters). In other
words, do you really need to DELETE the records or just ignore them?

--
Gordon Smith (eMVP)
-- Avnet Applied Computing Solutions
Nov 17 '05 #4
Greg Bacon <gb****@hiwaay.net> wrote:
: if (m.Success) { // this is a header or footer
: continue;
: } else {
: processRecord();
: }

If you like guards instead, you could write

if (Regex.IsMatch(input, @"^\*\*\*"))
continue;

ObJonSkeet: You could also write the test as

if (input.IndexOf("***") == 0)


Or, even more readably:

if (input.StartsWith("***"))

I challenge anyone to claim with a straight face that the regex is a
more readable option than the above...

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Nov 17 '05 #5
Thank you for the replies. Yes I will be ignoring the records rather
than deleting them. I'm testing your suggestions now to see which will
provide the best performance on very large input.

Thank you again

Joe

*** Sent via Developersdex http://www.developersdex.com ***
Nov 23 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Someonekicked | last post by:
I have a binary file, and I need to delete a specific number of characters in it. The file contains records, each record has a specific length. So the way I wanna handle deleting is that I will...
16
by: Philip Boonzaaier | last post by:
I want to be able to generate SQL statements that will go through a list of data, effectively row by row, enquire on the database if this exists in the selected table- If it exists, then the colums...
3
by: vcornjamb | last post by:
Hello, I am developing a web form that contains some buttons and a data grid which has as its last column link buttons that will delete the data associated with that row. Everything works fine,...
2
kcdoell
by: kcdoell | last post by:
Hello I have a code where I want to delete the records that are found in my DAO recordset. I took a stab at this for the first time and got it to work but it is only delete one record at a time. If...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.