472,958 Members | 1,852 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,958 software developers and data experts.

Need to know the best method to perform sort merge of log file con

I'm trying to write a text log file processor but am having significant
performance issues.

* On average, there are about 100-200 files to process each file being about
1MB in size.
* Typically there are ~600k to 1m lines to process total. Every line in
each log file typically contains a date, a time, follwed by a textual message
* Unfortunately, not all log files have the same format/layout; e.g. some
may have only have month/day while others have year/moth/day; some may have
time with milliseconds (12:34:56.768) while other only have standard time
format (12:34:56)
* Log file contents from all files need to be sort merged to support either
viewing or post merge parsing/searches/whatever.

I have tried loading all of the file contents into memory, reformatting on
the fly to get dates, times, etc. in the same formart, and performing a sort
merge by comparing date and time stamps per line; which took forever (20-30
minutes on average.) I also tried using MS Log Parser which did the job
fairly but totally consumes CPU utilization (still took awhile as well.)

Surely there has to be a better approach without requiring a ton of memory
and thrashing disk I/O to read, reformat, and sort merge text log files. Any
suggestions and/or code examples?

Thanks,

Matt
Dec 26 '05 #1
8 1497
Mesterak,

I answer in the way of a question. Why do you think that people has
investigated databases?

First it where random databases and because of the maintenance problems with
those they came what is now relational databases.

Have a look for SQL Express

Your alternative are doing it yourself and create binary files, however be
aware of what I wrote about maintenance.

I hope this helps,

Cor
Dec 26 '05 #2
I don't require retention of the data thus, what maintenance is required?
None. Surely, there has to be some algorithm and/or best practice method out
there for doing this sort of thing without requiring a database.

Honestly, you have replied to three of my posts supplying equally vague
answers. I do appreciate your replies but am not finding the responses
beneficial in reaching solutions. You mention doing it myself and create
binary files. Ok, so how would this be done with some details and through
code examples?

"Cor Ligthert [MVP]" wrote:
Mesterak,

I answer in the way of a question. Why do you think that people has
investigated databases?

First it where random databases and because of the maintenance problems with
those they came what is now relational databases.

Have a look for SQL Express

Your alternative are doing it yourself and create binary files, however be
aware of what I wrote about maintenance.

I hope this helps,

Cor

Dec 26 '05 #3
Mesterak,
Honestly, you have replied to three of my posts supplying equally vague
answers. I do appreciate your replies but am not finding the responses
beneficial in reaching solutions. You mention doing it myself and create
binary files. Ok, so how would this be done with some details and through
code examples?

I was busy searching for the linefeed character for you. However if this is
your idea, feel free.

But honetsly, don't ask something to a newsgroup, if you are sure you know
the answers better. Most people don't want to be in a newsgroup with text
answering direct to the given solution, which is so bad, that it can be
thought that it was there idea afterwards.

Succes

Cor
Dec 26 '05 #4
There was absolutely no need to take it personal. I was thanking you for
responding but was requesting more detail beyond you "hinting" as to a
solution path. You are the MVP with the answers, yes? If I knew the
answers, I wouldn't post questions. So again, I thank-you for responding and
only ask for more precision in your answers...only you know what you are
thinking, so you must please elaborate when responding or "break-it-down" for
us newbies...

Kindly,

Matt

"Cor Ligthert [MVP]" wrote:
Mesterak,
Honestly, you have replied to three of my posts supplying equally vague
answers. I do appreciate your replies but am not finding the responses
beneficial in reaching solutions. You mention doing it myself and create
binary files. Ok, so how would this be done with some details and through
code examples?

I was busy searching for the linefeed character for you. However if this is
your idea, feel free.

But honetsly, don't ask something to a newsgroup, if you are sure you know
the answers better. Most people don't want to be in a newsgroup with text
answering direct to the given solution, which is so bad, that it can be
thought that it was there idea afterwards.

Succes

Cor

Dec 26 '05 #5
Hi Matt,

"mesterak" wrote:
There was absolutely no need to take it personal. I was thanking you for
responding but was requesting more detail beyond you "hinting" as to a
solution path. You are the MVP with the answers, yes? If I knew the
answers, I wouldn't post questions. So again, I thank-you for responding and
only ask for more precision in your answers...only you know what you are
thinking, so you must please elaborate when responding or "break-it-down" for
us newbies...


That's not what Cor meant. He means that there are better tools for doing
what you do without having to code your own engine. Try to reuse as much as
possible. Joining is a *natural* operation in databases. You write a simple
query, 3-4 lines long and get the results in a table, while you would code an
entire engine only to find the same results.

Anyway, using a database may not be an option for you. In that case you
could use the ODBC driver for text files and join (merge) the logs that way.
It supports several data/time formats and other. You will have to write a
short descriptor file for the log file formats, what this does is describe
the log file as a table to the ODBC driver. Search the net or respective
newsgroups for more info. Then you write a SQL query that gets the data how
you want it.

If that isn't an option, then is not much more to say. Write it by hand.

Kind regards,
--
Tom Tempelaere.
Dec 29 '05 #6
Thanks, I will check it out ;)

-Matt

"TT (Tom Tempelaere)" wrote:
Hi Matt,

"mesterak" wrote:
There was absolutely no need to take it personal. I was thanking you for
responding but was requesting more detail beyond you "hinting" as to a
solution path. You are the MVP with the answers, yes? If I knew the
answers, I wouldn't post questions. So again, I thank-you for responding and
only ask for more precision in your answers...only you know what you are
thinking, so you must please elaborate when responding or "break-it-down" for
us newbies...


That's not what Cor meant. He means that there are better tools for doing
what you do without having to code your own engine. Try to reuse as much as
possible. Joining is a *natural* operation in databases. You write a simple
query, 3-4 lines long and get the results in a table, while you would code an
entire engine only to find the same results.

Anyway, using a database may not be an option for you. In that case you
could use the ODBC driver for text files and join (merge) the logs that way.
It supports several data/time formats and other. You will have to write a
short descriptor file for the log file formats, what this does is describe
the log file as a table to the ODBC driver. Search the net or respective
newsgroups for more info. Then you write a SQL query that gets the data how
you want it.

If that isn't an option, then is not much more to say. Write it by hand.

Kind regards,
--
Tom Tempelaere.

Dec 29 '05 #7
Tom,

Beside this is about my other vague answers where Matt is talking about the
code from Jon (which Mat uses now) beside the buffer part completely based
on the same as my first answer to him.

Only Jon did write it in code for him.

Cor
Dec 29 '05 #8
Whatever...

"Cor Ligthert [MVP]" wrote:
Tom,

Beside this is about my other vague answers where Matt is talking about the
code from Jon (which Mat uses now) beside the buffer part completely based
on the same as my first answer to him.

Only Jon did write it in code for him.

Cor

Dec 29 '05 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Tommo | last post by:
Hello All, whilst I know how to sort an array I am having problems with my situation, what I would ideally like is something like a 'sort keys' action with Perl hashes. I am reading in data and...
4
by: Support | last post by:
Hi, I want to know if I have changed a few records in my database using update / insert / delete methods, how can i later know which rows have been changed or modified ? I know the...
5
by: Learner | last post by:
Hello, Here is the code snippet I got strucked at. I am unable to convert the below line of code to its equavalent vb.net code. could some one please help me with this? static public...
20
by: Martin Jørgensen | last post by:
Hi, I'm reading a number of double values from a file. It's a 2D-array: 1 2 3 4 5 6 7 ------------- 1 3.2 2 0 2.1 3 9.3 4
20
by: mike | last post by:
I help manage a large web site, one that has over 600 html pages... It's a reference site for ham radio folks and as an example, one page indexes over 1.8 gb of on-line PDF documents. The site...
0
by: truptidalia | last post by:
Hi, I am not that experienced in VB.NET. I am working on a application where i have to perform mail merge from the application. The application lets the userwrite letters, save them as rtf...
2
by: mqueene7 | last post by:
below is my code for my merge sort but I can't get it to sort properly. I am trying generate random numbers based on input from the user and then sort those random numbers. Can you tell me what I...
9
by: Aaron Watters | last post by:
....is to forget they are sorted??? While trying to optimize some NUCULAR libraries I discovered that the best way to merge 2 sorted lists together into a new sorted list is to just append them...
9
by: Academia | last post by:
ContextMenu has a Popup event but MainMenu does not. I need Popup with MainMenu. Do you know anyway to simulate it? Thanks in advance
2
by: DJRhino | last post by:
Was curious if anyone else was having this same issue or not.... I was just Up/Down graded to windows 11 and now my access combo boxes are not acting right. With win 10 I could start typing...
2
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 4 Oct 2023 starting at 18:00 UK time (6PM UTC+1) and finishing at about 19:15 (7.15PM) The start time is equivalent to 19:00 (7PM) in Central...
2
by: giovanniandrean | last post by:
The energy model is structured as follows and uses excel sheets to give input data: 1-Utility.py contains all the functions needed to calculate the variables and other minor things (mentions...
4
NeoPa
by: NeoPa | last post by:
Hello everyone. I find myself stuck trying to find the VBA way to get Access to create a PDF of the currently-selected (and open) object (Form or Report). I know it can be done by selecting :...
1
by: Teri B | last post by:
Hi, I have created a sub-form Roles. In my course form the user selects the roles assigned to the course. 0ne-to-many. One course many roles. Then I created a report based on the Course form and...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 1 Nov 2023 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM) Please note that the UK and Europe revert to winter time on...
3
by: nia12 | last post by:
Hi there, I am very new to Access so apologies if any of this is obvious/not clear. I am creating a data collection tool for health care employees to complete. It consists of a number of...
0
NeoPa
by: NeoPa | last post by:
Introduction For this article I'll be focusing on the Report (clsReport) class. This simply handles making the calling Form invisible until all of the Reports opened by it have been closed, when it...
2
by: GKJR | last post by:
Does anyone have a recommendation to build a standalone application to replace an Access database? I have my bookkeeping software I developed in Access that I would like to make available to other...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.