473,385 Members | 1,326 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

CSV File Import

Good morning all,

I have an issue where I am importing a large CSV into Visual Studio 2005
- Visual Basic; as always the creators of the CSV have included the
details of the data at the end file (Some 1.6 million records from the
beginning). I am doing some evaluation on the file before I undertake
the importation, currently this is being done by reading all rows until
I get to the end then returning the last 10 or so rows; does anyone know
how I can read from the bottom of the file up to negate the 1.6 million
cycles of the read procedure before I can get to the data I need.

Any assistance you may be able to provide would be greatly appreciated.

Yours sincerely,

William Foster

*** Sent via Developersdex http://www.developersdex.com ***
Mar 17 '06 #1
11 2038
William Foster wrote:
Good morning all,

I have an issue where I am importing a large CSV into Visual Studio 2005
- Visual Basic; as always the creators of the CSV have included the
details of the data at the end file (Some 1.6 million records from the
beginning). I am doing some evaluation on the file before I undertake
the importation, currently this is being done by reading all rows until
I get to the end then returning the last 10 or so rows; does anyone know
how I can read from the bottom of the file up to negate the 1.6 million
cycles of the read procedure before I can get to the data I need.

Any assistance you may be able to provide would be greatly appreciated.

Yours sincerely,

William Foster

*** Sent via Developersdex http://www.developersdex.com ***


There's probably a simpler way, but I've done it before by first opening
the file in Binary mode, and as CSV files are generally line-delimited
by CrLf (Hex 0D 0A) characters, I start by obtaining the file length and
reading backwards (using SEEK) to find the second-last CrLf.

Between the second-last and the last CrLf would be your header
information you were looking for.

I trust this helps, although as already said, maybe someone else has a
better way.

ShaneO

There are 10 kinds of people - Those who understand Binary and those who
don't.
Mar 17 '06 #2

"William Foster" <no****@devdex.com> wrote in message
news:%2***************@TK2MSFTNGP12.phx.gbl...
Good morning all,

I have an issue where I am importing a large CSV into Visual Studio 2005
- Visual Basic; as always the creators of the CSV have included the
details of the data at the end file (Some 1.6 million records from the
beginning). I am doing some evaluation on the file before I undertake
the importation, currently this is being done by reading all rows until
I get to the end then returning the last 10 or so rows; does anyone know
how I can read from the bottom of the file up to negate the 1.6 million
cycles of the read procedure before I can get to the data I need.


Have you tried MSWord?
Mar 17 '06 #3
Shane,

Thanks for your assistance, I will give that a go, it sounds a bit
awkward, but it is much better than the way I am doing it.

Thank you !

Yours sincerely,

William Foster

*** Sent via Developersdex http://www.developersdex.com ***
Mar 17 '06 #4
Homer,

I was previously using Microsoft Excel, but I need to move to Visual
Studio for a number of reasons. In Microsoft Word I would still need to
seek the last record by reading all rows anyway wouldn't I ? Or is there
a quick way ?

Thanks for you assistance.

Yours sincerely,

William Foster

*** Sent via Developersdex http://www.developersdex.com ***
Mar 17 '06 #5
William Foster wrote:
Shane,

Thanks for your assistance, I will give that a go, it sounds a bit
awkward, but it is much better than the way I am doing it.

Thank you !

Yours sincerely,

William Foster

*** Sent via Developersdex http://www.developersdex.com ***


Hello William,

For what it's worth, I've found the code I mentioned earlier.

You'll have to excuse the syntax, it was just a quick and dirty routine
I needed for a one-off.

If you copy & paste it (watch for line-wrapping!), then set the
filename, you'll find it will display your Header data as required.
Dim blPossibleEOL As Boolean = False
Dim byTemp As Byte = 0
Dim iFileNum As UInt16 = FreeFile()
Dim sA As String = "YOUR FILENAME HERE!!!"
Dim sHeader As String = ""

Dim iFileEnd As Integer = FileLen(sA) - 2
Dim iFileSeekPosition As Integer = iFileEnd
FileOpen(iFileNum, sA, OpenMode.Binary, OpenAccess.Read)

Do
FileGet(iFileNum, byTemp, iFileSeekPosition)
If blPossibleEOL Then
If byTemp = &HD Then
sHeader = StrDup(iFileEnd - (iFileSeekPosition + 1), " ")
FileGet(iFileNum, sHeader, iFileSeekPosition + 2)
Exit Do
End If
ElseIf byTemp = &HA Then
blPossibleEOL = True
Else
blPossibleEOL = False
End If
iFileSeekPosition -= 1
Loop
FileClose(iFileNum)

Debug.Print(sHeader)
Hope this helps.

ShaneO

There are 10 kinds of people - Those who understand Binary and those who
don't.
Mar 17 '06 #6

"William Foster" <no****@devdex.com> wrote in message
news:eU**************@TK2MSFTNGP11.phx.gbl...
Homer,

I was previously using Microsoft Excel, but I need to move to Visual
Studio for a number of reasons. In Microsoft Word I would still need to
seek the last record by reading all rows anyway wouldn't I ? Or is there
a quick way ?


I believe you could just go <CTRL>-<END> and get the last several lines.

Mar 17 '06 #7
Shane,

Thanks a lot ! That will save me a fair bit of work.

Yours sincerely,

William Foster

*** Sent via Developersdex http://www.developersdex.com ***
Mar 17 '06 #8
Homer,

Thanks for the tip, however, I need to do it progammatically as it is
part of a larger program to clean a file, this first bit just assesses
the data. There are in excess of 1.6 million records, with
73 fields per record so opening it up into Microsoft Word and then
getting the program to scroll to the bottom with your method will result
in a computer crash. Kind of the way that Microsoft Excel is currently
dying whcih is why I am moving to Visual Studio.

Thanks anyway !

Yours sincerely,

William Foster

*** Sent via Developersdex http://www.developersdex.com ***
Mar 17 '06 #9

"William Foster" <no****@devdex.com> wrote in message
news:eB**************@tk2msftngp13.phx.gbl...
Thanks for the tip, however, I need to do it progammatically as it is
part of a larger program to clean a file, this first bit just assesses
the data. There are in excess of 1.6 million records, with
73 fields per record so opening it up into Microsoft Word and then
getting the program to scroll to the bottom with your method will result
in a computer crash. Kind of the way that Microsoft Excel is currently
dying whcih is why I am moving to Visual Studio.


OK. If it was a one off I'd try Word. You could also do this with vi on a
Unix system.

Otherwise the best method is to open the file, go to the end, back up some
and then read the data in. This is what Word and vi can do as both can
handle files far larger than main memory.

Mar 17 '06 #10
Shane,

I tried that routine out and it worked beautifully; it was amazingly
quick to search through on of my data files which was in excess of
700,000 rows.

I have just one more question, now that I have the last row, do you have
any ideas on how to capture the last sixteen rows? I have tried a few
different things with your existing code but found that the FileGet must
work on some type of algorithm as it doesn't read sequentially and in
order to find the end of the file it only did 26 loops within the whole
file.

Thank you very much for your assistance so far it has been invaluable.

Yours sincerely,

William Foster

*** Sent via Developersdex http://www.developersdex.com ***
Mar 18 '06 #11
William Foster wrote:
Shane,

I tried that routine out and it worked beautifully; it was amazingly
quick to search through on of my data files which was in excess of
700,000 rows.

I have just one more question, now that I have the last row, do you have
any ideas on how to capture the last sixteen rows? I have tried a few
different things with your existing code but found that the FileGet must
work on some type of algorithm as it doesn't read sequentially and in
order to find the end of the file it only did 26 loops within the whole
file.

Thank you very much for your assistance so far it has been invaluable.

Yours sincerely,

William Foster

*** Sent via Developersdex http://www.developersdex.com ***


William,

The following code will do what you want. (Watch for line-wrapping!) -
Dim blPossibleEOL As Boolean = False
Dim byTemp As Byte = 0
Dim iFileNum As UInt16 = FreeFile()
Dim sA As String = "YOUR FILENAME HERE!!"
Dim sHeader As String = ""
Dim iCounter As UInt16 = 0

Dim iFileEnd As Integer = FileLen(sA) - 2
Dim iFileSeekPosition As Integer = iFileEnd
FileOpen(iFileNum, sA, OpenMode.Binary, OpenAccess.Read)

Do
FileGet(iFileNum, byTemp, iFileSeekPosition)
If blPossibleEOL Then
If byTemp = &HD Then
iCounter += 1
If iCounter = 26 Then 'Change this for whatever number of
lines you are wanting.
sHeader = StrDup(iFileEnd - (iFileSeekPosition + 1), " ")
FileGet(iFileNum, sHeader, iFileSeekPosition + 2)
Exit Do
Else
blPossibleEOL = False
End If
Else
blPossibleEOL = False
End If
ElseIf byTemp = &HA Then
blPossibleEOL = True
Else
blPossibleEOL = False
End If
iFileSeekPosition -= 1
Loop
FileClose(iFileNum)

Debug.Print(sHeader)

One thing to be careful of is that I have declared both "iFileEnd" and
"iFileSeekPosition" as Integer variables. If the CSV file you're
reading is larger that approx 2GB you'll need to change that to Long
Integer (UInt64).

Hope this helps.

ShaneO

There are 10 kinds of people - Those who understand Binary and those who
don't.
Mar 18 '06 #12

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: Pekka Niiranen | last post by:
Hi, I have used the following example from win32 extensions: -----SCRIPT STARTS---- import win32file import win32con import win32security import pywintypes
2
by: Cigdem | last post by:
Hello, I am trying to parse the XML files that the user selects(XML files are on anoher OS400 system called "wkdis3"). But i am permenantly getting that error: Directory0: \\wkdis3\ROOT\home...
1
by: DierkErdmann | last post by:
Hi ! I am trying to create an exe file using pyinstaller. Running the created exe-File gives the error message "" Traceback (most recent call last): File "<string>", line 8, in <module> File...
0
by: lovecarole | last post by:
hi, i am the student who should write a program about reading wav file and do the DFT. actually i don't know how to read data of the wav song and save it into the array... if i want to read...
4
by: AshishMishra16 | last post by:
HI friends, I am using the Flex to upload files to server. I m getting all the details about the file, but I m not able to upload it to Server. Here is the code i m using for both flex & for...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.