469,593 Members | 2,105 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,593 developers. It's quick & easy.

Import large text file to a MS Access database

hi there,

i have a huge large text file (350.000 lines) that i want to import to
a MS Acccess Database, of course i don't want to use Access, but do it
with C#.

i already have tried the AddRow method or Insert, reading each line of
the text file, the problem of course is velocity, it would take more
than 4 hours to add all lines/records to the database.

any suggestion?
than you

Mar 3 '07 #1
14 10430
On 3 Mar 2007 08:36:21 -0800, mfrsousa wrote:
hi there,

i have a huge large text file (350.000 lines) that i want to import to
a MS Acccess Database, of course i don't want to use Access, but do it
with C#.

i already have tried the AddRow method or Insert, reading each line of
the text file, the problem of course is velocity, it would take more
than 4 hours to add all lines/records to the database.

any suggestion?

than you
A bit more info would help ... do you want each line to go into its own
row in the database or all the lines into one field
--
Bits.Bytes
http://bytes.thinkersroom.com
Mar 3 '07 #2
"mfrsousa" <mf******@gmail.comwrote in message
news:11*********************@t69g2000cwt.googlegro ups.com...
i have a huge large text file (350.000 lines) that i want to import to
a MS Acccess Database, of course i don't want to use Access, but do it
with C#.

i already have tried the AddRow method or Insert, reading each line of
the text file, the problem of course is velocity, it would take more
than 4 hours to add all lines/records to the database.
First of all, there is no such thing as an "Access database" per se -
Microsoft Access, like several other Micrsoft applications, uses the Jet
database format...

You say you don't want to use Access - is there any particular reason for
this...? I strongly suggest you open up the database in Access, open the
table in question, then import the text file using the Get External Data
functionality and time how long it takes - I'm willing to bet it will be
nothing like 4 hours...

Obviously if Access isn't installed, then your options are pretty much
limited to ADO.NET and the Jet OleDb provider. However, you might like to
consider using transactions to "batch" the imports into groups of, say
10,000 records - I've achieved massive speed gains with this technique in
the past...
Mar 3 '07 #3
On 3 Mar 2007 08:36:21 -0800, "mfrsousa" <mf******@gmail.comwrote:
>hi there,

i have a huge large text file (350.000 lines) that i want to import to
a MS Acccess Database, of course i don't want to use Access, but do it
with C#.

i already have tried the AddRow method or Insert, reading each line of
the text file, the problem of course is velocity, it would take more
than 4 hours to add all lines/records to the database.

any suggestion?
For a bulk load of a database, such as you describe, it is often
better to switch off indexing on the affected tables before starting
the load and then rebuilding the indexes from scratch after the bulk
load has finished.

rossum

Mar 3 '07 #4

rossum wrote:
For a bulk load of a database, such as you describe, it is often
better to switch off indexing on the affected tables before starting
the load and then rebuilding the indexes from scratch after the bulk
load has finished.

rossum
Right i was just reading this bulk thing (sorry my ignorance), the
problem is that i have tried something like this:
/************************************** CODE
*******************************/
string conn = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=|
DataDirectory|\\db.mdb";
OleDbConnection oleconn = new OleDbConnection(conn);
oleconn.Open();

string sqlstr = "BULK INSERT INTO tblExtData (ext_disc, ext_num,
ext_track, ext_artist) FROM 'c:\\data1.lst')";
OleDbCommand olecom = new OleDbCommand(sqlstr, oleconn);
olecom.ExecuteNonQuery();
/************************************** CODE
*******************************/
and i get this error

Invalid SQL statement; expected 'DELETE', 'INSERT', 'PROCEDURE',
'SELECT', or 'UPDATE'.
i think that the OleDB doesnt support the bulk command

could you point to some bulk reading or explain better please

thanks

Mar 3 '07 #5

Mark Rae wrote:
First of all, there is no such thing as an "Access database" per se -
Microsoft Access, like several other Micrsoft applications, uses the Jet
database format...

You say you don't want to use Access - is there any particular reason for
this...? I strongly suggest you open up the database in Access, open the
table in question, then import the text file using the Get External Data
functionality and time how long it takes - I'm willing to bet it will be
nothing like 4 hours...

Obviously if Access isn't installed, then your options are pretty much
limited to ADO.NET and the Jet OleDb provider. However, you might like to
consider using transactions to "batch" the imports into groups of, say
10,000 records - I've achieved massive speed gains with this technique in
the past...
Yes i'm developing an applicattion so i don't expect that everyone
will have the MAccess installed to make this operation. I will have a
look on the batch imports.

Thanks

Mar 3 '07 #6
On 3 Mar 2007 12:14:57 -0800, "mfrsousa" <mf******@gmail.comwrote:
>
rossum wrote:
>For a bulk load of a database, such as you describe, it is often
better to switch off indexing on the affected tables before starting
the load and then rebuilding the indexes from scratch after the bulk
load has finished.

rossum

Right i was just reading this bulk thing (sorry my ignorance), the
problem is that i have tried something like this:
/************************************** CODE
*******************************/
string conn = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=|
DataDirectory|\\db.mdb";
OleDbConnection oleconn = new OleDbConnection(conn);
oleconn.Open();

string sqlstr = "BULK INSERT INTO tblExtData (ext_disc, ext_num,
ext_track, ext_artist) FROM 'c:\\data1.lst')";
OleDbCommand olecom = new OleDbCommand(sqlstr, oleconn);
olecom.ExecuteNonQuery();
/************************************** CODE
*******************************/
and i get this error

Invalid SQL statement; expected 'DELETE', 'INSERT', 'PROCEDURE',
'SELECT', or 'UPDATE'.
i think that the OleDB doesnt support the bulk command

could you point to some bulk reading or explain better please

thanks
I do not know MS SQL, all my experience was with Oracle SQL some time
ago.

Basically the sequence is:

Delete all indexes for tables to be loaded
Load data into tables
Rebuild/create indexes for tables just loaded

If you load data into an indexed table then each line inserted into
the table also requires an update to the index(es) for that table. By
dropping all the indexing before a bulk load you do not have the
overhead of re-indexing during the run.

Rebuilding the indexes from scratch is faster than trying to keep them
up to date while loading 350,000 lines into a table.

HTH

rossum

Mar 4 '07 #7

"mfrsousa" <mf******@gmail.comwrote in message
news:11**********************@z35g2000cwz.googlegr oups.com...
>
rossum wrote:
>For a bulk load of a database, such as you describe, it is often
better to switch off indexing on the affected tables before starting
the load and then rebuilding the indexes from scratch after the bulk
load has finished.

rossum

Right i was just reading this bulk thing (sorry my ignorance), the
problem is that i have tried something like this:
/************************************** CODE
*******************************/
string conn = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=|
DataDirectory|\\db.mdb";
OleDbConnection oleconn = new OleDbConnection(conn);
oleconn.Open();

string sqlstr = "BULK INSERT INTO tblExtData (ext_disc, ext_num,
ext_track, ext_artist) FROM 'c:\\data1.lst')";
OleDbCommand olecom = new OleDbCommand(sqlstr, oleconn);
olecom.ExecuteNonQuery();
/************************************** CODE
*******************************/
and i get this error

Invalid SQL statement; expected 'DELETE', 'INSERT', 'PROCEDURE',
'SELECT', or 'UPDATE'.
i think that the OleDB doesnt support the bulk command

could you point to some bulk reading or explain better please

thanks
You are correct, I do not believe Access supports BULK Insert. SQLServer
does, but that's not much help to you.

Your choices are to try to figure out how to use OLE Automation to get
Access to import the file, or to write a line-by-line import. (First option
will be much faster). You might try posting your question to
microsoft.public.framework.adonet unless there's someone over there who can
help you who doesn't read this forum.

Robin S.
Mar 4 '07 #8
"rossum" <ro******@coldmail.comwrote in message
news:v9********************************@4ax.com...
If you load data into an indexed table then each line inserted into
the table also requires an update to the index(es) for that table. By
dropping all the indexing before a bulk load you do not have the
overhead of re-indexing during the run.
Or perform the writes within a transaction - indexes are only updated when
the transaction is committed...
Mar 4 '07 #9
On Mar 4, 7:48 pm, "RobinS" <Rob...@NoSpam.yah.nonewrote:
>
You are correct, I do not believe Access supports BULK Insert. SQLServer
does, but that's not much help to you.

Your choices are to try to figure out how to use OLE Automation to get
Access to import the file, or to write a line-by-line import. (First option
will be much faster). You might try posting your question to
microsoft.public.framework.adonet unless there's someone over there who can
help you who doesn't read this forum.

Robin S.
thanks

Mar 4 '07 #10
Does Access support transactions? I didn't think it did. The real answer to
his question is "use SQLServer instead".

Robin S.
---------------------------------------------
"Mark Rae" <ma**@markNOSPAMrae.comwrote in message
news:%2****************@TK2MSFTNGP04.phx.gbl...
"rossum" <ro******@coldmail.comwrote in message
news:v9********************************@4ax.com...
>If you load data into an indexed table then each line inserted into
the table also requires an update to the index(es) for that table. By
dropping all the indexing before a bulk load you do not have the
overhead of re-indexing during the run.

Or perform the writes within a transaction - indexes are only updated
when the transaction is committed...

Mar 5 '07 #11
"RobinS" <Ro****@NoSpam.yah.nonewrote in message
news:28******************************@comcast.com. ..
Does Access support transactions?
Yep - ever since version 1.0...
http://msdn2.microsoft.com/en-us/lib...ffice.10).aspx
Mar 5 '07 #12
"Mark Rae" <ma**@markNOSPAMrae.comwrote in message
news:e$****************@TK2MSFTNGP05.phx.gbl...
"RobinS" <Ro****@NoSpam.yah.nonewrote in message
news:28******************************@comcast.com. ..
>Does Access support transactions?

Yep - ever since version 1.0...
http://msdn2.microsoft.com/en-us/lib...ffice.10).aspx
Well, it supports it with DAO. The question is, does it support doing
transactions with ADO.Net? I thought I had seen a post in the
microsoft.public.dotnet.framework.adonet group that said it didn't, but I
could be wrong. Wouldn't be the first time; won't be the last. ;-)

Robin S.
Mar 5 '07 #13
Going back in time to ADO we created a link to the text file (via Odbc Text
driver) and then issued a INSERT/SELECT FROM to copy the data from the link
into database proper ... for example;

CREATE "ADOX.Catalog" w_Catalog NO-ERROR.
w_Catalog:ActiveConnection = "<connection-string-to-mdb-database>".
CREATE "ADOX.Table" w_Table.
w_Table:Name = "zzData".
w_Table:ParentCatalog = w_Catalog.
w_Table:Properties("Jet OLEDB:Link Datasource") =
"<path-in-which-test-file-is-located>".
w_Table:Properties("Jet OLEDB:Remote Table Name") = "<name-of-test-file,
eg WSReport#txt>".
w_Table:Properties("Jet OLEDB:Create Link") = True.
w_Table:Properties("Jet OLEDB:Link Provider String")= "TEXT;HDR=No".
w_Catalog:Tables:APPEND(w_Table).
.... (and then )
CREATE "ADODB.Connection" w_DB NO-ERROR.
w_Sql = "INSERT INTO Booking (<fields>) SELECT <fieldsFROM zzData WHERE
<filter-if-required>".
w_DB:EXECUTE(w_Sql, OUTPUT w_Cnt, 0).
w_Sql = 'DROP TABLE zzData'.
w_DB:EXECUTE(w_Sql, OUTPUT w_Cnt, 0).

The code is from a non .Net 4gl, so "some" conversion is required, I have
placed <around most of the parameter elements - hopefully it will give you
some ideas. The down side is the use of the ADOX com for the schema link
(the "INSERT/SELECT FROM" statement you can do through managed code) - not
sure how much of an issue it will be though.
"rossum" <ro******@coldmail.comwrote in message
news:v9********************************@4ax.com...
On 3 Mar 2007 12:14:57 -0800, "mfrsousa" <mf******@gmail.comwrote:

rossum wrote:
For a bulk load of a database, such as you describe, it is often
better to switch off indexing on the affected tables before starting
the load and then rebuilding the indexes from scratch after the bulk
load has finished.

rossum
Right i was just reading this bulk thing (sorry my ignorance), the
problem is that i have tried something like this:
/************************************** CODE
*******************************/
string conn = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=|
DataDirectory|\\db.mdb";
OleDbConnection oleconn = new OleDbConnection(conn);
oleconn.Open();

string sqlstr = "BULK INSERT INTO tblExtData (ext_disc, ext_num,
ext_track, ext_artist) FROM 'c:\\data1.lst')";
OleDbCommand olecom = new OleDbCommand(sqlstr, oleconn);
olecom.ExecuteNonQuery();
/************************************** CODE
*******************************/
and i get this error

Invalid SQL statement; expected 'DELETE', 'INSERT', 'PROCEDURE',
'SELECT', or 'UPDATE'.
i think that the OleDB doesnt support the bulk command

could you point to some bulk reading or explain better please

thanks
I do not know MS SQL, all my experience was with Oracle SQL some time
ago.

Basically the sequence is:

Delete all indexes for tables to be loaded
Load data into tables
Rebuild/create indexes for tables just loaded

If you load data into an indexed table then each line inserted into
the table also requires an update to the index(es) for that table. By
dropping all the indexing before a bulk load you do not have the
overhead of re-indexing during the run.

Rebuilding the indexes from scratch is faster than trying to keep them
up to date while loading 350,000 lines into a table.

HTH

rossum

Mar 6 '07 #14
On Mar 6, 6:29 am, "Colin Stutley" <Colin.Stut...@ws.com.auwrote:
Going back in time to ADO we created a link to the text file (via Odbc Text
driver) and then issued a INSERT/SELECT FROM to copy the data from the link
into database proper ... for example;

CREATE "ADOX.Catalog" w_Catalog NO-ERROR.
w_Catalog:ActiveConnection = "<connection-string-to-mdb-database>".
CREATE "ADOX.Table" w_Table.
w_Table:Name = "zzData".
w_Table:ParentCatalog = w_Catalog.
w_Table:Properties("Jet OLEDB:Link Datasource") =
"<path-in-which-test-file-is-located>".
w_Table:Properties("Jet OLEDB:Remote Table Name") = "<name-of-test-file,
eg WSReport#txt>".
w_Table:Properties("Jet OLEDB:Create Link") = True.
w_Table:Properties("Jet OLEDB:Link Provider String")= "TEXT;HDR=No".
w_Catalog:Tables:APPEND(w_Table).
... (and then )
CREATE "ADODB.Connection" w_DB NO-ERROR.
w_Sql = "INSERT INTO Booking (<fields>) SELECT <fieldsFROM zzData WHERE
<filter-if-required>".
w_DB:EXECUTE(w_Sql, OUTPUT w_Cnt, 0).
w_Sql = 'DROP TABLE zzData'.
w_DB:EXECUTE(w_Sql, OUTPUT w_Cnt, 0).

The code is from a non .Net 4gl, so "some" conversion is required, I have
placed <around most of the parameter elements - hopefully it will give you
some ideas. The down side is the use of the ADOX com for the schema link
(the "INSERT/SELECT FROM" statement you can do through managed code) - not
sure how much of an issue it will be though.

"rossum" <rossu...@coldmail.comwrote in message

news:v9********************************@4ax.com...
On 3 Mar 2007 12:14:57 -0800, "mfrsousa" <mfrso...@gmail.comwrote:
>rossum wrote:
>For a bulk load of a database, such as you describe, it is often
>better to switch off indexing on the affected tables before starting
>the load and then rebuilding the indexes from scratch after the bulk
>load has finished.
>rossum
>Right i was just reading this bulk thing (sorry my ignorance), the
>problem is that i have tried something like this:
>/************************************** CODE
>*******************************/
>string conn = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=|
>DataDirectory|\\db.mdb";
>OleDbConnection oleconn = new OleDbConnection(conn);
>oleconn.Open();
>string sqlstr = "BULK INSERT INTO tblExtData (ext_disc, ext_num,
>ext_track, ext_artist) FROM 'c:\\data1.lst')";
>OleDbCommand olecom = new OleDbCommand(sqlstr, oleconn);
>olecom.ExecuteNonQuery();
>/************************************** CODE
>*******************************/
>and i get this error
>Invalid SQL statement; expected 'DELETE', 'INSERT', 'PROCEDURE',
>'SELECT', or 'UPDATE'.
i think that the OleDB doesnt support the bulk command
>could you point to some bulk reading or explain better please
>thanks
I do not know MS SQL, all my experience was with Oracle SQL some time
ago.
Basically the sequence is:
Delete all indexes for tables to be loaded
Load data into tables
Rebuild/create indexes for tables just loaded
If you load data into an indexed table then each line inserted into
the table also requires an update to the index(es) for that table. By
dropping all the indexing before a bulk load you do not have the
overhead of re-indexing during the run.
Rebuilding the indexes from scratch is faster than trying to keep them
up to date while loading 350,000 lines into a table.
HTH
rossum
Solution is here, thank you all for the help
http://groups.google.com/group/micro...60421b4295430b

Mar 6 '07 #15

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

8 posts views Thread by Rune Johansen | last post: by
reply views Thread by adrian GREEMAN | last post: by
6 posts views Thread by jcrouse | last post: by
reply views Thread by Edwin.Madari | last post: by
reply views Thread by suresh191 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.