Connecting Tech Pros Worldwide Forums | Help | Site Map

Reading textfiles line by line

jamait
Guest
 
Posts: n/a
#1: Jul 22 '05
Hi all,

I m trying to read in a text file into a datatable...

Not sure on how to split up the information though, regex or substrings...?

sample:
Col1 Col2 Col3
Col4
A0012430 REKAL TVÄTTMEDEL EKOMAX 0,5L ST 75.9000
A0012550 REKAL TVÄTTMEDEL BIOKULÖR 20KGST 1727.0000

Notice how the 2nd row has merged col2 and col3. There are no delimeter at
all but the position seem to be the same depending on the text in col2. The
original file
include approx. 5000 lines that I want to update a sql table with. The above
problem
occurs at many positions in the text file.

I have successfully read in data from the text file using this code:

<code>
StreamReader sr = File.OpenText(fileName);
string line = "";
while ((line = sr.ReadLine()) != null)
{
if(line.Length > 17)
{
DataRow dr = m_Data.NewRow();
dr["Col1"]= line.Substring(0, 17).Trim();
dr["Col2"] = ?
dr["Col3"] = ?
dr["Col4"] = ?
m_Data.Rows.Add(dr);
}
}
sr.Close();

Any help appriciated!

/Martin


Patrice
Guest
 
Posts: n/a
#2: Jul 22 '05

re: Reading textfiles line by line


Not sure what you meant by "the position seem to be the same depending on
the text in col2" ?

What if you try to display this file using a fixed width font such as
courier new ? Are all fields aligned ?

To me it looks like this is a fixed width file. Each column uses always the
same range of characters on each line (but it may not be visible immediately
when using a proportional font).

Patrice

--

"jamait" <jamait@discussions.microsoft.com> a écrit dans le message de
news:8EEA78EC-6F35-4C62-BC94-592A262EFA77@microsoft.com...[color=blue]
> Hi all,
>
> I m trying to read in a text file into a datatable...
>
> Not sure on how to split up the information though, regex or[/color]
substrings...?[color=blue]
>
> sample:
> Col1 Col2 Col3
> Col4
> A0012430 REKAL TVÄTTMEDEL EKOMAX 0,5L ST 75.9000
> A0012550 REKAL TVÄTTMEDEL BIOKULÖR 20KGST 1727.0000
>
> Notice how the 2nd row has merged col2 and col3. There are no delimeter at
> all but the position seem to be the same depending on the text in col2.[/color]
The[color=blue]
> original file
> include approx. 5000 lines that I want to update a sql table with. The[/color]
above[color=blue]
> problem
> occurs at many positions in the text file.
>
> I have successfully read in data from the text file using this code:
>
> <code>
> StreamReader sr = File.OpenText(fileName);
> string line = "";
> while ((line = sr.ReadLine()) != null)
> {
> if(line.Length > 17)
> {
> DataRow dr = m_Data.NewRow();
> dr["Col1"]= line.Substring(0, 17).Trim();
> dr["Col2"] = ?
> dr["Col3"] = ?
> dr["Col4"] = ?
> m_Data.Rows.Add(dr);
> }
> }
> sr.Close();
>
> Any help appriciated!
>
> /Martin
>[/color]


Dave
Guest
 
Posts: n/a
#3: Jul 22 '05

re: Reading textfiles line by line


> The original file include approx. 5000 lines

Is this a one-time operation? If you are using SqlServer you can write a simple DTS package to do the transformation or just use
the Import Data command in Enterprise Manager.

--
Dave Sexton
dave@www..jwaonline..com
-----------------------------------------------------------------------
"Patrice" <nobody@nowhere.com> wrote in message news:unk3M9QYFHA.616@TK2MSFTNGP12.phx.gbl...[color=blue]
> Not sure what you meant by "the position seem to be the same depending on
> the text in col2" ?
>
> What if you try to display this file using a fixed width font such as
> courier new ? Are all fields aligned ?
>
> To me it looks like this is a fixed width file. Each column uses always the
> same range of characters on each line (but it may not be visible immediately
> when using a proportional font).
>
> Patrice
>
> --
>
> "jamait" <jamait@discussions.microsoft.com> a écrit dans le message de
> news:8EEA78EC-6F35-4C62-BC94-592A262EFA77@microsoft.com...[color=green]
>> Hi all,
>>
>> I m trying to read in a text file into a datatable...
>>
>> Not sure on how to split up the information though, regex or[/color]
> substrings...?[color=green]
>>
>> sample:
>> Col1 Col2 Col3
>> Col4
>> A0012430 REKAL TVÄTTMEDEL EKOMAX 0,5L ST 75.9000
>> A0012550 REKAL TVÄTTMEDEL BIOKULÖR 20KGST 1727.0000
>>
>> Notice how the 2nd row has merged col2 and col3. There are no delimeter at
>> all but the position seem to be the same depending on the text in col2.[/color]
> The[color=green]
>> original file
>> include approx. 5000 lines that I want to update a sql table with. The[/color]
> above[color=green]
>> problem
>> occurs at many positions in the text file.
>>
>> I have successfully read in data from the text file using this code:
>>
>> <code>
>> StreamReader sr = File.OpenText(fileName);
>> string line = "";
>> while ((line = sr.ReadLine()) != null)
>> {
>> if(line.Length > 17)
>> {
>> DataRow dr = m_Data.NewRow();
>> dr["Col1"]= line.Substring(0, 17).Trim();
>> dr["Col2"] = ?
>> dr["Col3"] = ?
>> dr["Col4"] = ?
>> m_Data.Rows.Add(dr);
>> }
>> }
>> sr.Close();
>>
>> Any help appriciated!
>>
>> /Martin
>>[/color]
>
>[/color]


jamait
Guest
 
Posts: n/a
#4: Jul 22 '05

re: Reading textfiles line by line


Hi again,

It is not a one time operation and what I really want to do is to split each
line into 4 parts...

First bit is a ID field, second field a description, third is the unit and
is in the format of an known char array. The last column is the price of the
product.
Unfortunately when opening this file which is opened as plain text and read
line by line the 2nd and 3rd column somehow merges at some of the rows...
The merging only occurs where the description field is long enough and
probably at the position of the longest description in the file...The unit of
the line is then appended to the line without any delimeter whereas the main
problem.
The other columns in the file are separated by more than 1 white space
between them.

I am thinking of using some kind of regular expression and extracting the
information wanted line by line but not sure how to split the 2nd and 3rd
column.
A0012430 REKAL TVÄTTMEDEL EKOMAX 0,5L ST 75.9000
A0012550 REKAL TVÄTTMEDEL BIOKULÖR 20KGST 1727.0000

Also updated the StreamReader to use the default encoding to display the
swedish characters properly...


Help...

/Martin
[color=blue][color=green][color=darkred]
> >> StreamReader sr = new StreamReader(file, System.Text.Encoding.Default);
> >> string line = "";
> >> while ((line = sr.ReadLine()) != null)
> >> {
> >> if(line.Length > 17)
> >> {
> >> DataRow dr = m_Data.NewRow();
> >> dr["Col1"]= line.Substring(0, 17).Trim();
> >> dr["Col2"] = ?
> >> dr["Col3"] = ?
> >> dr["Col4"] = ?
> >> m_Data.Rows.Add(dr);
> >> }
> >> }
> >> sr.Close();[/color][/color][/color]




"Dave" wrote:
[color=blue][color=green]
> > The original file include approx. 5000 lines[/color]
>
> Is this a one-time operation? If you are using SqlServer you can write a simple DTS package to do the transformation or just use
> the Import Data command in Enterprise Manager.
>
> --
> Dave Sexton
> dave@www..jwaonline..com
> -----------------------------------------------------------------------
> "Patrice" <nobody@nowhere.com> wrote in message news:unk3M9QYFHA.616@TK2MSFTNGP12.phx.gbl...[color=green]
> > Not sure what you meant by "the position seem to be the same depending on
> > the text in col2" ?
> >
> > What if you try to display this file using a fixed width font such as
> > courier new ? Are all fields aligned ?
> >
> > To me it looks like this is a fixed width file. Each column uses always the
> > same range of characters on each line (but it may not be visible immediately
> > when using a proportional font).
> >
> > Patrice
> >
> > --
> >
> > "jamait" <jamait@discussions.microsoft.com> a écrit dans le message de
> > news:8EEA78EC-6F35-4C62-BC94-592A262EFA77@microsoft.com...[color=darkred]
> >> Hi all,
> >>
> >> I m trying to read in a text file into a datatable...
> >>
> >> Not sure on how to split up the information though, regex or[/color]
> > substrings...?[color=darkred]
> >>
> >> sample:
> >> Col1 Col2 Col3
> >> Col4
> >> A0012430 REKAL TVÄTTMEDEL EKOMAX 0,5L ST 75.9000
> >> A0012550 REKAL TVÄTTMEDEL BIOKULÖR 20KGST 1727.0000
> >>
> >> Notice how the 2nd row has merged col2 and col3. There are no delimeter at
> >> all but the position seem to be the same depending on the text in col2.[/color]
> > The[color=darkred]
> >> original file
> >> include approx. 5000 lines that I want to update a sql table with. The[/color]
> > above[color=darkred]
> >> problem
> >> occurs at many positions in the text file.
> >>
> >> I have successfully read in data from the text file using this code:
> >>
> >> <code>
> >> StreamReader sr = File.OpenText(fileName);
> >> string line = "";
> >> while ((line = sr.ReadLine()) != null)
> >> {
> >> if(line.Length > 17)
> >> {
> >> DataRow dr = m_Data.NewRow();
> >> dr["Col1"]= line.Substring(0, 17).Trim();
> >> dr["Col2"] = ?
> >> dr["Col3"] = ?
> >> dr["Col4"] = ?
> >> m_Data.Rows.Add(dr);
> >> }
> >> }
> >> sr.Close();
> >>
> >> Any help appriciated!
> >>
> >> /Martin
> >>[/color]
> >
> >[/color]
>
>
>[/color]
Patrice
Guest
 
Posts: n/a
#5: Jul 22 '05

re: Reading textfiles line by line


Sorry but I'm afraid I still don't catch the exact problem. The two lines
you showed us are using the same format.

There is NO separator. Each field uses a *fixed* width :

Copy the two lines below :
A0012430 REKAL TVÄTTMEDEL EKOMAX 0,5L ST 75.9000
A0012550 REKAL TVÄTTMEDEL BIOKULÖR 20KGST 1727.0000

Paste this into notepad and use the courier new font.

You'll see that 5L and 20KG are starting at the same location (and always 4
characters wide). ST and ST starts at the same location.
The last field doesn't but is right justified in a field that begins at some
unknown location (this is the only problem I see, you'll have to find out
the *fixed* length of the third field so that you can start reading the 4 th
field from the correct position).

Patrice


--

"jamait" <jamait@discussions.microsoft.com> a écrit dans le message de
news:00D6D172-ABF6-4735-AEFD-6F8D197D1EB4@microsoft.com...[color=blue]
> Hi again,
>
> It is not a one time operation and what I really want to do is to split[/color]
each[color=blue]
> line into 4 parts...
>
> First bit is a ID field, second field a description, third is the unit and
> is in the format of an known char array. The last column is the price of[/color]
the[color=blue]
> product.
> Unfortunately when opening this file which is opened as plain text and[/color]
read[color=blue]
> line by line the 2nd and 3rd column somehow merges at some of the rows...
> The merging only occurs where the description field is long enough and
> probably at the position of the longest description in the file...The unit[/color]
of[color=blue]
> the line is then appended to the line without any delimeter whereas the[/color]
main[color=blue]
> problem.
> The other columns in the file are separated by more than 1 white space
> between them.
>
> I am thinking of using some kind of regular expression and extracting the
> information wanted line by line but not sure how to split the 2nd and 3rd
> column.
> A0012430 REKAL TVÄTTMEDEL EKOMAX 0,5L ST 75.9000
> A0012550 REKAL TVÄTTMEDEL BIOKULÖR 20KGST 1727.0000
>
> Also updated the StreamReader to use the default encoding to display the
> swedish characters properly...
>
>
> Help...
>
> /Martin
>[color=green][color=darkred]
> > >> StreamReader sr = new StreamReader(file,[/color][/color][/color]
System.Text.Encoding.Default);[color=blue][color=green][color=darkred]
> > >> string line = "";
> > >> while ((line = sr.ReadLine()) != null)
> > >> {
> > >> if(line.Length > 17)
> > >> {
> > >> DataRow dr = m_Data.NewRow();
> > >> dr["Col1"]= line.Substring(0, 17).Trim();
> > >> dr["Col2"] = ?
> > >> dr["Col3"] = ?
> > >> dr["Col4"] = ?
> > >> m_Data.Rows.Add(dr);
> > >> }
> > >> }
> > >> sr.Close();[/color][/color]
>
>
>
>
> "Dave" wrote:
>[color=green][color=darkred]
> > > The original file include approx. 5000 lines[/color]
> >
> > Is this a one-time operation? If you are using SqlServer you can write[/color][/color]
a simple DTS package to do the transformation or just use[color=blue][color=green]
> > the Import Data command in Enterprise Manager.
> >
> > --
> > Dave Sexton
> > dave@www..jwaonline..com
> > -----------------------------------------------------------------------
> > "Patrice" <nobody@nowhere.com> wrote in message[/color][/color]
news:unk3M9QYFHA.616@TK2MSFTNGP12.phx.gbl...[color=blue][color=green][color=darkred]
> > > Not sure what you meant by "the position seem to be the same depending[/color][/color][/color]
on[color=blue][color=green][color=darkred]
> > > the text in col2" ?
> > >
> > > What if you try to display this file using a fixed width font such as
> > > courier new ? Are all fields aligned ?
> > >
> > > To me it looks like this is a fixed width file. Each column uses[/color][/color][/color]
always the[color=blue][color=green][color=darkred]
> > > same range of characters on each line (but it may not be visible[/color][/color][/color]
immediately[color=blue][color=green][color=darkred]
> > > when using a proportional font).
> > >
> > > Patrice
> > >
> > > --
> > >
> > > "jamait" <jamait@discussions.microsoft.com> a écrit dans le message de
> > > news:8EEA78EC-6F35-4C62-BC94-592A262EFA77@microsoft.com...
> > >> Hi all,
> > >>
> > >> I m trying to read in a text file into a datatable...
> > >>
> > >> Not sure on how to split up the information though, regex or
> > > substrings...?
> > >>
> > >> sample:
> > >> Col1 Col2[/color][/color][/color]
Col3[color=blue][color=green][color=darkred]
> > >> Col4
> > >> A0012430 REKAL TVÄTTMEDEL EKOMAX 0,5L ST 75.9000
> > >> A0012550 REKAL TVÄTTMEDEL BIOKULÖR 20KGST 1727.0000
> > >>
> > >> Notice how the 2nd row has merged col2 and col3. There are no[/color][/color][/color]
delimeter at[color=blue][color=green][color=darkred]
> > >> all but the position seem to be the same depending on the text in[/color][/color][/color]
col2.[color=blue][color=green][color=darkred]
> > > The
> > >> original file
> > >> include approx. 5000 lines that I want to update a sql table with.[/color][/color][/color]
The[color=blue][color=green][color=darkred]
> > > above
> > >> problem
> > >> occurs at many positions in the text file.
> > >>
> > >> I have successfully read in data from the text file using this code:
> > >>
> > >> <code>
> > >> StreamReader sr = File.OpenText(fileName);
> > >> string line = "";
> > >> while ((line = sr.ReadLine()) != null)
> > >> {
> > >> if(line.Length > 17)
> > >> {
> > >> DataRow dr = m_Data.NewRow();
> > >> dr["Col1"]= line.Substring(0, 17).Trim();
> > >> dr["Col2"] = ?
> > >> dr["Col3"] = ?
> > >> dr["Col4"] = ?
> > >> m_Data.Rows.Add(dr);
> > >> }
> > >> }
> > >> sr.Close();
> > >>
> > >> Any help appriciated!
> > >>
> > >> /Martin
> > >>
> > >
> > >[/color]
> >
> >
> >[/color][/color]


jamait
Guest
 
Posts: n/a
#6: Jul 22 '05

re: Reading textfiles line by line


So by getting the positions for the columns on the first row i can use these
in the following lines?

I think that the column width is set by the max length of the 2nd
column...so this would probably change if the description in any of the rows
is longer. The file is an export from a different software package that I am
not able to change.

Is there a way to replace 2 or more spaces with a separator or is there a
better way to split the columns?

Thanks

/Martin

"Patrice" wrote:
[color=blue]
> Sorry but I'm afraid I still don't catch the exact problem. The two lines
> you showed us are using the same format.
>
> There is NO separator. Each field uses a *fixed* width :
>
> Copy the two lines below :
> A0012430 REKAL TVÄTTMEDEL EKOMAX 0,5L ST 75.9000
> A0012550 REKAL TVÄTTMEDEL BIOKULÖR 20KGST 1727.0000
>
> Paste this into notepad and use the courier new font.
>
> You'll see that 5L and 20KG are starting at the same location (and always 4
> characters wide). ST and ST starts at the same location.
> The last field doesn't but is right justified in a field that begins at some
> unknown location (this is the only problem I see, you'll have to find out
> the *fixed* length of the third field so that you can start reading the 4 th
> field from the correct position).
>
> Patrice
>
>
> --
>
> "jamait" <jamait@discussions.microsoft.com> a écrit dans le message de
> news:00D6D172-ABF6-4735-AEFD-6F8D197D1EB4@microsoft.com...[color=green]
> > Hi again,
> >
> > It is not a one time operation and what I really want to do is to split[/color]
> each[color=green]
> > line into 4 parts...
> >
> > First bit is a ID field, second field a description, third is the unit and
> > is in the format of an known char array. The last column is the price of[/color]
> the[color=green]
> > product.
> > Unfortunately when opening this file which is opened as plain text and[/color]
> read[color=green]
> > line by line the 2nd and 3rd column somehow merges at some of the rows...
> > The merging only occurs where the description field is long enough and
> > probably at the position of the longest description in the file...The unit[/color]
> of[color=green]
> > the line is then appended to the line without any delimeter whereas the[/color]
> main[color=green]
> > problem.
> > The other columns in the file are separated by more than 1 white space
> > between them.
> >
> > I am thinking of using some kind of regular expression and extracting the
> > information wanted line by line but not sure how to split the 2nd and 3rd
> > column.
> > A0012430 REKAL TVÄTTMEDEL EKOMAX 0,5L ST 75.9000
> > A0012550 REKAL TVÄTTMEDEL BIOKULÖR 20KGST 1727.0000
> >
> > Also updated the StreamReader to use the default encoding to display the
> > swedish characters properly...
> >
> >
> > Help...
> >
> > /Martin
> >[color=darkred]
> > > >> StreamReader sr = new StreamReader(file,[/color][/color]
> System.Text.Encoding.Default);[color=green][color=darkred]
> > > >> string line = "";
> > > >> while ((line = sr.ReadLine()) != null)
> > > >> {
> > > >> if(line.Length > 17)
> > > >> {
> > > >> DataRow dr = m_Data.NewRow();
> > > >> dr["Col1"]= line.Substring(0, 17).Trim();
> > > >> dr["Col2"] = ?
> > > >> dr["Col3"] = ?
> > > >> dr["Col4"] = ?
> > > >> m_Data.Rows.Add(dr);
> > > >> }
> > > >> }
> > > >> sr.Close();[/color]
> >
> >
> >
> >
> > "Dave" wrote:
> >[color=darkred]
> > > > The original file include approx. 5000 lines
> > >
> > > Is this a one-time operation? If you are using SqlServer you can write[/color][/color]
> a simple DTS package to do the transformation or just use[color=green][color=darkred]
> > > the Import Data command in Enterprise Manager.
> > >
> > > --
> > > Dave Sexton
> > > dave@www..jwaonline..com
> > > -----------------------------------------------------------------------
> > > "Patrice" <nobody@nowhere.com> wrote in message[/color][/color]
> news:unk3M9QYFHA.616@TK2MSFTNGP12.phx.gbl...[color=green][color=darkred]
> > > > Not sure what you meant by "the position seem to be the same depending[/color][/color]
> on[color=green][color=darkred]
> > > > the text in col2" ?
> > > >
> > > > What if you try to display this file using a fixed width font such as
> > > > courier new ? Are all fields aligned ?
> > > >
> > > > To me it looks like this is a fixed width file. Each column uses[/color][/color]
> always the[color=green][color=darkred]
> > > > same range of characters on each line (but it may not be visible[/color][/color]
> immediately[color=green][color=darkred]
> > > > when using a proportional font).
> > > >
> > > > Patrice
> > > >
> > > > --
> > > >
> > > > "jamait" <jamait@discussions.microsoft.com> a écrit dans le message de
> > > > news:8EEA78EC-6F35-4C62-BC94-592A262EFA77@microsoft.com...
> > > >> Hi all,
> > > >>
> > > >> I m trying to read in a text file into a datatable...
> > > >>
> > > >> Not sure on how to split up the information though, regex or
> > > > substrings...?
> > > >>
> > > >> sample:
> > > >> Col1 Col2[/color][/color]
> Col3[color=green][color=darkred]
> > > >> Col4
> > > >> A0012430 REKAL TVÄTTMEDEL EKOMAX 0,5L ST 75.9000
> > > >> A0012550 REKAL TVÄTTMEDEL BIOKULÖR 20KGST 1727.0000
> > > >>
> > > >> Notice how the 2nd row has merged col2 and col3. There are no[/color][/color]
> delimeter at[color=green][color=darkred]
> > > >> all but the position seem to be the same depending on the text in[/color][/color]
> col2.[color=green][color=darkred]
> > > > The
> > > >> original file
> > > >> include approx. 5000 lines that I want to update a sql table with.[/color][/color]
> The[color=green][color=darkred]
> > > > above
> > > >> problem
> > > >> occurs at many positions in the text file.
> > > >>
> > > >> I have successfully read in data from the text file using this code:
> > > >>
> > > >> <code>
> > > >> StreamReader sr = File.OpenText(fileName);
> > > >> string line = "";
> > > >> while ((line = sr.ReadLine()) != null)
> > > >> {
> > > >> if(line.Length > 17)
> > > >> {
> > > >> DataRow dr = m_Data.NewRow();
> > > >> dr["Col1"]= line.Substring(0, 17).Trim();
> > > >> dr["Col2"] = ?
> > > >> dr["Col3"] = ?
> > > >> dr["Col4"] = ?
> > > >> m_Data.Rows.Add(dr);
> > > >> }
> > > >> }
> > > >> sr.Close();
> > > >>
> > > >> Any help appriciated!
> > > >>
> > > >> /Martin
> > > >>
> > > >
> > > >
> > >
> > >
> > >[/color][/color]
>
>
>[/color]
Chris Mayers
Guest
 
Posts: n/a
#7: Jul 22 '05

re: Reading textfiles line by line


Hi,

As other people have pointed out, this is a FIXED WIDTH file. Somthing like
this should do the job:

dr["Col1"]= line.Substring(0, 17).Trim();
dr["Col2"] = line.Substring(17,30).Trim();
// May need to adjust the length of Col3 and the starting position of Col4?
Not enough data to be sure.
dr["Col3"] = line.Substring(47,2).Trim();
dr["Col4"] = line.Substring(49).Trim();

If you give me some more sample lines then I can be more specific.

Cheers,

Chris.



Patrice
Guest
 
Posts: n/a
#8: Jul 22 '05

re: Reading textfiles line by line


IMO it will never change. For example if I've got a product who handles a
first field with 10 chars and another with 4 chars, the vendor could decide
to export its data as a text file using 10 chars for first field and 4 chars
for the second fields...

Whatever the data are it will never change and each line will have always 14
chars... There is no need to ever use more characters as the first field
can't have more than 10 and the other can't have more than 4...

IMO this is the way this file works..

This is know as "fixed size" fields files (as the size of each field never
change) opposed to "delimited" in which you have a delimiter between
fields...

Patrice
--

"jamait" <jamait@discussions.microsoft.com> a écrit dans le message de
news:4BC203B1-40F1-4D24-AC86-CAD43FAC99E2@microsoft.com...[color=blue]
> So by getting the positions for the columns on the first row i can use[/color]
these[color=blue]
> in the following lines?
>
> I think that the column width is set by the max length of the 2nd
> column...so this would probably change if the description in any of the[/color]
rows[color=blue]
> is longer. The file is an export from a different software package that I[/color]
am[color=blue]
> not able to change.
>
> Is there a way to replace 2 or more spaces with a separator or is there a
> better way to split the columns?
>
> Thanks
>
> /Martin
>
> "Patrice" wrote:
>[color=green]
> > Sorry but I'm afraid I still don't catch the exact problem. The two[/color][/color]
lines[color=blue][color=green]
> > you showed us are using the same format.
> >
> > There is NO separator. Each field uses a *fixed* width :
> >
> > Copy the two lines below :
> > A0012430 REKAL TVÄTTMEDEL EKOMAX 0,5L ST 75.9000
> > A0012550 REKAL TVÄTTMEDEL BIOKULÖR 20KGST 1727.0000
> >
> > Paste this into notepad and use the courier new font.
> >
> > You'll see that 5L and 20KG are starting at the same location (and[/color][/color]
always 4[color=blue][color=green]
> > characters wide). ST and ST starts at the same location.
> > The last field doesn't but is right justified in a field that begins at[/color][/color]
some[color=blue][color=green]
> > unknown location (this is the only problem I see, you'll have to find[/color][/color]
out[color=blue][color=green]
> > the *fixed* length of the third field so that you can start reading the[/color][/color]
4 th[color=blue][color=green]
> > field from the correct position).
> >
> > Patrice
> >
> >
> > --
> >
> > "jamait" <jamait@discussions.microsoft.com> a écrit dans le message de
> > news:00D6D172-ABF6-4735-AEFD-6F8D197D1EB4@microsoft.com...[color=darkred]
> > > Hi again,
> > >
> > > It is not a one time operation and what I really want to do is to[/color][/color][/color]
split[color=blue][color=green]
> > each[color=darkred]
> > > line into 4 parts...
> > >
> > > First bit is a ID field, second field a description, third is the unit[/color][/color][/color]
and[color=blue][color=green][color=darkred]
> > > is in the format of an known char array. The last column is the price[/color][/color][/color]
of[color=blue][color=green]
> > the[color=darkred]
> > > product.
> > > Unfortunately when opening this file which is opened as plain text and[/color]
> > read[color=darkred]
> > > line by line the 2nd and 3rd column somehow merges at some of the[/color][/color][/color]
rows...[color=blue][color=green][color=darkred]
> > > The merging only occurs where the description field is long enough and
> > > probably at the position of the longest description in the file...The[/color][/color][/color]
unit[color=blue][color=green]
> > of[color=darkred]
> > > the line is then appended to the line without any delimeter whereas[/color][/color][/color]
the[color=blue][color=green]
> > main[color=darkred]
> > > problem.
> > > The other columns in the file are separated by more than 1 white space
> > > between them.
> > >
> > > I am thinking of using some kind of regular expression and extracting[/color][/color][/color]
the[color=blue][color=green][color=darkred]
> > > information wanted line by line but not sure how to split the 2nd and[/color][/color][/color]
3rd[color=blue][color=green][color=darkred]
> > > column.
> > > A0012430 REKAL TVÄTTMEDEL EKOMAX 0,5L ST 75.9000
> > > A0012550 REKAL TVÄTTMEDEL BIOKULÖR 20KGST 1727.0000
> > >
> > > Also updated the StreamReader to use the default encoding to display[/color][/color][/color]
the[color=blue][color=green][color=darkred]
> > > swedish characters properly...
> > >
> > >
> > > Help...
> > >
> > > /Martin
> > >
> > > > >> StreamReader sr = new StreamReader(file,[/color]
> > System.Text.Encoding.Default);[color=darkred]
> > > > >> string line = "";
> > > > >> while ((line = sr.ReadLine()) != null)
> > > > >> {
> > > > >> if(line.Length > 17)
> > > > >> {
> > > > >> DataRow dr = m_Data.NewRow();
> > > > >> dr["Col1"]= line.Substring(0, 17).Trim();
> > > > >> dr["Col2"] = ?
> > > > >> dr["Col3"] = ?
> > > > >> dr["Col4"] = ?
> > > > >> m_Data.Rows.Add(dr);
> > > > >> }
> > > > >> }
> > > > >> sr.Close();
> > >
> > >
> > >
> > >
> > > "Dave" wrote:
> > >
> > > > > The original file include approx. 5000 lines
> > > >
> > > > Is this a one-time operation? If you are using SqlServer you can[/color][/color][/color]
write[color=blue][color=green]
> > a simple DTS package to do the transformation or just use[color=darkred]
> > > > the Import Data command in Enterprise Manager.
> > > >
> > > > --
> > > > Dave Sexton
> > > > dave@www..jwaonline..com
> > >[/color][/color]
> -----------------------------------------------------------------------[color=green][color=darkred]
> > > > "Patrice" <nobody@nowhere.com> wrote in message[/color]
> > news:unk3M9QYFHA.616@TK2MSFTNGP12.phx.gbl...[color=darkred]
> > > > > Not sure what you meant by "the position seem to be the same[/color][/color][/color]
depending[color=blue][color=green]
> > on[color=darkred]
> > > > > the text in col2" ?
> > > > >
> > > > > What if you try to display this file using a fixed width font such[/color][/color][/color]
as[color=blue][color=green][color=darkred]
> > > > > courier new ? Are all fields aligned ?
> > > > >
> > > > > To me it looks like this is a fixed width file. Each column uses[/color]
> > always the[color=darkred]
> > > > > same range of characters on each line (but it may not be visible[/color]
> > immediately[color=darkred]
> > > > > when using a proportional font).
> > > > >
> > > > > Patrice
> > > > >
> > > > > --
> > > > >
> > > > > "jamait" <jamait@discussions.microsoft.com> a écrit dans le[/color][/color][/color]
message de[color=blue][color=green][color=darkred]
> > > > > news:8EEA78EC-6F35-4C62-BC94-592A262EFA77@microsoft.com...
> > > > >> Hi all,
> > > > >>
> > > > >> I m trying to read in a text file into a datatable...
> > > > >>
> > > > >> Not sure on how to split up the information though, regex or
> > > > > substrings...?
> > > > >>
> > > > >> sample:
> > > > >> Col1 Col2[/color]
> > Col3[color=darkred]
> > > > >> Col4
> > > > >> A0012430 REKAL TVÄTTMEDEL EKOMAX 0,5L ST 75.9000
> > > > >> A0012550 REKAL TVÄTTMEDEL BIOKULÖR 20KGST 1727.0000
> > > > >>
> > > > >> Notice how the 2nd row has merged col2 and col3. There are no[/color]
> > delimeter at[color=darkred]
> > > > >> all but the position seem to be the same depending on the text in[/color]
> > col2.[color=darkred]
> > > > > The
> > > > >> original file
> > > > >> include approx. 5000 lines that I want to update a sql table[/color][/color][/color]
with.[color=blue][color=green]
> > The[color=darkred]
> > > > > above
> > > > >> problem
> > > > >> occurs at many positions in the text file.
> > > > >>
> > > > >> I have successfully read in data from the text file using this[/color][/color][/color]
code:[color=blue][color=green][color=darkred]
> > > > >>
> > > > >> <code>
> > > > >> StreamReader sr = File.OpenText(fileName);
> > > > >> string line = "";
> > > > >> while ((line = sr.ReadLine()) != null)
> > > > >> {
> > > > >> if(line.Length > 17)
> > > > >> {
> > > > >> DataRow dr = m_Data.NewRow();
> > > > >> dr["Col1"]= line.Substring(0, 17).Trim();
> > > > >> dr["Col2"] = ?
> > > > >> dr["Col3"] = ?
> > > > >> dr["Col4"] = ?
> > > > >> m_Data.Rows.Add(dr);
> > > > >> }
> > > > >> }
> > > > >> sr.Close();
> > > > >>
> > > > >> Any help appriciated!
> > > > >>
> > > > >> /Martin
> > > > >>
> > > > >
> > > > >
> > > >
> > > >
> > > >[/color]
> >
> >
> >[/color][/color]


Closed Thread


Similar .NET Framework bytes