473,404 Members | 2,137 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,404 software developers and data experts.

Split large text file by number of lines?

Hello,

im a beginner in VB.NET... The thing i would like to do is as it
follows....

I have a text file (list of names, every name to the next line) which
is about 350000 lines long. I would like to split it and create a new
file at every lets say 20000 lines... so, the directory output would
have to be something like this:

File1: 1-20000 lines of the original file
File2: 20001-40000 lines of the original file
File3: 40001-60000 lines of the original file

etc.

Can it be done simply? one form with field to enter the number of
lines, button to load a text file and a "Start" button...

Thanks in advance

Feb 21 '07 #1
6 24111
Yes.

Read the source file line by line

Write each line to the target file

After each nth line, close the target file and open a new one (with a
different name of course).
<iv********@gmail.comwrote in message
news:11**********************@a75g2000cwd.googlegr oups.com...
Hello,

im a beginner in VB.NET... The thing i would like to do is as it
follows....

I have a text file (list of names, every name to the next line) which
is about 350000 lines long. I would like to split it and create a new
file at every lets say 20000 lines... so, the directory output would
have to be something like this:

File1: 1-20000 lines of the original file
File2: 20001-40000 lines of the original file
File3: 40001-60000 lines of the original file

etc.

Can it be done simply? one form with field to enter the number of
lines, button to load a text file and a "Start" button...

Thanks in advance
Feb 21 '07 #2
This code I have writen works but it takes some time to complete(about 50
seconds for a 1 mb text file)

Mabye beter to do a "readall" and then use the SPLIT(str, vbcrlf) function
anyway this should do it

add a textbox and a button. This is created in vb.net 2005 (the free
version from microsoft)

Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As
System.EventArgs) Handles Button1.Click

TextSplitter()

End Sub
Sub TextSplitter()

' open the source fle and read assign it to a stream

Dim AsciiStreamReader As IO.StreamReader =
IO.File.OpenText("C:\HugeSourceTextFile1.txt")

Dim sb As New Text.StringBuilder

Dim LineCounter As Integer = 0

Dim FileNumber As Integer = 1

Dim bProcessWinMsg As Boolean = 0

Me.Text = "processng file... "

While AsciiStreamReader.EndOfStream = False

bProcessWinMsg += 1

If bProcessWinMsg Then Application.DoEvents()

sb.Append(AsciiStreamReader.ReadLine() & vbCrLf)

If LineCounter = CInt(TextBox1.Text) Or AsciiStreamReader.EndOfStream = True
Then

' Writes the data stored in the stringBuiler(sb) and then closes the file

IO.File.WriteAllText("C:\" & "File " & FileNumber & ".txt", sb.ToString,
Encoding.ASCII)

' Reset the line count, clear the sb string and increment the file number

LineCounter = 0

sb.Length = 0

FileNumber += 1

End If

LineCounter += 1

End While

Me.Text = "Complete: created " & FileNumber & " files"

End Sub

"Stephany Young" <noone@localhostwrote in message
news:%2****************@TK2MSFTNGP05.phx.gbl...
Yes.

Read the source file line by line

Write each line to the target file

After each nth line, close the target file and open a new one (with a
different name of course).
<iv********@gmail.comwrote in message
news:11**********************@a75g2000cwd.googlegr oups.com...
>Hello,

im a beginner in VB.NET... The thing i would like to do is as it
follows....

I have a text file (list of names, every name to the next line) which
is about 350000 lines long. I would like to split it and create a new
file at every lets say 20000 lines... so, the directory output would
have to be something like this:

File1: 1-20000 lines of the original file
File2: 20001-40000 lines of the original file
File3: 40001-60000 lines of the original file

etc.

Can it be done simply? one form with field to enter the number of
lines, button to load a text file and a "Start" button...

Thanks in advance

Feb 22 '07 #3
"Michael M." <no****@mike.comschrieb
This code I have writen works but it takes some time to
complete(about 50 seconds for a 1 mb text file)

Mabye beter to do a "readall" and then use the SPLIT(str, vbcrlf)
function anyway this should do it

add a textbox and a button. This is created in vb.net 2005 (the
free version from microsoft)

Suggestion (untested):

Sub TextSplitter()

Dim fsIN, fsOut As IO.FileStream
Dim sr As IO.StreamReader
Dim sw As IO.StreamWriter
Dim OutCount As Integer

fsIN = New IO.FileStream( _
"infile.txt", IO.FileMode.Open, IO.FileAccess.Read _
)

sr = New IO.StreamReader(fsIN, System.Text.Encoding.Default)

Do
Dim Line As String
Dim LineCount As Integer

Line = sr.ReadLine()
If Line Is Nothing Then Exit Do

If fsOut Is Nothing Then
OutCount += 1

fsOut = New IO.FileStream( _
"outfile" & OutCount & ".txt", _
IO.FileMode.CreateNew, IO.FileAccess.Write _
)

sw = New IO.StreamWriter(fsOut, System.Text.Encoding.Default)
LineCount = 0
End If

sw.WriteLine(Line)
LineCount += 1

If LineCount = 20000 Then
sw.Close()
fsOut = Nothing
End If
Loop

If fsOut IsNot Nothing Then
sw.Close()
End If

fsIN.Close()

End Sub
Be aware that Encoding.Ascii supports only 7 bit characters.
Armin

Feb 22 '07 #4
I'm going to opt for an OOP solution which isn't quite so dependent upon all
the inputs being fixed. Personally I've learned that "specs change" and
planning for change saves the client money which makes for a happy client.

So try the other solutions out and then try this one. Do note that you can
change the input file name, set the output file names, set the line count
(indpendently per file) and it can produce more (or fewer) than 3 files by
calling the Copy() method as many times as you want.

Personally I'd add some error handling before I tried to sell it and I might
add a methodology to indicate the end of the input file was reached. While
it should cause no harm it seems pointless to keep calling Copy() if the
end-of-file was already reached.

The reason there is only 30 lines indicated in my example is that's all I
wanted to type into my test.

Tom

Dim oCopier As Copier = New Copier()

With oCopier
.Open("infile.txt")
.Copy("file1.txt", 10)
.Copy("file2.txt", 10)
.Copy("file3.txt", 10)
.Close()
End With
Public Class Copier
Inherits Object

Private fs As IO.FileStream
Private sr As IO.StreamReader

Public Sub Open(ByVal file As String)
fs = New IO.FileStream(file, IO.FileMode.Open, IO.FileAccess.Read)
sr = New IO.StreamReader(fs, System.Text.Encoding.Default)
End Sub

Public Sub Close()
sr.Close()
End Sub

Public Sub Copy(ByVal file As String, ByVal max As Int32)

Dim fs As IO.FileStream = New IO.FileStream(file,
IO.FileMode.CreateNew, IO.FileAccess.Write)
Dim sw As IO.StreamWriter = New IO.StreamWriter(fs,
System.Text.Encoding.Default)

Dim input As String
Dim count As Int32 = 0

Dim processing As Boolean = True
While processing

input = sr.ReadLine()
count += 1

processing = ((input IsNot Nothing) AndAlso count < max)

If processing Then
sw.WriteLine(input)
End If

End While

sw.Close()

End Sub

End Class


<iv********@gmail.comwrote in message
news:11**********************@a75g2000cwd.googlegr oups.com...
Hello,

im a beginner in VB.NET... The thing i would like to do is as it
follows....

I have a text file (list of names, every name to the next line) which
is about 350000 lines long. I would like to split it and create a new
file at every lets say 20000 lines... so, the directory output would
have to be something like this:

File1: 1-20000 lines of the original file
File2: 20001-40000 lines of the original file
File3: 40001-60000 lines of the original file

etc.

Can it be done simply? one form with field to enter the number of
lines, button to load a text file and a "Start" button...

Thanks in advance

Feb 22 '07 #5
you know they make partitioning in databases right?

keep everything in one table and then you can use rank; or you can
filter, search-- anything you want to do.

and it doesn't matter if you have 20k records or 4m...

I mean seriously; why reinvent the wheel? Did I mention why reinvent
the wheel?

On Feb 21, 1:58 pm, ivan.pe...@gmail.com wrote:
Hello,

im a beginner in VB.NET... The thing i would like to do is as it
follows....

I have a text file (list of names, every name to the next line) which
is about 350000 lines long. I would like to split it and create a new
file at every lets say 20000 lines... so, the directory output would
have to be something like this:

File1: 1-20000 lines of the original file
File2: 20001-40000 lines of the original file
File3: 40001-60000 lines of the original file

etc.

Can it be done simply? one form with field to enter the number of
lines, button to load a text file and a "Start" button...

Thanks in advance

Feb 22 '07 #6

Armin Zingler je napisao/la:
"Michael M." <no****@mike.comschrieb
This code I have writen works but it takes some time to
complete(about 50 seconds for a 1 mb text file)

Mabye beter to do a "readall" and then use the SPLIT(str, vbcrlf)
function anyway this should do it

add a textbox and a button. This is created in vb.net 2005 (the
free version from microsoft)


Suggestion (untested):

Sub TextSplitter()

Dim fsIN, fsOut As IO.FileStream
Dim sr As IO.StreamReader
Dim sw As IO.StreamWriter
Dim OutCount As Integer

fsIN = New IO.FileStream( _
"infile.txt", IO.FileMode.Open, IO.FileAccess.Read _
)

sr = New IO.StreamReader(fsIN, System.Text.Encoding.Default)

Do
Dim Line As String
Dim LineCount As Integer

Line = sr.ReadLine()
If Line Is Nothing Then Exit Do

If fsOut Is Nothing Then
OutCount += 1

fsOut = New IO.FileStream( _
"outfile" & OutCount & ".txt", _
IO.FileMode.CreateNew, IO.FileAccess.Write _
)

sw = New IO.StreamWriter(fsOut, System.Text.Encoding.Default)
LineCount = 0
End If

sw.WriteLine(Line)
LineCount += 1

If LineCount = 20000 Then
sw.Close()
fsOut = Nothing
End If
Loop

If fsOut IsNot Nothing Then
sw.Close()
End If

fsIN.Close()

End Sub
Be aware that Encoding.Ascii supports only 7 bit characters.
Armin
thanks man, this code does exactly what i need, and pretty fast....

Mar 1 '07 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

14
by: Luka Milkovic | last post by:
Hello, I have a little problem and although it's little it's extremely difficult for me to describe it, but I'll try. I have written a program which extracts certain portions of my received...
12
by: Martin Dieringer | last post by:
I am trying to split a file by a fixed string. The file is too large to just read it into a string and split this. I could probably use a lexer but there maybe anything more simple? thanks m.
2
by: SL_McManus | last post by:
Hi All; I am fairly new to Perl. I have a file with close to 3000 lines that I would like to split out in a certain way. I would like to put the record type starting in column 1 for 2 spaces,...
3
by: rxl124 | last post by:
Hi, room Beginner of learning perl here!! I have question to all, I have below file name datebook.master which contains only 2 lines Mike wolf:12/3/44:144 park ave, paramus: 44000 Sarah kim:...
3
by: Ben | last post by:
Hi I am creating a dynamic function to return a two dimensional array from a delimeted string. The delimited string is like: field1...field2...field3... field1...field2...field3......
2
by: Curious Joe | last post by:
I have some files that are anywhere from 3GB to 9GB and I need to split them down to a series of smaller files similar to what the "split" command in linux can do. Unfortunately, I do not have...
24
by: garyusenet | last post by:
I'm working on a data file and can't find any common delimmiters in the file to indicate the end of one row of data and the start of the next. Rows are not on individual lines but run accross...
20
by: mike | last post by:
I help manage a large web site, one that has over 600 html pages... It's a reference site for ham radio folks and as an example, one page indexes over 1.8 gb of on-line PDF documents. The site...
2
by: ogo796 | last post by:
Hi guys am having a problem with a split(),i retrieve line from the text file and i wanna split that line.i manage to split two words but splitting the string enclosed on brackets it seems to be a...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.