469,645 Members | 1,739 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,645 developers. It's quick & easy.

Help ArrayLis won't store, and how to remove duplicates????

Hi I'm still having some problems getting my head round this language.
A couple of things don't seem to work for me. First I am trying to
obtan a count of the number of words in a sting, so am using the split
function with ' ', but how do i get it to take into account
punctuation marks such as ',',',' etc?

Also I am then trying to add contents of an array of strings to an
arraylist, but only if the string isn't already there. I was using the
arraylist.contains method, but it is still adding duplicates, and
finally, I was trying to sort my arraylist, in alphabetical order
using arraylist.sort(), but that doesn't seem to sort it fully, any
ideas on where i am going wrong, i have posted my code below. One
final question, before i add my strings of words to my arraylist, is
it possible to capitalise the first letter in each word? Thanks.

public static void ParseFile(string input, string output)
{
System.IO.TextReader r = System.IO.File.OpenText(@"C:\input.txt");
System.IO.TextWriter w =
System.IO.File.CreateText(@"C:\output.txt");
string s = r. ReadToEnd();

ArrayList myList = new ArrayList();
int numOccur = 0;

string[] mySplit = s.Split(' ');
Console.WriteLine("Num of words is " + mySplit.Length);
for(int x = 0; x < mySplit.Length; x++)
{
if (!myList.Contains(mySplit[x]))
{
myList.Add(mySplit[x]);
}
}
myList.Sort();

foreach (string item in myList)
{
w.Write(item + "\n");
}
r.Close();
w.Close();
}
Nov 16 '05 #1
11 5577
Split() is a very simplistic method. It can split on a single character, or
on multiple characters (have a look at the Split(char[]) override) but if
you have for instance quoted strings within which delimiters must be
ignored, as with:

Fred,Smith,"Stockton, MD"

.... or other special cases, the you'll have to write your own routine using
string manipulation methods and/or Regex.

Your code looks fine offhand. If you can provide your input.txt file and a
sample of the bad output that would be helpful in diagnosing your problem.

Incidentally, foreach does not guarantee that it will traverse a collection
in order, though in my experience it does do so with arrays and ArrayLists.
Still, before concluding that the ArrayList is not really sorted, I'd use a
for loop instead. Or look at it in the debugger.

As for capitalizing the first letter of each word -- you have to write a
method for that yourself. Simplistically, something like:

string ToProperCase(string strWord) {

if (strWord.Length == 0) {
return "";
} else if (strWord.Length == 1) {
return strWord.ToUpper();
} else {
return Char.ToUpper(strWord[0]).ToString() + strWord.Substring(1);
}

}

--Bob

"steve smith" <bo**********@hotmail.com> wrote in message
news:4b**************************@posting.google.c om...
Hi I'm still having some problems getting my head round this language.
A couple of things don't seem to work for me. First I am trying to
obtan a count of the number of words in a sting, so am using the split
function with ' ', but how do i get it to take into account
punctuation marks such as ',',',' etc?

Also I am then trying to add contents of an array of strings to an
arraylist, but only if the string isn't already there. I was using the
arraylist.contains method, but it is still adding duplicates, and
finally, I was trying to sort my arraylist, in alphabetical order
using arraylist.sort(), but that doesn't seem to sort it fully, any
ideas on where i am going wrong, i have posted my code below. One
final question, before i add my strings of words to my arraylist, is
it possible to capitalise the first letter in each word? Thanks.

public static void ParseFile(string input, string output) {
System.IO.TextReader r = System.IO.File.OpenText(@"C:\input.txt");
System.IO.TextWriter w = System.IO.File.CreateText(@"C:\output.txt");
string s = r. ReadToEnd();

ArrayList myList = new ArrayList();
int numOccur = 0;

string[] mySplit = s.Split(' ');
Console.WriteLine("Num of words is " + mySplit.Length);

for(int x = 0; x < mySplit.Length; x++) {

if (!myList.Contains(mySplit[x])) {
myList.Add(mySplit[x]);
}
}

myList.Sort();

foreach (string item in myList) {
w.Write(item + "\n");
}
r.Close();
w.Close();
}
Nov 16 '05 #2
Hi my split still doesn't seem to be working correctly i have now
specified it to split by s.Split(' ',',','.',';'); but now i am
getting more words than are actually there, i will post the input
file, and hope you can point me in he right direction on how i should
split. Also I am still getting duplicates in the output file, any idea
what this could be? I will post a sample of this many thanks.

Input.txt:

No one would have believed in the last years of the nineteenth
century that this world was being watched keenly and closely by
intelligences greater than man's and yet as mortal as his own; that as
men busied themselves about their various concerns they were
scrutinised and studied, perhaps almost as narrowly as a man with a
microscope might scrutinise the transient creatures that swarm and
multiply in a drop of water. With infinite complacency men went to
and fro over this globe about their little affairs, serene in their
assurance of their empire over matter. It is possible that the
infusoria under the microscope do the same. No one gave a thought to
the older worlds of space as sources of human danger, or thought of
them only to dismiss the idea of life upon them as impossible or
improbable. It is curious to recall some of the mental habits of
those departed days. At most terrestrial men fancied there might be
other men upon Mars, perhaps inferior to themselves and ready to
welcome a missionary enterprise. Yet across the gulf of space, minds
that are to our minds as ours are to those of the beasts that perish,
intellects vast and cool and unsympathetic, regarded this earth with
envious eyes, and slowly and surely drew their plans against us. And
early in the twentieth century came the great disillusionment.

The planet Mars, I scarcely need remind the reader, revolves about the
sun at a mean distance of 140,000,000 miles, and the light and heat it
receives from the sun is barely half of that received by this world.
It must be, if the nebular hypothesis has any truth, older than our
world; and long before this earth ceased to be molten, life upon its
surface must have begun its course. The fact that it is scarcely one
seventh of the volume of the earth must have accelerated its cooling
to the temperature at which life could begin. It has air and water
and all that is necessary for the support of animated existence.

Yet so vain is man, and so blinded by his vanity, that no writer,
up to the very end of the nineteenth century, expressed any idea that
intelligent life might have developed there far, or indeed at all,
beyond its earthly level. Nor was it generally understood that since
Mars is older than our earth, with scarcely a quarter of the
superficial area and remoter from the sun, it necessarily follows that
it is not only more distant from time's beginning but nearer its end.

The secular cooling that must someday overtake our planet has
already gone far indeed with our neighbour. Its physical condition is
still largely a mystery, but we know now that even in its equatorial
region the midday temperature barely approaches that of our coldest
winter. Its air is much more attenuated than ours, its oceans have
shrunk until they cover but a third of its surface, and as its slow
seasons change huge snowcaps gather and melt about either pole and
periodically inundate its temperate zones. That last stage of
exhaustion, which to us is still incredibly remote, has become a
present-day problem for the inhabitants of Mars. The immediate
pressure of necessity has brightened their intellects, enlarged their
powers, and hardened their hearts. And looking across space with
instruments, and intelligences such as we have scarcely dreamed of,
they see, at its nearest distance only 35,000,000 of miles sunward of
them, a morning star of hope, our own warmer planet, green with
vegetation and grey with water, with a cloudy atmosphere eloquent of
fertility, with glimpses through its drifting cloud wisps of broad
stretches of populous country and narrow, navy-crowded seas.

And we men, the creatures who inhabit this earth, must be to them
at least as alien and lowly as are the monkeys and lemurs to us. The
intellectual side of man already admits that life is an incessant
struggle for existence, and it would seem that this too is the belief
of the minds upon Mars. Their world is far gone in its cooling and
this world is still crowded with life, but crowded only with what they
regard as inferior animals. To carry warfare sunward is, indeed,
their only escape from the destruction that, generation after
generation, creeps upon them.

And before we judge of them too harshly we must remember what
ruthless and utter destruction our own species has wrought, not only
upon animals, such as the vanished bison and the dodo, but upon its
inferior races. The Tasmanians, in spite of their human likeness,
were entirely swept out of existence in a war of extermination waged
by European immigrants, in the space of fifty years. Are we such
apostles of mercy as to complain if the Martians warred in the same
spirit?

The Martians seem to have calculated their descent with amazing
subtlety--their mathematical learning is evidently far in excess of
ours--and to have carried out their preparations with a well-nigh
perfect unanimity.
Output.txt:

1
1
The 1

and 1

at 1

beyond 1

came 1

fluctuating 1

flying 1

indicated 1

intellects 1

It 1

made 1

Ogilvy 1

People 1

softened 1

their 1

they 1

three 1

up 1

visible 1

were 1

with 1
" 1
"

1
"as
flaming 1
"The 1
(the 1
000 1
140 1
1894 1
2 1
35 1
A 39
A 39
A 39
A 39
A 39
A 39
A 39
A 39
A 39
A 39
A 39
A 39
A 39
A 39
A 39
A 39
A 39
A 39
A 39
A 39
A 39
A 39
A 39
A 39
A 39
A 39
A 39
A 39
A 39
A 39
A 39
A 39
A 39
A 39
A 39
A 39
A 39
A 39
A 39
A
flame 1
A
heavy 1
A
microscope 1
A
party 1
A
present-day 1
A
scrutiny 1
About 8
About 8
About 8
"Bob Grommes" <bo*@bobgrommes.com> wrote in message news:<Oh**************@TK2MSFTNGP09.phx.gbl>...
Split() is a very simplistic method. It can split on a single character, or
on multiple characters (have a look at the Split(char[]) override) but if
you have for instance quoted strings within which delimiters must be
ignored, as with:

Fred,Smith,"Stockton, MD"

... or other special cases, the you'll have to write your own routine using
string manipulation methods and/or Regex.

Your code looks fine offhand. If you can provide your input.txt file and a
sample of the bad output that would be helpful in diagnosing your problem.

Incidentally, foreach does not guarantee that it will traverse a collection
in order, though in my experience it does do so with arrays and ArrayLists.
Still, before concluding that the ArrayList is not really sorted, I'd use a
for loop instead. Or look at it in the debugger.

As for capitalizing the first letter of each word -- you have to write a
method for that yourself. Simplistically, something like:

string ToProperCase(string strWord) {

if (strWord.Length == 0) {
return "";
} else if (strWord.Length == 1) {
return strWord.ToUpper();
} else {
return Char.ToUpper(strWord[0]).ToString() + strWord.Substring(1);
}

}

--Bob

"steve smith" <bo**********@hotmail.com> wrote in message
news:4b**************************@posting.google.c om...
Hi I'm still having some problems getting my head round this language.
A couple of things don't seem to work for me. First I am trying to
obtan a count of the number of words in a sting, so am using the split
function with ' ', but how do i get it to take into account
punctuation marks such as ',',',' etc?

Also I am then trying to add contents of an array of strings to an
arraylist, but only if the string isn't already there. I was using the
arraylist.contains method, but it is still adding duplicates, and
finally, I was trying to sort my arraylist, in alphabetical order
using arraylist.sort(), but that doesn't seem to sort it fully, any
ideas on where i am going wrong, i have posted my code below. One
final question, before i add my strings of words to my arraylist, is
it possible to capitalise the first letter in each word? Thanks.

public static void ParseFile(string input, string output) {
System.IO.TextReader r = System.IO.File.OpenText(@"C:\input.txt");
System.IO.TextWriter w = System.IO.File.CreateText(@"C:\output.txt");
string s = r. ReadToEnd();

ArrayList myList = new ArrayList();
int numOccur = 0;

string[] mySplit = s.Split(' ');
Console.WriteLine("Num of words is " + mySplit.Length);

for(int x = 0; x < mySplit.Length; x++) {

if (!myList.Contains(mySplit[x])) {
myList.Add(mySplit[x]);
}
}

myList.Sort();

foreach (string item in myList) {
w.Write(item + "\n");
}
r.Close();
w.Close();
}

Nov 16 '05 #3
If you want to split on multiple characters you have to pass a character array, e.g.,

s.Split(new char[] {' ',',','.',';'});

I assume that's what you mean.

I know that this will parse "140,000,000" into three words, and I expect it to parse "matter. It" (with 2 spaces before It) into "matter", "It" and two empty strings (the "word" between the period and the space, and the "word" between the two spaces would be considered words, I think). As a side note, in VS 2005 (CLR 2.0) there will be an option you can set to not return empty strings if you wish. It's managed by a new enum called StringSplitOptions and some new overrides to the Split() method.

I will try to take a closer look at this tonight, but I may run out of time, and I'll be out of town and out of touch until Wed.

--Bob
"steve smith" <bo**********@hotmail.com> wrote in message news:4b*************************@posting.google.co m...
Hi my split still doesn't seem to be working correctly i have now
specified it to split by s.Split(' ',',','.',';'); but now i am
getting more words than are actually there, i will post the input
file, and hope you can point me in he right direction on how i should
split. Also I am still getting duplicates in the output file, any idea
what this could be? I will post a sample of this many thanks.


Nov 16 '05 #4
Bob Grommes <bo*@bobgrommes.com> wrote:
If you want to split on multiple characters you have to pass a
character array, e.g.,

s.Split(new char[] {' ',',','.',';'});

I assume that's what you mean.


No need for that, as s.Split takes a params char[] parameter.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 16 '05 #5
steve smith <bo**********@hotmail.com> wrote:
Hi my split still doesn't seem to be working correctly i have now
specified it to split by s.Split(' ',',','.',';'); but now i am
getting more words than are actually there, i will post the input
file, and hope you can point me in he right direction on how i should
split. Also I am still getting duplicates in the output file, any idea
what this could be? I will post a sample of this many thanks.


Hmm. That's very odd - I've tried your code and it nearly works fine. I
added '\r' and '\n' to the split chars, as otherwise you get the start
of each new line as a word in itself. Here's what the results look
like:
000
140
35
a
about
accelerated
across
admits
affairs
after
against
air
alien
all
almost
already
amazing
an
and
And
animals
animated
any
apostles
approaches
are
Are
area
as
assurance
at
At
atmosphere
attenuated
barely
be
beasts
become
before
begin
beginning
begun
being
belief
believed
beyond
bison
blinded

(etc)

Could you post the *exact* code you used to generate the output you
posted? The code you wrote in your first post doesn't include numbers,
so I assume you've changed the code somewhat anyway.

Here's the code I used:

using System;
using System.IO;
using System.Collections;

class Test
{

public static void Main()
{
System.IO.TextReader r = System.IO.File.OpenText("input.txt");
System.IO.TextWriter w =
System.IO.File.CreateText("output.txt");
string s = r. ReadToEnd();

ArrayList myList = new ArrayList();

string[] mySplit = s.Split(' ',',','.',';','\r','\n');
Console.WriteLine("Num of words is " + mySplit.Length);
for(int x = 0; x < mySplit.Length; x++)
{
if (!myList.Contains(mySplit[x]))
{
myList.Add(mySplit[x]);
}
}
myList.Sort();

foreach (string item in myList)
{
w.Write(item + "\n");
}
r.Close();
w.Close();
}
}

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 16 '05 #6
> arraylist.contains method, but it is still adding duplicates, and

The Contains will tell you if an object is already in the list, it does not
look at the values of the objects. You should look at teh StringCollection
class. The Contains method of this class actualy looks at the value.

Etienne Boucher
Nov 16 '05 #7
Etienne Boucher <et*****@novat.qc.ca> wrote:
arraylist.contains method, but it is still adding duplicates, and


The Contains will tell you if an object is already in the list, it does not
look at the values of the objects.


Yes it does - ArrayList.Contains calls Equals on either the object
you're looking for with an argument of each of the references within
the list, or vice versa. Here's a program to show that:

using System;
using System.Collections;

class Foo
{
public override bool Equals (object o)
{
Console.WriteLine ("Equals called");
return true;
}
}

class Test
{
public static void Main()
{
ArrayList al = new ArrayList();

Foo f = new Foo();
al.Add(f);
Foo g = new Foo();
Console.WriteLine (al.Contains(g));
}
}

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 16 '05 #8
Thanks for proving me wrong, it's great to have you around Jon. I'm curious
however, doesn't...

string s1 = "1";
string s2 = "2";
(object) s1 == (object) s2;

call Object.Equals? Why does it still call String.Equals with Contains? Mmm,
I just tested myself with...

static void Main(string[] args)
{
string s1 = "1";
string s2 = "1";
Console.WriteLine(Test(s1, s2));
}

static bool Test(object o1, object o2)
{
return o1.Equals(o2);
}

and it does call String.Equals. I always thought both references had to be
of type string for that to happen.

Etienne Boucher
"Jon Skeet [C# MVP]" <sk***@pobox.com> a écrit dans le message de
news:MP************************@msnews.microsoft.c om...
Etienne Boucher <et*****@novat.qc.ca> wrote:
arraylist.contains method, but it is still adding duplicates, and


The Contains will tell you if an object is already in the list, it does not look at the values of the objects.


Yes it does - ArrayList.Contains calls Equals on either the object
you're looking for with an argument of each of the references within
the list, or vice versa. Here's a program to show that:

using System;
using System.Collections;

class Foo
{
public override bool Equals (object o)
{
Console.WriteLine ("Equals called");
return true;
}
}

class Test
{
public static void Main()
{
ArrayList al = new ArrayList();

Foo f = new Foo();
al.Add(f);
Foo g = new Foo();
Console.WriteLine (al.Contains(g));
}
}

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Nov 16 '05 #9
Etienne Boucher <et*****@novat.qc.ca> wrote:
Thanks for proving me wrong, it's great to have you around Jon. I'm curious
however, doesn't...

string s1 = "1";
string s2 = "2";
(object) s1 == (object) s2;

call Object.Equals?
Because the normal reference type equality operator is equivalent to
calling Object.ReferenceEquals.
Why does it still call String.Equals with Contains?
Because Contains don't use

if (x==y)

it uses

if (x.Equals(y))
Mmm, I just tested myself with...

static void Main(string[] args)
{
string s1 = "1";
string s2 = "1";
Console.WriteLine(Test(s1, s2));
}

static bool Test(object o1, object o2)
{
return o1.Equals(o2);
}

and it does call String.Equals. I always thought both references had to be
of type string for that to happen.


Both expressions have to be string for x==y to call the overloaded
equality operator, which is equivalent to calling String.Equals.

o1.Equals(o2) is calling the overridden Object.Equals(Object), however,
which is just a normal overridden method call.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 16 '05 #10
Hi many thanks for all the replies, however i am still getting some
problems My sort seems to be roking fine, as they come out as they
should, however in my output file it seems as my contains check does
not seem to work as i am getting the same wor repeating many times.
Also a word cound in word reveals there are 2218 words, however
program says there are 2789 words. Ihave since changed the program, to
capitalise the first letter of each word and to also put a count of
the number of occurences of each word, the code and sample of output
file posted below, any help with this is much appreciated. Thanks.

namespace ExamProblem
{
using System;
using System.IO;
using System.Collections;

class Start
{
[STAThread]
static void Main(string[] args)
{
Parser.ParseFile("c:/input.txt", "c:/output.txt");
}
}

class Parser
{
public static void ParseFile(string input, string output)
{
System.IO.TextReader r = System.IO.File.OpenText(@"C:\input.txt");
System.IO.TextWriter w =
System.IO.File.CreateText(@"C:\output.txt");
string s = r. ReadToEnd();

ArrayList myList = new ArrayList();
int numOccur = 0;
string[] mySplit = s.Split(' ',',','.',';','\r','\n');
Console.WriteLine("Num of words is " + mySplit.Length);
for(int x = 0; x < mySplit.Length; x++)
{
if (!myList.Contains(mySplit[x]))
{
string insertWord = Parser.ToProperCase(mySplit[x]);
myList.Add(insertWord);
}
}
myList.Sort();

foreach (string item in myList)
{
numOccur = Parser.Occurs(item,myList);
w.Write(item + " " + numOccur + "\n");
}
r.Close();
w.Close();
}

public static int Occurs(string strSearch,ArrayList myList)
{
int intCount = 0;
foreach (string find in myList)
{
if (find == strSearch)
{
intCount++;
}
}
return intCount;
}

public static string ToProperCase(string strWord)
{
if (strWord.Length == 0)
{
return "";
}
else if (strWord.Length == 1)
{
return strWord.ToUpper();
}
else
{
return Char.ToUpper(strWord[0]).ToString() + strWord.Substring(1);
}

}

}
}
Output file:

1
" 1
"as 1
"The 1
(the 1
000 1
10 1
140 1
1894 1
2 1
35 1
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
A 45
About 9
About 9
About 9
About 9
About 9
About 9
About 9
About 9
About 9
Abundance 1
Accelerated 1
Across 4
Across 4
Across 4
Across 4
Activity 1
Adjacent 1
Admits 1
Advance 1
Affairs 1
After 5
Jon Skeet [C# MVP] <sk***@pobox.com> wrote in message news:<MP************************@msnews.microsoft. com>...
Etienne Boucher <et*****@novat.qc.ca> wrote:
Thanks for proving me wrong, it's great to have you around Jon. I'm curious
however, doesn't...

string s1 = "1";
string s2 = "2";
(object) s1 == (object) s2;

call Object.Equals?


Because the normal reference type equality operator is equivalent to
calling Object.ReferenceEquals.
Why does it still call String.Equals with Contains?


Because Contains don't use

if (x==y)

it uses

if (x.Equals(y))
Mmm, I just tested myself with...

static void Main(string[] args)
{
string s1 = "1";
string s2 = "1";
Console.WriteLine(Test(s1, s2));
}

static bool Test(object o1, object o2)
{
return o1.Equals(o2);
}

and it does call String.Equals. I always thought both references had to be
of type string for that to happen.


Both expressions have to be string for x==y to call the overloaded
equality operator, which is equivalent to calling String.Equals.

o1.Equals(o2) is calling the overridden Object.Equals(Object), however,
which is just a normal overridden method call.

Nov 16 '05 #11
steve smith <bo**********@hotmail.com> wrote:
Hi many thanks for all the replies, however i am still getting some
problems My sort seems to be roking fine, as they come out as they
should, however in my output file it seems as my contains check does
not seem to work as i am getting the same wor repeating many times.
Also a word cound in word reveals there are 2218 words, however
program says there are 2789 words. Ihave since changed the program, to
capitalise the first letter of each word and to also put a count of
the number of occurences of each word, the code and sample of output
file posted below, any help with this is much appreciated. Thanks.


The problem is that you're seeing whether the non-proper-cased word is
in the ArrayList, then proper-casing it and adding it. So, for
instance, you never add "a" to the ArrayList, only "A" - so every time
"a" comes up, it will check whether that's in the list, never find it,
and add it.

If you change the middle bit of your code to:

string proper = ToProperCase(mySplit[x]);
if (!myList.Contains(proper))
{
myList.Add(proper);
}

then you'll only get each word once.

(Of course, you then need to change to a different way of determining
how many occurrences there are.)

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 16 '05 #12

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

5 posts views Thread by Herman | last post: by
8 posts views Thread by shawnews | last post: by
5 posts views Thread by Richard Gromstein | last post: by
3 posts views Thread by ryan.paquette | last post: by
118 posts views Thread by Chuck Cheeze | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.