473,509 Members | 2,950 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Problems with parsing uploaded csv file contents

codesid
6 New Member
I could not find over the web anything related to this issue that I found, so I started to open a discussion about this, and maybe can help me out, or give better ideas of how to handle with this.

Environment:
Windows XP Pro, VS2003, .NET 1.1, C#

The Case:
When we obtain the information of a csv file from a post type "multipart/form-data", the contents of the file come with strange chars, similar to double spaces that are not exactly spaces (very tricky to clean them up). I searched this case, but did not find anything related. So, I managed to solve this in a very dirty way, which I am embarassed to show here (lol).

We can upload (without using any COMs) files in 2 ways, basically:

1) using Request.BinaryRead (manual handling with posted data)
2) using HtmlInputFile.PostedFile (using .NET web controls)

I've tested my case for both and the results are the same.

Eg.

piece of the file = [bar foo,bar@foo.com,]
after obtaining the info from the post = [
b a r f o o , b a r @ f o o . c o m , ]

When I run a simple script to separate all elements of the content that I receive, see how it looks like:

[ ] [
](this is a new line content) [ ] [b] [ ] [a] [ ] [r] [ ] [ ] [ ] [f] [ ] [o] [ ] [o] [ ] [,] [ ] [b] [ ] [a] [ ] [r] [ ] [@] [ ] [f] [ ] [o] [ ] [o] [ ] [.] [ ] [c] [ ] [o] [ ] [m] [ ] [,] [ ]

For your information, the script:

// line is the line from the csv file
for(int i = 0; i < line.Length; i++)
{
string digit = line.Substring(i,1);
Response.Write(" ["+digit+"] ");
}

Note: Not all csv files have this problem. I got this from google's exporting features (gmail, orkut, etc).

The funny thing is when I display the contents on a webpage, everything seems ok, because the strange chars do not appear... I first discovered this when I did a script to automatically store the csv contents in a database... the data was very strange, because in the database all strange spaces were there, including the "new line" which does not disappear even if you replace it for anything else.

When I tried to compare the data of the spaces there, I could not find anything that would clean them:

line = line.Replace(" ", "") -> does not work
digit.Equals(string.Empty) -> always return false
"" + digit == "" -> is false
"" + digit == " " -> is false

So, I lost my hope on finding this char, which is not empty neither blank, so I went for the hashcode, and apparently solved the problem, but as I told before, I would not advise myself to do something like that, ever.

So, simply read everything char by char, and clean them up...

string correctedLine = "";
for(int i = 0; i < line.Length; i++)
{
string digit = line.Substring(i,1);
if (digit.GetHashCode() != 5381 && digit.GetHashCode() != 177583)
{
correctedLine += digit;
}
}

Has anyone ever seen this?

Thanks in advance.
Dec 31 '06 #1
2 1403
codesid
6 New Member
Haven't anyone gotten into similar problem so far?
Jan 17 '07 #2
bplacker
121 New Member
are the strange characters 'line breaks' ? try comparing against or replacing vbcrlf or something like this. I remember when dealing with CSVs in Java, that the character or character string for line breaks was something strange.
Jan 17 '07 #3

Sign in to post your reply or Sign up for a free account.

Similar topics

8
3785
by: CAFxX | last post by:
i'm writing a program that executes some calculations on a bitmap loaded in memory. these calculation ends up with pixel wth values far over 255, but i need them to be between 0 and 255 since i...
6
17294
by: Hans Kamp | last post by:
Is it possible to write a function like the following: string ReadURL(string URL) { .... } The purpose is that it reads the URL (determined by the parameter) and returns the string in which...
3
3485
by: Pir8 | last post by:
I have a complex xml file, which contains stories within a magazine. The structure of the xml file is as follows: <?xml version="1.0" encoding="ISO-8859-1" ?> <magazine> <story>...
1
1279
by: Kiana Toufighi | last post by:
Hi, I have a simple CGI program that allows that user to upload a file. However, since accessing the the value of the uploaded file using the value attribute or the getvalue() method reads the...
22
2934
by: JJ | last post by:
Whats the best way for me to pull out records from a tab delimited text file? Or rather HOW do I parse the text, knowing that the tabs are field delimiters and a return (I image) signifies a new...
6
17103
by: J055 | last post by:
Hi I have the following code. I upload an XML file using the FileUpload object, store the stream in a session so the user gets the chance to confirm some options then pass the stream from the...
2
1231
by: syam | last post by:
Hai evenerybody.. I uploaded a text/html file into the database now i want to read the contents of the file and write the contents to send as an email. How do i read the contents of the file,...
13
2788
by: charliefortune | last post by:
I am fetching some product feeds with PHP like this $merch = substr($key,1); $feed = file_get_contents($_POST); $fp = fopen("./feeds/feed".$merch.".txt","w+"); fwrite ($fp,$feed); fclose...
3
1868
by: Damon Getsman | last post by:
Okay so I'm writing a script in python right now as a dirty fix for a problem we're having at work.. Unfortunately this is the first really non-trivial script that I've had to work with in python...
0
7234
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
7136
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
7344
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
7505
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
5652
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
4730
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
3216
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
1570
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
0
441
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.