473,586 Members | 2,546 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Problems with parsing uploaded csv file contents

codesid
6 New Member
I could not find over the web anything related to this issue that I found, so I started to open a discussion about this, and maybe can help me out, or give better ideas of how to handle with this.

Environment:
Windows XP Pro, VS2003, .NET 1.1, C#

The Case:
When we obtain the information of a csv file from a post type "multipart/form-data", the contents of the file come with strange chars, similar to double spaces that are not exactly spaces (very tricky to clean them up). I searched this case, but did not find anything related. So, I managed to solve this in a very dirty way, which I am embarassed to show here (lol).

We can upload (without using any COMs) files in 2 ways, basically:

1) using Request.BinaryR ead (manual handling with posted data)
2) using HtmlInputFile.P ostedFile (using .NET web controls)

I've tested my case for both and the results are the same.

Eg.

piece of the file = [bar foo,bar@foo.com ,]
after obtaining the info from the post = [
b a r f o o , b a r @ f o o . c o m , ]

When I run a simple script to separate all elements of the content that I receive, see how it looks like:

[ ] [
](this is a new line content) [ ] [b] [ ] [a] [ ] [r] [ ] [ ] [ ] [f] [ ] [o] [ ] [o] [ ] [,] [ ] [b] [ ] [a] [ ] [r] [ ] [@] [ ] [f] [ ] [o] [ ] [o] [ ] [.] [ ] [c] [ ] [o] [ ] [m] [ ] [,] [ ]

For your information, the script:

// line is the line from the csv file
for(int i = 0; i < line.Length; i++)
{
string digit = line.Substring( i,1);
Response.Write( " ["+digit+"] ");
}

Note: Not all csv files have this problem. I got this from google's exporting features (gmail, orkut, etc).

The funny thing is when I display the contents on a webpage, everything seems ok, because the strange chars do not appear... I first discovered this when I did a script to automatically store the csv contents in a database... the data was very strange, because in the database all strange spaces were there, including the "new line" which does not disappear even if you replace it for anything else.

When I tried to compare the data of the spaces there, I could not find anything that would clean them:

line = line.Replace(" ", "") -> does not work
digit.Equals(st ring.Empty) -> always return false
"" + digit == "" -> is false
"" + digit == " " -> is false

So, I lost my hope on finding this char, which is not empty neither blank, so I went for the hashcode, and apparently solved the problem, but as I told before, I would not advise myself to do something like that, ever.

So, simply read everything char by char, and clean them up...

string correctedLine = "";
for(int i = 0; i < line.Length; i++)
{
string digit = line.Substring( i,1);
if (digit.GetHashC ode() != 5381 && digit.GetHashCo de() != 177583)
{
correctedLine += digit;
}
}

Has anyone ever seen this?

Thanks in advance.
Dec 31 '06 #1
2 1407
codesid
6 New Member
Haven't anyone gotten into similar problem so far?
Jan 17 '07 #2
bplacker
121 New Member
are the strange characters 'line breaks' ? try comparing against or replacing vbcrlf or something like this. I remember when dealing with CSVs in Java, that the character or character string for line breaks was something strange.
Jan 17 '07 #3

Sign in to post your reply or Sign up for a free account.

Similar topics

8
3792
by: CAFxX | last post by:
i'm writing a program that executes some calculations on a bitmap loaded in memory. these calculation ends up with pixel wth values far over 255, but i need them to be between 0 and 255 since i got to write them in a bmp file (in which RGB values are limited to 8 bits per channel). so i need to have them scaled down. first of all i find...
6
17300
by: Hans Kamp | last post by:
Is it possible to write a function like the following: string ReadURL(string URL) { .... } The purpose is that it reads the URL (determined by the parameter) and returns the string in which there is HTML-code, for example:
3
3489
by: Pir8 | last post by:
I have a complex xml file, which contains stories within a magazine. The structure of the xml file is as follows: <?xml version="1.0" encoding="ISO-8859-1" ?> <magazine> <story> <story_id>112233</story_id> <pub_name>Puleen's Publication</pub_name> <pub_code>PP</pub_code> <edition_date>20031201</edition_date>
1
1289
by: Kiana Toufighi | last post by:
Hi, I have a simple CGI program that allows that user to upload a file. However, since accessing the the value of the uploaded file using the value attribute or the getvalue() method reads the entire file in memory as a string which is not what I want I'm making use of the file module. The problem is that all the my checks including "assert...
22
2949
by: JJ | last post by:
Whats the best way for me to pull out records from a tab delimited text file? Or rather HOW do I parse the text, knowing that the tabs are field delimiters and a return (I image) signifies a new record ? JJ
6
17114
by: J055 | last post by:
Hi I have the following code. I upload an XML file using the FileUpload object, store the stream in a session so the user gets the chance to confirm some options then pass the stream from the Session to an XmlReader. if (performImport == false) { Session = fileUpload1.FileContent; //... some other code
2
1235
by: syam | last post by:
Hai evenerybody.. I uploaded a text/html file into the database now i want to read the contents of the file and write the contents to send as an email. How do i read the contents of the file, which is uploaded to the database using php.. thanks in advance..
13
2802
by: charliefortune | last post by:
I am fetching some product feeds with PHP like this $merch = substr($key,1); $feed = file_get_contents($_POST); $fp = fopen("./feeds/feed".$merch.".txt","w+"); fwrite ($fp,$feed); fclose ($fp); and then parsing them with PHP's native parsing functions. This is succesful for most of the feeds, but a couple of them claim to be
3
1872
by: Damon Getsman | last post by:
Okay so I'm writing a script in python right now as a dirty fix for a problem we're having at work.. Unfortunately this is the first really non-trivial script that I've had to work with in python and the book that I have on it really kind of sucks. I'm having an issue parsing lines of 'last' output that I have stored in a /tmp file. The...
0
7841
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
8339
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
0
8220
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
0
6617
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
1
5712
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
5392
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3838
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
1
1452
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
1184
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.