473,386 Members | 1,860 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

Mapforce: mapping to CSV without column header line inserts hex FF FE FF FE

Hi Group,

In Mapforce 2005 R3, when mapping to CSV with the "First row contains
field names" option UN-checked on the CSV target component settings,
the characters (hex) FF FE FF FE are inserted in the beginning of the
first line when running Java code autogenerated by Mapforce.

In the output tab of the Mapforce application, this problem doesn't
occur. I've not checked whether it occurs when running C#,C++ or XSLT
autogenerated code.

I've encountered this problem when mapping XML to CSV and CSV to CSV.

Does anyone know whether this is this a known bug? Is it fixed in a
later release?
Any known workarounds?

Not holding my breath,

Lukas

Dec 9 '05 #1
11 5029
Correction:

My editor was displaying those bytes incorrectly.
The bytes inserted are actually:

EF BB BF

Dec 12 '05 #2
Lukas wrote:
Hi Group,

In Mapforce 2005 R3, when mapping to CSV with the "First row contains
field names" option UN-checked on the CSV target component settings,
the characters (hex) FF FE FF FE are inserted in the beginning of the
first line when running Java code autogenerated by Mapforce.

In the output tab of the Mapforce application, this problem doesn't
occur. I've not checked whether it occurs when running C#,C++ or XSLT
autogenerated code.

I've encountered this problem when mapping XML to CSV and CSV to CSV.

Does anyone know whether this is this a known bug? Is it fixed in a
later release?
Any known workarounds?


It's not a bug, it's part of XML. It's the Byte Order Mark (BOM) which
is designed to signal to a processor before processing starts which
16-bit character encoding is in use. It's being output because your
processor is emitting UCS-2 which is probably unnecessary unless you
are using a very wide range of character repertoire planes. Check the
Mapforce output settings and switch to UTF-8 instead.

///Peter
--
See FAQ: http://xml.silmaril.ie/appendix/glossary/#bom

Dec 12 '05 #3
In article <11**********************@g14g2000cwa.googlegroups .com>,
Lukas <lu*******@yahoo.com> wrote:
My editor was displaying those bytes incorrectly.
The bytes inserted are actually:

EF BB BF


I can't help you directly, but EF BB BF is the UTF-8 code for a
byte-order mark (or "BOM"). Maybe you can look that up in the manual
for your software.

-- Richard
Dec 13 '05 #4
Sorry for the confusion. The sequence was actually EF BB BF (UTF-8 BOM,
as Richard notes).

What confuses me about the UTF-8 BOM issue:

A) In XML: Since I'm using UTF-8, which is a 7 bit encoding, and the
xml processing instruction says so explicitly, why would I want to have
nasty binary at the start of my document?

B)
* In Text (CSV): some articles claim that Windows Notepad handles the
BOM gracefully, but in our project the issue would've not even been
raised if our editors had not displayed spurious characters;
... "" (if you view this in ISO 8859-1) in Notepad, a dot in
Ultraedit 8.2. When switching to hex in Ultraedit, completely wrong
values are being displayed throug the length of the doc.

* The issue did not occur when (in Mapforce) the option "First row
contains field names" was checked for the output CSV, although we
viewed the output files with the same editors.

* Mapforce ITSELF doesn't handle the BOM gracefully. If the CSV output
with BOM from one Mapforce code-gen mapping is fed as input to another,
the BOM is visible in the first field and trips up functions operating
on that field.

Dec 14 '05 #5
Sorry, something doesn't display in my last post. It's meant to read:

...

* * * * * * *
* * * *
* * * *
* * * *
* * * *
* * * * *
* * * ****

(if you view this in ISO 8859-1) in Notepad, a dot ...

Dec 14 '05 #6
In article <11**********************@g44g2000cwa.googlegroups .com>,
Lukas <lu*******@yahoo.com> wrote:
A) In XML: Since I'm using UTF-8, which is a 7 bit encoding, and the
xml processing instruction says so explicitly, why would I want to have
nasty binary at the start of my document?
UTF-8 is not a 7-bit encoding! It corresponds to ASCII for characters
up to 127, but uses bytes with the high bit set to encode the rest of
Unicode.
* In Text (CSV): some articles claim that Windows Notepad handles the
BOM gracefully, but in our project the issue would've not even been
raised if our editors had not displayed spurious characters;
.. "" (if you view this in ISO 8859-1) in Notepad


I don't know anything about Notepad, but if you see those characters -
i with diaeresis, double greater-than, inverted question mark - it
means that the program is interpreting the document as 8859-1 rather
than UTF-8. Of course, the whole point of the UTF-8 BOM is to let it
know that it's in UTF-8!

-- Richard
Dec 14 '05 #7
Lukas wrote:
Sorry for the confusion. The sequence was actually EF BB BF (UTF-8
BOM, as Richard notes).

What confuses me about the UTF-8 BOM issue:

A) In XML: Since I'm using UTF-8, which is a 7 bit encoding,
Whoah there. UTF-8 uses all 8 bits in the byte. Where did you get the
information that it's 7-bit? The only 7-bit encoding in widespread
use is US-ASCII.
and the
xml processing instruction says so explicitly, why would I want to
have nasty binary at the start of my document?
To identify that it is UTF-8 as opposed to UTF-16 or UTF-32.
If your XML software can't handle it, it's broken and should be
replaced.
B)
* In Text (CSV): some articles claim that Windows Notepad handles the
BOM gracefully, but in our project the issue would've not even been
raised if our editors had not displayed spurious characters;
.. "" (if you view this in ISO 8859-1) in Notepad, a dot in
Ultraedit 8.2. When switching to hex in Ultraedit, completely wrong
values are being displayed throug the length of the doc.
While most plaintext editors will display ASCII or ISO-8859-1
adequately, large numbers of them spit blood when faced with anything
else. Notepad is suitable for shopping lists and not much else.
* The issue did not occur when (in Mapforce) the option "First row
contains field names" was checked for the output CSV, although we
viewed the output files with the same editors.

* Mapforce ITSELF doesn't handle the BOM gracefully. If the CSV output
with BOM from one Mapforce code-gen mapping is fed as input to
another, the BOM is visible in the first field and trips up functions
operating on that field.


Sounds like Mapforce is broken and you should complain to the vendor.

///Peter
--
XML FAQ: http://xml.silmaril.ie/

Dec 14 '05 #8
In <dn**********@pc-news.cogsci.ed.ac.uk>, on 12/14/2005
at 12:59 PM, ri*****@cogsci.ed.ac.uk (Richard Tobin) said:
I don't know anything about Notepad, but if you see those characters
-
i with diaeresis, double greater-than, inverted question mark - it
means that the program is interpreting the document as 8859-1 rather
than UTF-8. Of course, the whole point of the UTF-8 BOM is to let it
know that it's in UTF-8!


Why would you need a BOM for UTF-8? It's only needed for characters
larger than an octet, e.g., UTF-16, raw UCS4.

--
Shmuel (Seymour J.) Metz, SysProg and JOAT <http://patriot.net/~shmuel>

Unsolicited bulk E-mail subject to legal action. I reserve the
right to publicly post or ridicule any abusive E-mail. Reply to
domain Patriot dot net user shmuel+news to contact me. Do not
reply to sp******@library.lspace.org

Dec 19 '05 #9
In article <43***************************@news.patriot.net> ,
Shmuel (Seymour J.) Metz <sp******@library.lspace.org.invalid> wrote:
Why would you need a BOM for UTF-8? It's only needed for characters
larger than an octet, e.g., UTF-16, raw UCS4.


It also serves to indicate the encoding, as well as which byte-order
variant.

-- Richard
Dec 19 '05 #10
In <do***********@pc-news.cogsci.ed.ac.uk>, on 12/19/2005
at 08:58 PM, ri*****@cogsci.ed.ac.uk (Richard Tobin) said:
It also serves to indicate the encoding, as well as which byte-order
variant


What byte-order variant? UTF-8 uses a stream of 8-bit bytes (octets),
not a stream of 16-bit bytes; there is no byte ordering issue. The BOM
is needed for UTF-16 and raw Unicode, not for UTF-8.

--
Shmuel (Seymour J.) Metz, SysProg and JOAT <http://patriot.net/~shmuel>

Unsolicited bulk E-mail subject to legal action. I reserve the
right to publicly post or ridicule any abusive E-mail. Reply to
domain Patriot dot net user shmuel+news to contact me. Do not
reply to sp******@library.lspace.org

Jan 3 '06 #11
In article <43***************************@news.patriot.net> ,
Shmuel (Seymour J.) Metz <sp******@library.lspace.org.invalid> wrote:
It also serves to indicate the encoding, as well as which byte-order
variant
What byte-order variant? UTF-8 uses a stream of 8-bit bytes (octets),
not a stream of 16-bit bytes; there is no byte ordering issue.


The obvious use of a BOM - as the name implies - is to indicate which
byte order variant of an encoding is being used. It is *also* used to
indicate the encoding itself. Obviously for UTF-8 only this second
fuction is relevant.

-- Richard
Jan 4 '06 #12

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: Michael Herman \(Parallelspace\) | last post by:
Suppose I have two RDF files (and corresponding RDF Schema files) that are used to store calendar appointment information using different (RDF) schemas. Can Altova mapforce...
7
by: lukaslatz | last post by:
Hi group I'm fed up with the Altova Mapforce user forum because posts go missing and it's slow. Is there a group especially for vendor-specific topics/issues or are these things ok in...
3
by: Lukas | last post by:
title: xml to xml mapping: empty elements output although input element is not empty Why is is that when mapping from a XML schema to another XML schema, when drawing a default connection...
1
by: Alexander | last post by:
Hi there! I need to write an sql statement, that inserts a blank (empty) row into a table. I tried "Insert Into tabname", "Insert Into tabname () VALUES ()" "Insert Into tabname VALUES ()"
5
by: Alex K. | last post by:
How do I define multiline column header in DataGrid control? I need two-line headers for some columns. Thank you
1
by: Drew | last post by:
Hey, thanks in advance for helping me out with my problem: I have a datagrid which is embedded in another datagrid. The datagrid is filled directly by a dataset generated from a sql query. So...
5
by: Jeff | last post by:
I'm having some trouble wrapping my mind around header(), I'm used to just printing the header and leaving a blank line. Does php ignore whitespace before a header: <?php // no output here...
13
by: sulyokpeti | last post by:
I have made a simple python module to handle SQL databases: https://fedorahosted.org/pySQLFace/wiki Its goal to separate relational database stuff (SQL) from algorythmic code (python). A SQLFace...
4
by: wizardry | last post by:
hello - i've created a form that has multiple inserts. it inserts the data fine if i manually parse the data to it but when i use the form to test the inserts it errors out. it errors out at...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.