473,395 Members | 1,931 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

literal escape sequence conversion to raw

I need to convert escape sequences entered into my program to the actual
code.

For example, \r becomes 0x0d
I have looked over the FAQ, and searched the web, with no results.
Is there a function that can do this, or do I need to use predefined
constants or a table of the values?

Below is a sample program.
When run with the input w\rx, I want to see the output::

input 77 5c 72 78 a

output 77 d 78

Thanks,

Walter

#include <stdio.h>

#include <stdarg.h>

int main (void)

{

char input[80];

char output[80];

int i;

fgets (input, 79, stdin);

printf ("\ninput ");

for (i = 0; i < strlen (input); i++) {

printf ("%x ", input[i]); }

strcpy (output, "");

sprintf (output, "%s", input);

printf ("\noutput ");

for (i = 0; i < strlen (output); i++) {

printf ("%x ", output[i]); }

exit (0);

}

When run, I want to get this output:

w\rx

input 77 5c 72 78 a

output 77 d 78


Nov 14 '05 #1
6 6605
On Mon, 05 Jan 2004 15:44:29 -0500, Walter L. Preuninger II wrote:
I need to convert escape sequences entered into my program to the actual
code.

For example, \r becomes 0x0d


As simple as this looks I recommend using a small state machine. The one
displayed at the bottom of my homepage is not too disimilar:

http://www.ioplex.com/~miallen/

Mike
Nov 14 '05 #2
"Walter L. Preuninger II" <wa*****@texramp.net> writes:
I need to convert escape sequences entered into my program to the actual
code.

For example, \r becomes 0x0d
I have looked over the FAQ, and searched the web, with no results.
Is there a function that can do this, or do I need to use predefined
constants or a table of the values?


Just write a function to do it. You don't have to know the
numeric values that escape sequences map to, because you can use
the escape sequences themselves to do the mapping. e.g.
if (in[0] == '\\') {
switch (in[1]) {
case 'n': *out++ = '\n'; break;
case 'r': *out++ = '\r'; break;
case 't': *out++ = '\t'; break;
...
}
...
}
--
"The lusers I know are so clueless, that if they were dipped in clue
musk and dropped in the middle of pack of horny clues, on clue prom
night during clue happy hour, they still couldn't get a clue."
--Michael Girdwood, in the monastery
Nov 14 '05 #3

"Walter L. Preuninger II" <wa*****@texramp.net> wrote in message
I need to convert escape sequences entered into my program to the
actual code.

For example, \r becomes 0x0d
So you want to allow the user to enter a string containing C escape
sequences? I have looked over the FAQ, and searched the web, with no results.
Is there a function that can do this, or do I need to use predefined
constants or a table of the values?
There is no ANSI function that converts a string containing '\' characters
to the C escaped equivalent. Obviously every compiler contains such a
function, but it isn't publically available.
Below is a sample program.
When run with the input w\rx, I want to see the output::

input 77 5c 72 78 a

output 77 d 78
Not sure what you are looking for here. Are you saying you want the space
character to escape to a hexadecimal ASCII code ? This is possible, but not
very sensible.
#include <stdio.h>
#include <stdarg.h>

int main (void)

{

char input[80];

char output[80];

int i;

fgets (input, 79, stdin);
This is actually not much improvement on gets() ? What do you propose to do
on over-long input ?
printf ("\ninput ");

for (i = 0; i < strlen (input); i++) {
This is an O(n*n) algorithm. Ok it is only a demonstration program on short
input, but strlen() will be called on every iteration, and will step through
the string.
printf ("%x ", input[i]); }

strcpy (output, "");

sprintf (output, "%s", input);

printf ("\noutput ");

for (i = 0; i < strlen (output); i++) {

printf ("%x ", output[i]); }

exit (0);

}

Try writing a function

void escapestring(char *out, const char *in)
{
}

Which detects C-style escapes and writes the corrected string to out.
Nov 14 '05 #4
"Walter L. Preuninger II" wrote:

I need to convert escape sequences entered into my program to the actual
code.

For example, \r becomes 0x0d
I have looked over the FAQ, and searched the web, with no results.
Is there a function that can do this, or do I need to use predefined
constants or a table of the values?


There is no Standard library function to accomplish
this. That's not too surprising, really: the backslash
method of denoting special characters is a convention of
the way C source code is written, not anything intrinsic
in the nature of the special characters themselves. The
translation is provided by the compiler, and is just one
of many compile-time activities that lack run-time analogs.

(Of course, the fact that some operation occurs at
compile time is not a compelling reason not to support it
at run time. For example, the character sequence 314e-2
in C source is compiled into a poor approximation to pi,
and this same transformation is also accomplished by the
run-time strtod() function, among others. 314e-2 is
understandable outside a C context, which may be why its
run-time translation is provided for while the conversion
of \r\n to CR-LF is not -- but even that rationale breaks
down a bit in light of the library's support for blatant
C-isms like 0xA and 012. "No prior art" may be the only
definitive reason -- and "no prior art" may also indicate
that the transformation isn't of wide interest.)

That said, it's pretty easy to perform the translation
yourself if you really need it. Pseudocode:

char *p;
char ch;

for (p = input_string; *p != '\0'; ++p) {
if (*p != '\\') {
/* ordinary character represents itself */
emit_as_output (*p);
}
else {
/* backslash modifies next character */
ch = translate[ (unsigned char)(*++p) ];
if (ch != '\0') {
/* recognized an escape sequence */
emit_as_output (ch);
}
else {
/* garbage after the backslash */
complain_bitterly();
--p; /* restart the scan */
}
}
}

The magic is in the translate[] array, which could be
initialized once at the start of the program:

#include <limits.h>
char translate[1+UCHAR_MAX];
...
translate['r'] = '\r';
translate['n'] = '\n';
...
translate['\\'] = '\\';

.... thus avoiding any hard-wired assumptions about the numeric
values of these special characters (ASCII is not the world's
only character encoding, you know).

--
Er*********@sun.com
Nov 14 '05 #5

"Malcolm" <ma*****@55bank.freeserve.co.uk> wrote in message
news:bt**********@newsg1.svr.pol.co.uk...

Thanks to Michael B Allen, Ben Pfaff and yourself for such a quick response.

My intended program is to scan the OFAC SDN (Specifically Designated
Nationals, the "terrorist list") file. But I want the program to work on
differently formatted files. So this question was for the soup bowl option
of allowing the user to specify what terminates a line/record. Some files
are CR, some LF, some are CRLF, and I even know of one text file that is
terminated with 0xFF

So my thought is to provide a command line switch that accepts \r, \a, \r\n,
and hex or octal codes (--delim "\r0xff" etc)

The sample code I posted produces the input line, I wanted to see the output
line like I showed, not what the program would have given me, where the
output line would have mirrored the input line.

In theory, the input never goes over 64 characters per line, and I will
detect longer length lines in my code. The program listed was just a test
program, and I never optimize test or proof of concept code.

Thanks for the valuable input!
Walter
Nov 14 '05 #6
Walter L. Preuninger II wrote:
I need to convert escape sequences entered into my program to the actual
code.

For example, \r becomes 0x0d
I don't understand. I can think of at least 2 sensible meanings for '\r'
in this context - do you mean that your program actually reads the two
character sequence '\' 'r'? Or that your program reads the character
that C represents by '\r'? Also, when you say it "becomes 0x0d" do you
mean that you want to convert it to the sequence of characters '0', 'x',
'0', 'd' or that you want to map it to the integer value 0x0d (13 in
decimal)? Also, where does the resulting sequence or value come from? As
far as we can tell, 0x0d is completely arbitrary. Is it supposed to be
the value that represents '\r' in some character set? If so, is it a
specific character set (and which one?) or will you use whatever the
execution character set of your implementation is? Note that different
implementations may use different execution character sets (all the
world is not ASCII).

When asking questions here, try to be as precise as possible.
I have looked over the FAQ, and searched the web, with no results.
Is there a function that can do this, or do I need to use predefined
constants or a table of the values?
There is no standard function (or there doesn't seem to be, based on the
possibilities I can think of for what you may be trying to do). If you
are mapping characters to the values that represent them in some
specific character set, you can do so portably only by using a table or
mapping of some sort. If you are using the execution character set then
you can simply use the value of the character that was read (interpreted
as an integer instead of a character).

Below is a sample program.
When run with the input w\rx, I want to see the output::

input 77 5c 72 78 a

output 77 d 78

Thanks,

Walter

#include <stdio.h>

#include <stdarg.h>
All these extra blank lines are annoying. Please don't do that in the
future. I've removed some of them.

You don't seem to be using anything from <stdarg.h>.

int main (void)
{

char input[80];
Please use sane indenting. Bad indenting (or lack of indenting) makes
the code much more difficult to read.

char output[80];
int i;
Judging from how you use 'i', it should probably be a size_t rather than
an int.

fgets (input, 79, stdin);
fgets() expects the entire buffer size as the 2nd argument, and reads no
more than one fewer than the number specified. In other words, unless
you are saving the last character for some reason, you should use 80 as
the second argument. Better yet, use sizeof(input) for easier maintenance.

printf ("\ninput ");

for (i = 0; i < strlen (input); i++) {
You need to #include <string.h> for strlen.

printf ("%x ", input[i]); }
The %x format specifier tells printf to expect an argument of type
unsigned int. Are you absolutely positive that char promotes to unsigned
int on your implementation (and, if so, are you positive that you don't
care about portability)? Note that this would imply that CHAR_MAX >
INT_MAX, making it impossible to implement several standard functions
correctly, and thus is not possible on a hosted implementation.

So in short, you are almost certainly passing the wrong type here. You
can fix it by casting to unsigned int:

printf("%x ", (unsigned int)input[i]);

strcpy (output, "");
You could say

output[0] = '\0';

instead. You need <string.h> for strcpy().

sprintf (output, "%s", input);
printf ("\noutput ");

for (i = 0; i < strlen (output); i++) {

printf ("%x ", output[i]); }
Same error as with the other printf().

exit (0);
You need to #include <stdlib.h> for exit().

Also, a portable program must terminate (non-empty) text streams with a
newline character. In other words, you should do

printf("\n");

or something equivalent before terminating your program if you aren't
sure that the most recent output to stdout ended with a newline. (An
exception being if you've used freopen() to change stdout to binary
mode. I believe this is only possible in C99, and even then it is
implementation-defined what changes in mode are permitted, and under
what circumstances.)

}


-Kevin
--
My email address is valid, but changes periodically.
To contact me please use the address from a recent posting.
Nov 14 '05 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
by: Joe | last post by:
I'm using Python 2.4 on Windows XP SP2. I'm trying to receive a command line argument that is a newline (\n) Here is the command line to use sample.py "\n" Here is a sample.py script
14
by: Jon Maz | last post by:
Hi, I have been getting hopelessly confused with escaping escape characters in JScript! All I want to do is write a simple funtion: function DoubleUpBackSlash(inputString) { ??????? }
9
by: Steven T. Hatton | last post by:
This is from the draft of the previous version of the Standard: http://www.kuzbass.ru:8086/docs/isocpp/expr.html 2- A literal is a primary expression. Its type depends on its form...
16
by: Don Starr | last post by:
When applied to a string literal, is the sizeof operator supposed to return the size of the string (including nul), or the size of a pointer? For example, assuming a char is 1 byte and a char *...
7
by: N U | last post by:
How to program your own escape sequence using basic functions available in C?
7
by: al | last post by:
char s = "This string literal"; or char *s= "This string literal"; Both define a string literal. Both suppose to be read-only and not to be modified according to Standard. And both have...
7
by: teachtiro | last post by:
Hi, 'C' says \ is the escape character to be used when characters are to be interpreted in an uncommon sense, e.g. \t usage in printf(), but for printing % through printf(), i have read that %%...
2
by: kihoshk | last post by:
I have what I THINK is an incredibly simple question, though I can't resolve it. I have a reference that returns a string which oftentimes contains "\". These returned strings ar produced by a...
3
by: ramhog | last post by:
Hello all, I would like to use AppendFormat to replace tokens in a string with actual string values. The problem I am running into however is that my string uses {} around a string that I do not...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.