473,545 Members | 2,196 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

literal escape sequence conversion to raw

I need to convert escape sequences entered into my program to the actual
code.

For example, \r becomes 0x0d
I have looked over the FAQ, and searched the web, with no results.
Is there a function that can do this, or do I need to use predefined
constants or a table of the values?

Below is a sample program.
When run with the input w\rx, I want to see the output::

input 77 5c 72 78 a

output 77 d 78

Thanks,

Walter

#include <stdio.h>

#include <stdarg.h>

int main (void)

{

char input[80];

char output[80];

int i;

fgets (input, 79, stdin);

printf ("\ninput ");

for (i = 0; i < strlen (input); i++) {

printf ("%x ", input[i]); }

strcpy (output, "");

sprintf (output, "%s", input);

printf ("\noutput ");

for (i = 0; i < strlen (output); i++) {

printf ("%x ", output[i]); }

exit (0);

}

When run, I want to get this output:

w\rx

input 77 5c 72 78 a

output 77 d 78


Nov 14 '05 #1
6 6626
On Mon, 05 Jan 2004 15:44:29 -0500, Walter L. Preuninger II wrote:
I need to convert escape sequences entered into my program to the actual
code.

For example, \r becomes 0x0d


As simple as this looks I recommend using a small state machine. The one
displayed at the bottom of my homepage is not too disimilar:

http://www.ioplex.com/~miallen/

Mike
Nov 14 '05 #2
"Walter L. Preuninger II" <wa*****@texram p.net> writes:
I need to convert escape sequences entered into my program to the actual
code.

For example, \r becomes 0x0d
I have looked over the FAQ, and searched the web, with no results.
Is there a function that can do this, or do I need to use predefined
constants or a table of the values?


Just write a function to do it. You don't have to know the
numeric values that escape sequences map to, because you can use
the escape sequences themselves to do the mapping. e.g.
if (in[0] == '\\') {
switch (in[1]) {
case 'n': *out++ = '\n'; break;
case 'r': *out++ = '\r'; break;
case 't': *out++ = '\t'; break;
...
}
...
}
--
"The lusers I know are so clueless, that if they were dipped in clue
musk and dropped in the middle of pack of horny clues, on clue prom
night during clue happy hour, they still couldn't get a clue."
--Michael Girdwood, in the monastery
Nov 14 '05 #3

"Walter L. Preuninger II" <wa*****@texram p.net> wrote in message
I need to convert escape sequences entered into my program to the
actual code.

For example, \r becomes 0x0d
So you want to allow the user to enter a string containing C escape
sequences? I have looked over the FAQ, and searched the web, with no results.
Is there a function that can do this, or do I need to use predefined
constants or a table of the values?
There is no ANSI function that converts a string containing '\' characters
to the C escaped equivalent. Obviously every compiler contains such a
function, but it isn't publically available.
Below is a sample program.
When run with the input w\rx, I want to see the output::

input 77 5c 72 78 a

output 77 d 78
Not sure what you are looking for here. Are you saying you want the space
character to escape to a hexadecimal ASCII code ? This is possible, but not
very sensible.
#include <stdio.h>
#include <stdarg.h>

int main (void)

{

char input[80];

char output[80];

int i;

fgets (input, 79, stdin);
This is actually not much improvement on gets() ? What do you propose to do
on over-long input ?
printf ("\ninput ");

for (i = 0; i < strlen (input); i++) {
This is an O(n*n) algorithm. Ok it is only a demonstration program on short
input, but strlen() will be called on every iteration, and will step through
the string.
printf ("%x ", input[i]); }

strcpy (output, "");

sprintf (output, "%s", input);

printf ("\noutput ");

for (i = 0; i < strlen (output); i++) {

printf ("%x ", output[i]); }

exit (0);

}

Try writing a function

void escapestring(ch ar *out, const char *in)
{
}

Which detects C-style escapes and writes the corrected string to out.
Nov 14 '05 #4
"Walter L. Preuninger II" wrote:

I need to convert escape sequences entered into my program to the actual
code.

For example, \r becomes 0x0d
I have looked over the FAQ, and searched the web, with no results.
Is there a function that can do this, or do I need to use predefined
constants or a table of the values?


There is no Standard library function to accomplish
this. That's not too surprising, really: the backslash
method of denoting special characters is a convention of
the way C source code is written, not anything intrinsic
in the nature of the special characters themselves. The
translation is provided by the compiler, and is just one
of many compile-time activities that lack run-time analogs.

(Of course, the fact that some operation occurs at
compile time is not a compelling reason not to support it
at run time. For example, the character sequence 314e-2
in C source is compiled into a poor approximation to pi,
and this same transformation is also accomplished by the
run-time strtod() function, among others. 314e-2 is
understandable outside a C context, which may be why its
run-time translation is provided for while the conversion
of \r\n to CR-LF is not -- but even that rationale breaks
down a bit in light of the library's support for blatant
C-isms like 0xA and 012. "No prior art" may be the only
definitive reason -- and "no prior art" may also indicate
that the transformation isn't of wide interest.)

That said, it's pretty easy to perform the translation
yourself if you really need it. Pseudocode:

char *p;
char ch;

for (p = input_string; *p != '\0'; ++p) {
if (*p != '\\') {
/* ordinary character represents itself */
emit_as_output (*p);
}
else {
/* backslash modifies next character */
ch = translate[ (unsigned char)(*++p) ];
if (ch != '\0') {
/* recognized an escape sequence */
emit_as_output (ch);
}
else {
/* garbage after the backslash */
complain_bitter ly();
--p; /* restart the scan */
}
}
}

The magic is in the translate[] array, which could be
initialized once at the start of the program:

#include <limits.h>
char translate[1+UCHAR_MAX];
...
translate['r'] = '\r';
translate['n'] = '\n';
...
translate['\\'] = '\\';

.... thus avoiding any hard-wired assumptions about the numeric
values of these special characters (ASCII is not the world's
only character encoding, you know).

--
Er*********@sun .com
Nov 14 '05 #5

"Malcolm" <ma*****@55bank .freeserve.co.u k> wrote in message
news:bt******** **@newsg1.svr.p ol.co.uk...

Thanks to Michael B Allen, Ben Pfaff and yourself for such a quick response.

My intended program is to scan the OFAC SDN (Specifically Designated
Nationals, the "terrorist list") file. But I want the program to work on
differently formatted files. So this question was for the soup bowl option
of allowing the user to specify what terminates a line/record. Some files
are CR, some LF, some are CRLF, and I even know of one text file that is
terminated with 0xFF

So my thought is to provide a command line switch that accepts \r, \a, \r\n,
and hex or octal codes (--delim "\r0xff" etc)

The sample code I posted produces the input line, I wanted to see the output
line like I showed, not what the program would have given me, where the
output line would have mirrored the input line.

In theory, the input never goes over 64 characters per line, and I will
detect longer length lines in my code. The program listed was just a test
program, and I never optimize test or proof of concept code.

Thanks for the valuable input!
Walter
Nov 14 '05 #6
Walter L. Preuninger II wrote:
I need to convert escape sequences entered into my program to the actual
code.

For example, \r becomes 0x0d
I don't understand. I can think of at least 2 sensible meanings for '\r'
in this context - do you mean that your program actually reads the two
character sequence '\' 'r'? Or that your program reads the character
that C represents by '\r'? Also, when you say it "becomes 0x0d" do you
mean that you want to convert it to the sequence of characters '0', 'x',
'0', 'd' or that you want to map it to the integer value 0x0d (13 in
decimal)? Also, where does the resulting sequence or value come from? As
far as we can tell, 0x0d is completely arbitrary. Is it supposed to be
the value that represents '\r' in some character set? If so, is it a
specific character set (and which one?) or will you use whatever the
execution character set of your implementation is? Note that different
implementations may use different execution character sets (all the
world is not ASCII).

When asking questions here, try to be as precise as possible.
I have looked over the FAQ, and searched the web, with no results.
Is there a function that can do this, or do I need to use predefined
constants or a table of the values?
There is no standard function (or there doesn't seem to be, based on the
possibilities I can think of for what you may be trying to do). If you
are mapping characters to the values that represent them in some
specific character set, you can do so portably only by using a table or
mapping of some sort. If you are using the execution character set then
you can simply use the value of the character that was read (interpreted
as an integer instead of a character).

Below is a sample program.
When run with the input w\rx, I want to see the output::

input 77 5c 72 78 a

output 77 d 78

Thanks,

Walter

#include <stdio.h>

#include <stdarg.h>
All these extra blank lines are annoying. Please don't do that in the
future. I've removed some of them.

You don't seem to be using anything from <stdarg.h>.

int main (void)
{

char input[80];
Please use sane indenting. Bad indenting (or lack of indenting) makes
the code much more difficult to read.

char output[80];
int i;
Judging from how you use 'i', it should probably be a size_t rather than
an int.

fgets (input, 79, stdin);
fgets() expects the entire buffer size as the 2nd argument, and reads no
more than one fewer than the number specified. In other words, unless
you are saving the last character for some reason, you should use 80 as
the second argument. Better yet, use sizeof(input) for easier maintenance.

printf ("\ninput ");

for (i = 0; i < strlen (input); i++) {
You need to #include <string.h> for strlen.

printf ("%x ", input[i]); }
The %x format specifier tells printf to expect an argument of type
unsigned int. Are you absolutely positive that char promotes to unsigned
int on your implementation (and, if so, are you positive that you don't
care about portability)? Note that this would imply that CHAR_MAX >
INT_MAX, making it impossible to implement several standard functions
correctly, and thus is not possible on a hosted implementation.

So in short, you are almost certainly passing the wrong type here. You
can fix it by casting to unsigned int:

printf("%x ", (unsigned int)input[i]);

strcpy (output, "");
You could say

output[0] = '\0';

instead. You need <string.h> for strcpy().

sprintf (output, "%s", input);
printf ("\noutput ");

for (i = 0; i < strlen (output); i++) {

printf ("%x ", output[i]); }
Same error as with the other printf().

exit (0);
You need to #include <stdlib.h> for exit().

Also, a portable program must terminate (non-empty) text streams with a
newline character. In other words, you should do

printf("\n");

or something equivalent before terminating your program if you aren't
sure that the most recent output to stdout ended with a newline. (An
exception being if you've used freopen() to change stdout to binary
mode. I believe this is only possible in C99, and even then it is
implementation-defined what changes in mode are permitted, and under
what circumstances.)

}


-Kevin
--
My email address is valid, but changes periodically.
To contact me please use the address from a recent posting.
Nov 14 '05 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
3427
by: Joe | last post by:
I'm using Python 2.4 on Windows XP SP2. I'm trying to receive a command line argument that is a newline (\n) Here is the command line to use sample.py "\n" Here is a sample.py script
14
3522
by: Jon Maz | last post by:
Hi, I have been getting hopelessly confused with escaping escape characters in JScript! All I want to do is write a simple funtion: function DoubleUpBackSlash(inputString) { ??????? }
9
3528
by: Steven T. Hatton | last post by:
This is from the draft of the previous version of the Standard: http://www.kuzbass.ru:8086/docs/isocpp/expr.html 2- A literal is a primary expression. Its type depends on its form (lex.literal). A string literal is an *lvalue*; all other literals are *rvalues*. -4- The operator :: followed by an identifier, a qualified-id, or an...
16
17376
by: Don Starr | last post by:
When applied to a string literal, is the sizeof operator supposed to return the size of the string (including nul), or the size of a pointer? For example, assuming a char is 1 byte and a char * is 4 bytes, should the following yield 4, 5, of something else? (And, if something else, what determines the result?) char x = "abcd"; printf(...
7
5703
by: N U | last post by:
How to program your own escape sequence using basic functions available in C?
7
4349
by: al | last post by:
char s = "This string literal"; or char *s= "This string literal"; Both define a string literal. Both suppose to be read-only and not to be modified according to Standard. And both have type of "const char *". Right? But why does the compiler I am using allow s to be modified, instead of generating compile error?
7
96263
by: teachtiro | last post by:
Hi, 'C' says \ is the escape character to be used when characters are to be interpreted in an uncommon sense, e.g. \t usage in printf(), but for printing % through printf(), i have read that %% should be used. Wouldn't it have been better (from design perspective) if the same escape character had been used in this case too. Forgive me...
2
1628
by: kihoshk | last post by:
I have what I THINK is an incredibly simple question, though I can't resolve it. I have a reference that returns a string which oftentimes contains "\". These returned strings ar produced by a DLL, which is out of my control. The string is assigned to a variable: string returnedValue; returnedValue=Encrypt("my data"); (returnedValue is...
3
3579
by: ramhog | last post by:
Hello all, I would like to use AppendFormat to replace tokens in a string with actual string values. The problem I am running into however is that my string uses {} around a string that I do not want formatted. Is there a way to tell the AppendFormat method not to see certain {} as a token to be replaced? In the example here, I want to...
0
7398
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
7656
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
1
7416
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
0
7752
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
0
5969
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
1
5325
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
4944
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
1
1013
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
701
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.