473,320 Members | 2,164 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

remove punctuation in C

7
hey,
can anyone help me write up a function to remove space, semicolons, etc?
ty.
Nov 21 '08 #1
15 14735
donbock
2,426 Expert 2GB
Do you have any suggestions for how you might recognize punctuation or space characters in a string?
Nov 21 '08 #2
kpdp
7
so, i think when you are using the remove_punc function, you have to write the script so that it keeps in the characters, while removing the punctions.
Nov 21 '08 #3
oler1s
671 Expert 512MB
so, i think when you are using the remove_punc function, you have to write the script so that it keeps in the characters, while removing the punctions.
Yes, presumably it’s why you called it a remove punctuation function, as opposed to retain punctuation function or something. We aren’t sure what your question is though. You requested help, but that’s not necessary. We are all here to help.

But what is your question?
Nov 21 '08 #4
kpdp
7
I'm sorry. I have just started learning programming so I am trying to teach myself the basics with functions and strings.
My question is that how do i write something like "My name is Peter" and I get back "mynameispeter" ?
How do i use the function?
Nov 21 '08 #5
donbock
2,426 Expert 2GB
I'm sorry. I have just started learning programming so I am trying to teach myself the basics with functions and strings.
My question is that how do i write something like "My name is Peter" and I get back "mynameispeter" ?
How do i use the function?
First let's specify the interface to this function. It needs an input string and it needs to return a string. The input string is easy to handle: one of the function arguments is a pointer to the input string. I can think of three ways to handle the output string: (1) since the length of the output string can never be greater than the length of the input string, you can overwrite the input string; (2) you can malloc a buffer for the output string; and (3) the caller can pass you a buffer to use for the output string. Options (2) and (3) have failure modes (malloc can fail in #2; and the passed buffer can be too small in #3), so some method is needed to distinguish success from failure. Which of these sounds like what you want to do? I suggest you pick an interface and then create a corresponding function prototype.
Nov 21 '08 #6
oler1s
671 Expert 512MB
My question is that how do i write something like "My name is Peter" and I get back "mynameispeter" ?
There’s two aspects to that question. One is the issue of code: what exact code do you need to write. The other issue is the algorithm, which is partly independent of the language and code. How do you approach something like this algorithmically? So you need to sit and think about algorithms. What data structure are you dealing with? How would you do the work on paper? What does pseudocode or plain English instructions for the algorithm look like? That’s your primary question.
Nov 21 '08 #7
@kpdp
I haven't done this in C but have done this in C++, java and C#.

psuedo code:

string someFunction(string string){
string str;
for(int i = 0 ; i<string.size() ; i++) {//loop through the main string passed in character by character
char j = string[i];
if(j>=65 && j<=90 || j>=97 && j<=122 || j == 32 || j>=48 && j<=57){ //these numbers are ascii values... 65-90 lowercase letter, 97-122 upper case letter , 48-57 numbers and 32 is spaces
str += string[i];
}

return str;
}

this is a c++ code... it may even compile. if called correctly....

by for string manupilations ascii values are the way to go!

i'm not a pro but this algorythm should work in c as well!

drjay
Nov 23 '08 #8
JosAH
11,448 Expert 8TB
@drjay1627
No they are not.

kind regards,

Jos
Nov 23 '08 #9
@JosAH
why not??? what do you suggest?
Nov 23 '08 #10
boxfish
469 Expert 256MB
@drjay1627
Because the ASCII values of characters may be different on different platforms. It is safer to use the actual letters, like this:
Expand|Select|Wrap|Line Numbers
  1. char j = string[i];
  2. if((j >= 'a' && j <= 'z') || (j >= 'A' && j <= 'Z') || j == ' ' || (j >= '0' && j <= '9')) {
  3.     str += string[i];
  4. }
But the best thing to do is use isalpha() and friends. I think they're in stdlib.h, or cstdlib.
Expand|Select|Wrap|Line Numbers
  1. char j = string[i];
  2. if(isalpha(j) || isspace(j) || isdigit(j)) {
  3.     str += string[i];
  4. }
islower() and isupper() may be useful too.
Hope this helps.
Edit:
Also check out ispunct().
Nov 24 '08 #11
@boxfish
thanks! didn't know that ASCII values were different! when you say platform is the OS you talking or compiler?

If OS, my programs worked on both linux and windows, and if compiler my program compiles on VS and g++.
Nov 24 '08 #12
JosAH
11,448 Expert 8TB
Waidaminnit: ASCII is ASCII is ASCII. In ASCII a capital A has the code 65, everywhere. The C or C++ language (or any other language) doesn't force you to remember those silly codes, simply use 'A' and the language does the rest. This isn't the 1960s anymore.

Those languages can do that on non-ASCII machines as well, e.g. on an IBM EBCDIC machine the notation 'A' results in the correct EBCDIC code for a capital A. Dumb numbers can't do that.

kind regards,

Jos
Nov 24 '08 #13
donbock
2,426 Expert 2GB
@boxfish
Waidaminnit: ASCII is ASCII is ASCII.
I'm sure what boxfish meant was the the encoding of characters may be different on different platforms. Some might use ASCII, some might use EBCDIC, some might use other encoding systems. That's one reason why character constants are better than bare numbers.

Another reason is that bare numbers (so called magic numbers) should be avoided because they poorly document/explain what the code is trying to do.
Of course, this objection could be met by a collection of macro definitions such as this [tongue firmly in cheek]:
#define SMALL_A 65

However, if you're totally paranoid about portable code I don't think you can even take for granted that the range of character codes between 'a' and 'z' are exclusively lower case letters. Perhaps there are holes, perhaps they are out of order.

By far the best way to go is to use the functions in ctype.h (suggested by boxfish).
Nov 24 '08 #14
You need use ctype.h for remove space,punctuations and another grammatical concepts

See following palindrome checker

Expand|Select|Wrap|Line Numbers
  1. /*Write a program of palindrome using stack [implement stack using array]:
  2. Note: Punctuation, Capitalization, and Spaces are ignored. For Example- Poor dan is in a droop.
  3.  
  4. Date : February 21,2014 */
  5.  
  6. #include <stdio.h>
  7. #include <ctype.h>
  8. #include <string.h>
  9. #include <stdbool.h>
  10.  
  11. bool is_palindrome(const char* s)
  12. {
  13.     int i = 0;
  14.     int j = strlen(s) - 1;
  15.     while(j >= 0)
  16.     {
  17.         if(!isalpha(s[i]))
  18.         {
  19.             ++i;
  20.         }
  21.         else if(!isalpha(s[j]))
  22.         {
  23.             --j;
  24.         }
  25.         else if(tolower(s[i]) != tolower(s[j]))
  26.         {
  27.             return false;
  28.         }
  29.         ++i;
  30.         --j;
  31.     }
  32.     return true;
  33. }
  34.  
  35.  
  36. void printme(const char* s)
  37. {
  38.     printf(" \" %s\" ",s);
  39.     if(is_palindrome(s))
  40.     {
  41.         printf(" IS a palindrome! \n");
  42.     }
  43.     else
  44.     {
  45.         printf(" IS NOT a palindrome! \n");
  46.     }
  47. }
  48.  
  49. int main()
  50. {
  51.     char s[] = "kajak";
  52.     char s2[] = "Poor Dan is in a droop";
  53.     char s3[] = "not a palindrome";
  54.  
  55.     printme(s);
  56.     printme(s2);
  57.     printme(s3);
  58.     return 0;
  59. }
Mar 1 '14 #15
Expand|Select|Wrap|Line Numbers
  1. /*Write a program of palindrome using stack [implement stack using array]:
  2. Note: Punctuation, Capitalization, and Spaces are ignored. For Example- Poor dan is in a droop.
  3.  
  4. Date : February 21,2014 */
  5.  
  6. #include <stdio.h>
  7. #include <ctype.h>
  8. #include <string.h>
  9. #include <stdbool.h>
  10.  
  11. bool is_palindrome(const char* s)
  12. {
  13.     int i = 0;
  14.     int j = strlen(s) - 1;
  15.     while(j >= 0)
  16.     {
  17.         if(!isalpha(s[i]))
  18.         {
  19.             ++i;
  20.         }
  21.         else if(!isalpha(s[j]))
  22.         {
  23.             --j;
  24.         }
  25.         else if(tolower(s[i]) != tolower(s[j]))
  26.         {
  27.             return false;
  28.         }
  29.         ++i;
  30.         --j;
  31.     }
  32.     return true;
  33. }
  34.  
  35.  
  36. void printme(const char* s)
  37. {
  38.     printf(" \" %s\" ",s);
  39.     if(is_palindrome(s))
  40.     {
  41.         printf(" IS a palindrome! \n");
  42.     }
  43.     else
  44.     {
  45.         printf(" IS NOT a palindrome! \n");
  46.     }
  47. }
  48.  
  49. int main()
  50. {
  51.     char s[] = "kajak";
  52.     char s2[] = "Poor Dan is in a droop";
  53.     char s3[] = "not a palindrome";
  54.  
  55.     printme(s);
  56.     printme(s2);
  57.     printme(s3);
  58.     return 0;
  59. }
  60.  
Mar 1 '14 #16

Sign in to post your reply or Sign up for a free account.

Similar topics

7
by: Lachlan Hunt | last post by:
Hi, I have recently downloaded and experemented with IBM HPR 3.0, and Opera 8 with text-to-speech, and have come to realise some fairly annoying issues regarding punctuation marks. I've found,...
0
by: Chris Leffer | last post by:
Hi. I am trying to define a regular expression that accepts letters and punctuation characters. I read something about Posix where I could use in order to accept all the punctuation characters,...
2
by: Anat | last post by:
Hi, I need a little help on performing string manipulation: I want to take a given string, and make certain words hyperlinks. For example: "Hello world, this is a wonderful day!" I'd like the...
6
by: Tashfeen Bhimdi | last post by:
I'm trying to remove punctuation from a string with the following code: ---------------------------- #include <string> #include <algorithm> #include <cctype> .. using namespace std ..
5
by: joe | last post by:
hello i have a databse program that uses char arrays to output data to reports. I would like to remove all invalid characters from the array and replace them with a blank space. I have problems...
6
by: watcher00 | last post by:
Hi I'm a complete newbie at Perl and i was wondering if i can get some help completing an exercise i've come across. I need to count the punctuation marks from a text file and then output a...
4
by: kdsutaia | last post by:
hi! i m trying to do something like this. as I am doing tockenization. and wants to include all punctuation mark as tocken. if($punctuation){ grep {$p =~ m/$_/' '.$_.' '/g)} ("\.", "\%",...
10
by: Mike Copeland | last post by:
I have data I need to normalize - it's "name" data. For example, I have the following: "Watts, J.C." I wish to (1) parse the "first name" ("J.C.") and adjust it to "JC". Essentially, I want to...
6
by: Shiao | last post by:
Hello, I'm trying to build a regex in python to identify punctuation characters in all the languages. Some regex implementations support an extended syntax \p{P} that does just that. As far as I...
0
by: DolphinDB | last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation. Take...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
0
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.