By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
454,368 Members | 1,544 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 454,368 IT Pros & Developers. It's quick & easy.

Locales, file parsing and isspace, use_facets etc

P: n/a
Hi All,

Is there anyway to change what isspace thinks is a space character.

I am parsing some log files and it would be nice to just read a field as what ever is between quotes or between []'s ie clf log files

I know I can go char by char or find_last_of etc, but I would like to know if it is possible with locales and facets?

Also are there any suggestions for outputting in multiple languages? Can I use locales again?
I was just thinking of a class that has the language set at runtime then outputs the correct text as per the language?

//---------------------------------------------------------------------------

#include <sstream>
#include <string>
#include <locale>
#include <iostream>
#pragma hdrstop

//---------------------------------------------------------------------------

#pragma argsused
int main(int argc, char* argv[])
{
std::stringstream strm("209.167.50.22 - - [25/Jan/2006:02:27:14 -0800] \"GET /Services/Development HTTP/1.1\" 301 352 \"-\" \"LinkWalker\"");
std::string host;
std::string ident;
std::string authuser;
std::string datetime;
std::string http_request;
std::string response_code;
std::string xfer_size;
std::string referer;
std::string agent;

strm >host;
strm >ident;
strm >authuser;
//set the isspace to a ]
strm >datetime;
//set the isspace to a "
strm >http_request;
//set it back to default
strm >response_code;
strm >xfer_size;
//set to "
strm >referer;
strm >agent;

std::cout << "host: " << host << std::endl;
std::cout << "ident: " << ident << std::endl;
std::cout << "authuser: " << authuser << std::endl;
std::cout << "datetime: " << datetime << std::endl;
std::cout << "http_request: " << http_request << std::endl;
std::cout << "response_code: " << response_code << std::endl;
std::cout << "xfer_size: " << xfer_size << std::endl;
std::cout << "referer: " << referer << std::endl;
std::cout << "agent: " << agent << std::endl;

return 0;
}

--

Adrian

Think you know a language? Post to comp.lang... and find out!
Nov 14 '06 #1
Share this Question
Share on Google+
2 Replies


P: n/a

Adrian wrote:
Hi All,

Is there anyway to change what isspace thinks is a space character.
What? Did you mean use an alternate seperator token(s)? If so, see
below.
>
I am parsing some log files and it would be nice to just read a field as what ever is between quotes or between []'s ie clf log files

I know I can go char by char or find_last_of etc, but I would like to know if it is possible with locales and facets?

Also are there any suggestions for outputting in multiple languages? Can I use locales again?
I was just thinking of a class that has the language set at runtime then outputs the correct text as per the language?

//---------------------------------------------------------------------------

#include <sstream>
#include <string>
#include <locale>
#include <iostream>
#pragma hdrstop

//---------------------------------------------------------------------------

#pragma argsused
int main(int argc, char* argv[])
{
std::stringstream strm("209.167.50.22 - - [25/Jan/2006:02:27:14 -0800] \"GET /Services/Development HTTP/1.1\" 301 352 \"-\" \"LinkWalker\"");
std::string host;
std::string ident;
std::string authuser;
std::string datetime;
std::string http_request;
std::string response_code;
std::string xfer_size;
std::string referer;
std::string agent;

strm >host;
strm >ident;
strm >authuser;
//set the isspace to a ]
strm >datetime;
//set the isspace to a "
strm >http_request;
//set it back to default
strm >response_code;
strm >xfer_size;
//set to "
strm >referer;
strm >agent;

std::cout << "host: " << host << std::endl;
std::cout << "ident: " << ident << std::endl;
std::cout << "authuser: " << authuser << std::endl;
std::cout << "datetime: " << datetime << std::endl;
std::cout << "http_request: " << http_request << std::endl;
std::cout << "response_code: " << response_code << std::endl;
std::cout << "xfer_size: " << xfer_size << std::endl;
std::cout << "referer: " << referer << std::endl;
std::cout << "agent: " << agent << std::endl;

return 0;
}
Use std::getline to break down the istringstream using a delimiter
token:

#include <iostream>
#include <string>
#include <vector>
#include <sstream>
#include <iterator>

int main()
{
std::string stest("this$string$uses$an$alt$sep$token");
std::istringstream iss(stest);

std::vector< std::string vs;
std::string buffer;
while( std::getline(iss, buffer, '$') )
{
vs.push_back(buffer);
}
std::copy( vs.begin(),
vs.end(),
std::ostream_iterator< std::string >(std::cout, "\n") );
}

/*
this
string
uses
an
alt
sep
token
*/

If thats not what you are looking for, then restate your question
clearly.
As far as locales are concerned, lookup imbue.

Nov 14 '06 #2

P: n/a
Salt_Peter wrote:
Adrian wrote:
>Hi All,

Is there anyway to change what isspace thinks is a space character.

What? Did you mean use an alternate seperator token(s)? If so, see
below.
No, I meant can you change the character that isspaces thinks are whitespace. To be honest I thought the question was in plain english.
--

Adrian

Think you know a language? Post to comp.lang... and find out!
Nov 14 '06 #3

This discussion thread is closed

Replies have been disabled for this discussion.