Unicode pattern matching

Anyone? Can you refer me somewhere that might could help?

Jun 2 '07 #2

4,059

Expert 2GB

change this:

Expand|Select|Wrap|Line Numbers

 sub unicode_latin1_lax {

    my ($dataline) = @_;

    $dataline =~ s/\x{2018}/\~/g; # Should translate unicode to what I want

    # from_to ($dataline, "utf8", "iso-8859-1", 0);

}

to:

Expand|Select|Wrap|Line Numbers

 sub unicode_latin1_lax {

    my ($dataline) = @_;

    $dataline =~ s/\x{2018}/\~/g; # Should translate unicode to what I want

    # from_to ($dataline, "utf8", "iso-8859-1", 0);

    return($dataline);

}

and retry you script. By declaring $dataline with "my" in the sub routine it's not visible outside the sub routine.

Jun 2 '07 #3

Thanks Kevin, ameture mistake and I've made the correction, however, in the long run, that's not the problem. I've fixed that and re-ran and I'm still not changing the unicode characters to something else. Any other thoughts?

Jun 4 '07 #4

4,059

Expert 2GB

what is this regexp supposed to do?

$dataline =~ s/\x{2018}/\~/g;

Jun 4 '07 #5

It's suppose to pattern match and find the hex value in the string. From the Oreilly book...

*****************************
3rd Edition, Programming Perl, Page 164
*****************************
\x{LONGHEX}
\xHEX
A character number specified as one or two hex digits ([0-9a-fA-F]), as in \x1B. The one-digit form is usable only if the character following it is not a hex digit. If braces are used, you may use as many digits as you'd like, which may result in a Unicode character. For example, \x{262f} matches a Unicode YIN YANG.
*****************************

The information within the braces is the hex data from Perl itself. I told Perl to translate the Unicode to Latin1 however, there are some characters that do not translate according to Perl and when this happens Perl produces an error with the hex value of the Unicode that will not translate. I wanted to take that information and do a substitution of my own and the statement your asking about was created. So I'm trying to take that Unicode hex value with resides in a text file and just convert it to a Latin1 character of my choice. My understanding from the above info from the Oreilly book as that I could use that to find and substitute that particular Unicode character.

Here's the information Perl is giving me directly when attempting to translate one of the lines from the text file....

"\x{2018}" does not map to iso-8859-1 at C:/Perl_58/site/lib/Encode.pm line 183.

And here is my edit of the output...
ERROR (578): "\x{2018}" does not map to iso-8859-1, DEC. 8216 - EX. line 123

What this is 578 is how many there are in the text file, \x{2018} if the value Perl gives, 8216 is the decimal translation of the hex and 123 is a line within the text that has an example.

Jun 5 '07 #6

4,059

Expert 2GB

sorry mate, I don't know.

Jun 5 '07 #7

Anyone know another forum that might help? I'm having little success in getting help with this issue. Anything you can do is appreciated.

Jun 20 '07 #8

[perl-python] string pattern matching

4,059

Expert 2GB

try perlmonks:

www.perlmonks.com

Jun 20 '07 #9

by: gsv2com | last post by:

One of my weaknesses has always been pattern matching. Something I definitely need to study up on and maybe you guys can give me a pointer here. I'm looking to remove all of this code and just...

PHP

176

Typed Python?

by: Thomas Reichelt | last post by:

Moin, short question: is there any language combining the syntax, flexibility and great programming experience of Python with static typing? Is there a project to add static typing to Python? ...

Python

by: Xah Lee | last post by:

# -*- coding: utf-8 -*- # Python # Matching string patterns # # Sometimes you want to know if a string is of # particular pattern. Let's say in your website # you have converted all images...

Python

Need help on simple pattern matching searching

by: Henry | last post by:

I have a table that stores a list of zip codes using a varchar column type, and I need to perform some string prefix pattern matching search. Let's say that I have the columns: 94000-1235 94001...

MySQL Database

Will standard C++ allow me to replace a string in a unicode-encoded text file?

by: Eric Lilja | last post by:

Hello, I had what I thought was normal text-file and I needed to locate a string matching a certain pattern in that file and, if found, replace that string. I thought this would be simple but I had...

New object-oriented parallel pattern matching algorithm

by: bpontius | last post by:

The GES Algorithm A Surprisingly Simple Algorithm for Parallel Pattern Matching "Partially because the best algorithms presented in the literature are difficult to understand and to implement,...

Pattern matching with string and list

by: olaufr | last post by:

Hi, I'd need to perform simple pattern matching within a string using a list of possible patterns. For example, I want to know if the substring starting at position n matches any of the string I...

Python

Unicode API

by: Howard Kaikow | last post by:

Are the following equivalent? <DllImport("kernel32", CharSet:=CharSet.Unicode, SetLastError:=True)> _ Private Shared Function FindFirstFile _ (ByVal lpFileName As String, ByVal lpFindFileData As...

Visual Basic .NET

Implementing fp pattern matching, using C++

by: Ole Nielsby | last post by:

First, bear with my xpost. This goes to comp.lang.c++ comp.lang.functional with follow-up to comp.lang.c++ - I want to discuss an aspect of using C++ to implement a functional language, and...

Cloud Servers without Credit Card and Email Registration: A Simpler Way to Get on the Cloud

by: CloudSolutions | last post by:

Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...

Access Europe: Command bars, the Access Shortcut Tool and a simple Audit Log - Wed 3 April

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

One-click Importing Excel Data into a*Database

by: ryjfgjl | last post by:

In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...

Microsoft Excel

Easy Steps to Fix "Canon Printer Won't Connect to WiFi Network"

by: taylorcarr | last post by:

A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...

Migrating Website to Cloud - Emmanuel Katto

Basic Javascript concepts

by: aa123db | last post by:

Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...

Javascript

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...