473,399 Members | 3,919 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,399 software developers and data experts.

Attention, hyperlinkers: inference of active text

I'm looking for ideas, although their expression in executable
certainly doesn't offend me.

I do text manipulation. As it happens, I'm in a position to
"activate" the obvious URI in
Now is the time for all good men to read http://www.ams.org/
That's nice. End-users "get it", and are happy I render
"http://www.ams.org" as a hyperlink. Most of them eventually
notice the implications for punctuation, that is, that they're
happier when they write
Look at http://bamboo.org !
than
Look at http://bamboo.org!

The design breaks down more annoyingly by the time we get to
the "file" scheme, though. How do the rest of you handle this?
Do you begin to make end-users quote, as in
The secret is in "file:\My Download Folder\dont_look.txt".
? Is there some other obvious approach? I am confident that
requiring
It is on my drive as file:\Program%20Files\Perl\odysseus.exe
is NOT practical with my clients.
--

Cameron Laird <cl****@phaseit.net>
Business: http://www.Phaseit.net
Jul 18 '05 #1
7 1206
> The design breaks down more annoyingly by the time we get to
the "file" scheme, though. How do the rest of you handle this?
Do you begin to make end-users quote, as in
The secret is in "file:\My Download Folder\dont_look.txt".


Some thoughts:

1. The quoting certainly seems like a good idea, and one that is
applicable even if other other approaches are also used. Plus,
it is consistent with how most shells handle this problem.

2. You can special case common filenames like "Program Files",
"Documents and Settings", "My Music", etc., (the precise list
would depend on your environment & usage).

3. You could conceivably look in the filesystem (or even on the web) to
check which names/URLs are valid... but I think this could be a bad
idea because the program's behavior become non-deterministic. It might
confuse users.

-param

PS: I've never encountered this problem myself, so this could all be wrong.
Jul 18 '05 #2
cl****@lairds.com (Cameron Laird) writes:
I'm looking for ideas, although their expression in executable
certainly doesn't offend me.

I do text manipulation. As it happens, I'm in a position to
"activate" the obvious URI in
Now is the time for all good men to read http://www.ams.org/
That's nice. End-users "get it", and are happy I render
"http://www.ams.org" as a hyperlink. Most of them eventually
notice the implications for punctuation, that is, that they're
happier when they write
Look at http://bamboo.org !
than
Look at http://bamboo.org!

The design breaks down more annoyingly by the time we get to
the "file" scheme, though. How do the rest of you handle this?
Do you begin to make end-users quote, as in
The secret is in "file:\My Download Folder\dont_look.txt".
? Is there some other obvious approach? I am confident that
requiring
It is on my drive as file:\Program%20Files\Perl\odysseus.exe
is NOT practical with my clients.


Can't you get them to write <URL:http://bamboo.org> (or, alternatively
<http://bamboo.org> which, although not backed up by a RFC, also ought to do
the job and is less to type and to remember).

Apart from making escaping superfuous, this should also solve all your
punctuation and linebreak problems robustly. '<','>' can't occur in URIs so
matching '<http:|file:|www\..*?>.' or so (and then kicking out '\n\s.*') ought
to work, no?

'as
Jul 18 '05 #3
Alexander Schmolck <a.********@gmx.net> schreef:
Can't you get them to write <URL:http://bamboo.org> (or, alternatively
<http://bamboo.org> which, although not backed up by a RFC, also ought
to do the job and is less to type and to remember).


Recent URI RFCs say <...> is more common than <URL:...>.

--
JanC

"Be strict when sending and tolerant when receiving."
RFC 1958 - Architectural Principles of the Internet - section 3.9
Jul 18 '05 #4
If I understand your question correctly, you're looking for a way to
guess what part of an English sentence is a URL. The problem you're
facing is trailing punctuation characters.

Ie, these are good:
Look at http://bamboo.org !
It is on my drive as file:\Program%20Files\Perl\odysseus.exe
And these are bad:
Look at http://bamboo.org!
The secret is in "file:\My Download Folder\dont_look.txt".

If you want to make life as easy as possible for your authors, you
need some good heuristics. You need to guess where the URL starts and
ends. My terminal emulator (SecureCRT) does a pretty good job of this.
Nat Friedman's dingus also did this trick awhile ago - I can't find it
easily now, but I think the code might be part of rxvt or Gnome.

Your other option is to require folks to delimit URLs with something
like <http://bamboo.org>. This is pretty painless and common, but only
you can know whether your users will accept it.
Jul 18 '05 #5
In article <10*************@corp.supernews.com>,
cl****@lairds.com (Cameron Laird) wrote:
End-users "get it", and are happy I render
"http://www.ams.org" as a hyperlink. Most of them eventually
notice the implications for punctuation, that is, that they're
happier when they write
So who is constructing these sentences, you or the end-users?
Look at http://bamboo.org !
than
Look at http://bamboo.org!
Any idea *why* are they happier with the first than the second?
The design breaks down more annoyingly by the time we get to
the "file" scheme, though.
What design, and in what way is it breaking down?
I am confident that requiring
It is on my drive as file:\Program%20Files\Perl\odysseus.exe
is NOT practical with my clients.


Any idea why not? The lack of terminal punctuation?
Jul 18 '05 #6
Cameron Laird <cl****@lairds.com> wrote:
The design breaks down more annoyingly by the time we get to
the "file" scheme, though. How do the rest of you handle this?
The file scheme is no different to http regarding punctuation.
Personally, I trim characters that are valid in URIs but not likely to
be at the end, such as '.', from the end of URIs, so that constructs
like "See http://www.foo.com/index.html." still work. It's a hack but
the results seem reasonable.
It is on my drive as file:\Program Files\Perl\odysseus.exe


URIs with spaces and backslashes are not valid at all, and will break
browsers. (Also the example is missing the drive letter.)

If inputting file names directly is a requirement I would suggest
having a different format for it that doesn't involve escaping-to-URI,
for example you could sniff for double-quoted strings starting with
'[drive letter]:\'.

--
Andrew Clover
mailto:an*@doxdesk.com
http://www.doxdesk.com/
Jul 18 '05 #7
I'm pretty sure that this isn't a valid url:
file:\I never\used anything\besides windows.txt
It's something, but it's not a URL.

For actual HTTP URLs, I would suggest that you have a step in the
highlighting that considers whether the last part of the URL seems to
contain plausible characters. Letters from this set are pretty
unlikely: ".,!])}'\""

For these file: faux-URLs, you could again start by parsing the maximum
number of characters as the URL, then repeatedly check whether the
current fragment exists on disk. If it doesn't, chop off part of it
(probably at whitespace) and try again until you get something that
exists or your string is empty.

If that doesn't work (for instance, you're not in a position to check
what exists on the user's disk) then you could try a rule where the
hyperlink portion extends from file: at least to the last \, and if the
part beyond that is of the form "word word word.ext" then it's included
too.

Best of luck. This'll probably require a lot of experimentation.

Jeff

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFA1D/xJd01MZaTXX0RAh5yAJ9kyn8l8+XBheDYbFGomXvtW29fLgCfW k8M
ejPm975Sb8ASPTWknsE/huQ=
=CokA
-----END PGP SIGNATURE-----

Jul 18 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: A.M | last post by:
Hi, Is there any inelisence IDE (like visual studio) for python? Thanks, Alan
0
by: Brett C. | last post by:
My thesis, "Localized Type Inference of Atomic Types in Python", was successfully defended today for my MS in Computer Science at the California Polytechnic State University, San Luis Obispo. With...
162
by: Isaac Grover | last post by:
Hi everyone, Just out of curiosity I recently pointed one of my hand-typed pages at the W3 Validator, and my hand-typed code was just ripped to shreds. Then I pointed some major sites...
39
by: Noticedtrends | last post by:
Can inference search-engines narrow-down the number of often irrelevant results, by using specific keywords; for the purpose of discerning emerging social & business trends? For example, if...
2
by: rummey | last post by:
Group - I am working on a project that needs an inference engine (read: ala expert system) for Access. The task is relatively simple so a low-tech inference engine should do fine. Is there...
9
by: Pam Ammond | last post by:
I need the code to update the database when Save is clicked and a text field has changed. This should be very easy since I used Microsoft's wizards for the OleDBAdapter and OleDBConnection, and...
1
by: Yama | last post by:
Hello, Can someone tell me how to insert text into a table containing a TEXT field? How to update? I would like to create a stored procedure that take a text parameter and inserts it into a...
2
by: george.leithead | last post by:
Hi all, I have a very strange problem! In following Web page (which is generated from a CMS System), the navigation to the left 'dissapears' when you roll the mouse over the links? It does not...
12
by: Ste | last post by:
Hi there, I've got a website with a list of Frequently Asked Questions, so there's a question and answer in a long list down the page. Can anyone recommend a simple script that would allow me...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.