By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
432,275 Members | 947 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 432,275 IT Pros & Developers. It's quick & easy.

css to annotate image files?

P: n/a

dear css experts:

I have some TIFF images of text articles. I would love to be able to
use the "Find" feature in my browser to a particular word in the text,
and to have hrefs in the document be clickable. Acrobat Acrobat can do
this for pdf documents, but I would love to stick to tiff or gif or
some other plain browser format. my OCR software gives me the exact
location of each word, so I have the raw information necessary.

now, css can render this as invisible text above the image (in an
absolute location). I presume (but have not yet tried) that a find
would still be visible to the browser user, because the found text
would be rendered. I would love to have the ability to render a
semi-transparent block in css, so that I can highlight clickable area
in the underlying image, too, but I think this is not possible.

I guess my question is if anyone has written a program that automates
this "making bitmaps searchable" using CSS?

/iaw

Dec 23 '06 #1
Share this Question
Share on Google+
5 Replies


P: n/a

iv****@gmail.com wrote:
I guess my question is if anyone has written a program that automates
this "making bitmaps searchable" using CSS?
I'd be wary of this for web use. Search engine spiders don't like
invisible content and regard it as spamdexing, penalising it as such.
If you do it, then make the stuff visible (although not prominent) to
non-JS users, then only "hide" it with a bit of JS at page load time.
You can certainly use CSS and standard :hover or clickable button
techniques to make it appear. Two caveats though: if this stuff is
worth seeing, then it's usually more annoying to have it on hover only.
Maybe I really do want to read the stuff. Secondly it ought to appear
when printed, when better accessibility is needed, or when I've
manually selected to show everything (and that should be persistent for
at least my session).

When I do image galleries I embed the metadata inside the image (JPG
EXIF) and then extract it into the HTML automatically. I also use a lot
of Dublin Core to structure it, but not the more obscure formats like
MPEG-7. I don't do much CSS on top of this.

Dec 27 '06 #2

P: n/a

hi andy: thanks for the advice. fortunately, search engine inclusion
is not important to me. I am just trying to add the ability to
"search=find" (OCR-equivalent) text in image-files of articles. The
user will see the true authentic representation of the text in the
image itself.

This seems to be harder to do than I thought. I can't even get a
simple "proof of concept" to work. sigh...

regards,

/iaw

Dec 28 '06 #3

P: n/a

iv****@gmail.com wrote:
I am just trying to add the ability to
"search=find" (OCR-equivalent) text in image-files of articles.
Seems like the images alt text is what you need, or maybe even look at
using longdesc.

You'll have to process the HTML a bit though, either before publishing
(XSLT might be handy) or with some client-side JS.

Dec 29 '06 #4

P: n/a
iv****@gmail.com a écrit :
[...] I am just trying to add the ability to
"search=find" (OCR-equivalent) text in image-files of articles. The
user will see the true authentic representation of the text in the
image itself.
A possible start would be:
* create a div of fixed width and height (in pixels)
* set the image as background of that div
* split the recognized text in words, or possibly in lines, create one
span per unit and position it inside the div
(using lines instead of words as unit allows the search to span
several words)

But then a simple proof of concept shows that a lot of information is
missing: do you have any way to obtain the font used in the image? How
can you make sure the end user has the same font installed, or allows
text to be as small as the text in the image?

I would go another route:
* create a div as above
* put the image in it
* write a simple FORM that lets the user enter search pattern
* write the server side code that, given that search pattern,
find all the matches and, for each match, computes a rectangle
framing the matching text, then creates a span inside the outer
div (simple border, no background to let the text shine through)
* when this is working, add some Ajax "magic" for interactivity.

Good luck!
--
Daniel Déchelotte
http://yo.dan.free.fr/
Dec 29 '06 #5

P: n/a

hi daniel:

the best analogy may be the desire to display a NYTimes front page
image, and allow a user to search for text on this page.

your first suggestion is also what I was thinking of, but the remaining
issue is how I can make the foreground words invisible, except that I
need at least a colored box when selected.

I value speed and convenience [using ordinary browser mechanisms] over
perfect alignment accuracy or even multi-word searches. I can also
assume that my users have the basic microsoft web fonts. at least on a
word-by-word basis, I can make sure that I won't be off too much. In
fact, I would even be happy to display an arrow where the searched for
word starts---the point is to allow the reader to find a word on a long
image page.

I have never used ajax. maybe I need to learn it to do something like
this, though.

Andy---any alt tags on images are not searchable, and even if they
were, they would not show the reader where a particular word starts.

regards,

/iaw
Daniel Déchelotte wrote:
iv****@gmail.com a écrit :
[...] I am just trying to add the ability to
"search=find" (OCR-equivalent) text in image-files of articles. The
user will see the true authentic representation of the text in the
image itself.

A possible start would be:
* create a div of fixed width and height (in pixels)
* set the image as background of that div
* split the recognized text in words, or possibly in lines, create one
span per unit and position it inside the div
(using lines instead of words as unit allows the search to span
several words)

But then a simple proof of concept shows that a lot of information is
missing: do you have any way to obtain the font used in the image? How
can you make sure the end user has the same font installed, or allows
text to be as small as the text in the image?

I would go another route:
* create a div as above
* put the image in it
* write a simple FORM that lets the user enter search pattern
* write the server side code that, given that search pattern,
find all the matches and, for each match, computes a rectangle
framing the matching text, then creates a span inside the outer
div (simple border, no background to let the text shine through)
* when this is working, add some Ajax "magic" for interactivity.

Good luck!
--
Daniel Déchelotte
http://yo.dan.free.fr/
Dec 29 '06 #6

This discussion thread is closed

Replies have been disabled for this discussion.