473,394 Members | 1,674 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,394 software developers and data experts.

Conversion of historical material to HTML

Hi all,

A number of questions:

1) I'm in the process of converting some historical type-written
newsletters to HTML and one of the things that came up while doing so
was the fact that some headings underline every other letter. I can
just go ahead and bold them instead, but that feels a bit 1984'ish, or
I can emulate this using <span> tags,
but this leads to horribly (too) long lines. Is there an easier
method?

2) Given the limitations of browsers, is there an easy way of getting
(as close as possible to) the original fontsizes. I guess using
appropriately set up span.xxx classes is the only way to influence
line-spacing?

3) I'm currently using CSS2 & <div>'s to get the paging right. This
works OK, but is there an easier way?

4) Does anyone have/know where to find some simple examples of pages
(probably using tables) using different font-sizes for the various
cells - I need to use them to emulate the small-size output of thermal
printers.

5) OT(ish) My scanner came with Paperport Deluxe V8 and ReadIris was
on the PC. Both are OK(ish) for this type of material, but choke on
more complicated (and I have heaps of it) stuff, can anyone recommend
some really good OCR software?

6) Would UltraEdit be a good choice to edit HTML/CSS or are there more
suitable alternatives? FWIW, I'm now using an old DOS editor that I
can use almost blindfolded, but that doesn't have any fancy features
(other than being tiny and fast) and has a line-length limit of 255
characters (see 1) above...

Reply here, but if you can cc: any reply to <pr***@onetel.net.uk> I
would appreciate it.

Thanks,

Robert
--
Robert AH Prins
pr***@onetel.net.uk
Jul 20 '05 #1
5 1555
Robert AH Prins wrote:
A number of questions:


Not really answering any of the questions: but what are your trying to
achieve here? I think you're going to put in lots of work for an
unsatisfactory result. My advice would be:

Convert the letters to the plainest HTML you can, with no attempt at
presentation or replicating the original look. The goal here is to convert
the information.

Next, if appropriate, prepare high-resolution scans of the originals,
either as individual images or as multi-page PDFs. This is to convert the
visual appearance, if that is an important part of the history.

--
Mark.
http://tranchant.plus.com/
Jul 20 '05 #2
On 20 Apr 2004 07:59:20 -0700, Robert AH Prins <pr***@bigfoot.com> wrote:
Hi all,

A number of questions:

1) I'm in the process of converting some historical type-written
newsletters to HTML and one of the things that came up while doing so
was the fact that some headings underline every other letter. I can
just go ahead and bold them instead, but that feels a bit 1984'ish, or
I can emulate this using <span> tags,
but this leads to horribly (too) long lines. Is there an easier
method?
What's your goal? To make the web page look just like the document? Can't
really be done, you can get close. Or to present the factual information
in the documents? Then mark up headings like headings.
2) Given the limitations of browsers, is there an easy way of getting
(as close as possible to) the original fontsizes. I guess using
appropriately set up span.xxx classes is the only way to influence
line-spacing?
Monitor resolution will be an issue here - you won't actually know the end
product, will you?
3) I'm currently using CSS2 & <div>'s to get the paging right. This
works OK, but is there an easier way?
The page layout? This is appropriate.
4) Does anyone have/know where to find some simple examples of pages
(probably using tables) using different font-sizes for the various
cells - I need to use them to emulate the small-size output of thermal
printers.


Even if you're adopting the look and feel of these old documents, remember
usability and accessibility. Don't make those fonts too small. You could
apply font-size to the specific element.

Bottom line - if you are looking to present historical documents, worry
about the accessibility of the content first, and the nuance of the
presentation later.
Jul 20 '05 #3
"Mark Tranchant" <ma**@tranchant.plus.com> wrote in message
news:40**************@tranchant.plus.com...
Robert AH Prins wrote:
A number of questions:
Not really answering any of the questions: but what are your trying

to achieve here? I think you're going to put in lots of work for an
unsatisfactory result. My advice would be:
My stepdaughter is doing most of the work to earn some extra
pocketmoney,
and there are no deadlines. It's a labour of love to keep this
material
available for posterity.
Convert the letters to the plainest HTML you can, with no attempt at
presentation or replicating the original look. The goal here is to convert the information.
I agree, but having an index and some cross-linking is extremely
useful.
Next, if appropriate, prepare high-resolution scans of the originals, either as individual images or as multi-page PDFs. This is to convert the visual appearance, if that is an important part of the history.


There are already PDFs availalable, but creating multi megabyte files
(the
largest in in excess of 3Mb) for a mere 6 pages A4 (text content about
24k) is utter madness, even more so because it is just a picture and
cannot
be searched.

Robert
--
Robert AH Prins
pr***@onetel.net.uk
Jul 20 '05 #4
pr***@bigfoot.com (Robert AH Prins) wrote:
1) I'm in the process of converting some historical type-written
newsletters to HTML and one of the things that came up while doing so
was the fact that some headings underline every other letter.
I would suggest avoiding any attempt to reproduce that in HTML. Any
underlining, even broken underline, would easily be misunderstood as
denoting a link in HTML. So it's better to use some other styling for a
heading, if needed. I would simply use a suitable element, like h2, and
add as much CSS as reasonable to make the appearance resemble the
original - e.g. in fonts, bolding, etc., but not issues like underlining.

The look & feel might matter, so some styling, even detailed, might be
nice. But don't overdo it.

But should you wish to imitate underlining, then CSS code like
h2 { border-bottom: dashed black thin; }
might be a reasonable compromise. It's not really underline but bottom
border, and that's one reason why it wouldn't be taken as link underline
so easily. Other options include setting suitable background image
containing just a short vertical line at the level of underlining and
some transparent stuff on the right, and repeating in x direction.
I can
just go ahead and bold them instead, but that feels a bit 1984'ish, or
I can emulate this using <span> tags,
but this leads to horribly (too) long lines.
It gets rather awkward. But you could put every other character between
<u> and </u>. I wouldn't be too puristic here.

You could also use increased font size instead of bolding, e.g.
h2 { font-weight: normal; font-size: 115%; }
2) Given the limitations of browsers, is there an easy way of getting
(as close as possible to) the original fontsizes.
Why would you do _that_? Don't fight against the strengths of the Web.
Just as the Web is a way to make the data technically accessible
worldwide over the network, not setting any font sizes (except relatively
e.g. for headings) is part of the way of making it humanly accessible to
people with different properties and preferences.
I guess using
appropriately set up span.xxx classes is the only way to influence
line-spacing?
You cannot influence line spacing in HTML - though you could create very
coarse simulations. Just use line-height in CSS.
3) I'm currently using CSS2 & <div>'s to get the paging right. This
works OK, but is there an easier way?


Paging, too, is an area where you should utilize the strengths of the Web
and not fight against them. Do you know what paper sizes people have, or
their printer settings?

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

Jul 20 '05 #5
"Jukka K. Korpela" <jk******@cs.tut.fi> wrote in message news:<Xn****************************@193.229.0.31> ...
pr***@bigfoot.com (Robert AH Prins) wrote:
1) I'm in the process of converting some historical type-written
newsletters to HTML and one of the things that came up while doing so
was the fact that some headings underline every other letter.
I would suggest avoiding any attempt to reproduce that in HTML. Any
underlining, even broken underline, would easily be misunderstood as
denoting a link in HTML.


That's something I didn't consider. I'll stick to bold.
So it's better to use some other styling for a
heading, if needed. I would simply use a suitable element, like h2, and
As I mentioned in my original post, these were type-written
newsletters,
the only text decoration they use are the alternate underline and ALL
CAPS.
add as much CSS as reasonable to make the appearance resemble the
original - e.g. in fonts, bolding, etc., but not issues like underlining.

The look & feel might matter, so some styling, even detailed, might be
nice. But don't overdo it.
I have no plans to go all the way. My stepdaughter is
scanning/correcting the
material, I'm just adding very basic HTML.
But should you wish to imitate underlining, then CSS code like
h2 { border-bottom: dashed black thin; }
Everything is enclosed in <pre> ... </pre> tags, I don't even use
<p>'s.

<snip>
I wouldn't be too puristic here.


For these newsletters, which you can find at
<http://www.rskey.org/tippc.htm> under the heading 52-Notes, I will
stick to basic text, for others I will have to add a bit more fancy
stuff (two columns and graphics)
2) Given the limitations of browsers, is there an easy way of getting
(as close as possible to) the original fontsizes.


Why would you do _that_? Don't fight against the strengths of the Web.

Just as the Web is a way to make the data technically accessible
worldwide over the network, not setting any font sizes (except relatively
e.g. for headings) is part of the way of making it humanly accessible to
people with different properties and preferences.


OK, try again: These things use two fontsizes, one for the body,
another slightly smaller for the colofon on page 1. I'd like them to
look like the
originals (assuming a medium font in the browser)
I guess using
appropriately set up span.xxx classes is the only way to influence
line-spacing?


You cannot influence line spacing in HTML - though you could create very
coarse simulations. Just use line-height in CSS.


Sorry, that's what I meant, and I'm using only full, 80 and 50%, which
I guess
should be OK in most modern browsers
3) I'm currently using CSS2 & <div>'s to get the paging right. This
works OK, but is there an easier way?


Paging, too, is an area where you should utilize the strengths of the Web
and not fight against them. Do you know what paper sizes people have, or
their printer settings?


The originals were on US 8.5x11", which enough margins to fit in A4.
Using
medium font sixe in IE, they print OK on either format.

Robert
--
Robert AH Prins
pr***@onetel.net.uk
Jul 20 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
by: el_boricua | last post by:
What is the best way to convert from a string to a int and from a string to double? I have a line that is tokenize and I need to parse to test the tokens and convert them to their respective values...
26
by: David W. Fenton | last post by:
A client is panicking about their large Access application, which has been running smoothly with 100s of thousands of records for quite some time. They have a big project in the next year that will...
3
by: Erwin | last post by:
I have a work assignment in which I have to put a historical archive within access which can be used for trendlines etc. It contains data about month, service percentages and numbers. Within a...
8
by: John Beeler | last post by:
I am a graduate student writing a thesis on a space in Indianapolis called the Circle. I'm trying to figure something out that would really help me out as I write this thing. I would like to...
7
by: madhura | last post by:
Hi all I am new to this group.Please suggest me study material for c, also exercise material like examples on finding output, programs for beginners like me. How one can improve logic for writing...
46
by: Sensei | last post by:
I was having an interesting discussion about the ANSI C and some ``weird inconsistencies'', or at least what at first sight can be seen as an imbalance. I hope someone can satisfy my curiosity. ...
3
by: NageshB | last post by:
Hi every one, can any help me in converting prefix to infix using Stack in c I know the reverse
2
by: Arcturus | last post by:
-----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkh6waMACgkQq5h2IFR18WMFrwCgv/PNAC8FTZCErvc0KHnx0zpC...
4
by: henry | last post by:
Folks: Using Dreamweaver CS3... Consider a home page, "index.php" which conditionally REQUIREs one of 'N' HTML files of pure content. All site styles are specified in a master CSS file,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.