Re: Non-standard characters on Web

Tech-Archive recommends: Repair Windows Errors & Optimize Windows Performance



On Jul 16, 9:34 pm, little_creature <litttle.creature....@xxxxxxxxx>
wrote:
On Jul 16, 8:17 pm, korvent...@xxxxxxxxxxxxxxx (Corentin Cras-Méneur)
wrote:



Dan <dkenned...@xxxxxxxxx> wrote:
But I don't think that explains why the Word-generated HTML files
looked fine in Firefox when I opened them on my hard drive, but were
messed up when I opened them on the Web. Does it?

Well if there is no encoding information on the header, the browser will
look at the default encoding for the file and it might be displayed
correctly from your Mac.
When you upload it though, this file encoding can get converted to a
default for the server (usually NOT UTF-8). That's why it's important to
have a proper header decclaration in the document itself since the file
encoding is often unreliable.

To test the hypothesis, you could re-download the HTML file from the
server and check how it displays from your Mac.
You could even open it in a text editor and check how the extended
characters look like. I would suspect that they are all corrupted and
won't display properly,

Corentin

--
--- Mac:MS MVP (Francophone) http://www.cortig.net/wordpress/---
http://www.mvps.org - http://mvp.support.microsoft.com
MVPs are not MS employees - Les MVP ne travaillent pas pour MS
Remove "NoSpam" to e-mail me - Retirez "NoSpam" pour m'écrire

Hiya,
I would recommend you to learn HTML. The HTML itself it's really very
easy. Word puts a lot of mess when it generates the HTML files. It can
result at files 10x times greater. The encoding sounds reasonable,
particularly if the sever will be PC-based.
As a standard I use ISO 8859-2 encoding (as I create the files on PC
with central european language). All special character I need to use I
use the alternative character such as:
non -breakable space &amp;nbsp;
&amp;lambda;
I have no problems read my files on PC/Mac.
ndash &#150

Ok, as far as I can see in this web access it was not translated then
it should have been:
on -breakable space &nbsp; [ampersand followed by nbsp and semicolon
with no spaces]
&lambda;
ndash &#150

.



Relevant Pages

  • Re: UTF-8 JavaScript files
    ... If the adopted encoding form is not otherwise ... That is a subset of a character set, ... Well, what I know is that when talking about HTML, the difference ... Whether UTF-8 would be most widely used was ...
    (comp.lang.javascript)
  • Re: RSS feeds and HTML special characters
    ... unless you tell Perl what encoding you want for output. ... HTML entity from the feed. ... Easiest solution is to output your HTML as utf8. ... is just a fancy way of writing an utf character. ...
    (comp.lang.perl.misc)
  • Re: xhtml encoding question
    ... The original files are in cp1252 encoding and I must reencode them to ... I have to replace certain characters with html entities. ... for c in string: ... Still not efficient because it builds the string one character at a time ...
    (comp.lang.python)
  • Re: htmentities does not translate german "umlaute"
    ... the document encoding of HTML is Unicode. ... Don't you mean the document _character set_ of HTML is Unicode (or even more ...
    (comp.lang.php)
  • Re: Where to get BeautifulSoup--www.crummy.com appears to be down.
    ... Beautiful Soup uploaded to my website: ... obtaining a sensible parse tree in the face of common HTML errors. ... the encoding of an HTML or XML document, ... appears after this Tag in the document.""" ...
    (comp.lang.python)