Re: modify html code on the fly



Igor (or anyone else who might be interested),

I have another follow-up question that hopefully you can help me
with.... I am finally get back to working on this issue and am using
the sample MIME filter from microsoft's site (http://
support.microsoft.com/kb/260840) that essentially takes an xml file
and converts it to html by adding <html> to the beginning and </html>
to the end and then replacing certain characters within the xml file,
such as '<' to '&lt;' and so on. This seemed like a perfect example to
use as a base to what I am trying which is to use a MIME filter to
intercept html and make certain changes to it, such as bleep out
certain words.

I thought I could very easily modify this example and be able to
demonstrate the concept rather quickly. But it does not work on the
browser never loads the pages, the progress bar just slowly increases
but the page never loads. Basically I did 3 things, which I thought
would be enough. First, I changed the registry code to filter html
('text/html'). This works as I can now debug and catch it as it enters
my code.... Next, I got rid of the calls to BeginXMLConversion &
EndXMLConversion which adds the <html> and </html> tags, since that is
not necessary since we are already using html files. The final part
was to get rid of the part that changes certain characters, such as
'<' and '>' as this is no longer necessary for this purpose. I
originally got rid of the call to ParseXMLIntoHTML but after it wasn't
working, I decided to go much simpler. There's a call within there
that provides the characters to replace:

TCHAR* strCharSet = _T("<>&\r");

So I just changed that to a character that should not be on a site (or
at least not on most sites)

TCHAR* strCharSet = _T("~");

Now that's it, I did nothing else. The way I see it, pages should load
and display just like normal as the filter shouldn't change the html
at all. Once I got this working, I could start working on detecting
certain words to replace. Well, this doesn't work. Making those few
changes that I just pointed out causes it not to load pages at all. I
simply compiled the example from MIMEfilt and made those few changes
but nothing. Any thoughts on what I could be doing wrong...

Any help would be appreciated!

Thanks

.



Relevant Pages

  • Re: can I know how to write a html parser in C
    ... Are the lines truly limited to 80 characters of text? ... null-terminated character string size of 249 characters. ... Note too that in the general case it is perfectly acceptable in HTML ... much a beginner at C (and possibly a beginner at programming ...
    (comp.lang.c)
  • Re: Subject text length limit in system.net.mail?
    ... Finally figured what it was - Internet Message Filter for Exchange settings ... decided to change the mail server? ... to pre-generated html pages published somewhere. ... AM> stuff for invalid characters that might cause the process to ...
    (microsoft.public.dotnet.framework.aspnet)
  • Re: [PHP] generating an html intro text ...
    ... You would have to search out and pull in all closing tags. ... grab 256 characters -- The string. ... html markup should not go towards the string length count, ...
    (php.general)
  • Question about CGI.pm
    ... I have been exploring CGI.pm and am of course interested in the HTML ... Escape HTML formatting characters in a string. ... the standard HTML escaping rules will be used. ... is passed through a function called escapeHTML(): ...
    (perl.beginners)
  • Re: Input Character Set Handling
    ... two 8-bit encoded characters matching 208 and 144. ... You wrote that you deal with say Japanese and Korean 'legacy' encodings ... neither for Russian in UTF-8 nor for Japanese ... I don't know why you did not know before about say .HTML files ...
    (comp.lang.javascript)