Re: Convert HTML to String

Tech-Archive recommends: Repair Windows Errors & Optimize Windows Performance



Well, kind of. other than minor details like char is an obsolete data type that should
not be used in this context, you used "const char *" when you meant LPCTSTR, you are
assuming that every < or > is part of HTML and not actual text (it is nominally incorrect
if they are text, but most browsers will actually handle these cases well), you don't deal
with what might happen if a < or > was found inside a quoted string for an HTML parameter
value, you forgot about handling &-sequences, and you wrote it for 8-bit characters, and
you forgot about UTF-8 encodings (which many Web pages use). So it will work correctly
for a small percentage of Web pages. So other than the fact that it misses a whole bunch
of critical conditions, it should work.
joe

On Wed, 9 Jan 2008 15:39:08 +0100, "Eunet Uhser" <eunet_uhser@xxxxxxxxxxx> wrote:

Hello Group

What I have seen after having tried many things in my life....if you have to
analyze a string carefully, it makes more sense to go down to the "lowest
level" instead of using many sophisticated functions.
So I'd do it just like this:

void function(const *char strg) {
int iPos=0;
while (strg[iPos]) {
if (strg[iPos]=='<') {
...
} else if (strg[iPos]=='>') {
...
} else ´{
}
iPos++;
}
}

This is quite similar to that I would have done, when I wrote such programs
in assembly language....To my opinion, these simple program structures still
show the best performance ;-)

Eunet

"billyard" <dmetcalf@xxxxxxxxxxxxxxx> schrieb im Newsbeitrag
news:47688467$0$16093$4c368faf@xxxxxxxxxxxxxxxxx
Does anybody know of a way to quickly convert HTML to a string? I'm
running into characters like ( =2E ) that I would like translated to the
ASCII value, in this case ( . ) a period.

Here's an example string I'd like to translate:
=0D=0A <img border=3D"0" width=3D"739" height=3D"71" nosend=3D"1"=

My method below seems kind of crude. Is there an easier way?

Here's what I've gotten so far:

findString = "=";

while ( startPos > 0 )
{
startPos = strBody.Find( findString );
if ( startPos > 0 )
{
// Get the next two characters
CString nextTwo = strBody.Mid( startPos + 1, 2 );

// Translate this into the ASCII equivalent?
????

// Remove the characters translated
strBody.Delete( startPos, 3 );

// Insert the translated character back into the string
strBody.Insert( startPos, _T(newChar) );

}
}

Thanks for any help on this.

Joseph M. Newcomer [MVP]
email: newcomer@xxxxxxxxxxxx
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
.



Relevant Pages

  • Re: Is this string input function safe?
    ... return a pointer to mallocated memory holding one input string, ... See my comment after your call to fgets. ... char* malloc_getstr ... before any characters are read, then the ...
    (comp.lang.c)
  • Re: RfD: XCHAR wordset (for UTF-8 and alike)
    ... extended to work with xchars, ... >replacing one char with another. ... before-cursor part, even for fixed-width characters. ... So, should string words ...
    (comp.lang.forth)
  • RE: Fixed Length
    ... match these formats, but rather pull the existing, normal, formats from the ... and run a function to build that information into a string that you ... length and the total lenght of all the fields has to be 100 characters. ... JobId (8 char) ...
    (microsoft.public.access.modulesdaovba)
  • Re: Sorry, newbie question about generating a random string
    ... string grows to a max of 10 characters. ... The real problem is that you are not terminating the string. ... string is an array of characters ending in a null character, ... char myChar; ...
    (comp.lang.c.moderated)
  • Re: Is this code totaly a shit?
    ... | void UppStrg(char *Low, char *Upp, int cnt); ... whitespace-delimited string. ... You're also assuming that the representations of the characters ...
    (comp.lang.c)