Re: Convert HTML to String
- From: Joseph M. Newcomer <newcomer@xxxxxxxxxxxx>
- Date: Wed, 09 Jan 2008 12:48:59 -0500
Well, kind of. other than minor details like char is an obsolete data type that should
not be used in this context, you used "const char *" when you meant LPCTSTR, you are
assuming that every < or > is part of HTML and not actual text (it is nominally incorrect
if they are text, but most browsers will actually handle these cases well), you don't deal
with what might happen if a < or > was found inside a quoted string for an HTML parameter
value, you forgot about handling &-sequences, and you wrote it for 8-bit characters, and
you forgot about UTF-8 encodings (which many Web pages use). So it will work correctly
for a small percentage of Web pages. So other than the fact that it misses a whole bunch
of critical conditions, it should work.
joe
On Wed, 9 Jan 2008 15:39:08 +0100, "Eunet Uhser" <eunet_uhser@xxxxxxxxxxx> wrote:
Hello GroupJoseph M. Newcomer [MVP]
What I have seen after having tried many things in my life....if you have to
analyze a string carefully, it makes more sense to go down to the "lowest
level" instead of using many sophisticated functions.
So I'd do it just like this:
void function(const *char strg) {
int iPos=0;
while (strg[iPos]) {
if (strg[iPos]=='<') {
...
} else if (strg[iPos]=='>') {
...
} else ´{
}
iPos++;
}
}
This is quite similar to that I would have done, when I wrote such programs
in assembly language....To my opinion, these simple program structures still
show the best performance ;-)
Eunet
"billyard" <dmetcalf@xxxxxxxxxxxxxxx> schrieb im Newsbeitrag
news:47688467$0$16093$4c368faf@xxxxxxxxxxxxxxxxx
Does anybody know of a way to quickly convert HTML to a string? I'm
running into characters like ( =2E ) that I would like translated to the
ASCII value, in this case ( . ) a period.
Here's an example string I'd like to translate:
=0D=0A <img border=3D"0" width=3D"739" height=3D"71" nosend=3D"1"=
My method below seems kind of crude. Is there an easier way?
Here's what I've gotten so far:
findString = "=";
while ( startPos > 0 )
{
startPos = strBody.Find( findString );
if ( startPos > 0 )
{
// Get the next two characters
CString nextTwo = strBody.Mid( startPos + 1, 2 );
// Translate this into the ASCII equivalent?
????
// Remove the characters translated
strBody.Delete( startPos, 3 );
// Insert the translated character back into the string
strBody.Insert( startPos, _T(newChar) );
}
}
Thanks for any help on this.
email: newcomer@xxxxxxxxxxxx
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
.
- References:
- Re: Convert HTML to String
- From: Eunet Uhser
- Re: Convert HTML to String
- Prev by Date: Re: COM on 32 bit, 64 bit
- Next by Date: Re: PropertyPage problem
- Previous by thread: Re: Convert HTML to String
- Next by thread: Enable and Disable an USB port dynamically
- Index(es):
Relevant Pages
|