Re: Convert HTML to String

Tech-Archive recommends: Fix windows errors by optimizing your registry



billyard wrote:
Giovanni Dicanio wrote:
"billyard" <dmetcalf@xxxxxxxxxxxxxxx> ha scritto nel messaggio news:47688467$0$16093$4c368faf@xxxxxxxxxxxxxxxxx

// Get the next two characters
CString nextTwo = strBody.Mid( startPos + 1, 2 );

// Translate this into the ASCII equivalent?
????

So, is your problem to translate a string of kind "XX" where XX is an hex integer (e.g. "2D") into the corresponding ASCII character?

If so, I would use a code like this (may have errors... not tested on a compiler, I just wrote this on the keyboard and read it again):

The main function is this:

TCHAR AsciiHexToChar(LPCTSTR asciiHexCode)

it converts from the given hex ASCII code string to corresponding ASCII character, e.g.:

AsciiHexToChar( _T("2E") ) returns _T('.')

...if you find the _T() confusing, imagine like this:

AsciiHexToChar( "2E" ) returns '.'

So, in your loop, you could write:

TCHAR ch = AsciiHexToChar( nextTwo );

and ch stores the desidered character.

<code>
// Converts a character storing an hex nibble
// ('0'-'9','a'-'f','A'-'F')
// into its decimal value.
int HexNibbleToInt( TCHAR hexNibble )
{
if ( hexNibble >= _T('0') && hexNibble <= _T('9') )
return (hexNibble - _T('0'));
else if ( hexNibble >= _T('A') && hexNibble <= _T('F') )
return (hexNibble - _T('A')) + 10;
else if ( hexNibble >= _T('a') && hexNibble <= _T('f') )
return (hexNibble - _T('a')) + 10;
else
{
// *** ERROR - invalid character
ASSERT( FALSE );
return -1;
}
}


// Converts ASCII 2-characters hex codes to a single character,
// e.g. "2E" --> '.'
// On bad characters, returns '\0'
TCHAR AsciiHexToChar(LPCTSTR asciiHexCode)
{
int lowNibble;
int highNibble;
int asciiCode;

// Check pointer
ASSERT( asciiHexCode != NULL );

// Check that the strings has two characters
ASSERT( asciiHexCode[0] != _T('\0') );
ASSERT( asciiHexCode[1] != _T('\0') );
ASSERT( asciiHexCode[2] == _T('\0') );


// High nibble
highNibble = HexNibbleToInt( asciiHexCode[0] ) * 16;
if ( highNibble < 0 )
return '\0'; // bad nibble

// Low nibble
lowNibble = HexNibbleToInt( asciiHexCode[1] );
if ( lowNibble < 0 )
return '\0'; // bad nibble

// Convert to character using ASCII
asciiCode = highNibble * 16 + lowNibble;

return (TCHAR)asciiCode;
}

</code>


Giovanni


Giovanni - Nicely done code that works great for me. Only one minor problem. In the code above, you multiply highNibble by 16 two times. Other than that, it works exactly as you said it would.

This line:

// Convert to character using ASCII
asciiCode = highNibble * 16 + lowNibble;

Should read:
asciiCode = highNibble + lowNibble;

btw - I'm looking forward to the day I could write code like that and
actually have it compile the first go around!
.



Relevant Pages

  • RE: Character set conversoin headaches
    ... I made a mistake in my initial analysis in that an ascii file ... ASCII mode converted the two character sequence to a one character ... Text files in Unix System Services are supposed to have hex '15' as ... You talk about converting to ASCII, ...
    (bit.listserv.ibm-main)
  • Re: what does "serialization" mean?
    ... Sorry eddie, but you're dead wrong there as usual. ... >>How about ASCII character 0xB0, ... > Totalitarians and Fascists are often self-appointed language police. ...
    (comp.programming)
  • Re: what does "serialization" mean?
    ... > attempt to present myself as an authority on any and every topic I have ... >> survived and EBCDIC did not because ASCII properly sequenced letters. ... > How about ASCII character 0xB0, ... >> must assert negative facts, for all he knows is there is no knowledge ...
    (comp.programming)
  • Re: Convert HTML to String
    ... is your problem to translate a string of kind "XX" where XX is an hex ... integer into the corresponding ASCII character? ... // Converts a character storing an hex nibble ...
    (microsoft.public.vc.mfc)
  • Re: Cohens paper on byte order
    ... I think you're using "ASCII" in a notional sense. ... a good reason to teach the *opposite* convention, ... Computers should be as easy to understand as is possible _without_ ... arithmetic on character strings ...
    (sci.crypt)