Re: Custom Resource, XML problem



On Fri, 1 May 2009 10:28:11 -0700, Electronic75 <Electronic75@xxxxxxxxxxxxxxxxxxxxxxxxx>
wrote:

Hello, I watched a video from " How To" series titled custom resources by
Mr.David Ching(Thank you Mr.Ching) and I tried to use it with a XML wrapping
class by Mr.Jerry Wang(Thank you Mr. Wang) which is available on CodeProject
site.
the problem that I have is when I tried to load a xml resource it copies
some extra characters to buffer that I have to manually remove before I can
used it in the class.
this is the code:

USES_CONVERSION;
CXml xXml;
LPCTSTR pcaResourceName;
LPSTR pcaResourceContent;
****
Why are you assuming that it is 8-bit characters?
****
DWORD dwResourceSize;

JWXml::CXmlNodePtr pxNode, pxProperty;
****
It is tasteless to put commas in declaration lists. It makes the code hard to read. The
rule should be one variable, one line.
****
JWXml::CXmlNodesPtr pxNodes, pxProperties;
//JWxml is namespace used by CXml
CString xName, xValue;

UINT i, uiChildCount, uiPropertyCount,k, uiID = 0;
*****
Too many commas, unreadable code.
*****
int uiValue;

pcaResourceName = MAKEINTRESOURCE(IDR_XML_1);
****
Why introduce a variable just to hold a constant? Why not just put the MAKEINTRESOURCE
directly in the FindResource call?
****
HRSRC hXML = FindResource(AfxGetResourceHandle(), pcaResourceName,
_T("XML"));
HGLOBAL hMem = LoadResource(AfxGetResourceHandle(),hXML);
pcaResourceContent = (LPSTR) LockResource(hMem);
TRACE(pcaResourceContent);
//The output of this trace gives three extra character 
****
Where? At the front? At the end? Did you bother to look up what those character codes
are? Unless you take the time to understand what is going on, you have derived no
meaningful data. If you had bothered to look this up, you would have seen that what you
have is
0xEF 0xBB 0xBF

which is then screamingly obvious as the UTF-8 Byte Order Mark, which means that you have
to treat the content as being encoded as UTF-8, and convert it to UTF-16LE before using
it.

(For example, see The Unicode Standard, Version 5.0, page 551)
****


dwResourceSize = SizeofResource(AfxGetResourceHandle(),hXML);

LPSTR pcaXml = new char[(dwResourceSize*2) + 1];
//I doubled the size of buffer because Cxml accepts LPCTSTR so I have //to
convert it
****
Why do you think "doubling" is the correct solution?
****
memcpy((void*)pcaXml , (void*)(pcaResourceContent + 3),dwResourceSize);
****
Note that you are now presuming that the BOM exists, that is is 3 bytes in length, and
that the data is inherently in UTF-8 encoding. You have to look at the bytes, make sure
you have a BOM, if you do, which one, and convert the text appropriately.
****
//When I copy at start point of pcaResourceContent the Cxml loading //fails
but when I start copying from pcaResourceConteent+3 it goes well //and CXml
succeeds in loading it.
pcaXml[dwResourceSize*2] = '\0';

if(!xXml.LoadXml(A2W(pcaXml)))
****
A2W will give the wrong result for a UTF-8 encoding. Therefore, your code is not
guaranteed to work correctly.

You will have to determine if the BOM is UTF-16 (no conversion required), UTF-8 (UTF-8 to
UTF-16 conversion required), or missing (A2W required).
****
{
delete[] pcaXml;
FreeResource(hMem);
return;
}
....


I don't know what are these three extra characters. In resource view there
is nothing at the beginning of resource. Dose anybody know what these 3
characters are and can they have different lengths(other than 3)
****
It's the UTF-8 BOM. But see the previous comments, you have to convert in accordance with
the BOM you find.
joe
****

Thanks,
Joseph M. Newcomer [MVP]
email: newcomer@xxxxxxxxxxxx
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
.



Relevant Pages