Re: Finding out if a given character is in UpperCase, LowerCase or Numeric



The 'A' versions work with Japanese, etc., too Sinna. However, the "ANSI
character set" being used is different in different locales. The main
advantage of using Unicode data and support is that the interpretation of
the data is unambiguous, and not subject to misinterpretation when moving it
from one place to another

Tony Proctor

"Sinna" <news4sinna_NOSPAM@xxxxxxxxxx> wrote in message
news:%23RlWUZN3IHA.5024@xxxxxxxxxxxxxxxxxxxxxxx
Tony Proctor wrote:
I shouldn't worry Sinna. I'm sure the OP wasn't really interested in
worldwide distribution :-)

I always try to bring the subject up whenever I can, though, just to
remind people. The majority of applications written in the Western world
are not globally aware, and it doesn't make any difference whether the
programming language uses Unicode, or offers locale-aware support. I've
been involved in international projects for 20 years and developers
always make the same mistakes or assumptions. :-(

It's not just a matter of character sets, or alphabets (as in this case),
or decimal-point/triple-point characters, or date representations, or
boolean representations, etc. There are fundamental differences at the
local language level such as how the sentences are put together (affects
parameter placement, for instance), or the concept of plurals, or the
concept of masculine/feminine, etc. Even writing English language s/w for
UK and US distribution should consider such differences, although it
rarely does

Ah well, back to work, eh?

Tony Proctor

"Sinna" <news4sinna_NOSPAM@xxxxxxxxxx> wrote in message
news:e3fqu0A3IHA.2348@xxxxxxxxxxxxxxxxxxxxxxx
Tony Proctor wrote:
Being pedantic, most of the replies to your question are not totally
accurate. If all you're interested in is A-Z and a-z then they'll do.
However, be aware that other languages based on the Latin alphabet may
have diacritical marks (aka accents) and so simply looking at A-Z will
not work.

For instance, consider A with an acute accent. Chr$(193) is the
uppercase and Chr$(225) is the lowercase.
That's why I stated in my reply that I was only covering ANSI/ASCII. Now
I read your reply, I should have said: ASCII-7.

Sinna


Well, most examples I find on the Internet don't cover Unicode at all,
especially when calling APIs.
Before I had to introduce Japanese language support in my application, I
didn't either, but since then, I always try to use the W-variant of the
API (if available). I don't have to deal with the A-variant anymore as the
'oldest' OS my application supports is Win2k.
Fact is that calling the W-variants is quite a lot harder to implement
(and so to read as newbee).

Sinna



.



Relevant Pages

  • Re: i18ned Character Set in DBMS and tables
    ... I18n is designed to permit you to SELECT ONE SPECIFIC language ... if you 'use unicode to store and retrieve ... you can only set a character set and a related collation ...
    (comp.lang.java.programmer)
  • Re: Beyond ascii
    ... Only that the character set not be full Unicode. ... > in their own language even in the face of restrictions. ... programmers just knew these traps and avoided using them. ...
    (comp.lang.scheme)
  • Re: Unicode in menu and form caption
    ... > Dear Mike ... as a matter of fact vb 6 is really poor in unicode ... Maybe it's just a "language barrier" type of thing (spoken language, ... it's just a character set, albeit one that's large enough to support ...
    (microsoft.public.vb.winapi)
  • Re: Ascii to LCID or Code page function
    ... It is impossible to tell the character set (or language) from an amorphous ... and especially from a single character Lou. ... Unfortunately the VB Textbox is not unicode. ...
    (microsoft.public.vb.general.discussion)
  • =?windows-1252?Q?Encrypting_Unicode_=96_Using_ASCII_as_a_Surrogate_Al?= =?windows-1252?Q?pha
    ... characters of an exotic eastern language using an ASCII keyboard. ... It is true to say that any keyboard of any language can be simulated ... communicate in large volume with China or Japan using CJK from Unicode ... by the computer as an external file and enciphered by a stream cipher ...
    (sci.crypt)