Re: Characters allowed in short filenames
- From: "Norman Diamond" <ndiamond@xxxxxxxxxxxxxxxx>
- Date: Tue, 15 Apr 2008 14:15:58 +0900
Unfortunately my requirements at the moment don't involve what the OS is doing (aside from what its system locale's OEM code page might be from time to time). My requirements involve storing allowable filenames. Long filenames seem to be pretty simple ... mostly[*]. Short filenames are yielding all these questions.
Please, do you know what the real rules are for what characters are allowed on disk in short filenames?
[* The Posix subsystem clouds up the question of what's allowed in long filenames. | is allowed but " isn't, even though Posix itself allows both. Oops this comes from experiments not from documentation.]
"David Craig" <drivers@xxxxxxxxxx> wrote in message news:%2356XeVrnIHA.1164@xxxxxxxxxxxxxxxxxxxxxxx
For whichever version of Windows was in effect when fatgen103.doc was created, it is possibly true. The sources to fastfat are in the WDK. There are also somethings going on under the win32 subsystem, but if you use a NtCreateFile() you can bypass those to see if there is any difference.
"Norman Diamond" <ndiamond@xxxxxxxxxxxxxxxx> wrote in message news:e1iF$PrnIHA.2352@xxxxxxxxxxxxxxxxxxxxxxxMy partial understanding is that short filenames are stored using the OEM
code page of the system default locale at the time that the file (or
directory) is created.
For complicated code pages this is pretty simple, for example code page 932
is both ANSI and OEM, so each ANSI codepoint maps onto the exact same OEM
codepoint.
For simple code pages this isn't so simple. For example for several Western
European languages the default ANSI code page is 1252 but the default OEM
code page isn't 1252. I thought I read that the default OEM code page for
US Windows would be 437, but experiments indicate otherwise.
As far as I can tell, code page 437 doesn't contain a ゜ character. So if
the current default OEM code page is 437 and I create a new file then the
short filename cannot contain a ゜.
Code page 850 contains a ゜ character. So if the current default OEM code
page is 850 then we are halfway towards allowing a short filename to contain
a ゜. We shouldn't get more than halfway because lowercase letters aren't
allowed in short filenames, but let's proceed.
I installed US Windows 98 in a virtual PC. I left all its language settings
as defaults; I didn't even install the options for limited amounts of
multilingualism. In a command prompt window I tried the MODE CON command,
but it gave an error instead of telling what code page it was using.
I did install the character map utility, and copied a ゜ character into the
command prompt. US Windows 98 let me create file S゜T.TXT. Well this is OK
so far, since long filenames are stored in Unicode.
Oops. The DIR command said that the short filename is also S゜T.TXT. So
does this mean that US Windows defaults its OEM code page to 850 instead of
437?
The next problem is that fatgen103.doc says that short filenames are always
converted to uppercase. So how could a short filename be S゜T.TXT instead of
SSST.TXT? No problem for the long filename to be S゜T.TXT, but how could the
short filename contain a lowercase letter?
Other letters are going to be more troublesome, and I guess
ntfsgen103.doc[*] is going to say even less than fatgen103.doc says, but if
anyone knows the real rules, could someone please say?
[* I assume there's no such document, which is the reason it's not even
going to say how to determine what characters are allowed in short names in
NTFS.]
.
- References:
- Characters allowed in short filenames
- From: Norman Diamond
- Re: Characters allowed in short filenames
- From: David Craig
- Characters allowed in short filenames
- Prev by Date: Re: Characters allowed in short filenames
- Next by Date: Re: keyboard filter Vista 32/64
- Previous by thread: Re: Characters allowed in short filenames
- Next by thread: Re: Characters allowed in short filenames
- Index(es):
Relevant Pages
|