Re: Is Unicode character a vowel?




"MrAsm" <mrasm@xxxxxxx> wrote in message news:55sl33dvk41chh0gprho9vg4erghce7iul@xxxxxxxxxx

I can just say that, in Italian language, the concept of wovel is well
defined: there are just 5 wovels: 'a', 'e', 'i', 'o', 'u'. So the word
"pizza" contains 2 wovels ('i' and 'a'), the word "spaghetti" has 3
wovels ('a', 'e', 'i'), and so on.

So, basically, the look-up algorithm is fine for the Italian language.

It's something like this (just using standard ANSI chars)
...

Italian would also need accented characters of course - eg "più" if I remember right (and if the newsgroup accepts the accented u).

But, as others said, in languages like English (and I also suppose in
languages from the Far East), the concept of wovels may be more
ambiguous.

Obviously in languages where a glyph represents a whole syllable, you cannot classify symbols into vowels and consonants - so that's most of the unicode code points out for a start :-)

Quite generally the concept of a "vowel" is related to sounds rather than letters. And languages which do not represent sounds phonetically give you almost no chance to say that "a letter is a vowel". (Very few languages are as phonetic as Italian is, and most have a more complex array of vowel sounds.) Even those that do use groups of ltters to represent a single vowel sound like "igh" (eg in "right") in English and "on" (eg in "bon") in French, and most allow you to use letters which usually represent vowels as consonents. (What about the first "i" in the Italian word "ieri"????)

The question needs to be asked - why does on wish to distinguish a vowel from other letters? There may be a way of doing it which satisfies a particular purpose, or there may not be. But it will depend on the purpose.

Dave
--
David Webber
Author of 'Mozart the Music Processor'
http://www.mozart.co.uk
For discussion/support see
http://www.mozart.co.uk/mzusers/mailinglist.htm

P.S. It's a popular joke that Welsh, with words like gwr (=man), dwr (=water), gwn (=gown), ffwl=(fool), (all, IIRC, with a ^ on the w) has no vowels. But of course it has - it's just that it has different spelling conventions and w and y are both usually vowels. [w is pronounced more or less as the English "oo" either as in pool or as in book according to context.]






.



Relevant Pages

  • Re: Are Linguistic Changes Accelerated by...
    ... >>> Two reasons that languages might change are to increase the perceptual ... >>> articulatory difficulty of particular sounds or sequences of sounds. ... > lots of vowels have to make use of the periphery of the vowel space ... A certain amount of ambiguity is tolerated, ...
    (sci.lang)
  • Re: BBC does it again
    ... phoneme, and that long vowels are a sequence of two short vowels. ... Does this mean you don't agree that there are such languages? ... orthographies are already used to writing English, ...
    (sci.lang)
  • Re: BBC does it again
    ... >> phoneme, and that long vowels are a sequence of two short vowels. ... Does this mean you don't agree that there are such languages? ... phonemic orthography is an ideal orthography. ...
    (sci.lang)
  • Re: BBC does it again
    ... > phoneme, and that long vowels are a sequence of two short vowels. ... Does this mean you don't agree that there are such languages? ... > orthographies are already used to writing English, ...
    (sci.lang)
  • Re: How to check variables for uniqueness ?
    ... What is surprising is the ... linguistic rules that may have widespread implication for languages other ... be more Chinese/Japanese programmers than English programmers, ... letters are spelled differently. ...
    (comp.lang.java.programmer)

Loading