GetTextExtentExPoint slow for characters greater than codepoint 127



GetTextExtentExPointW seems to be significantly slower on strings that
contains codepoints above 127. For a string containing 30,000 characters
where 44 percent of them (13251) are composed of codepoints from the list
below, GetTextExtentExPointW is 37% slower than when called with a 30,000
character string composed of only characters below codepoint 127.

Is this a known problem and is there any work-around?

I am building an editor that supports proportional fonts and I am trying to
use GetTextExtentExPointW to determine which characters within strings fits
within my window. I noticed the problem because scrolling horizontally is
visibly much slower if the data being edited contains characters above 127.

Thank you.

David Liebtag
IBM APL Products and Services

The codepoints that I've noticed cause performance degradation:

U00C7 U00E9 U00E2 U00E4 U00E0 U00E5 U00E7 U00EA U00EB
U00E8 U00EF U00EE U00EC U00C4 U00C5 U00F4 U00F6 U00F2
U00FB U00F9 U00D6 U00DC U00A3 U20A7 U00E1 U00ED U00F3
U00FA U00F1 U00D1 U00BF U00A1 U00CC U00DF U00FC U237A
U005E U2340 U233F U2308 U25CB U22A5 U2207 U236B U2206
U22C4 U00F7 U2339 U2193 U00A8 U2282 U22A4 U220A U234E
U230A U2355 U2352 U234B U2265 U00AF U2336 U2229 U2373
U2218 U235D U2190 U2264 U235F U22A2 U2261 U007C U2372
U2371 U2260 U2375 U2228 U2283 U235E U2395 U233B U2342
U0027 U2192 U233D U2374 U22A3 U005C U2296 U0021 U2337
U2191 U00D7 U2349 U2235 U005F U2359 U2377 U2378 U222A


.



Relevant Pages

  • Re: Why R6RS is controversial
    ... the semantics of the language, ... behavior of grapheme-cluster characters under most linguistic ... as the strings grow longer. ... Normalization is hideously complicated, and may require many ...
    (comp.lang.scheme)
  • Re: Unicode LISP??
    ... I'm not experienced with Common Lisp library, ... terms of strings rather than characters. ... have their representation upgraded if they are updated in place. ...
    (comp.lang.lisp)
  • Re: not quite 1252
    ... The kill_gremlins function is intended to fix Unicode strings that have been obtained by decoding 8-bit strings using 'latin1' instead of 'cp1252'. ... In fact it wasn't, it was UTF-8 like Sergei wrote, but it was easy to convert it to cp1252, no problem. ... characters to documents marked up as ISO 8859-1 or other encodings. ...
    (comp.lang.python)
  • Re: How to check variables for uniqueness ?
    ... FI in English typography), so the correct uppercase version of those ... characters is the sequence SS. ... So you at least agree with me that it should be consistent with toUpperCase -- all strings should have a single canonical toUpperCase, a single canonical toLowerCase, both should define equivalence classes on the mixed-case input strings, these should be the SAME equivalence class, and equalsIgnoreCase should implement and embody the corresponding equivalence relation. ... The version that doesn't shouldn't surprise English speakers; the version that does shouldn't surprise anyone familiar with its locale-specific behavior for the locale actually used. ...
    (comp.lang.java.programmer)
  • Re: How to check variables for uniqueness ?
    ... characters is the sequence SS. ... is simply capitalizing strings. ... The fact that case mapping in English /is/ simple is neither here not ... That is a fair criticism of the Unicode position. ...
    (comp.lang.java.programmer)

Quantcast