Replace special characters



Hi,
I am trying to write a function that receives strings from all unicode
characters - and replaces all special characters (!@#$%^&*()><?...)
with "-", but literals as is.
I've tried to use regular expression with all special chars on ASCII
table - but it still doesn't cover everything.
I cannot use a "whitelist" (literals that are allowed) instead of a
"blacklist" (special chars NOT allowed) - because I don't know the
letters of all languages I'd like to support (English, Latin, Chinese,
Arabic...).
Any ideas what I can do?
Thanks,
Gabi.

This is my "black list" regular expression code - but it does not
succeed always...

dim oreg_exp
set oreg_exp = new RegExp
'oreg_exp.Pattern = "[^a-z0-9]"
oreg_exp.Pattern = "([{}\(\)\^$&._%#!@=<>:;,~`'\’ \*\?\/\+\|\[\\\\]|
\]|\-)"
oreg_exp.IgnoreCase = true
oreg_exp.global = true
title = oreg_exp.replace (title,"-")
Set oreg_exp = Nothing
.



Relevant Pages

  • Re: Replace special characters
    ... I am trying to write a function that receives strings from all unicode ... I've tried to use regular expression with all special chars on ASCII ... I don't know much about Unicode, but I have played with it a little. ...
    (microsoft.public.scripting.vbscript)
  • Re: Replace special characters with regular expression
    ... I am trying to write a function that receives strings from all unicode ... I've tried to use regular expression with all special chars on ASCII ...
    (microsoft.public.scripting.vbscript)
  • Re: Looking for a regexp generator based on a set of known string representative of a string set
    ... If all you have are those strings, you are better off trying to infer ... So I would suggest that the OP explain what he intends to do with his regular expression. ... Of two contending targets the longer prevails. ... "There was a BEE BELONGing to hive nine LONGing to BE a BEEtle and thinking that BEING a BEE was okay, but she had BEEN a BEE LONG ...
    (comp.lang.python)
  • Re: More elegant UTF-8 encoder
    ... Unicode as a sequence of bytes, store it into a sequence of bytes. ... Janusz Brzozowski's notion of derivatives of regular expression, ... which allows me to store the character ranges in their utf8toint encoded ...
    (comp.lang.c)
  • Re: trying to create a multiple pattern matcher
    ... Tom McGlynn wrote: ... You can certainly use "|)" but AFAIK regexs don't tell you ... intricate than just a simple choice between two literal strings. ... or'ed regular expression that had a match, 0 if it matches and is not ...
    (comp.lang.java.programmer)