Re: Base36

Tech Tip: Click here to run a free scan for Windows Errors and optimize PC performance

From: Justin Rogers (Justin_at_games4dotnet.com)
Date: 11/08/04


Date: Sun, 7 Nov 2004 19:54:11 -0800

I'll go ahead and add some additional types. That is a a fairly easy process. I
need to put some thought into the byte array. Base64 encoding uses a special
padding character to overcome some of the issues you are noting below.

Version 1.1 is posted at the space with all of the base types added.

-- 
Justin Rogers
DigiTec Web Consultants, LLC.
Blog: http://weblogs.asp.net/justin_rogers
"William Stacey [MVP]" <staceywREMOVE@mvps.org> wrote in message 
news:OTYmxEUxEHA.3336@TK2MSFTNGP11.phx.gbl...
> Thanks Justin.  Any chance you could add long and short support, and maybe
> arbitrary byte[]s?  TIA
> For byte[]s I was thinking just converting each byte to base36, but that
> results in 2 char min after dec 36.  So maybe need to take 4 or 8 bytes at a
> time and convert to int or long and convert that to a base to leverage the
> resulting chars better - not sure.  Any thoughts?
> -- 
> William Stacey, MVP
> http://mvp.support.microsoft.com
>
> "Justin Rogers" <Justin@games4dotnet.com> wrote in message
> news:Om9Pw9RxEHA.1260@TK2MSFTNGP12.phx.gbl...
>> Code-Only: Arbitrary alphabet encoding (aka BaseN encoding) for base2
> through
>> base36.
>>
>> The notes are extensive as to the direction the library may or may not go
>> depending on what
>> problems people are trying to solve. What I've realized is that there are
> a
>> number of additional
>> and interesting problems associated with alphabet encoding, such as
> permuations,
>> cyclic
>> rotations, error correction, and the like that may be interesting to build
> into
>> the libraries. An
>> example of an error correction alphabet would be the base32 encoding which
>> removes
>> characters that may be confused for other characters when read by a human.
>>
>>
>> -- 
>> Justin Rogers
>> DigiTec Web Consultants, LLC.
>> Blog: http://weblogs.asp.net/justin_rogers
>>
>> "Roy Fine" <rlfine@twt.obfuscate.net> wrote in message
>> news:unkF9yRxEHA.2192@TK2MSFTNGP14.phx.gbl...
>> > William
>> >
>> > The other base was base 26 and used *just* the uppercase alphabetic
>> > characters (A..Z).  The only change would be to specify the token set of
> the
>> > number set and the weights of each position.  For the Base26 case, it
> was
>> > this:
>> >
>> > /* ***************** */
>> > public class BASE32{
>> > static string tokens = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
>> > static long [] powers =
>> >
> {1L,26L,26L*26L,26L*26L*26L,26L*26L*26L*26L,26L*26L*26L*26L*26L,26L*26L*26L*
>> >
> 26L*26L*26L,26L*26L*26L*26L*26L*26L*26L,26L*26L*26L*26L*26L*26L*26L*26L,26L*
>> >
> 26L*26L*26L*26L*26L*26L*26L*26L,26L*26L*26L*26L*26L*26L*26L*26L*26L*26L,26L*
>> > 26L*26L*26L*26L*26L*26L*26L*26L*26L*26L};
>> > ...
>> > ...
>> > }
>> > /* ***************** */
>> >
>> > conversion is each direction is based on the tokens and powers arrays.
> the
>> > first entry in the tokens aray always corresponds to the empty or zero
>> > value, etc.
>> >
>> > happy to help
>> > roy
>> >
>> >
>> > "William Stacey [MVP]" <staceywREMOVE@mvps.org> wrote in message
>> > news:u$MPFPQxEHA.3908@TK2MSFTNGP12.phx.gbl...
>> >> Hey thanks a lot Roy.  Care to post the other base as well?  Either
> way,
>> >> thanks again!!
>> >>
>> >> -- 
>> >> William Stacey, MVP
>> >> http://mvp.support.microsoft.com
>> >>
>> >> "Roy Fine" <rlfine@twt.obfuscate.net> wrote in message
>> >> news:#wPFSCQxEHA.3976@TK2MSFTNGP09.phx.gbl...
>> >> > William,
>> >> >
>> >> > this is something that i did some time ago - actually for a different
>> >> base,
>> >> > but it was easy enough to change  to handle base32.
>> >> >
>> >> > you did not specify the symbol set for your number base - i will
> assume
>> >> > 0,1,2,3... X,Y,Z.  if yours is different, change the tokens string
>> >> > accordingly.
>> >> >
>> >> > for performance reasons, the weights of the digits are computed at
>> > compile
>> >> > time.
>> >> >
>> >> > note - there is absolutely no error checking, and it handles only
>> > positive
>> >> > values, and assumes that all character codes are upper case.
>> >> >
>> >> > regards
>> >> > roy fine
>> >> >
>> >> >
>> >> > namespace CONVERSION{
>> >> > // handles positive only values up to 4,738,381,338,321,616,896 - 1;
>> >> > public class BASE32{
>> >> > static string tokens = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
>> >> > static long [] powers =
>> >> >
>> >>
>> >
> {1L,36L,36L*36L,36L*36L*36L,36L*36L*36L*36L,36L*36L*36L*36L*36L,36L*36L*36L*
>> >> >
>> >>
>> >
> 36L*36L*36L,36L*36L*36L*36L*36L*36L*36L,36L*36L*36L*36L*36L*36L*36L*36L,36L*
>> >> >
>> >>
>> >
> 36L*36L*36L*36L*36L*36L*36L*36L,36L*36L*36L*36L*36L*36L*36L*36L*36L*36L,36L*
>> >> > 36L*36L*36L*36L*36L*36L*36L*36L*36L*36L};
>> >> >
>> >> > public static string ToString(long lval){
>> >> >  int maxStrLen = powers.Length;
>> >> >  long curval = lval;
>> >> >  char [] tb = new char[maxStrLen];
>> >> >  int outpos = 0;
>> >> >  for(int i=0; i<maxStrLen; i++){
>> >> >    long pval = powers[maxStrLen - i - 1];
>> >> >    int pos = (int)(curval / pval);
>> >> >    tb[outpos++] = tokens.Substring(pos,1).ToCharArray()[0];
>> >> >    curval = curval % pval;
>> >> >    }
>> >> >  if(outpos==0) tb[outpos++] = '0';
>> >> >  return new string(tb,0,outpos).TrimStart('0');
>> >> > }
>> >> >
>> >> > public static long ToLong(string t){
>> >> >  long ival = 0;
>> >> >  char [] tb = t.ToCharArray();
>> >> >  for(int i=0; i<tb.Length; i++){
>> >> >    ival += powers[i]*tokens.IndexOf(tb[tb.Length-i-1]);
>> >> >    }
>> >> >  return ival;
>> >> >  }
>> >> > }
>> >> > }
>> >> >
>> >> > "William Stacey [MVP]" <staceywREMOVE@mvps.org> wrote in message
>> >> > news:uc3%23MkOxEHA.1988@TK2MSFTNGP12.phx.gbl...
>> >> > > Anyone have a c# Base10ToBase36 and Base36ToBase10 conversion
>> > routines?
>> >> > TIA
>> >> > >
>> >> > > -- 
>> >> > > William Stacey, MVP
>> >> > > http://mvp.support.microsoft.com
>> >> > >
>> >> >
>> >> >
>> >>
>> >
>> >
>>
>>
> 


Relevant Pages

  • Re: Strange Characters When Viewing Outlook Express messages
    ... Messages Received in Outlook Express Have Different Characters in the ... messages in the default encoding format regardless of the actual encoding ... changed something with whatever they use to produce the emails. ...
    (microsoft.public.windowsxp.general)
  • Re: Help me!! Why java is so popular
    ... Well, Unicode is not a storage encoding system, or anything like that. ... Unicode is primarily a mapping from characters (in the linguistic conceptual ... French, Russian, Japanese and Korean songs. ...
    (comp.lang.java.programmer)
  • Re: Workable encryption in Tcl??
    ... abstract characters using the concrete UTF-8 encoding, ... character streams and octet streams when doing input and output. ... How does this relate to encryption? ...
    (comp.lang.tcl)
  • Re: Trasferire file
    ... The Base64 Content-Transfer-Encoding is designed to ... The encoding and decoding algorithms ... as output strings of 4 encoded characters. ... that this may be done directly by the encoder rather than in ...
    (it.comp.macintosh)
  • Re: unicode newbie, can you help?
    ... # The file can be opened using any encoding. ... # are used when analysing the BOM. ... # Create a test file with encoding .. ... ## Read in $MAX_BOM_LENGTH characters. ...
    (comp.lang.perl.misc)