445,918 Members | 2,279 Online
Need help? Post your question and get tips & solutions from a community of 445,918 IT Pros & Developers. It's quick & easy.

# Integer Base Function

 P: n/a For your enjoyment, a function that expresses any integer with absolute value less-than-or-equal-to nine quintillion in any base from 2 to 36. (For larger bases, you could expand the "digit" character string with, perhaps, foreign letters with diacritical marks, if you don't mind iso-8895-1 instead of ASCII. And if you don't like the non-std type "long long" you can always change it to "long"; but then it could only handle numbers up to about 2 billion, instead of 9 quintillion. /////////////////////////////////////////////////////////////////////////// // // // Base // // Represent an integer in any base from 2 to 36. // // // /////////////////////////////////////////////////////////////////////////// std::string Base ( int base, // must be >= 2 and <= 36 int precision, // must be >= 1 and <= 63 long long number, // must be >= -9E18 and <= +9E18 bool leading_zeros // does user want leading zeros? ) { const long long MAX = 9223372036854775807LL; const long long MIN = -9223372036854775808LL; double largest = pow(base, precision) - 1; if ( base < 2 || base 36 // If base is out-of-range || precision < 1 || precision 63 // or precision is out-of-range || number < MIN || number MAX // or number is out-of-range || largest MAX // or base/precision combo is out-of-range || largest < number // or base/precision combo can't express number ) { return std::string("ERROR"); // then return "ERROR". } std::ostringstream repre; if (number < 0) { number *= (-1); repre << '-'; } long long place = 1LL; for (int i = 1; i <= precision - 1; ++i) { place *= base; } long long value = 0LL; std::string digits = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"; bool first_non_zero = false; for ( ; place 0; place /= base) { value = number / place; if (value 0) first_non_zero = true; if (leading_zeros || first_non_zero) repre << digits[value]; number -= value * place; } return repre.str(); } -- Cheers, Robbie Hatley Tustin, CA, USA lonewolfintj at pacbell dot net (put "[usenet]" in subject to bypass spam filter) http://home.pacbell.net/earnur/ Jul 2 '06 #1
19 Replies

 P: n/a Robbie Hatley posted: const long long MAX = 9223372036854775807LL; Remove the "LL" tagged on the end, it serves no purpose. In fact, it will only hinder you if you decide to change the integer type you're using. For info on integer literals, go to page 20 of the following document: http://www.open-std.org/jtc1/sc22/wg...2005/n1905.pdf MAX and MIN are widely used as macro names for other things -- expect conflict. numeric_limits::max() would be preferable (unless you need a compile-time constant). I haven't checked the Standard, but I'd presume that there's a LONGLONG_MAX macro. const long long MIN = -9223372036854775808LL; Rather than: -9223372036854775808LL I think you need: -9223372036854775807 -1 (A program is ill-formed if it contains an integer literal which is out of range. If you write: -9223372036854775808 It's intepreted as: -(9223372036854775808) And, as you can see, that positive figure is out of range. Therefore you need to write: -9223372036854775807 -1 (But no worries, it's a compile-time constant.) double largest = pow(base, precision) - 1; if ( base < 2 || base 36 // If base is out-of-range || precision < 1 || precision 63 // or precision is || out-of-range number < MIN || number MAX // or number || is out-of-range largest MAX // or base/precision || combo is out-of-range largest < number // or || base/precision combo can't express number ) { return std::string("ERROR"); // then return "ERROR". } I'd consider that inefficient, and would use asserts instead. std::ostringstream repre; if (number < 0) { number *= (-1); You have a bug there. The number system used by the machine may have asymmetrical ranges for positive and negative integers. There's likely to be a negative value which hasn't got a corresponding positive value. repre << '-'; } long long place = 1LL; for (int i = 1; i <= precision - 1; ++i) { place *= base; } long long value = 0LL; std::string digits = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"; bool first_non_zero = false; for ( ; place 0; place /= base) { value = number / place; if (value 0) first_non_zero = true; if (leading_zeros || first_non_zero) repre << digits[value]; number -= value * place; } return repre.str(); } That's inefficient; there's no need for a "ostringstream" object. -- Frederick Gotham Jul 2 '06 #2

 P: n/a Frederick Gotham wrote: Robbie Hatley posted: > const long long MAX = 9223372036854775807LL; Remove the "LL" tagged on the end, it serves no purpose. In fact, it will only hinder you if you decide to change the integer type you're using. For info on integer literals, go to page 20 of the following document: http://www.open-std.org/jtc1/sc22/wg...2005/n1905.pdf MAX and MIN are widely used as macro names for other things -- expect conflict. numeric_limits::max() would be preferable (unless you need a compile-time constant). I haven't checked the Standard, but I'd presume that there's a LONGLONG_MAX macro. I don't believe that long long is in the Standard yet (ISO/IEC 14882:2003). I think it's included in the draft C++0x for C99 compatibility. Jul 2 '06 #3

 P: n/a On Sun, 02 Jul 2006 17:54:31 GMT, red floyd ::max() would be preferable (unless you need a compile-time constant). I haven't checked the Standard, but I'd presume that there's a LONGLONG_MAX macro. I don't believe that long long is in the Standard yet (ISO/IEC 14882:2003). I think it's included in the draft C++0x for C99 compatibility. ....and for absolute necessity. It is very handicapping for a language not to have an integer type that can contain the size of any file in the file system, for example. Or, on a busy day, the number of shares traded on the New York and NASDAQ stock exchanges together will overflow a signed 32-bit value. I think, but can't be bothered to look up, the fact that the NASDAQ volume alone has exceeded 0x7fffffff shares on a few days. -- Jack Klein Home: http://JK-Technology.Com FAQs for comp.lang.c http://c-faq.com/ comp.lang.c++ http://www.parashift.com/c++-faq-lite/ alt.comp.lang.learn.c-c++ http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html Jul 2 '06 #4

 P: n/a Jack Klein wrote: On Sun, 02 Jul 2006 17:54:31 GMT, red floyd >Frederick Gotham wrote: I don't believe that long long is in the Standard yet (ISO/IEC14882:2003). I think it's included in the draft C++0x for C99compatibility. ...and for absolute necessity. It is very handicapping for a language not to have an integer type that can contain the size of any file in the file system, for example. C++ already has such a type. Its name can be int, or more likely, long. The problem is compilers that don't take advantage of the flexibility of the C and C++ integral types, for various reasons, both good and bad. Jul 3 '06 #5

 P: n/a "Frederick Gotham" ::max() would be preferable Very good point. Esp. if I were to templatize the function. Then I could do: template std::string Base(int base, int precision, T number) { T Max = numeric_limits::max() - 5; // see below T Min = numeric_limits::min() + 5; // see below // etc. } If you write: -9223372036854775808 It's intepreted as: -(9223372036854775808) Hmmm... yes, I see in section 2.13.1 that you're right. Therefore you need to write: -9223372036854775807 -1 I'll either use -9223372036854775803 or numerical_limit::min(). if (any of several error conditions are true) { return std::string("ERROR"); // then return "ERROR". } I'd consider that inefficient, and would use asserts instead. assert() calls abort(), which crashes the calling application and dumps big ugly core all over the user's screen. Not necessary just because a display function such as this fails. Better is to just return an error message which the caller can display, react-to, or ignore. Even throwing an exception would be overkill in this case, I think, because that would force callers to catch() them; else, again, abort() is called. (Though, now that I think of it, "ERROR" is a very bad choice, because in bases over 26, it's a valid number!) std::ostringstream repre; if (number < 0) { number *= (-1); You have a bug there. The number system used by the machine may have asymmetrical ranges for positive and negative integers. There's likely to be a negative value which hasn't got a corresponding positive value. Usually, 2's compliment is used for signed numbers, in which case TYPE_MIN = -(TYPE_MAX + 1), so all positive integers are invertible. (TYPE_MIN, on the other hand, is always the one integer that's NOT invertible in 2's compliment.) However, I see upon looking at section 3.9.1 parargraph 2 that the std. doesn't really specify the number system for signed integers. So maybe this will do the trick, unless the system is VERY asymmetrical: T Max = numeric_limits::max() - 5; T Min = numeric_limits::min() + 5; if (condition) repre << digits[value]; That's inefficient; there's no need for a "ostringstream" object. Since the stuff I'm writing to the stringstream are just characters, I suppose I could make repre a std::string and do: if (condition) repre += digits[value]; Yes, that would be more efficient. Thanks for the tips. -- Cheers, Robbie Hatley Tustin, CA, USA lonewolfintj at pacbell dot net (put "[usenet]" in subject to bypass spam filter) http://home.pacbell.net/earnur/ Jul 3 '06 #6

 P: n/a "Pete Becker" I don't believe that long long is in the Standard yet (ISO/IEC 14882:2003). I think it's included in the draft C++0x for C99 compatibility. and for absolute necessity. It is very handicapping for a language not to have an integer type that can contain the size of any file in the file system, for example. C++ already has such a type. Its name can be int, or more likely, long. The problem is compilers that don't take advantage of the flexibility of the C and C++ integral types, for various reasons, both good and bad. My friend Ron, a firmware developer, abhors the C / C++ "int", "short", "long" types. He never uses them directly. He says, "What's 'long'? Exactly how 'long' is it, in bits? 32, you say? Who's #\$^*@! opinion is it that that constitutes 'long'? AARRGG!!!" Instead, he uses int8, uint8, int16, uint16, etc, all typedefed to "char", "short", etc. for the particular platform & compiler he's working with. I think the C and C++ standards should both define the following types as absolutely mandatory basic integer types (actual keywords, not just typedefs): int8 uint8 int16 uint16 int32 uint32 int64 uint64 Firmware folks would love that, since they often need to control EXACTLY how many bits they're using. (When your CPU only has 512 bytes of RAM, every bit counts.) And their code would become more portable, because they'd know that a uint16 on one system will always be exactly the same as a uint16 on another. (Which isn't true of short, long, etc.) -- Cheers, Robbie Hatley Tustin, CA, USA lonewolfintj at pacbell dot net (put "[usenet]" in subject to bypass spam filter) http://home.pacbell.net/earnur/ Jul 3 '06 #7

 P: n/a Robbie Hatley posted: >In fact, it will only hinder you if you decide to changethe integer type you're using. Like, to "long long long"? :-) If, some day, we get 128bit integers available. It would be nice to be able to change effortlessly to any other integer type, be it unsigned int, unsigned short, unsigned long, unsigned char... Good point. "Max" and "Min" would be better. I use all lowercase for object names, but you have your coding style and I have mine. >I'd consider that inefficient, and would use asserts instead. assert() calls abort(), which crashes the calling application and dumps big ugly core all over the user's screen. Not necessary just because a display function such as this fails. Better is to just return an error message which the caller can display, react-to, or ignore. Even throwing an exception would be overkill in this case, I think, because that would force callers to catch() them; else, again, abort() is called. The asserts would be used for debugging only. The idea I had in mind is that you make sure the figures are correct BEFORE calling the function. I like to keep my code as efficient as possible, and so I keep all the input-checking and error-checking code in one place. >The number system used by the machine may have asymmetrical ranges forpositive and negative integers. There's likely to be a negative valuewhich hasn't got a corresponding positive value. Usually, 2's compliment is used for signed numbers, in which case TYPE_MIN = -(TYPE_MAX + 1), so all positive integers are invertible. (TYPE_MIN, on the other hand, is always the one integer that's NOT invertible in 2's compliment.) Yes, but not all negative numbers can be negated (which is exactly what your code does). Anyway, the canonical way to negate a number is: x = -x; rather than: x *= -1; However, I see upon looking at section 3.9.1 parargraph 2 that the std. doesn't really specify the number system for signed integers. It does somewhere in there. The system must either be: Two's complement One's complement Sign-magnitude >That's inefficient; there's no need for a "ostringstream" object. Since the stuff I'm writing to the stringstream are just characters, I suppose I could make repre a std::string and do: if (condition) repre += digits[value]; Yes, that would be more efficient. I myself would use a raw char buffer... but then again I'm a "C++ programmer with a heavy bias towards C". Thanks for the tips. You're welcome. -- Frederick Gotham Jul 3 '06 #8

 P: n/a Robbie Hatley posted: int8 uint8 int16 uint16 int32 uint32 int64 uint64 Rather than specifying how many value representation bits an integer type has, these typedef's would have to specify the minimum amount they have, because some systems have 9-Bit bytes and 36-Bit int's. Also, an integer may contain padding bits (except for the char family). -- Frederick Gotham Jul 3 '06 #9

 P: n/a "Frederick Gotham" wrote: Robbie Hatley posted: int8 uint8 int16 uint16 int32 uint32 int64 uint64 Rather than specifying how many value representation bits an integer type has, these typedef's would have to specify the minimum amount they have, because some systems have 9-Bit bytes and 36-Bit int's. Also, an integer may contain padding bits (except for the char family). Actually, that's a bit of a misquote. What I said was: I think the C and C++ standards should both define the following types as ABSOLUTELY MANDATORY BASIC INTEGER TYPES (ACTUAL KEYWORDS, NOT JUST TYPEDEFS): int8 uint8 int16 uint16 int32 uint32 int64 uint64 Perhaps you're right, though, in that some system archetectures may make impliementing these difficult or impossible. Perhaps they should be optional typedefs, as you say, instead of mandatory basic types, as had been my idea. Note that the current C standard (C99) has already has optional exact-integer-length typedefs int8_t, uint8_t, int16_t, etc.. Also, the C std. has these NON-optional (ie, mandatory) types, listed in C99 section 7.18.1.1 paragraph 3: The following types are required: int_least8_t int_least16_t int_least32_t int_least64_t uint_least8_t uint_least16_t uint_least32_t uint_least64_t These are the ones that function like the minimum-length typedefs you described. The C++ standard doesn't have these, though. And surprisingly, I see no sign of these in the working draft, either. A shame. I think C++ should include all of the above, both the optional exact-length "intN_t" typedefs (int16_t or int27_t or whatever), and the eight mandatory "minimum size" typedefs. -- Cheers, Robbie Hatley Tustin, CA, USA lonewolfintj at pacbell dot net (put "[usenet]" in subject to bypass spam filter) http://home.pacbell.net/earnur/ Jul 4 '06 #10

 P: n/a In article , bo***********@no.spam says... [ stuff in C99's stdint.h ] The C++ standard doesn't have these, though. And surprisingly, I see no sign of these in the working draft, either. A shame. I think C++ should include all of the above, both the optional exact-length "intN_t" typedefs (int16_t or int27_t or whatever), and the eight mandatory "minimum size" typedefs. These are in TR1, so even if the editing hasn't been done to put them into the new C++ standard, it seems likely that it's more a matter of administrivia than any likelihood they'll be excluded. -- Later, Jerry. The universe is a figment of its own imagination. Jul 4 '06 #11

 P: n/a "Frederick Gotham" wrote: I myself would use a raw char buffer I think I'll go for just a char array for my "digits" string, because it's a small container of 36 chars which never change: // overkill: std::string digits = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"; // more efficient: const char digits[37] = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"; But for the return value, I prefer to return a std::string object by value. That way, the calling function gets its own independent copy of the results. but then again I'm a "C++ programmer with a heavy bias towards C". I, too, find myself biased towards some C-style C++ programming paradigms (such as, I often tend to think procedurally when perhaps I should be thinking in terms of OOP). This is, perhaps, due to my past background in other languages (APL, Fortran, Basic). But there are certain features of C++ which I love so much that I rarely use C anymore, except for firmware or tiny utility apps: namespaces std containers std::string std algorithms std iterators parameterized constructors templates functors I had a dream a while back in which the C++ standards committee had decided to remove all those from the language. When I woke up and realized it was just a nightmare, I was very relieved! For comparison, consider two programmers' implimentations of a list of strings: My friend Ron, in C: (snip 500 lines of tricky code) Me, in C++: std::list

 P: n/a Robbie Hatley wrote: > Also, the C std. has these NON-optional (ie, mandatory) types, listed in C99 section 7.18.1.1 paragraph 3: The following types are required: int_least8_t int_least16_t int_least32_t int_least64_t uint_least8_t uint_least16_t uint_least32_t uint_least64_t These are the ones that function like the minimum-length typedefs you described. The C++ standard doesn't have these, though. And surprisingly, I see no sign of these in the working draft, either. Page 404, N2009.pdf. Jul 5 '06 #13

 P: n/a Robbie Hatley wrote: But there are certain features of C++ which I love so much that I rarely use C anymore, except for firmware or tiny utility apps: namespaces std containers std::string std algorithms std iterators parameterized constructors templates functors so if you like templates so much, and since your code is all inlined anyway, why not make the function a template on its integral type? You can use numeric_limits to find out some of the information you need about what type you have. Jul 5 '06 #14

 P: n/a I'd said: The C++ standard doesn't have [exact-size and minimum-size integers], though. And surprisingly, I see no sign of these in the working draft, either. And Pete Becker replied: Page 404, N2009.pdf. Thanks. I was looking at a copy of the working draft from last October (N1905). Obviously lots has changed since then. I just grabbed a copy of N2009.pdf from open-stds.org for my own reference. I see you're the author of that document. At least you're up front about the fact that some of the things in it are "incomplet and incorrekt and contain bad formatting". :-) Looks pretty clean to me, in spite of that. Thanks for all the hard work improving the language. -- Cheers, Robbie Hatley Tustin, CA, USA lonewolfintj at pacbell dot net (put "[usenet]" in subject to bypass spam filter) http://home.pacbell.net/earnur/ Jul 6 '06 #15

 P: n/a "Earl Purple" wrote: Robbie Hatley wrote: But there are certain features of C++ which I love so much that I rarely use C anymore, except for firmware or tiny utility apps: namespaces std containers std::string std algorithms std iterators parameterized constructors templates functors so if you like templates so much, and since your code is all inlined anyway, why not make the function a template on its integral type? I did. Probably around the same time as you were writing your post. I also incorporated many of Frederick Gotham's nifty tips. You can use numeric_limits to find out some of the information you need about what type you have. Yep, that's what I did. Here's my updated version of my Base<>() function: // Put this in a header file: namespace YourNamespaceName { /////////////////////////////////////////////////////////////////////////// // // // Base // // Represent an integer in any base from 2 to 36. // // // /////////////////////////////////////////////////////////////////////////// template std::string Base ( int base, // must be >= 2 and <= 36 int precision, // must be >= 1 and <= 63 T number, // must be >= min+5 and <= max-5 for type bool leading_zeros = false // does user want leading zeros? ) { T const max = std::numeric_limits::max() - 5; T const min = std::numeric_limits::min() + 5; double largest = pow(base, precision) - 1; if ( base < 2 || base 36 // If base is out-of-range || precision < 1 || precision 63 // or precision is out-of-range || number < min || number max // or number is out-of-range || largest max // or base/precision combo is out-of-range || largest < number // or base/precision combo can't express number ) { return std::string("***ERROR***"); // then return "***ERROR***". } std::string repre = std::string(""); if (number < 0) { number = -number; repre += '-'; } T place = 1; for (int i = 1; i <= precision - 1; ++i) { place *= base; } T value = 0; const char digits[37] = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"; bool first_non_zero = false; for ( ; place 0; place /= base) { value = number / place; if (value 0) first_non_zero = true; if (leading_zeros || first_non_zero) repre += digits[value]; number -= value * place; } return repre; } // end Base() } // end namespace YourNamespaceName -- Cheers, Robbie Hatley Tustin, CA, USA lonewolfintj at pacbell dot net (put "[usenet]" in subject to bypass spam filter) http://home.pacbell.net/earnur/ Jul 6 '06 #16

 P: n/a Robbie Hatley posted: I did. Probably around the same time as you were writing your post. I also incorporated many of Frederick Gotham's nifty tips. I've written such an alogrithm before, and made it ultra-efficient. (I don't think many people here would like my code.) I'd post the code, but it's embedded deep within some horrible template code. Basically, I started off with the max value for an integer type, e.g.: 4294967295 And then I calculated the maximum power of the radix which is smaller than the max value, i.e.: 1000000000 (assuming a radix of 10) I would then take a value from the user, e.g.: 2533916891 I would divide the user's value by the max power; this would yield the first digit. Then I would mod the user value with the radix, and then divide the max power by the radix. Kind of like: 2533916891 1000000000 Yields 2 533916891 100000000 Yields 5 33916891 10000000 Yields 3 3916891 1000000 Yields 3 916891 100000 Yields 9 16891 10000 Yields 1 6891 1000 Yields 6 891 100 Yields 8 91 10 Yields 9 1 1 Yields 1 I went way way WAY overboard on the whole efficiency side of things though, calculating as many stuff with template metaprogramming as I can. -- Frederick Gotham Jul 6 '06 #17

 P: n/a "Frederick Gotham" wrote: 2533916891 1000000000 Yields 2 533916891 100000000 Yields 5 33916891 10000000 Yields 3 3916891 1000000 Yields 3 916891 100000 Yields 9 16891 10000 Yields 1 6891 1000 Yields 6 891 100 Yields 8 91 10 Yields 9 1 1 Yields 1 If you look at my code, it's doing exactly the same thing. Except that I use the word "base" instead of "radix", "place" to mean base^(place-value-index), and "value" to mean digit * place. See the "place /= base" in the for loop? It's doing the exact thing you present in your chart above. For example, user enters base=27, precision=30, number=3847566. My error-checking makes sure precision is high enough that base^precision number, which is true in this case. Start with place = base ^ (precision - 1). number/place == 0, place /= base; number/place == 0, place /= base; number/place == 0, place /= base; ... many more times ... when place gets from 27^30 down to 27^4, I finally get a non-zero result: number is now 3847566 and place is now 27^4 == 531441. digit = number/place = 3847566/531441 = 7, so print "7" and do value = digit * place = 7 * 531441 = 3720087 number -= value; place /= base; number is now 127479 and place is now 27^3 == 19683. digit = number/place = 127479/19683 = 6, so print "6" and do value = digit * place = 6 * 19683 = 118098 number -= value; place /= base; number is now 9381 and place is now 27^2 = 729. digit = number/place = 9381/729 = 12 ("C"), so print "C" and do value = digit * place = 12 * 729 = 8748 number -= value; place /= base; number is now 633 and place is now 27^1 = 27. digit = number/place = 633/27 = 23 ("N"), so print "N" and do value = digit * place = 23 * 27 = 621 number -= value; place /= base; number is now 12 and place is now 27^0 = 1. digit = number * place = 12 * 1 = 12 ("C"), so print "C" and do value = digit * place = 12 * 1 = 12 number -= value; place /= base; number is now 0 and place is now 0. for-loop test fails. return "76CNC"; The only major way I can see to make the algorithm more efficient is to calculate the "precision" to be exactly what it needs to be, intead of letting the user specify a number which may be way too high, such as the user entering a precision of "40" when the result is actually only going to have 5 27ary digits. Perhaps something like: if (!leading_zeros) { precision = static_cast(floor((log(number)/log(base)))); } -- Cheers, Robbie Hatley Tustin, CA, USA lonewolfintj at pacbell dot net (put "[usenet]" in subject to bypass spam filter) http://home.pacbell.net/earnur/ Jul 6 '06 #18

 P: n/a Robbie Hatley posted: The only major way I can see to make the algorithm more efficient is to calculate the "precision" to be exactly what it needs to be, intead of letting the user specify a number which may be way too high, such as the user entering a precision of "40" when the result is actually only going to have 5 27ary digits. Perhaps something like: if (!leading_zeros) { precision = static_cast(floor((log(number)/log(base)))); } To find out the amount of leading zeros, you could firstly find out the maximum amount of digits for a type for a give radix. So if we assume a radix of 10, and we have the following max value: 4294967295 Then our max digits is: 10 Then we take an input from the user, and calculate the max amount of digits. Subtract the two numbers and we have the amount of leading zeros. Here's a template if you're interested: template< class T, T max, T radix, bool max_is_less_than_radix = (max < radix) > struct MaxDigits { static T const val = 1 + MaxDigits::val; }; template< class T, T max, T radix> struct MaxDigits int main() { std::cout << MaxDigits::val; } -- Frederick Gotham Jul 6 '06 #19