435,269 Members | 1,507 Online + Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,269 IT Pros & Developers. It's quick & easy.

Human readable number formatting

 P: n/a When reporting file sizes to the user, it's nice to print '16.1 MB', rather than '16123270 B'. This is the behaviour the command 'df -h' implements. There's no python function that I could find to perform this formatting , so I've taken a stab at it: import math def human_readable(n, suffix='B', places=2): '''Return a human friendly approximation of n, using SI prefixes''' prefixes = ['','k','M','G','T'] base, step, limit = 10, 3, 100 if n == 0: magnitude = 0 #cannot take log(0) else: magnitude = math.log(n, base) order = int(round(magnitude)) // step return '%.1f %s%s' % (float(n)/base**(order*step), \ prefixes[order], suffix) Example usage print [human_readable(x) for x in [0, 1, 23.5, 100, 1000/3, 500, 1000000, 12.345e9]] ['0.0 B', '1.0 B', '23.5 B', '100.0 B', '0.3 kB', '0.5 kB', '1.0 MB', '12.3 GB'] I'd hoped to generalise this to base 2 (eg human_readable(1024, base=2) == '1 KiB' and enforcing of 3 digits at most (ie human_readable(100) == '0.1 KB' instead of '100 B). However I can't get the right results adapting the above code. Here's where I'd like to ask for your help. Am I chasing the right target, in basing my function on log()? Does this function already exist in some python module? Any hints, or would anyone care to finish it off/enhance it? With thanks Alex Sep 27 '05 #1
9 Replies

 P: n/a Compared to your program, I think the key to mine is to divide by "limit" before taking the log. In this way, things below the "limit" go to the next lower integer. I think that instead of having 'step' and 'base', there should be a single value which would be 1000 or 1024. import math def MakeFormat(prefixes, step, limit, base): def Format(n, suffix='B', places=2): if abs(n) < limit: if n == int(n): return "%s %s" % (n, suffix) else: return "%.1f %s" % (n, suffix) magnitude = math.log(abs(n) / limit, base) / step magnitude = min(int(magnitude)+1, len(prefixes)-1) return '%.1f %s%s' % ( float(n) / base ** (magnitude * step), prefixes[magnitude], suffix) return Format DecimalFormat = MakeFormat( prefixes = ['', 'k', 'M', 'G', 'T'], step = 3, limit = 100, base = 10) BinaryFormat = MakeFormat( prefixes = ['', 'ki', 'Mi', 'Gi', 'Ti'], step = 10, limit = 100, base = 2) values = [0, 1, 23.5, 100, 1000/3, 500, 1000000, 12.345e9] print [DecimalFormat(v) for v in values] print [BinaryFormat(v) for v in values] -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQFDOd84Jd01MZaTXX0RApi5AKChFiER/MmrIdYwfMMlCbhmTf/vjgCgpXsv MhxevhDrWNnP5gomuNNCaMw= =4jCj -----END PGP SIGNATURE----- Sep 28 '05 #2

 P: n/a Alex Willmer wrote: When reporting file sizes to the user, it's nice to print '16.1 MB', rather than '16123270 B'. This is the behaviour the command 'df -h' implements. There's no python function that I could find to perform this formatting , so I've taken a stab at it: BOTEC at http://www.alcyone.com/software/botec/ contains a class called SI which does this formatting (and supports all SI prefixes). -- Erik Max Francis && ma*@alcyone.com && http://www.alcyone.com/max/ San Jose, CA, USA && 37 20 N 121 53 W && AIM erikmaxfrancis Dead men have no victory. -- Euripides Sep 28 '05 #4

 P: n/a "Alex Willmer" wrote in message news:11**********************@localhost.localdomai n... When reporting file sizes to the user, it's nice to print '16.1 MB', rather than '16123270 B'. This is the behaviour the command 'df -h' implements. There's no python function that I could find to perform this formatting , so I've taken a stab at it: import math def human_readable(n, suffix='B', places=2): '''Return a human friendly approximation of n, using SI prefixes''' prefixes = ['','k','M','G','T'] base, step, limit = 10, 3, 100 if n == 0: magnitude = 0 #cannot take log(0) else: magnitude = math.log(n, base) order = int(round(magnitude)) // step return '%.1f %s%s' % (float(n)/base**(order*step), \ prefixes[order], suffix) Example usage print [human_readable(x) for x in [0, 1, 23.5, 100, 1000/3, 500, 1000000, 12.345e9]] ['0.0 B', '1.0 B', '23.5 B', '100.0 B', '0.3 kB', '0.5 kB', '1.0 MB', '12.3 GB'] I'd hoped to generalise this to base 2 (eg human_readable(1024, base=2) == '1 KiB' and enforcing of 3 digits at most (ie human_readable(100) == '0.1 KB' instead of '100 B). However I can't get the right results adapting the above code. Here's where I'd like to ask for your help. Am I chasing the right target, in basing my function on log()? Does this function already exist in some python module? Any hints, or would anyone care to finish it off/enhance it? With thanks Alex This'll probably do what you want with some minor modifications. def fmt3(num): for x in ['','Kb','Mb','Gb','Tb']: if num<1024: return "%3.1f%s" % (num, x) num /=1024 print [fmt3(x) for x in [0, 1, 23.5, 100, 1000/3, 500, 1000000, 12.345e9]] ['0.0', '1.0', '23.5', '100.0', '333.0', '500.0', '976.6Kb', '11.5Gb'] HTH. Sep 28 '05 #5

 P: n/a Here is another function for human formatting:
def sistr(value, prec=None, K=1024.0, k=1000.0, sign='', blank=' '): ''' Convert value to a signed string with an SI prefix.  The 'prec' value specifies the number of fractional digits to be included.  Use 'prec=0' to omit any fraction.  If 'prec' is not specified or None, the precision is adjusted to make the returned string 6 characters (without the sign).  The 'sign' character is used for positive values. Negative values are always prefixed with '-'.  Uppercase 'K' is the scale factor for values above 1.0 and lowercase 'k' scales values below 1.0.  The 'blank' character is used as the SI prefix for values between k and K, i.e. value without an SI prefix.  Set 'blank' to None, False or '' if no alignment is required.  name symbol   10**   symbol name ================================= deca   da    +  1 -     d   deci hecto  h     +  2 -     c   centi - - - - - - - - - - - - - - - - - Kilo   K     +  3 -     m   milli Mega   M     +  6 -    /u   micro Giga   G     +  9 -     n   nano Tera   T     + 12 -     p   pico Peta   P     + 15 -     f   femto Exa    E     + 18 -     a   atto Zetta  Z     + 21 -     z   zepto Yotta  Y     + 24 -     y   yocto --------------------------------- Xona   X     + 27 -     x   xonto Weka   W     + 30 -     w   wekto Vunda  V     + 33 -     v   vunkto Uda    U     + 36 -     u*  unto Treda  TD*   + 39 -    td   trekto Sorta  S     + 42 -     s   sotro Rinta  R     + 45 -     r   rimto Quexa  Q     + 48 -     q   quekto Pepta  PP    + 51 -    pk   pekro Ocha   O     + 54 -     o   otro Nena   N     + 57 -    nk   nekto MInga  MI    + 60 -    mk   mikto Luma   L     + 63 -     l   lunto  The prefixes below the line are non-sanctioned SI and are only used until the symbols marked * to avoid ambiguity.  The symbols above the dotted line are not used and '/u' is returned as 'u'.  See http://en.wikipedia.org/wiki/Binary_prefix or http://www.bipm.org/en/si/prefixes.html and maybe http://jimvb.home.mindspring.com/unitsystem.htm ''' s, v, p = sign, float(value), None if v < 0.0: s, v = '-', -v if v < K: if v >= 1.0: p = blank elif k > 10.0: for f in iter('munpfazyxwv'):  # no unto, ... v *= k  # scale up if v >= 1.0: p = f break elif K > 10.0: for f in iter('KMGTPEZYXWVU'):  # no Treda, ... v /= K  # scale down if v < K: p = f break # format value if p is None:  # too large, small or invalid K, k return "%.0e*" % value elif prec is None: if v < 100.0: if v < 10.0: prec = 3 else: prec = 2 else: if v < 1000.0: prec = 1 else: prec = 0 elif prec < 0: prec = 0 # rounds return "%s%0.*f%s" % (s, prec, v, p) if __name__ == '__main__': x = 17 while x < 1.0e18: print sistr(x), x x *= 17 x = 0.12 while x > 1.0e-18: print sistr(x), x x *= 0.12
/Jean Brouwers Sep 28 '05 #6

 P: n/a "MrJean1" writes: Ocha O + 54 - o otro Nena N + 57 - nk nekto MInga MI + 60 - mk mikto Luma L + 63 - l lunto Please tell me you're making this up. Sep 28 '05 #7

 P: n/a Paul Rubin wrote: "MrJean1" writes: Ocha O + 54 - o otro Nena N + 57 - nk nekto MInga MI + 60 - mk mikto Luma L + 63 - l lunto Please tell me you're making this up. No, but someone else is. http://jimvb.home.mindspring.com/unitsystem.htm -- Robert Kern rk***@ucsd.edu "In the fields of hell where the grass grows high Are the graves of dreams allowed to die." -- Richard Harter Sep 28 '05 #8

 P: n/a No, I didn't. See the references at the bottom. /Jean Brouwers Sep 28 '05 #9

 P: n/a MrJean1 wrote: No, I didn't. See the references at the bottom. /Jean Brouwers So when I say "I'm sorta busy" it means I'm REALLY busy. Sep 28 '05 #10

This discussion thread is closed

Replies have been disabled for this discussion. 