473,395 Members | 1,527 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

Human readable number formatting

When reporting file sizes to the user, it's nice to print '16.1 MB',
rather than '16123270 B'. This is the behaviour the command 'df -h'
implements. There's no python function that I could find to perform this
formatting , so I've taken a stab at it:

import math
def human_readable(n, suffix='B', places=2):
'''Return a human friendly approximation of n, using SI prefixes'''
prefixes = ['','k','M','G','T']
base, step, limit = 10, 3, 100

if n == 0:
magnitude = 0 #cannot take log(0)
else:
magnitude = math.log(n, base)

order = int(round(magnitude)) // step
return '%.1f %s%s' % (float(n)/base**(order*step), \
prefixes[order], suffix)

Example usage
print [human_readable(x) for x in [0, 1, 23.5, 100, 1000/3, 500,

1000000, 12.345e9]]
['0.0 B', '1.0 B', '23.5 B', '100.0 B', '0.3 kB', '0.5 kB', '1.0 MB',
'12.3 GB']

I'd hoped to generalise this to base 2 (eg human_readable(1024, base=2)
== '1 KiB' and enforcing of 3 digits at most (ie human_readable(100) ==
'0.1 KB' instead of '100 B). However I can't get the right results
adapting the above code.

Here's where I'd like to ask for your help.
Am I chasing the right target, in basing my function on log()?
Does this function already exist in some python module?
Any hints, or would anyone care to finish it off/enhance it?

With thanks

Alex
Sep 27 '05 #1
9 5940
Compared to your program, I think the key to mine is to divide by "limit"
before taking the log. In this way, things below the "limit" go to the next lower integer.

I think that instead of having 'step' and 'base', there should be a single
value which would be 1000 or 1024.

import math

def MakeFormat(prefixes, step, limit, base):
def Format(n, suffix='B', places=2):
if abs(n) < limit:
if n == int(n):
return "%s %s" % (n, suffix)
else:
return "%.1f %s" % (n, suffix)
magnitude = math.log(abs(n) / limit, base) / step
magnitude = min(int(magnitude)+1, len(prefixes)-1)

return '%.1f %s%s' % (
float(n) / base ** (magnitude * step),
prefixes[magnitude], suffix)
return Format

DecimalFormat = MakeFormat(
prefixes = ['', 'k', 'M', 'G', 'T'],
step = 3,
limit = 100,
base = 10)
BinaryFormat = MakeFormat(
prefixes = ['', 'ki', 'Mi', 'Gi', 'Ti'],
step = 10,
limit = 100,
base = 2)

values = [0, 1, 23.5, 100, 1000/3, 500, 1000000, 12.345e9]
print [DecimalFormat(v) for v in values]
print [BinaryFormat(v) for v in values]

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFDOd84Jd01MZaTXX0RApi5AKChFiER/MmrIdYwfMMlCbhmTf/vjgCgpXsv
MhxevhDrWNnP5gomuNNCaMw=
=4jCj
-----END PGP SIGNATURE-----

Sep 28 '05 #2
Alex Willmer <al**@moreati.org.uk> writes:
When reporting file sizes to the user, it's nice to print '16.1 MB',
rather than '16123270 B'. This is the behaviour the command 'df -h'
implements. There's no python function that I could find to perform this
formatting , so I've taken a stab at it:

import math
def human_readable(n, suffix='B', places=2):
'''Return a human friendly approximation of n, using SI prefixes'''
prefixes = ['','k','M','G','T']
base, step, limit = 10, 3, 100

if n == 0:
magnitude = 0 #cannot take log(0)
else:
magnitude = math.log(n, base)

order = int(round(magnitude)) // step
return '%.1f %s%s' % (float(n)/base**(order*step), \
prefixes[order], suffix)

Example usage
print [human_readable(x) for x in [0, 1, 23.5, 100, 1000/3, 500,
1000000, 12.345e9]]
['0.0 B', '1.0 B', '23.5 B', '100.0 B', '0.3 kB', '0.5 kB', '1.0 MB',
'12.3 GB']

I'd hoped to generalise this to base 2 (eg human_readable(1024, base=2)
== '1 KiB' and enforcing of 3 digits at most (ie human_readable(100) ==
'0.1 KB' instead of '100 B). However I can't get the right results
adapting the above code.

Here's where I'd like to ask for your help.
Am I chasing the right target, in basing my function on log()?


I wouldn't have done it that way, but that's not worth very much. Can
you use the log() variation to change form proper scientific units
to the CS powers-of-two variation?

if not, I would do it this way:

def human_readable(n, suffix = 'B', places = 2):
prefixes = ['', 'K', 'M', 'G', 'T', 'P', 'E']

top = 10 ** places
index = 0
n = float(n)
while abs(n) > top:
n /= 10
index += 1
return '%.1f %s%s' % (n, prefixes[index], suffix)
Does this function already exist in some python module?


humanize_number is a cross-platform C library function, about 150
lines of code. It uses the loop I gave above. It might be worthwhile
to swipe the code (it's BSD-licensed), wrap it, and submit a PR to add
it to the standard library - just so you get properly tested code.

<mike
--
Mike Meyer <mw*@mired.org> http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.
Sep 28 '05 #3
Alex Willmer wrote:
When reporting file sizes to the user, it's nice to print '16.1 MB',
rather than '16123270 B'. This is the behaviour the command 'df -h'
implements. There's no python function that I could find to perform this
formatting , so I've taken a stab at it:


BOTEC at

http://www.alcyone.com/software/botec/

contains a class called SI which does this formatting (and supports all
SI prefixes).

--
Erik Max Francis && ma*@alcyone.com && http://www.alcyone.com/max/
San Jose, CA, USA && 37 20 N 121 53 W && AIM erikmaxfrancis
Dead men have no victory.
-- Euripides
Sep 28 '05 #4
"Alex Willmer" <al**@moreati.org.uk> wrote in message
news:11**********************@localhost.localdomai n...
When reporting file sizes to the user, it's nice to print '16.1 MB',
rather than '16123270 B'. This is the behaviour the command 'df -h'
implements. There's no python function that I could find to perform this
formatting , so I've taken a stab at it:

import math
def human_readable(n, suffix='B', places=2):
'''Return a human friendly approximation of n, using SI prefixes'''
prefixes = ['','k','M','G','T']
base, step, limit = 10, 3, 100

if n == 0:
magnitude = 0 #cannot take log(0)
else:
magnitude = math.log(n, base)

order = int(round(magnitude)) // step
return '%.1f %s%s' % (float(n)/base**(order*step), \
prefixes[order], suffix)

Example usage
print [human_readable(x) for x in [0, 1, 23.5, 100, 1000/3, 500, 1000000, 12.345e9]]
['0.0 B', '1.0 B', '23.5 B', '100.0 B', '0.3 kB', '0.5 kB', '1.0 MB',
'12.3 GB']

I'd hoped to generalise this to base 2 (eg human_readable(1024, base=2)
== '1 KiB' and enforcing of 3 digits at most (ie human_readable(100) ==
'0.1 KB' instead of '100 B). However I can't get the right results
adapting the above code.

Here's where I'd like to ask for your help.
Am I chasing the right target, in basing my function on log()?
Does this function already exist in some python module?
Any hints, or would anyone care to finish it off/enhance it?

With thanks

Alex


This'll probably do what you want with some minor modifications.

def fmt3(num):
for x in ['','Kb','Mb','Gb','Tb']:
if num<1024:
return "%3.1f%s" % (num, x)
num /=1024
print [fmt3(x) for x in [0, 1, 23.5, 100, 1000/3, 500, 1000000, 12.345e9]]

['0.0', '1.0', '23.5', '100.0', '333.0', '500.0', '976.6Kb', '11.5Gb']

HTH.


Sep 28 '05 #5
Here is another function for human formatting:

<pre>

def sistr(value, prec=None, K=1024.0, k=1000.0, sign='', blank=' '):
'''
Convert value to a signed string with an SI prefix.

The 'prec' value specifies the number of fractional
digits to be included. Use 'prec=0' to omit any
fraction. If 'prec' is not specified or None, the
precision is adjusted to make the returned string 6
characters (without the sign).

The 'sign' character is used for positive values.
Negative values are always prefixed with '-'.

Uppercase 'K' is the scale factor for values above
1.0 and lowercase 'k' scales values below 1.0.

The 'blank' character is used as the SI prefix for
values between k and K, i.e. value without an SI
prefix. Set 'blank' to None, False or '' if no
alignment is required.

name symbol 10** symbol name
=================================
deca da + 1 - d deci
hecto h + 2 - c centi
- - - - - - - - - - - - - - - - -
Kilo K + 3 - m milli
Mega M + 6 - /u micro
Giga G + 9 - n nano
Tera T + 12 - p pico
Peta P + 15 - f femto
Exa E + 18 - a atto
Zetta Z + 21 - z zepto
Yotta Y + 24 - y yocto
---------------------------------
Xona X + 27 - x xonto
Weka W + 30 - w wekto
Vunda V + 33 - v vunkto
Uda U + 36 - u* unto
Treda TD* + 39 - td trekto
Sorta S + 42 - s sotro
Rinta R + 45 - r rimto
Quexa Q + 48 - q quekto
Pepta PP + 51 - pk pekro
Ocha O + 54 - o otro
Nena N + 57 - nk nekto
MInga MI + 60 - mk mikto
Luma L + 63 - l lunto

The prefixes below the line are non-sanctioned SI
and are only used until the symbols marked * to
avoid ambiguity. The symbols above the dotted
line are not used and '/u' is returned as 'u'.

See http://en.wikipedia.org/wiki/Binary_prefix or
http://www.bipm.org/en/si/prefixes.html and maybe
http://jimvb.home.mindspring.com/unitsystem.htm
'''
s, v, p = sign, float(value), None
if v < 0.0:
s, v = '-', -v
if v < K:
if v >= 1.0:
p = blank
elif k > 10.0:
for f in iter('munpfazyxwv'): # no unto, ...
v *= k # scale up
if v >= 1.0:
p = f
break
elif K > 10.0:
for f in iter('KMGTPEZYXWVU'): # no Treda, ...
v /= K # scale down
if v < K:
p = f
break
# format value
if p is None: # too large, small or invalid K, k
return "%.0e*" % value
elif prec is None:
if v < 100.0:
if v < 10.0:
prec = 3
else:
prec = 2
else:
if v < 1000.0:
prec = 1
else:
prec = 0
elif prec < 0:
prec = 0 # rounds
return "%s%0.*f%s" % (s, prec, v, p)
if __name__ == '__main__':
x = 17
while x < 1.0e18:
print sistr(x), x
x *= 17
x = 0.12
while x > 1.0e-18:
print sistr(x), x
x *= 0.12

</pre>

/Jean Brouwers

Sep 28 '05 #6
"MrJean1" <Mr*****@gmail.com> writes:
Ocha O + 54 - o otro
Nena N + 57 - nk nekto
MInga MI + 60 - mk mikto
Luma L + 63 - l lunto


Please tell me you're making this up.
Sep 28 '05 #7
Paul Rubin wrote:
"MrJean1" <Mr*****@gmail.com> writes:
Ocha O + 54 - o otro
Nena N + 57 - nk nekto
MInga MI + 60 - mk mikto
Luma L + 63 - l lunto


Please tell me you're making this up.


No, but someone else is.

http://jimvb.home.mindspring.com/unitsystem.htm

--
Robert Kern
rk***@ucsd.edu

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter

Sep 28 '05 #8
No, I didn't. See the references at the bottom.

/Jean Brouwers

Sep 28 '05 #9

MrJean1 wrote:
No, I didn't. See the references at the bottom.

/Jean Brouwers


So when I say "I'm sorta busy" it means I'm REALLY busy.

Sep 28 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Sven | last post by:
Hello, I am receiving a text file that is produced from a mainframe that is out of my control. I am attempting to find a (hopefully clean) way to import it into a SQL Server database in an...
4
by: aj | last post by:
DB2 WSE 8.1 FP5 Red Hat Linux AS 2.1 I am working on a Java-Swing based interface for a DB2 database, and want to display more human-readable error meesages to the end user when/if a database...
4
by: Konstantin Shemyak | last post by:
I have a big structure tree. All leaves are scalar values (no pointers). Present are arrays, structures and unions. I want to be able to store/read the content of the structure in/from a file, and...
4
by: John Baro | last post by:
I need to determine which fonts are human readable. Webdings, wingdings etc.. are not. Is there any easy way to accomplish this? Cheers JB
5
by: mphanke | last post by:
Hi, can somebody tell me how to convert the date from a Paradox .DB to human readable format? I need to write a tool to convert the 4 Bytes representing the date to a human readable format. ...
2
by: DBuss | last post by:
OK, I'm reading a multicast socket. It attaches fine, reads fine, all of that. The problem is that while some of the data I get is normal text (ASCII String), some of it is Binary Integer. ...
3
by: Sanyog Garg | last post by:
hi I want to open a binary file that contains information in Ulong, Int and char form. now i want to convert this file into human readable form so that i can read the information contained in this...
0
by: Rob Weir | last post by:
On 13 Aug 2008, rkmr wrote: http://mail.python.org/pipermail/python-list/1999-December/018519.html is a good start - just need to change the table to something like:: _abbrevs = (and add a...
5
by: August Karlstrom | last post by:
Hi, I'm looking for a function that *returns* a human readable string representation of an array rather than prints it so I can use it with the error_log procedure. Any clues? August
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.