Human readable number formatting

Alex Willmer

When reporting file sizes to the user, it's nice to print '16.1 MB',
rather than '16123270 B'. This is the behaviour the command 'df -h'
implements. There's no python function that I could find to perform this
formatting , so I've taken a stab at it:

import math
def human_readable(n, suffix='B', places=2):
'''Return a human friendly approximation of n, using SI prefixes'''
prefixes = ['','k','M','G','T']
base, step, limit = 10, 3, 100

if n == 0:
magnitude = 0 #cannot take log(0)
else:
magnitude = math.log(n, base)

order = int(round(magnitude)) // step
return '%.1f %s%s' % (float(n)/base**(order*step), \
prefixes[order], suffix)

Example usage

print [human_readable(x) for x in [0, 1, 23.5, 100, 1000/3, 500,

1000000, 12.345e9]]
['0.0 B', '1.0 B', '23.5 B', '100.0 B', '0.3 kB', '0.5 kB', '1.0 MB',
'12.3 GB']

I'd hoped to generalise this to base 2 (eg human_readable(1024, base=2)
== '1 KiB' and enforcing of 3 digits at most (ie human_readable(100) ==
'0.1 KB' instead of '100 B). However I can't get the right results
adapting the above code.

Here's where I'd like to ask for your help.
Am I chasing the right target, in basing my function on log()?
Does this function already exist in some python module?
Any hints, or would anyone care to finish it off/enhance it?

With thanks

Alex

Sep 27 '05 #1

Subscribe Post Reply

5940

jepler

Compared to your program, I think the key to mine is to divide by "limit"
before taking the log. In this way, things below the "limit" go to the next lower integer.

I think that instead of having 'step' and 'base', there should be a single
value which would be 1000 or 1024.

import math

def MakeFormat(prefixes, step, limit, base):
def Format(n, suffix='B', places=2):
if abs(n) < limit:
if n == int(n):
return "%s %s" % (n, suffix)
else:
return "%.1f %s" % (n, suffix)
magnitude = math.log(abs(n) / limit, base) / step
magnitude = min(int(magnitude)+1, len(prefixes)-1)

return '%.1f %s%s' % (
float(n) / base ** (magnitude * step),
prefixes[magnitude], suffix)
return Format

DecimalFormat = MakeFormat(
prefixes = ['', 'k', 'M', 'G', 'T'],
step = 3,
limit = 100,
base = 10)
BinaryFormat = MakeFormat(
prefixes = ['', 'ki', 'Mi', 'Gi', 'Ti'],
step = 10,
limit = 100,
base = 2)

values = [0, 1, 23.5, 100, 1000/3, 500, 1000000, 12.345e9]
print [DecimalFormat(v) for v in values]
print [BinaryFormat(v) for v in values]

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFDOd84Jd01MZaTXX0RApi5AKChFiER/MmrIdYwfMMlCbhmTf/vjgCgpXsv
MhxevhDrWNnP5gomuNNCaMw=
=4jCj
-----END PGP SIGNATURE-----

Sep 28 '05 #2

Mike Meyer

Alex Willmer <al**@moreati.org.uk> writes:

When reporting file sizes to the user, it's nice to print '16.1 MB',
rather than '16123270 B'. This is the behaviour the command 'df -h'
implements. There's no python function that I could find to perform this
formatting , so I've taken a stab at it:

import math
def human_readable(n, suffix='B', places=2):
'''Return a human friendly approximation of n, using SI prefixes'''
prefixes = ['','k','M','G','T']
base, step, limit = 10, 3, 100

if n == 0:
magnitude = 0 #cannot take log(0)
else:
magnitude = math.log(n, base)

order = int(round(magnitude)) // step
return '%.1f %s%s' % (float(n)/base**(order*step), \
prefixes[order], suffix)

Example usage
print [human_readable(x) for x in [0, 1, 23.5, 100, 1000/3, 500,
1000000, 12.345e9]]
['0.0 B', '1.0 B', '23.5 B', '100.0 B', '0.3 kB', '0.5 kB', '1.0 MB',
'12.3 GB']

I'd hoped to generalise this to base 2 (eg human_readable(1024, base=2)
== '1 KiB' and enforcing of 3 digits at most (ie human_readable(100) ==
'0.1 KB' instead of '100 B). However I can't get the right results
adapting the above code.

Here's where I'd like to ask for your help.
Am I chasing the right target, in basing my function on log()?

I wouldn't have done it that way, but that's not worth very much. Can
you use the log() variation to change form proper scientific units
to the CS powers-of-two variation?

if not, I would do it this way:

def human_readable(n, suffix = 'B', places = 2):
prefixes = ['', 'K', 'M', 'G', 'T', 'P', 'E']

top = 10 ** places
index = 0
n = float(n)
while abs(n) > top:
n /= 10
index += 1
return '%.1f %s%s' % (n, prefixes[index], suffix)
Does this function already exist in some python module?

humanize_number is a cross-platform C library function, about 150
lines of code. It uses the loop I gave above. It might be worthwhile
to swipe the code (it's BSD-licensed), wrap it, and submit a PR to add
it to the standard library - just so you get properly tested code.

<mike
--
Mike Meyer <mw*@mired.org> http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.

Sep 28 '05 #3

Erik Max Francis

Alex Willmer wrote:

When reporting file sizes to the user, it's nice to print '16.1 MB',
rather than '16123270 B'. This is the behaviour the command 'df -h'
implements. There's no python function that I could find to perform this
formatting , so I've taken a stab at it:

BOTEC at

http://www.alcyone.com/software/botec/

contains a class called SI which does this formatting (and supports all
SI prefixes).

--
Erik Max Francis && ma*@alcyone.com && http://www.alcyone.com/max/
San Jose, CA, USA && 37 20 N 121 53 W && AIM erikmaxfrancis
Dead men have no victory.
-- Euripides

Sep 28 '05 #4

J Correia

"Alex Willmer" <al**@moreati.org.uk> wrote in message
news:11**********************@localhost.localdomai n...

When reporting file sizes to the user, it's nice to print '16.1 MB',
rather than '16123270 B'. This is the behaviour the command 'df -h'
implements. There's no python function that I could find to perform this
formatting , so I've taken a stab at it:

import math
def human_readable(n, suffix='B', places=2):
'''Return a human friendly approximation of n, using SI prefixes'''
prefixes = ['','k','M','G','T']
base, step, limit = 10, 3, 100

if n == 0:
magnitude = 0 #cannot take log(0)
else:
magnitude = math.log(n, base)

order = int(round(magnitude)) // step
return '%.1f %s%s' % (float(n)/base**(order*step), \
prefixes[order], suffix)

Example usage
print [human_readable(x) for x in [0, 1, 23.5, 100, 1000/3, 500, 1000000, 12.345e9]]
['0.0 B', '1.0 B', '23.5 B', '100.0 B', '0.3 kB', '0.5 kB', '1.0 MB',
'12.3 GB']

I'd hoped to generalise this to base 2 (eg human_readable(1024, base=2)
== '1 KiB' and enforcing of 3 digits at most (ie human_readable(100) ==
'0.1 KB' instead of '100 B). However I can't get the right results
adapting the above code.

Here's where I'd like to ask for your help.
Am I chasing the right target, in basing my function on log()?
Does this function already exist in some python module?
Any hints, or would anyone care to finish it off/enhance it?

With thanks

Alex

This'll probably do what you want with some minor modifications.

def fmt3(num):
for x in ['','Kb','Mb','Gb','Tb']:
if num<1024:
return "%3.1f%s" % (num, x)
num /=1024

print [fmt3(x) for x in [0, 1, 23.5, 100, 1000/3, 500, 1000000, 12.345e9]]

['0.0', '1.0', '23.5', '100.0', '333.0', '500.0', '976.6Kb', '11.5Gb']

HTH.

Sep 28 '05 #5

MrJean1

Here is another function for human formatting:

<pre>

def sistr(value, prec=None, K=1024.0, k=1000.0, sign='', blank=' '):
'''
Convert value to a signed string with an SI prefix.

The 'prec' value specifies the number of fractional
digits to be included. Use 'prec=0' to omit any
fraction. If 'prec' is not specified or None, the
precision is adjusted to make the returned string 6
characters (without the sign).

The 'sign' character is used for positive values.
Negative values are always prefixed with '-'.

Uppercase 'K' is the scale factor for values above
1.0 and lowercase 'k' scales values below 1.0.

The 'blank' character is used as the SI prefix for
values between k and K, i.e. value without an SI
prefix. Set 'blank' to None, False or '' if no
alignment is required.

name symbol 10** symbol name
=================================
deca da + 1 - d deci
hecto h + 2 - c centi
- - - - - - - - - - - - - - - - -
Kilo K + 3 - m milli
Mega M + 6 - /u micro
Giga G + 9 - n nano
Tera T + 12 - p pico
Peta P + 15 - f femto
Exa E + 18 - a atto
Zetta Z + 21 - z zepto
Yotta Y + 24 - y yocto
---------------------------------
Xona X + 27 - x xonto
Weka W + 30 - w wekto
Vunda V + 33 - v vunkto
Uda U + 36 - u* unto
Treda TD* + 39 - td trekto
Sorta S + 42 - s sotro
Rinta R + 45 - r rimto
Quexa Q + 48 - q quekto
Pepta PP + 51 - pk pekro
Ocha O + 54 - o otro
Nena N + 57 - nk nekto
MInga MI + 60 - mk mikto
Luma L + 63 - l lunto

The prefixes below the line are non-sanctioned SI
and are only used until the symbols marked * to
avoid ambiguity. The symbols above the dotted
line are not used and '/u' is returned as 'u'.

See http://en.wikipedia.org/wiki/Binary_prefix or
http://www.bipm.org/en/si/prefixes.html and maybe
http://jimvb.home.mindspring.com/unitsystem.htm
'''
s, v, p = sign, float(value), None
if v < 0.0:
s, v = '-', -v
if v < K:
if v >= 1.0:
p = blank
elif k > 10.0:
for f in iter('munpfazyxwv'): # no unto, ...
v *= k # scale up
if v >= 1.0:
p = f
break
elif K > 10.0:
for f in iter('KMGTPEZYXWVU'): # no Treda, ...
v /= K # scale down
if v < K:
p = f
break
# format value
if p is None: # too large, small or invalid K, k
return "%.0e*" % value
elif prec is None:
if v < 100.0:
if v < 10.0:
prec = 3
else:
prec = 2
else:
if v < 1000.0:
prec = 1
else:
prec = 0
elif prec < 0:
prec = 0 # rounds
return "%s%0.*f%s" % (s, prec, v, p)
if __name__ == '__main__':
x = 17
while x < 1.0e18:
print sistr(x), x
x *= 17
x = 0.12
while x > 1.0e-18:
print sistr(x), x
x *= 0.12

</pre>

/Jean Brouwers

Sep 28 '05 #6

Paul Rubin

"MrJean1" <Mr*****@gmail.com> writes:

Ocha O + 54 - o otro
Nena N + 57 - nk nekto
MInga MI + 60 - mk mikto
Luma L + 63 - l lunto

Please tell me you're making this up.

Sep 28 '05 #7

Robert Kern

Paul Rubin wrote:

"MrJean1" <Mr*****@gmail.com> writes:
Ocha O + 54 - o otro
Nena N + 57 - nk nekto
MInga MI + 60 - mk mikto
Luma L + 63 - l lunto

Please tell me you're making this up.

No, but someone else is.

http://jimvb.home.mindspring.com/unitsystem.htm

--
Robert Kern
rk***@ucsd.edu

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter

Sep 28 '05 #8

MrJean1

No, I didn't. See the references at the bottom.

/Jean Brouwers

Sep 28 '05 #9

mensanator

MrJean1 wrote:

No, I didn't. See the references at the bottom.

/Jean Brouwers

So when I say "I'm sorta busy" it means I'm REALLY busy.

Sep 28 '05 #10

by: Sven | last post by:

Hello, I am receiving a text file that is produced from a mainframe that is out of my control. I am attempting to find a (hopefully clean) way to import it into a SQL Server database in an...

Microsoft SQL Server

Get human-readable error meesages ala db2 ? sql00911

by: aj | last post by:

DB2 WSE 8.1 FP5 Red Hat Linux AS 2.1 I am working on a Java-Swing based interface for a DB2 database, and want to display more human-readable error meesages to the end user when/if a database...

DB2 Database

Human-readable storage for types (with no pointers)?

by: Konstantin Shemyak | last post by:

I have a big structure tree. All leaves are scalar values (no pointers). Present are arrays, structures and unions. I want to be able to store/read the content of the structure in/from a file, and...

C / C++

Determine which fonts are Human Readable

by: John Baro | last post by:

I need to determine which fonts are human readable. Webdings, wingdings etc.. are not. Is there any easy way to accomplish this? Cheers JB

C# / C Sharp

Paradox Date field to human readable format

by: mphanke | last post by:

Hi, can somebody tell me how to convert the date from a Paradox .DB to human readable format? I need to write a tool to convert the 4 Bytes representing the date to a human readable format. ...

.NET Framework

Converting a Multi-Byte Binary Stream Data into Human Readable

by: DBuss | last post by:

OK, I'm reading a multicast socket. It attaches fine, reads fine, all of that. The problem is that while some of the data I get is normal text (ASCII String), some of it is Binary Integer. ...

C / C++

is it possible to convert a binary file into human readable?

by: Sanyog Garg | last post by:

hi I want to open a binary file that contains information in Ulong, Int and char form. now i want to convert this file into human readable form so that i can read the information contained in this...

Java

Re: formatting number of bytes to human readable format

by: Rob Weir | last post by:

On 13 Aug 2008, rkmr wrote: http://mail.python.org/pipermail/python-list/1999-December/018519.html is a good start - just need to change the table to something like:: _abbrevs = (and add a...

Python

Human readable error_log

by: August Karlstrom | last post by:

Hi, I'm looking for a function that *returns* a human readable string representation of an array rather than prints it so I can use it with the error_log procedure. Any clues? August

PHP

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

Windows Server

Human readable number formatting

Similar topics