Archives and magic bytes

andrea

Hi everybody,
this is my first post but I've read already many of yours interesting
posts... (sorry for my bad english)

Anyway for my little project I need a module that given an archive (zip,
bz2, tar ...) gives me back the archive decompressed.

I looked at the modules in the library reference and used some of them,
but the problem is that they all behave in a different way, and just one
has a useful command for decompressing files easily, bz2.

/decompress(...)
decompress(data) -> decompressed data

Decompress data in one shot. If you want to decompress data
sequentially,
use an instance of BZ2Decompressor instead.
/
I can't get even that one working, what does it mean data? A file?

Maybe I could implement myself the compression algorithm (or copy that
from the modules) and implement myself compress/decompress functions,
what do you think?
Do you know if there is already something similar??

Another thing, I work on linux (gentoo) and I would like to use the
"file" command to retrieve informations about type of file instead of
using extensions, do you think this can be done?

Thanks, and sorry if I've done some stupid questions, I'm still a python
novice (but it's a great language)

Andrea

Jul 18 '05 #1

Subscribe Reply

1251

Maxim Krikun

> Another thing, I work on linux (gentoo) and I would like to use the
"file" command to retrieve informations about type of file instead of
using extensions, do you think this can be done?

this is trivial:

import os
os.popen("file /etc/passwd").read()

'/etc/passwd: ASCII text\n'

Jul 18 '05 #2

Chris Rebert (cybercobra)

Have you tried the tarfile or zipfile modules? You might need to ugrade
your python if you don't have them. They look pretty easy and should
make this a snap.
You can grab the output from the *nix "file" command using the new
subprocess module.
Good Luck
- Chris
=======
PYTHON POWERs all!
All your code are belong to Python!

Jul 18 '05 #3

andrea

Chris Rebert (cybercobra) wrote:

Have you tried the tarfile or zipfile modules? You might need to ugrade
your python if you don't have them. They look pretty easy and should
make this a snap.
You can grab the output from the *nix "file" command using the new
subprocess module.
Good Luck
- Chris
=======
PYTHON POWERs all!
All your code are belong to Python!

I've got them (I'm still using python 2.3 because I use gentoo) but they
are not very easy to use as they seem...
I'll try again, thanks!

Jul 18 '05 #4

Jim

This is something I've recently thought about; perhaps you wouldn't
mind some points?

1) I've been running 'file' via os.popen, and I've had trouble with it
incorrectly spotting file types (Fedora Core 1). I can name a specific
example where it thinks a plain text README file is HTML (despite that
the configuration file for 'file' at least looks right). That makes me
suspicious of its ability to spot more obscure types.

(No, I haven't tried to get the latest 'file'; the days are long but
they are filled with negative time and in the end I don't always get
done what I should.)

2) Watch out for someone giving you, say, a bogus /bin/ls in a .zip
file. You may want to look into chroot (which I believe requires you
to run as root), or at least examine the output of "unzip -l"

3) You might also have to worry about the possibility that unpacking a
bundle will fill up your disk's partition. At least for a while you
hold both the bundle and the unpacked bundle.

4) Using os.popen to unpack the bundle has a lot of advantages,
including that during debugging you can test the stuff from the command
line and feel that you completely understand which steps are working (I
think I use popen2, IIRC, and capture stderr for error messages).

Perhaps this is mostly a reflection on me as a programmer :-} but I
found the job surprisingly tricky.

Jim

Jul 18 '05 #5

andrea crotti

> Perhaps this is mostly a reflection on me as a programmer :-} but I

found the job surprisingly tricky.

No I think you're right...
It's not very important for me retrieve exactly what kind of file it
is, it could be just something more in my little program (an organizer
that put files in the right directories using extensions of files,
categories and regular expressions).

A really good file identifier is this one(the author is a friend of mine):
http://mark0.net/soft-trid-e.html

It doesn't work with mono yet but maybe one day I'll try to use it
with python...

Thanks everybody

Jul 18 '05 #6

Similar topics

3353

Version of TAR in tarfile module? TAR 1.14 or 1.15 port to Windows?

by: Claudio Grondi | last post by:

I need to unpack on a Windows 2000 machine some Wikipedia media .tar archives which are compressed with TAR 1.14 (support for long file names and maybe some other features) . It seems, that...

Python

5144

Are MAGIC numbers always bad?

by: youpak2000 | last post by:

Are MAGIC numbers always bad? Using magic numbers (constant numbers) in programs are generally considered a bad programming practice, and it's recommended that to define constants in single,...

C / C++

4394

Reference and Pointer

by: Kuku | last post by:

What is the difference between a reference and a pointer?

C / C++

3542

MIME Magic

by: Samuel | last post by:

Hi, How can I determine the type of a file from "magic bytes", similar to what the "file" command on unix does? I found http://docs.python.org/lib/module-mimetypes.html but this only seems...

Python

2573

magic names in python

by: per9000 | last post by:

Hi, I recently started working a lot more in python than I have done in the past. And I discovered something that totally removed the pretty pink clouds of beautifulness that had surrounded my...

Python

1199

Python 3000: Standard API for archives?

by: samwyse | last post by:

I'm a relative newbie to Python, so please bear with me. There are currently two standard modules used to access archived data: zipfile and tarfile. The interfaces are completely different. In...

Python

2561

How can i open the magic square in the text file????

by: jyck91 | last post by:

i have done the magic square: #include <stdio.h> #include <stdlib.h> #include <string.h> #define SIZE 13 main() { FILE *fp; int i, j, n, row, column;

C / C++

2021

What is "ERROR: Wrong magic number"

by: ravi | last post by:

Hi all, I written and compiled a c++ program using g++ no errors or warning are reported. But when I run it , reporting an error : ERROR: Wrong magic number. What is the reason for this...

C / C++

6482

Mime Magic and FileInfo Help

by: tinman77 | last post by:

Hello, I'm having a terrible time using the functions finfo_open and finfo_file. I'm using PHP 5 on IIS 5.1 and Windows XP. I have enabled php_mime_magic.dll and php_fileinfo.dll and also added...

PHP

16112

Module python-magic on/for Windows?

by: Larry Hale | last post by:

I've heard tell of a Python binding for libmagic (file(1) *nixy command; see http://darwinsys.com/file/). Generally, has anybody built this and worked with it under Windows? The only thing I've...

Python

7254

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

7153

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

7373

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

7432

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

5677

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice

5079

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

Microsoft Access / VBA

4743

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...

C# / C Sharp

3230

Trying to create a lan-to-lan vpn between two differents networks

by: TSSRALBI | last post by:

Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...

Networking - Hardware / Configuration

796

How to add payments to a PHP MySQL app.

by: muto222 | last post by:

How can i add a mobile payment intergratation into php mysql website.

PHP