473,399 Members | 4,177 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,399 software developers and data experts.

Re: extract from zip files..

hello, i am quite new to python and i have been experimenting with the zipfile module since i deal with a lot of zip files everyday and i want to automate my work..my question is: how do i retain the date and time stamp of the files that i have 'unzipped' from the zip file..i already know how to unzip but a new timestamp is written (time and day unzipped)..it is important that i retain the original from the zipfile..can someone help me out?thanks in advance.. :) been trying to solve this myself but the solution keeps on evading me..

Expand|Select|Wrap|Line Numbers
  1. def unzip(zip,dst,name):
  2.     zip = zipfile.ZipFile(zip, 'r')    
  3.     os.makedirs(dst+name)
  4.     dst = dst+name
  5.     for filename in zip.namelist():
  6.         bytes = zip.read(filename)
  7.         print 'Unzipping file:', filename, 'with', len(bytes), 'bytes..'
  8.         file((join(dst,filename)), 'wb').write(zip.read(filename))
  9.     zip.close
  10.  
btw,am using windows xp with py2.4.4..
Feb 23 '07 #1
11 5855
ghostdog74
511 Expert 256MB
hello, i am quite new to python and i have been experimenting with the zipfile module since i deal with a lot of zip files everyday and i want to automate my work..my question is: how do i retain the date and time stamp of the files that i have 'unzipped' from the zip file..i already know how to unzip but a new timestamp is written (time and day unzipped)..it is important that i retain the original from the zipfile..can someone help me out?thanks in advance.. :) been trying to solve this myself but the solution keeps on evading me..

Expand|Select|Wrap|Line Numbers
  1. def unzip(zip,dst,name):
  2.     zip = zipfile.ZipFile(zip, 'r')    
  3.     os.makedirs(dst+name)
  4.     dst = dst+name
  5.     for filename in zip.namelist():
  6.         bytes = zip.read(filename)
  7.         print 'Unzipping file:', filename, 'with', len(bytes), 'bytes..'
  8.         file((join(dst,filename)), 'wb').write(zip.read(filename))
  9.     zip.close
  10.  
btw,am using windows xp with py2.4.4..

how about giving the tar module a try
here
tarring before zipping can keep time stamps..
Feb 23 '07 #2
Motoma
3,237 Expert 2GB
Welcome to theScripts.
I believe you can set these timestamps using the utime() function.
Feb 23 '07 #3
thanks guys..ill try tar as suggested..right now im using the zipfile module when i write the zip file and the time stamp is well preserved..am confused about the extract part though cos i want to retain the time stamp.. i have tried utime but im having a hard time binding the values to mtime which i got from getinfo..i guess ill do this a bit more till i get it but if i don't,ill be back..thanks a bunch again.. :)

correct me if im wrong, i can use any time value for atime when i use utime, right?

EDIT: by the way,its my 1st time in this forum and i am impressed..i hope to find
a home here..thanks Motoma :)
Feb 23 '07 #4
bartonc
6,596 Expert 4TB
thanks guys.. by the way,its my 1st time in this forum and i am impressed..i hope to find
a home here.. :)
Definitely! Keep posting,
Barton
Feb 23 '07 #5
woaah!!i finally got it..hehehe..thanks for the tips guys..i just used zipfile module without tarring..and i can unzip files with ([folders within folders]raised to the power of n)..hehehe..python is so amazing!!im shy about posting my code here just yet since im a newbie but if anyone wants it,id be happy to..thanks again.. :)
Feb 24 '07 #6
bartonc
6,596 Expert 4TB
woaah!!i finally got it..hehehe..thanks for the tips guys..i just used zipfile module without tarring..and i can unzip files with ([folders within folders]raised to the power of n)..hehehe..python is so amazing!!im shy about posting my code here just yet since im a newbie but if anyone wants it,id be happy to..thanks again.. :)
The whole community benefits from your posts. (just imagine the next person in the position that you were in before you posted) So, yes, definitly post for everybody's sake. Thanks.
Feb 24 '07 #7
here is my script..any comment is appreciated..i personally think its too long but it does what i need it to do..i can't retain folder time stamps though and i think its not possible since windows and winzip(the ones ive tried)also writes new timestamps to folders when unzipping from the zipfile..if u guys know how to do it, please let me know.. :) any 'constructive' comments and suggestions for this script will be appreciated..it will add to my programming skills..hehehe..thanks in advance guys..

ps..im trying something new right now..how do i zip all files and folders(recursive) inside a folder to a zip file..ill search around the forums
more but if anyone can help me out,you are heaven-sent..thanks!!


Expand|Select|Wrap|Line Numbers
  1. import zipfile, os, time, datetime, os.path
  2. from zipfile import *
  3. from os import *
  4. from os.path import *
  5. from time import mktime
  6.  
  7. ## Zipfile to be unzipped
  8. zips = 'test.zip'
  9. ## Destination directory
  10. dst = ''
  11. ## Specific folder in destination
  12. name = 'test'
  13.  
  14. zip = zipfile.ZipFile(zips, 'r')
  15. if os.path.exists(dst+name):
  16.     pass
  17. else:
  18.     os.makedirs(dst+name)
  19. dst = dst+name+'/'
  20. for filename in zip.namelist():
  21.     if filename.endswith('/'):
  22.         if os.path.exists(join(abspath(dst),filename)):
  23.             pass
  24.         else:
  25.             os.makedirs(join(abspath(dst),filename))
  26.     else:
  27.         try:
  28.             os.makedirs(normpath((abspath(dst)+'/'+dirname(filename))))
  29.             try:                
  30.                 bytes = zip.read(filename)
  31.                 print 'Unzipping file:', filename, 'with', len(bytes), 'bytes..'
  32.                 file((join(dst,filename)), 'wb').write(zip.read(filename))
  33.                 accesstime = time.time()
  34.                 timeTuple=(int(zip.getinfo(filename).date_time[0]),\
  35.                            int(zip.getinfo(filename).date_time[1]),\
  36.                            int(zip.getinfo(filename).date_time[2]),\
  37.                            int(zip.getinfo(filename).date_time[3]) ,\
  38.                            int(zip.getinfo(filename).date_time[4]),\
  39.                            int(zip.getinfo(filename).date_time[5]),\
  40.                            int(0),int(0),int(0))
  41.                 modifiedtime = mktime(timeTuple)
  42.                 utime((join(dst,filename)), (accesstime,modifiedtime))
  43.             except IOError:
  44.                 pass
  45.         except:
  46.             if os.path.exists(normpath((abspath(dst)+'/'+dirname(filename)))):
  47.                 try:                
  48.                     bytes = zip.read(filename)
  49.                     print 'Unzipping file:', filename, 'with', len(bytes), 'bytes..'
  50.                     file((join(dst,filename)), 'wb').write(zip.read(filename))
  51.                     accesstime = time.time()
  52.                     timeTuple=(int(zip.getinfo(filename).date_time[0]),\
  53.                                int(zip.getinfo(filename).date_time[1]),\
  54.                                int(zip.getinfo(filename).date_time[2]),\
  55.                                int(zip.getinfo(filename).date_time[3]) ,\
  56.                                int(zip.getinfo(filename).date_time[4]),\
  57.                                int(zip.getinfo(filename).date_time[5]),\
  58.                                int(0),int(0),int(0))
  59.                     modifiedtime = mktime(timeTuple)
  60.                     utime((join(dst,filename)), (accesstime,modifiedtime))
  61.                 except IOError:
  62.                     pass
  63.             else:
  64.                 os.makedirs(normpath((abspath(dst)+'/'+dirname(filename))))
  65. zip.close
  66.  
Feb 24 '07 #8
bvdet
2,851 Expert Mod 2GB
here is my script..any comment is appreciated..i personally think its too long but it does what i need it to do..i can't retain folder time stamps though and i think its not possible since windows and winzip(the ones ive tried)also writes new timestamps to folders when unzipping from the zipfile..if u guys know how to do it, please let me know.. :) any 'constructive' comments and suggestions for this script will be appreciated..it will add to my programming skills..hehehe..thanks in advance guys..

ps..im trying something new right now..how do i zip all files and folders(recursive) inside a folder to a zip file..ill search around the forums
more but if anyone can help me out,you are heaven-sent..thanks!!

This function will return a joined list as a string of all the files found under a given directory. In this case it passes files that do not have a 'py' extension:
Expand|Select|Wrap|Line Numbers
  1. import os
  2.  
  3. # Joined string method
  4.  
  5. def dir_list2(dir_name, *args):
  6.     fileList = []
  7.     for file in os.listdir(dir_name):
  8.         dirfile = os.path.join(dir_name, file)
  9.         if os.path.isfile(dirfile):
  10.             if len(args) == 0:
  11.                 fileList.append('%s\n' %(dirfile))
  12.             else:
  13.                 if os.path.splitext(dirfile)[1][1:] in args:
  14.                     fileList.append('%s\n' %(dirfile))
  15.         elif os.path.isdir(dirfile):
  16.             print "Accessing directory:", dirfile
  17.             fileList += dir_list2(dirfile, *args)
  18.     return "".join(fileList)
  19.  
  20. if __name__ == '__main__':
  21.  
  22.     def run_script():
  23.         dir_name = (os.path.join('C:\\', 'SDS2_7.0', 'macro'))
  24.         f = dir_list2(dir_name, 'py')
  25.         print f
One call to print is much faster than iterating on a file list to print each name.
Feb 24 '07 #9
bvdet
2,851 Expert Mod 2GB
crashonyou,

Here is another means of compiling a list of file names using os.walk:
Expand|Select|Wrap|Line Numbers
  1. import os
  2.  
  3. dir_name = (os.path.join('X:/', 'dir1, 'dir2'))
  4.  
  5. a = os.walk(dir_name)
  6.  
  7. for root, dir, file in os.walk(dir_name):
  8.  
  9.     print "Root directory: %s" % (root)
  10.  
  11.     if len(dir) > 0:
  12.         print "Subdirectories under %s:" % (root)
  13.         dirList = map(lambda x: '%s\n' % (x), dir)
  14.         dirStr = "".join(dirList)
  15.         print dirStr
  16.     else:
  17.         print "There are no subdirectories under directory %s" % (root)
  18.  
  19.     if len(file) > 0:
  20.         print "Files in directory %s:" % (root)
  21.         fileList = map(lambda x: '%s\n' % (os.path.join(root, x)), file)
  22.         fileStr = "".join(fileList)
  23.         print fileStr
  24.     else:
  25.         print "There are no files in directory %s" % (root)
You can use a facimile of the above to create a list of file names to pass to:
Expand|Select|Wrap|Line Numbers
  1. import zipfile, os
  2.  
  3. def makeArchive(fileList, archive):
  4.     try:
  5.         # ZipFile will accept a file name or file object
  6.         a = zipfile.ZipFile(archive, 'w', zipfile.ZIP_DEFLATED)
  7.         for f in fileList:
  8.             print "archiving file %s" % (f)
  9.             a.write(f)    # Use (f, os.path.basename(f)) if not saving full path
  10.         a.close()
  11.         return True
  12.     except: return False
  13.  
  14. if __name__== '__main__':
  15.  
  16.     arcfile_name = 'H:/TEMP/temsys/zipped_files.zip'
  17.  
  18.     if makeArchive(fileList, arcfile_name):    # fileList == your_file_list
  19.         print arcfile_name, "was created.\n"
  20.         # check the new archive file
  21.         f = zipfile.ZipFile(arcfile_name, 'r')
  22.         for info in f.infolist():
  23.             print info.filename, info.date_time, info.file_size, info.compress_size
  24.         f.close()
  25.     else:
  26.         print "There was an error"
HTH :),
BV
Feb 24 '07 #10
woah,thanks for that one..found it really useful..i'll be modifying it to suit my needs.. :)
Feb 28 '07 #11
heiro
56
woah,thanks for that one..found it really useful..i'll be modifying it to suit my needs.. :)

Sugs,

Buti nalang na post mo ito.Hindi na ako mahihirapang maghanap...heheheh
Oct 27 '07 #12

Sign in to post your reply or Sign up for a free account.

Similar topics

5
by: Jane Doe | last post by:
Hi I took a quick look in the archives, but didn't find an answer to this one. I'd like to display a list of HTML files in a directory, showing the author's name between brackets after the...
9
by: Sharon | last post by:
hi, I want to extract a string from a file, if the file is like this: 1 This is the string 2 3 4 how could I extract the string, starting from the 10th position (i.e. "T") and...
10
by: Robert Schultz | last post by:
I have a C/C++ file that I simply want to 'extract' a function from. Something like: extract <function name> <c or cpp file> I want it to return from the beginning of the function, to the end. ...
3
by: rahman | last post by:
I have few hundred HTML pages. I need to extract portion of each HTML page into a text/database/HTML files format. You can imagine it is very tedious to do one by one. Is there any automatic...
9
by: trihanhcie | last post by:
Hi, I would like to extract the text in an HTML file For the moment, I'm trying to get all text between <tdand </td>. I used a regular expression because i don't know the "format between...
7
by: John | last post by:
We have created a game and when the user click on "Download New Songs" which requires the user to download songs. This could be over 20 songs. I have these files zipped on the server in one...
8
by: Fabian Braennstroem | last post by:
Hi, I would like to remove certain lines from a log files. I had some sed/awk scripts for this, but now, I want to use python with its re module for this task. Actually, I have two different...
6
by: learnerofpython | last post by:
hi! i have a directory structure(can be of any type) I want to search for the files present in it and extract the names of those files(file can be of any format e.g .txt/.c/.py) How can...
3
by: learningvbnet | last post by:
Hi, I am trying to extract zipped files using Winzip in my VB.net application and I ran into 2 stone walls. 1. How do you handle file names with spaces. See psiProcess.Arguments For...
5
by: Steve | last post by:
Hi all Does anybody please know a way to extract an Image from a pdf file and save it as a TIFF? I have used a scanner to scan documents which are then placed on a server, but I need to...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.