By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,275 Members | 1,745 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,275 IT Pros & Developers. It's quick & easy.

speed up data extraction from large arrays

P: 2

I have three arrays with variable values from a netCDF file. xy contains the vectors for the lon and lat coordinates. I am extracting the coordinates out of the two arrays (lon, lat) according to the vectors in the xy array. So far, I get the correct results, but the speed of the extraction process (for-loop) is extremely poor!!

Here's my code:

Expand|Select|Wrap|Line Numbers
  1. from netCDF4 import Dataset
  2. import scipy as scp
  3. import numpy as np
  5. # open netCDF file and create a dataset
  6. rootfile = ""
  7. rootgrp = Dataset(rootfile, "r")
  9. # read variable values into arrays
  10. xy = rootgrp.variables["zipxy"]
  11. lon = rootgrp.variables["lon"]
  12. lat = rootgrp.variables["lat"]
  14. # create a new coordinates array
  15. coordinates = np.array([[],[]])
  17. # loop through the xy array and write the corresponding lon & lat coordinates
  18. # into the coordinates array
  19. # NOTE: the vectors in xy begin with [1,1] BUT the index of the values
  20. # in lon & lat begins with [0,0] -> therefore: y-1, x-1
  21. for x, y in xy:
  22.     coordinates = np.append(coordinates,[[lon[y-1,x-1]],[lat[y-1,x-1]]],1)
The dimensions are:
xaxis = 1320
yaxis = 1482
zip2 = 1080236

And the shape of the variables:
zipxy('zip2', 'two')
lon('yaxis', 'xaxis')
lat('yaxis', 'xaxis')

There must be a way to speed up the coordinate extraction (maybe I need another searching method through the arrays...). Until now, I couldn't find any solution.

Thanks for help!
Mar 23 '11 #1
Share this Question
Share on Google+
3 Replies

P: 332
You should have a read at Python Patterns - An Optimization Anecdote
. That shall inspire you.
Mar 23 '11 #2

Expert 100+
P: 622
A list/array is slow. Consider using a set or dictionary as they are hashed. It you just want all of the coordinates, a set made up of tuples=(x, y) or ([lat_x, lat_y], [lon_x, lon_y]) will work fine. Your time consumer is probably the lookups here so time the read/create arrays and the for loop separately so you know where the problem is.
Expand|Select|Wrap|Line Numbers
  1. # for x, y in xy:
  2. #     coordinates = np.append(coordinates,[[lon[y-1,x-1]],[lat[y-1,x-1]]],1)
Mar 23 '11 #3

P: 2
Thanks a lot for the both answers!
I will try some possibilities and post the results
Mar 24 '11 #4

Post your reply

Sign in to post your reply or Sign up for a free account.