443,275 Members | 1,745 Online Need help? Post your question and get tips & solutions from a community of 443,275 IT Pros & Developers. It's quick & easy.

# speed up data extraction from large arrays

 P: 2 hi, I have three arrays with variable values from a netCDF file. xy contains the vectors for the lon and lat coordinates. I am extracting the coordinates out of the two arrays (lon, lat) according to the vectors in the xy array. So far, I get the correct results, but the speed of the extraction process (for-loop) is extremely poor!! Here's my code: Expand|Select|Wrap|Line Numbers from netCDF4 import Dataset import scipy as scp import numpy as np   # open netCDF file and create a dataset rootfile = "myfile.nc" rootgrp = Dataset(rootfile, "r")   # read variable values into arrays xy = rootgrp.variables["zipxy"] lon = rootgrp.variables["lon"] lat = rootgrp.variables["lat"]   # create a new coordinates array coordinates = np.array([[],[]])   # loop through the xy array and write the corresponding lon & lat coordinates # into the coordinates array # NOTE: the vectors in xy begin with [1,1] BUT the index of the values # in lon & lat begins with [0,0] -> therefore: y-1, x-1 for x, y in xy:     coordinates = np.append(coordinates,[[lon[y-1,x-1]],[lat[y-1,x-1]]],1)   The dimensions are: xaxis = 1320 yaxis = 1482 zip2 = 1080236 And the shape of the variables: zipxy('zip2', 'two') lon('yaxis', 'xaxis') lat('yaxis', 'xaxis') There must be a way to speed up the coordinate extraction (maybe I need another searching method through the arrays...). Until now, I couldn't find any solution. Thanks for help! Mar 23 '11 #1
3 Replies

 100+ P: 332 You should have a read at Python Patterns - An Optimization Anecdote . That shall inspire you. Mar 23 '11 #2

 Expert 100+ P: 622 A list/array is slow. Consider using a set or dictionary as they are hashed. It you just want all of the coordinates, a set made up of tuples=(x, y) or ([lat_x, lat_y], [lon_x, lon_y]) will work fine. Your time consumer is probably the lookups here so time the read/create arrays and the for loop separately so you know where the problem is. Expand|Select|Wrap|Line Numbers # for x, y in xy: #     coordinates = np.append(coordinates,[[lon[y-1,x-1]],[lat[y-1,x-1]]],1) Mar 23 '11 #3

 P: 2 Thanks a lot for the both answers! I will try some possibilities and post the results Mar 24 '11 #4 