Connecting Tech Pros Worldwide Help | Site Map

How can I read a data file into a 2xN matrix using NumPy arrays?

Newbie
 
Join Date: Sep 2009
Location: Wilmington, NC
Posts: 2
#1: Sep 15 '09
I'm trying to store or arrange three sets of two-dimensional data into three 2xN matrices that are stored as NumPy arrays.

Expand|Select|Wrap|Line Numbers
  1. import os                # for file handling functions
  2. import numpy as np            # for array/matrix processing
  3. import matplotlib.pyplot as plt        # for general plotting
  4. from mpl_toolkits.mplot3d import axes3d    # for 3d plotting
  5.  
  6. wordList=[]
  7. numFiles = 0
  8. numDataSets = numVectors = sizeVector = 0
  9.  
  10. print"\n\nBeginning data processing."
  11. # traverse filePath folder
  12. filePath = "Train"
  13. for file in os.listdir(filePath):
  14.     numDataSets +=1                # one data set per file
  15.     print "Loading the file: " + file + "."
  16.     tempFile = open(filePath+"/"+file, 'rU')
  17.     for line in tempFile:
  18.         numVectors +=1            # one vector per line
  19.         for word in line.split():
  20.             sizeVector +=1        # components in vector
  21.             wordList.append(word)
  22.  
  23. sizeVector = sizeVector / numVectors
  24. numVectors = numVectors / numDataSets
  25.  
  26. # print"\nHere's all the data:\n", wordList, "\n\n"
  27. print "numDataSets: ",numDataSets
  28. print "numVectors: ",numVectors
  29. print "sizeVector: ",sizeVector
  30. x = eval(wordList[5])
  31.  
  32. # Structure our arrays to hold our data.
  33. # We'll use the three sample class data sets from the text.
  34. # sizeVector of rows and numVectors of columns for column vector data.
  35. # In other words, each feature vector will be a column vector for set array.
  36. w1 = np.zeros((sizeVector, numVectors))
  37. w2 = np.zeros((sizeVector, numVectors))
  38. w3 = np.zeros((sizeVector, numVectors))
  39.  
  40. #   Load matrix array with class w1 numeric values,
  41. # replacing elements indexed as in matrices row, col starting from 0.
  42. #   Pull elements from wordList based on dataSet, sizeVector, and
  43. # which vector we're on during the column, row iteration.
  44.  
  45. skip = numVectors*sizeVector
  46.  
  47. dataSet = 0    
  48. for j in range(sizeVector):
  49.     for i in range(numVectors):
  50.         (do something - nothing I've tried is working)
  51. print "\nClass w1 is as follows: \n",w1
  52.  
  53. dataSet = 1    
  54. for j in range(sizeVector):
  55.     for i in range(numVectors):
  56.         (do something - nothing I've tried is working)
  57. print "\nClass w2 is as follows: \n",w2
  58.  
  59. dataSet = 2    
  60. for j in range(sizeVector):
  61.     for i in range(numVectors):
  62.         (do something - nothing I've tried is working)
  63. print "\nClass w3 is as follows: \n",w3
I need advice on how to get my dataSet(s) set up in this 2xN format.
Member
 
Join Date: Nov 2008
Posts: 49
#2: Sep 24 '09

re: How can I read a data file into a 2xN matrix using NumPy arrays?


Hi

This is a bit difficult to engage on, because of the level of detail. You might also get better responses if you say more generally what you're trying to achieve and why.

But, if I understand it, you have a list, called wordList which has (3 x numVectors x sizeVector) elements, and you want to change it into 3 separate numpy arrays each of which has 2 columns and (numVectors x sizeVector)/2 rows. Or, perhaps into arrays which have sizeVector columns and numVectors rows.

So the key lines might be something like
w1[i,j]=wordList[i+numVectors*j]
w2[i,j]=wordList[i+numVectors*j+size]
w2[i,j]=wordList[i+numVectors*j+2*size]

or maybe w1[i][j]

I haven't had a chance to check it yet. Is something like this what you've tried?
Reply