By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
464,400 Members | 1,282 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 464,400 IT Pros & Developers. It's quick & easy.

Possible to have different datatypes among columns of an array?

P: 10
Hi Everyone,

I create an expandable earray of Nx4 columns. Some columns require float64 datatype, the others can be managed with int32. Is it possible to vary the data types among the columns? Right now I just use one (float64, below) for all, but it takes huge disk space for (>10 GB) files.

For example, how can I ensure column 1-2 elements are int32 and 3-4 elements are float64?

Expand|Select|Wrap|Line Numbers
  1. a = f1.create_earray(f1.root, "dataset_1", atom=tables.Float32Atom(), shape=(0, 4))
Aug 19 '20 #1
Share this Question
Share on Google+
2 Replies

100+
P: 200
When reading with "read_csv()", specify the column type in dictionary format for the argument "dtype".
Expand|Select|Wrap|Line Numbers
  1. df = pd.read_csv('Nx4.csv',dtype = {'col1':'int64', 'col2':'int64', 'col3':'float64','col4':'float64'})
Or change the data type of a column in Pandas and write it to a file.
Expand|Select|Wrap|Line Numbers
  1. df_ = df.astype({'col1':'int8','col2':'int8','col3':'float64', 'c': 'float64'})
  2. df_.to_csv('output.csv')
If the data is too large, you can compress it and write it out.
Expand|Select|Wrap|Line Numbers
  1. df_.to_csv('output.csv.gz', compression='gzip')
Aug 20 '20 #2

100+
P: 305
@RockRoll, Can you please elaborate on what you want to achieve.
Aug 25 '20 #3

Post your reply

Sign in to post your reply or Sign up for a free account.