473,395 Members | 1,516 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

Pandas: Merging Sorted Dataframes

10 Byte
Hi,

I have a large (Nx4, >10GB) array that I need to sort based on col.2.

I am reading my data in chunks and sorting using Pandas. But I am unable to combine the sorted chunks to give me a final large Nx4 array that is sorted on Col.2. Here is what I have tried yet:

Expand|Select|Wrap|Line Numbers
  1. chunks = pd.read_csv(ifile[0], chunksize=50000, skiprows=0,
  2.                      names=['col-1', 'col-2', 'col-3', 'col-4'])
  3.  
  4. for df in chunks:
  5.     df = df.sort_values(by='col-2', kind='mergesort') # sorted chunks
  6.     print(df)
Aug 12 '20 #1
2 4430
SioSio
272 256MB
The process when reading the file divided is as follows.
Expand|Select|Wrap|Line Numbers
  1. import pandas as pd
  2. df = None
  3. for tmp in  pd.read_csv(ifile[0], chunksize=50000, names=['col-1', 'col-2', 'col-3', 'col-4']):
  4.     if df is None:
  5.         df = tmp
  6.     else:
  7.         df = df.append(tmp, ignore_index=True)
  8.  
  9. df_s = df.sort_values(by='col-2', kind='mergesort')
  10. print(df_s)
Aug 18 '20 #2
madankarmukta
308 256MB
Follow standalone syntaxt for sort_values.

Thanks
Aug 26 '20 #3

Sign in to post your reply or Sign up for a free account.

Similar topics

11
by: Max M | last post by:
I am writing a "find-free-time" function for a calendar. There are a lot of time spans with start end times, some overlapping, some not. To find the free time spans, I first need to convert the...
24
by: Lasse Vågsæther Karlsen | last post by:
I need to merge several sources of values into one stream of values. All of the sources are sorted already and I need to retrieve the values from them all in sorted order. In other words: s1 = ...
4
by: gagan.singh.arora | last post by:
I had a test today and there was a question regarding how to merge two AVL trees to form another AVL tree without using any AVL tree operations. Some had done it by using an array, sorted the...
8
by: Guy | last post by:
Is there a better way to search identical elements in a sorted array list than the following: iIndex = Array.BinarySearch( m_Array, 0, m_Array.Count, aSearchedObject ); aFoundObject= m_Array;...
1
by: bonitabonita | last post by:
Hi i got this assignment, is there anybody who can help me?? Write a C# program that can update the balance of a list of bank accounts. The balances are stored in a text file ”Balances.txt”...
1
by: chiefychf | last post by:
I'm working on a school project and I am having a few issues... The program calls for three arrays a,b,c that have to be sorted, then compared to even or odd and stored in arrays d & e, then merge...
14
by: etal | last post by:
Here's an algorithm question: How should I efficiently merge a collection of mostly similar lists, with different lengths and arbitrary contents, while eliminating duplicates and preserving order...
2
by: joeme | last post by:
How would one using STL do the following tasks: 1) merge 2 sorted vectors with dupes, result shall be sorted 2) merge 2 sorted vectors without dupes, result shall be sorted 3) merge 2...
9
by: perogy | last post by:
I have same question as posted by holla and Iam not sure about merging the contents of 2 sorted arrays into another array without duplication of values.
0
by: alan28a | last post by:
I am just trying to get an SQL Server table into a Pandas DataFrame and copied this code from the internet, obviously added my local details and credentials ie XXXX etc. Which I think works OK. When...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.