472,780 Members | 1,265 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,780 software developers and data experts.

Best multi-thread sort for use with C++11 ?

SwissProgrammer
220 128KB
If I have 2 cores, or 200 cores, etc. that I can use for my program and I have a huge amount of data to be sorted that is not being used via Access or any other common database, and if I want to sort it fast, I was thinking of maybe using threads for partial sorts, or something like that. Maybe sort 10% of the data with each of 10 different cpu cores, then step over in the data 5% of the total of the data and sort from there in 10% sections, maybe. Do this until no changes are detected. If I have so much data that to do each of these individual 10% sorts takes a few minutes in each thread using its own cpu core, then maybe there is a better way. I am looking for usable opinions on this.

Specifically, what is the best type of sort to use for this? Someone else's pre-made software is not what I am looking for with this.

In C++ up to and including C++11.

Thank you.
Nov 17 '20 #1

✓ answered by Banfa

This paper appears to specifically address use of parallelism (threads) in sorting algorithms and I think the conclusions could be summerised as
  1. Make sure you use the right sorting algorithm because parallelism makes little difference to some of them
  2. Don't both if you have < 2^12 samples
  3. While a few threads is better than 1 thread; lots of threads is not that much better than a few.

The paper considers Rank Sort, Merge Sort, Quicksort, Bitonic Sort, Bucket Sort and Radix Sort so there may be other algorithms out there to try (but don't bother with Bozo Sort) but this should be a good place to start.

If you don't want to read the paper then try Bitonic Sort.

2 2295
Banfa
9,065 Expert Mod 8TB
This paper appears to specifically address use of parallelism (threads) in sorting algorithms and I think the conclusions could be summerised as
  1. Make sure you use the right sorting algorithm because parallelism makes little difference to some of them
  2. Don't both if you have < 2^12 samples
  3. While a few threads is better than 1 thread; lots of threads is not that much better than a few.

The paper considers Rank Sort, Merge Sort, Quicksort, Bitonic Sort, Bucket Sort and Radix Sort so there may be other algorithms out there to try (but don't bother with Bozo Sort) but this should be a good place to start.

If you don't want to read the paper then try Bitonic Sort.
Nov 17 '20 #2
SwissProgrammer
220 128KB
Banfa,

For over 20,000,000 short data sets (example: Fred Flintstone 4321 Rock Creak)
It looks like a multi-core (multi-physical-thread) Radix Sort
should work.

For over 20,000,000 longer data sets (example: Communication Avenue: Phone/Land Line, Time Stamp: 123456789, Location Origination: USA/Maryland, VISA #1234567890123456, Order #0987654321)
It looks like a multi-core (multi-physical-thread) combination of starting with Radix Sort first (maybe 20% of data length), then Bitonic Sort to complete the process.
Thank you for that information.

For a very fast changing data load (example: a "massively multiplayer" game as defined by the originator of that term back in the early 1990's as being a minimum of 1,000 concurrent users all in the same game and all playing against each other at the same time, not sectioned off. Full interaction one to 1,000.) Especially if the data sets are long (more than 150 characters each): I think that the sort which I described in my question could be the best.
Example: sort 10% of the data with each of 10 different physical cpu cores, then step over in the data 5% of the total of the data and sort from there in 10% sections, maybe. The actual sorts could be Radix for short data and Bitonic for longer data or a combination as previously described. Do this non-stop as long as the data is very fast changing.
An addition could be to compartmentalize with multiple servers as needed.




etcetera:

In case someone is interested in my observations of the paper reported by Banfa, see the following:

"The paper concludes that an implementation of radix sort is the fastest for GPUs, by a great margin." [pdf page 4].
But, they note that this can vary based upon randomness and presortedness of the data at the time of the sort. [pdf page 5+] That makes sense.

In that paper it says,
"As the data sets grows larger, the speedup increases pending on the number of threads" [pdf page 11].
That was expected.
"Results . Multithreading does improve performance, with two threads in average providing a speedup of up to 2x and four threads up to 3x. However, the potential speedup is bound to the available physical threads of the CPU and dependent of balancing the workload." [pdf page 3].
The 2x was expected. The 3x: I reject as inconclusive due to insufficient data size. I am guessing that the load was not sufficient on the four threads to achieve results approaching 4x. But, the 2x for two threads gave me an answer.
"On a single-core machine, there is little improvement due to multithreading, since it only has one core and can only run one thread at the time." [pdf page 10]
I did not notice where they specifically stated that a multi-threaded single core was to be considered only via the actual core count. They seem to completely discount the "multi-thread" ability. It seems to be a premise. They use the term "physical threads", thus I think that it was a conditional that they are using. It is exactly a part of what I based my question upon. Thank you Banfa for finding this and reporting it.

I have not gotten far in reading this report, and now I see a potential error in their testing logic. I wanted to know about multi-core threading. They reported on what seemed to be multi-core threading, but if I read the report correctly, it is not [specifically not] multi-core threading as I was looking for. For multi-core threading, I would have used cores reported plus one. The plus one would have been for the operating system and the program being executed, without the threads being testing. Then the threads being tested would have been on further separate cores. For a two core test: use a minimum of a three core system. Operate on thread one. Test on thread two and thread three. With the constraints of keeping all other use of those threads, two and three, from being used for other purposes.

I now have finished the report and I think that I agree with them and you (both) conditionally about the sort type to use based upon the randomness of the data and the average individual data set size.

Thank you Banfa.
Nov 18 '20 #3

Sign in to post your reply or Sign up for a free account.

Similar topics

0
by: r_obert | last post by:
Hello, I'm trying to create a worker thread for my VC++ program, and was wondering whether I should be linking with the Multithread /MT or Multithread DLL /MD option? I'm not quite sure, in...
0
by: Stefan Turalski \(stic\) | last post by:
Hi, What I do is looking for holes in my solution to problem which is: 1. taking something from database by this sth ID, by some application 2. process it to this multithread enviroment of some...
4
by: zbcong | last post by:
Hello: I write a multithread c# socket server,it is a winform application,there is a richtextbox control and button,when the button is click,the server begin to listen the socket port,waiting for a...
2
by: zhebincong | last post by:
Hello: I write a multithread c# socket server,it is a winform application,there is a richtextbox control and button,when the button is click,the server begin to listen the socket port,waiting...
0
by: fred | last post by:
I need some help in trying to understand how to make myCollection (inherited from CollectionBase) multithread safe. Taking my implementation of the Add Sub and a readonly property Item. Public...
5
by: mrkbrndck | last post by:
Please see the code below as I am trying to use multithreading for copying files to a new location in a way that improves performance of the client windows application. The problem occurs when 2...
6
by: jmartin | last post by:
Hi, I have made a multithread version of a program (load a file into database), and with two processors I get the double of time in the multithread than in the process (unithread) version. I...
2
by: tikcireviva | last post by:
Hi Guys, I've done a mulithread queue implementation on stl<queue>, my developement environment is on VC6 as well as FC3. Let's talks about the win32 side. The suspected memory leak is find...
4
by: RB | last post by:
Hi there, I'm pretty new to ASP.NET, and have a "best practice" sort-of question. I'm programming for .NET 2.0, using VB.Net in Visual Studio 2002. Basically, I've got a web-page which is...
5
by: halooti | last post by:
Hello all I created a button with excel and assigned a macro to it which I recorded with excel Macro Record.. The button sorts column C ascending, this is the code Sub Macro1() ...
3
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 2 August 2023 starting at 18:00 UK time (6PM UTC+1) and finishing at about 19:15 (7.15PM) The start time is equivalent to 19:00 (7PM) in Central...
0
by: erikbower65 | last post by:
Using CodiumAI's pr-agent is simple and powerful. Follow these steps: 1. Install CodiumAI CLI: Ensure Node.js is installed, then run 'npm install -g codiumai' in the terminal. 2. Connect to...
0
linyimin
by: linyimin | last post by:
Spring Startup Analyzer generates an interactive Spring application startup report that lets you understand what contributes to the application startup time and helps to optimize it. Support for...
0
by: erikbower65 | last post by:
Here's a concise step-by-step guide for manually installing IntelliJ IDEA: 1. Download: Visit the official JetBrains website and download the IntelliJ IDEA Community or Ultimate edition based on...
0
by: Taofi | last post by:
I try to insert a new record but the error message says the number of query names and destination fields are not the same This are my field names ID, Budgeted, Actual, Status and Differences ...
14
DJRhino1175
by: DJRhino1175 | last post by:
When I run this code I get an error, its Run-time error# 424 Object required...This is my first attempt at doing something like this. I test the entire code and it worked until I added this - If...
0
by: lllomh | last post by:
How does React native implement an English player?
0
by: Mushico | last post by:
How to calculate date of retirement from date of birth
2
by: DJRhino | last post by:
Was curious if anyone else was having this same issue or not.... I was just Up/Down graded to windows 11 and now my access combo boxes are not acting right. With win 10 I could start typing...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.