473,499 Members | 1,873 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

New Altivec-optimized valarray implementation

I have written a new implemention of the std::valarray library that is
optimized to use Altivec (Apple's "Velocity Engine", part of the
PowerPC G4's in most Macintoshes and the announced IBM PPC 970). The
implementation is mostly standard conforming and is complete.

As soon as I get my shingle up on the web (1 or 2 day's time), I'll
post the library and its accompanying docs, which I call "MacSTL".
Would like comments, tests, discussions of it...

Preliminary benchmarks on my Power Mac G4 make it 550% faster on
inlined arithmetic, 1360% faster on inlined transcendentals and 290%
faster on summation than the gcc 3.1 std::valarray. Even the
non-Altivec-optimized inline arithmetic is 50% faster than gcc, due to
ruthless elimination of extrenous loads and stores by using STL-style
algorithms.

I sense that std::valarray is somewhat of a dead horse out there, but
I believe I can show there's still some life left in that concept. As
for Altivec, it's still up and coming!

P.S. The library also has several STL-influenced concepts for the Mac
(or BSD/PowerPC): std::vectors with Mach copy-on-write semantics,
clean COM wrappers using just std containers, zlib wrapped in std
iostreams...
Jul 19 '05 #1
3 3126

"Chandra Shekhar Kumar" <ch***********@oracle.com> wrote in message news:3E***************@oracle.com...
i think this is the wrong news-group for yr msg...plzz post it at
comp.std.c++

Please don't.
Jul 19 '05 #2
Glen Low wrote:
I have written a new implementation of the std::valarray library
that is optimized to use AltiVec (Apple's "Velocity Engine",
AltiVec(tm) is Motorola's trademark
for the first PowerPC SIMD extension.

http://www.simdtech.org/altivec
part of the PowerPC G4's in most Macintoshes
and the announced IBM PPC 970).
The implementation is mostly standard conforming and is complete. As soon as I get my shingle up on the web (1 or 2 day's time),
I'll post the library and its accompanying docs,
which I call "MacSTL".
Would like comments, tests, discussions of it... Preliminary benchmarks on my Power Mac G4 make it 550% faster
on inlined arithmetic, 1360% faster on inlined transcendentals
and 290% faster on summation than the gcc 3.1 std::valarray.
Even the non-Altivec-optimized inline arithmetic
is 50% faster than gcc, due to ruthless elimination
of extraneous loads and stores by using STL-style algorithms.

I sense that std::valarray is somewhat of a dead horse out there but
I believe I can show there's still some life left in that concept.
As for Altivec, it's still up and coming!

P.S. The library also has several STL-influenced concepts for the Mac
(or BSD/PowerPC): std::vectors with Mach copy-on-write semantics,
clean COM wrappers using just std containers, zlib wrapped in std
iostreams...


You might want to post this to the Object Oriented Numerics mailing list

http://www.oonumerics.org/mailman/li....cgi/oon-list/

I believe that Kent Budge still subscribes to this list
and he may appreciate vindication.

You might also visit the
High Performance Embedded Computing Software Initiative (HPEC-SI)

http://www.hpec-si.org/

They are working on a C++ binding for
the Vector Signal Image Processing Library (VSIPL)

http://www.vsipl.org/

There are several implementations of the VSIPL now
for AltiVec on the PowerPC and you really should be
benchmarking your implementation of std::valarray against them.

Also, do you plan an implementation for The Power Mac G5

http://www.apple.com/powermac/

Jul 19 '05 #3
> You might want to post this to the Object Oriented Numerics mailing list

http://www.oonumerics.org/mailman/li....cgi/oon-list/

I believe that Kent Budge still subscribes to this list
and he may appreciate vindication.
I will do that once I get back to my own Mac. Yes, I have seen Kent's
rationale at the site and I'll be linking to it from my own website,
once it's up.
There are several implementations of the VSIPL now
for AltiVec on the PowerPC and you really should be
benchmarking your implementation of std::valarray against them.
I was hoping for a wider audience and thus targetted std::valarray.
Also, do you plan an implementation for The Power Mac G5

http://www.apple.com/powermac/


Definitely! I tried to get out my library before the WWDC
announcement, in the hopes that the rumors were true and get some
additional publicity. Alas several sleepless nights and time off my
day job and I only finished it on Monday or so, sans docs which I'm
working on now.

I've taken a look at the developer.apple.com site new G5 docs and it
looks good for what I am doing, especially since Altivec code is
bandwidth sensitive. For example aligning loops for maximal
performance would dovetail nicely with my inline implementation. I'll
have to download the gcc 3.3 compiler that they've provided and see
how it goes.

Cheers,
Glen
Jul 19 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
3539
by: Justin Lemkul | last post by:
Hello all, I am hoping someone out there will be able to help me. I am trying to install a program that utilizes NumPy. In installing NumPy, I realized that I was lacking Atlas. I ran into...
4
1703
by: Richard Cavell | last post by:
Hi, I wish to write an algorithm in C++. My intention is to run it on a Mac G4, however it would be nice to have the same program compile and run on a Pentium 4. The program will have to do...
3
1382
by: Richard Cavell | last post by:
Hi, Many processors have some kind of facility for working on multiple data at once (MMX, SIMD, SSE2, Altivec, Velocity Engine, etc). What's the state of the art with regards trying to use...
9
4169
by: Joel Rodrigues | last post by:
Hi, I get the following error when I run make on Mac OS X v 10.1.5 Any ideas ? ---------------------------------------------------------- ar: illegal option -- s usage: ar -d archive file ......
3
2280
by: Fabio Garufi | last post by:
Hi, all, I built the fftw-3.0.1-fma on a LynxOS 4.0 running on a board equipped with a PowerPC G4 7457. To compile it I had to slightly modify the configure script to use the -fvec instead of...
16
2271
by: aarklon | last post by:
Hi folks, recently i read the book named assembly language step by step by Jeff Duntemann. in the chapter coding for linux, he has got a paragraph named C calling conventions, which he...
3
2667
by: Lars Grobe | last post by:
Hi, first hello, I am new to the list, and I guess my question will show that clearly. I want to use some vector operations (at the moment altivec) in existing code. It is a raytracing-based...
7
2844
by: Flash Gordon | last post by:
Reading the standard, va_list is an object type (, so I believe the following should be possible: #include <stdarg.h> void foo(va_list *arg) { /* do some stuff which conditionally might read...
9
2301
by: ma740988 | last post by:
Assume I have a vendor file called ' vendor.h'. Within the file there's two methods memalign and cforward. It is my understanding that the memalign function is a wrapper around malloc. cforward...
0
899
by: mo3749 | last post by:
Hi, Is there any library or linkings or compatibilities of Python with AltiVec(r). I KNOW, It is all very OLD stuff, but I just want to know????
0
7225
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
6901
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
5479
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
4605
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
3105
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
3101
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
1429
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
1
667
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
307
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.