473,471 Members | 1,737 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

alignment is lost with openmp

I really like the new openmp implementation of vc, but I ran into a
something which may be a bug. Please see the following little example:

void funct()
{
__declspec(align(16)) BYTE buff[16*16];
#pragma omp parallel private(buff) num_threads(2)
{
#pragma omp for
for(int y = 0; y < height; y += 16)
{
// here a few instructions involving buff and sse2 intrinsics which
require 16 byte alignment
}
}
}

If I print the base address of buff inside the loop I can see that it has
lost its alignment and of course it crashes a little later there.
Nov 17 '05 #1
8 1662
Gabest wrote:
I really like the new openmp implementation of vc, but I ran into a
something which may be a bug. Please see the following little example:

void funct()
{
__declspec(align(16)) BYTE buff[16*16];
#pragma omp parallel private(buff) num_threads(2)
{
#pragma omp for
for(int y = 0; y < height; y += 16)
{
// here a few instructions involving buff and sse2 intrinsics
which require 16 byte alignment
}
}
}

If I print the base address of buff inside the loop I can see that it
has lost its alignment and of course it crashes a little later there.


Please post a bug report with a complete repro case to
http://lab.msdn.microsoft.com/productfeedback/

-cd
Nov 17 '05 #2
Gabest wrote:
Done.

http://lab.msdn.microsoft.com/produc...a-32821fc49691


You might have taken the time to post a complete repro (there's no standard
include file named stdafx.h) or to actually fill out all the fields...

In any case, I'm unable to reproduce the problem with your sample. How
'bout a few more repro steps, such as the exact command-line arguments to
the compiler?

Here's what I get:

C:\Pub\Dev\cppbugs>cl -MD -arch:SSE2 -openmp ompalign0513.cpp
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50215.44 for
80x86
Copyright (C) Microsoft Corporation. All rights reserved.

ompalign0513.cpp
Microsoft (R) Incremental Linker Version 8.00.50215.44
Copyright (C) Microsoft Corporation. All rights reserved.

/out:ompalign0513.exe
ompalign0513.obj

C:\Pub\Dev\cppbugs>ompalign0513
thread_num=0, i=0, buff=0012FB90
thread_num=2, i=8, buff=00AEFE30
thread_num=3, i=12, buff=00BEFE30
thread_num=3, i=13, buff=00BEFE30
thread_num=0, i=1, buff=0012FB90
thread_num=0, i=2, buff=0012FB90
thread_num=0, i=3, buff=0012FB90
thread_num=3, i=14, buff=00BEFE30
thread_num=2, i=9, buff=00AEFE30
thread_num=2, i=10, buff=00AEFE30
thread_num=3, i=15, buff=00BEFE30
thread_num=1, i=4, buff=009EFE30
thread_num=1, i=5, buff=009EFE30
thread_num=2, i=11, buff=00AEFE30
thread_num=1, i=6, buff=009EFE30
thread_num=1, i=7, buff=009EFE30

-cd
Nov 17 '05 #4
> You might have taken the time to post a complete repro (there's no
standard include file named stdafx.h) or to actually fill out all the
fields...
That was the auto generated precompiled header file. Pretty "standard" in
visual c, every new project gets that automagically. This one was a console
application with the default settings, I have only changed openmp support to
yes.
In any case, I'm unable to reproduce the problem with your sample. How
'bout a few more repro steps, such as the exact command-line arguments to
the compiler?
As I said, the default settings were used, except the /openmp switch of
course. But if you insist, here is the command line of the debug build:

/Od /D "WIN32" /D "_DEBUG" /D "_CONSOLE" /D "_UNICODE" /D "UNICODE" /Gm
/EHsc /RTC1 /MTd /GS- /openmp /Yu"stdafx.h" /Fp"Debug\omptest.pch"
/Fo"Debug\\" /Fd"Debug\vc80.pdb" /W3 /nologo /c /Wp64 /ZI /TP
/errorReport:prompt

/OUT:"Debug\omptest.exe" /INCREMENTAL /NOLOGO /MANIFEST:NO /DEBUG
/PDB:"i:\Progs\omptest\omptest\Debug\omptest.pdb" /SUBSYSTEM:CONSOLE
/MACHINE:X86 /ERRORREPORT:PROMPT kernel32.lib user32.lib gdi32.lib
winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib
uuid.lib odbc32.lib odbccp32.lib
Here's what I get:
...


Try to change the code around a bit or run it several times, eventually you
will see non 0 ending pointers.
Nov 17 '05 #5
Gabest wrote:
You might have taken the time to post a complete repro (there's no
standard include file named stdafx.h) or to actually fill out all the
fields...
That was the auto generated precompiled header file. Pretty
"standard" in visual c, every new project gets that automagically.
This one was a console application with the default settings, I have
only changed openmp support to yes.


I normally do repros as stand-alone CPP files compiled from the command-line
with the minimal options necessary to elicit the bug. Old habit :)
In any case, I'm unable to reproduce the problem with your sample. How
'bout a few more repro steps, such as the exact command-line
arguments to the compiler?


As I said, the default settings were used, except the /openmp switch
of course. But if you insist, here is the command line of the debug
build:
/Od /D "WIN32" /D "_DEBUG" /D "_CONSOLE" /D "_UNICODE" /D "UNICODE"
/Gm /EHsc /RTC1 /MTd /GS- /openmp /Yu"stdafx.h" /Fp"Debug\omptest.pch"
/Fo"Debug\\" /Fd"Debug\vc80.pdb" /W3 /nologo /c /Wp64 /ZI /TP
/errorReport:prompt

/OUT:"Debug\omptest.exe" /INCREMENTAL /NOLOGO /MANIFEST:NO /DEBUG
/PDB:"i:\Progs\omptest\omptest\Debug\omptest.pdb" /SUBSYSTEM:CONSOLE
/MACHINE:X86 /ERRORREPORT:PROMPT kernel32.lib user32.lib gdi32.lib
winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib
oleaut32.lib uuid.lib odbc32.lib odbccp32.lib


Ah Ha!

It's /ZI that does it. I'll add a note to your bug report to that effect.
Thanks for the details.

-cd
Nov 17 '05 #6
Ah Ha!

It's /ZI that does it. I'll add a note to your bug report to that effect.
Thanks for the details.


Well, I don't think that switch does it. Just tried the release build as
well with /Zi, then completly disabled the generation of debug info too, the
alignment was still wrong sometimes.

This is how it builds now for the release conofiguration, but I don't think
any of the switches can affect this problem.

/O2 /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_UNICODE" /D "UNICODE" /FD
/EHsc /MT /openmp /Yu"stdafx.h" /Fp"Release\omptest.pch" /Fo"Release\\"
/Fd"Release\vc80.pdb" /W3 /nologo /c /Wp64 /TP /errorReport:prompt

/OUT:"Release\omptest.exe" /INCREMENTAL:NO /NOLOGO /MANIFEST
/MANIFESTFILE:"Release\omptest.exe.intermediate.man ifest" /DEBUG
/PDB:"i:\progs\omptest\omptest\release\omptest.pdb" /SUBSYSTEM:CONSOLE
/OPT:REF /OPT:ICF /MACHINE:X86 /ERRORREPORT:PROMPT kernel32.lib user32.lib
gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib
oleaut32.lib uuid.lib odbc32.lib odbccp32.lib
Nov 17 '05 #7
Gabest wrote:
Ah Ha!

It's /ZI that does it. I'll add a note to your bug report to that
effect. Thanks for the details.


Well, I don't think that switch does it. Just tried the release build
as well with /Zi, then completly disabled the generation of debug
info too, the alignment was still wrong sometimes.

This is how it builds now for the release conofiguration, but I don't
think any of the switches can affect this problem.


You're right, it's not as simple as just -ZI. -Zi also seems to be
sufficient, as well as -O2. In any case, having at least one concrete repro
will allow someone to triage the bug. Watch it on product feedback center -
you should see something within a few days.

-cd
Nov 17 '05 #8
> You're right, it's not as simple as just -ZI. -Zi also seems to be
sufficient, as well as -O2. In any case, having at least one concrete
repro will allow someone to triage the bug. Watch it on product feedback
center - you should see something within a few days.


Just stepped through the assembly and found something strange after I
tailored the code to the following. (also turned of security check to not
get in to the way). My comments are inlined.

__declspec(align(16)) BYTE buff[1234];

#pragma omp parallel private(buff) num_threads(2)
{
#pragma omp for
for(int i = 0; i < 2; i++)
{
_mm_store_si128((__m128i*)&buff[16], _mm_setzero_si128());
}
}

When the debugger reached this multi-threaded code then I saw this:

#pragma omp for
for(int i = 0; i < 2; i++)
00401000 push ebp
00401001 mov ebp,esp
00401003 and esp,0FFFFFFF0h
00401006 sub esp,4E0h

buff was just aligned to 16 bytes, fine!

0040100C lea eax,[esp]
0040100F push eax
00401010 lea ecx,[esp+8]
00401014 push ecx
00401015 push 1
00401017 push 1
00401019 push 1
0040101B push 0
0040101D call _vcomp_for_static_simple_init (40710Ah)

It pushed 6 arguments on stack (esp -= 0x18) and called
_vcomp_for_static_simple_init.

00401022 mov ecx,dword ptr [esp+1Ch]
00401026 mov eax,dword ptr [esp+18h]
0040102A add esp,18h

Looks like it is cleaning up the stack, esp += 0x18

0040102D cmp ecx,eax
0040102F jg wmain$omp$1+4Bh (40104Bh)
00401031 sub eax,ecx
00401033 pxor xmm0,xmm0
00401037 add eax,1
0040103A lea ebx,[ebx]
00401040 sub eax,1
#include <windows.h>
#include <omp.h>
#include <xmmintrin.h>
#include <emmintrin.h>
int _tmain(int argc, _TCHAR* argv[])
{
__declspec(align(16)) BYTE buff[1234];
buff[0] = 0;
#pragma omp parallel private(buff) num_threads(2)
{
#pragma omp for
for(int i = 0; i < 2; i++)
{
_mm_store_si128((__m128i*)&buff[16], _mm_setzero_si128());
00401043 movdqa xmmword ptr [esp+18h],xmm0

Storing at "esp+18h" ???? esp is aligned to 16 bytes right now
(esp=0x0012fa00), what's that +18h doing there? Also, this 18h looks
familiar, could it be a coincidence?

00401049 jne wmain$omp$1+40h (401040h)
{
#pragma omp for
for(int i = 0; i < 2; i++)
0040104B call _vcomp_for_static_end (407104h)
00401050 call _vcomp_barrier (4070FEh)
}
00401055 mov esp,ebp
00401057 pop ebp
00401058 ret
Nov 17 '05 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Rodrigo Perottoni | last post by:
I'm trying compiling and running the OpenMP Benchmark, but I have some problems. Anyone knows how I compiling and running OpenMP? Thank You.
67
by: S.Tobias | last post by:
I would like to check if I understand the following excerpt correctly: 6.2.5#26 (Types): All pointers to structure types shall have the same representation and alignment requirements as each...
5
by: Vladimir_petter | last post by:
Hello guys, Looks like this technology is comming along with VS2005. Anybody has an expirience of using it along with C++? What are the patterns? Thanks, Vladimir.
1
by: Carl J. Van Arsdall | last post by:
Hey everyone, I know I've posted several questions regarding python and python's parallel capabilities so bear with me as I've never attempted to incite discussion. However, today I'm...
0
by: BallCOMET | last post by:
I have an application that I built with Visual Studio 2005 and for which I enabled OpenMP. If I open the compiled binary in Visual Studio, this is the manifest that is embedded in the executable...
2
by: firenet | last post by:
gcc -g -fopenmp openmp.c -o openmp cc1: error: unrecognized command line option "-fopenmp" i wrote a simple openmp.c and wanted to compile it, but i got this error. What should i do ? Thanks
7
by: s.z.s | last post by:
Hi! Is anybody using an openmp compiler under cygwin? I tried to install gcc 4.2.2 without success and now I'm looking for alternatives. Thanks, Steffen
6
by: Renato Perini | last post by:
Hi all!!! I'm trying to find a *good* book about OpenMP and C, but I can't find anything specific. Can you advice me some good (and updated) books about OpenMP using the C interface? I'd like a...
0
by: Alien | last post by:
Hi, I am having bit of confusion about !$OPENMP DO/!$OPENMP END DO and !$OPENMP PARALLEL DO/!$OPENMP END PARALLEL DO directives. Having looked at my notes, the only difference I can find is...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
1
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.