By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
438,856 Members | 2,179 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 438,856 IT Pros & Developers. It's quick & easy.

failover cluster using 64 bit windows 2k3 on single HBAs and DS400 with dual controller

P: n/a
We are trying to setup a system to system failover cluster using two
nodes (x346) which each have a single hba running to seperate
controllers on the DS400.

For full redundnancy, IBM recommends dual path from each node but we
dont need that. The current setup has two completly seperate paths. hba
on node 1 to controller A on DS400 and hba on node 2 to controller B.
If i take a controller offline, failover works fine to jumo to other
controller and throw all resources to it's node but if i shutdown a
node- the cluster loses all attached storage and DS400 is unaware to
switch ownership to other controller.

Is there a way to us mscs without dual path from each node?
anotherwords... if either node or controller fails on a single path, we
want the other path to become active.

our main goal is to use sql server 2005 clustering on the cluster.
everything checks out perfect if i only use one controller on the DS400
for both nodes but this brings us back to another single point of
failure.

I saw that Qlogic has MPIO drivers on thir website for the DS400 but it
seems as though they are for 32bit systems and the install errors out
with:

C:\Drivers\mpio\1.0.8.4 (w32)>install.exe -i
Pre-Installing the Multi-Path Adapter Filter...
Success
Installing the Multi-Path Bus Driver...
Failure. Error code (0xe0000235)

configuration:
2 X IBM x346 w/ single QLogic 2340 HBAs running win2k3 64bit Enterprise
DS400 w/ dual controllers

Apr 12 '06 #1
Share this Question
Share on Google+
5 Replies


P: n/a
<us******@gmail.com> wrote in message
news:11**********************@e56g2000cwe.googlegr oups.com...
We are trying to setup a system to system failover cluster using two
nodes (x346) which each have a single hba running to seperate
controllers on the DS400.

For full redundnancy, IBM recommends dual path from each node but we
dont need that. The current setup has two completly seperate paths. hba
on node 1 to controller A on DS400 and hba on node 2 to controller B.
If i take a controller offline, failover works fine to jumo to other
controller and throw all resources to it's node but if i shutdown a
node- the cluster loses all attached storage and DS400 is unaware to
switch ownership to other controller.

Is there a way to us mscs without dual path from each node?
anotherwords... if either node or controller fails on a single path, we
want the other path to become active.


It sounds like you're trying to persuade the DS400 to control your failover
action. You're making a LUN available to one node, and when a failure occurs
you're expecting the DS400 to switch ownership of that LUN to the other node
so it can proceed. That's not what you want. You want both nodes to see and
share the LUN(s) on the DS400 at all times. Mscs will then figure out
between the two nodes which one will access the LUN.

Rob
Apr 12 '06 #2

P: n/a
logically that would make sense that MSCS would be responsible for
everything.
however.... both nodes are able to see the storage but can only read
the drives when the respective controller is the active one.

both initiators have access to all the LUNs on the storage. both HBAs
have access to all LUNs

It sounds like you're trying to persuade the DS400 to control your failover
action. You're making a LUN available to one node, and when a failure occurs
you're expecting the DS400 to switch ownership of that LUN to the other node
so it can proceed. That's not what you want. You want both nodes to see and
share the LUN(s) on the DS400 at all times. Mscs will then figure out
between the two nodes which one will access the LUN.

Rob


Apr 12 '06 #3

P: n/a
<us******@gmail.com> wrote in message
news:11*********************@i40g2000cwc.googlegro ups.com...
logically that would make sense that MSCS would be responsible for
everything.
however.... both nodes are able to see the storage but can only read
the drives when the respective controller is the active one.

both initiators have access to all the LUNs on the storage. both HBAs
have access to all LUNs


The DS400 wasn't certified for MSCS when it was initially introduced. If you
have a model from before mid-2005 then you may need to update firmware or
contact IBM about the exact features required to make it work with MSCS. The
latest firmware is available from Adaptec's website:
http://www.adaptec.com/ibm/downloads...ems_index.html

Rob
Apr 12 '06 #4

P: n/a
logically that would make sense that MSCS would be responsible for
everything.
however.... both nodes are able to see the storage but can only read
the drives when the respective controller is the active one.

both initiators have access to all the LUNs on the storage. both HBAs
have access to all LUNs

It sounds like you're trying to persuade the DS400 to control your failover
action. You're making a LUN available to one node, and when a failure occurs
you're expecting the DS400 to switch ownership of that LUN to the other node
so it can proceed. That's not what you want. You want both nodes to see and
share the LUN(s) on the DS400 at all times. Mscs will then figure out
between the two nodes which one will access the LUN.

Rob


Apr 12 '06 #5

P: n/a
OK,

we have now added to the configuration to provide multipaths to both
nodes from both controllers.

Each node now has two HBAs with with connections to both controllers.
It seems as though everything is working as expected with failover
occuring system to system if the node fails and also controller to
controller if the controller fails.

When I do a failover from system to system, it works flawlessly.
When i Do a failover from controller to controller however, the active
node seems to kick in fine when the resources are bak up and available
but shows an error in taskbar and event log saying:

windowsDelayed Write Failed: Windows was unable to save all the data
for the file M:\ The data has been lost. This error may be caused by a
failure of your computer hardware or network connection. Please try to
save this file elsewhere.

Since this cluster is being used for a SQL Server 2005 cluster, losing
data is not something we would like to do. The controllers have 256
battery backup memory on them. Since this is the case, are the
controllers taking care of this issue and windows is just not aware of
it or do we actually have an issue where we might lose data?

Apr 18 '06 #6

This discussion thread is closed

Replies have been disabled for this discussion.