I can't say for sure why the db refused to activate, but presumably
something happened prior to that to put it in a state where it could
not. I'm not familiar with what that may have been, as the "activate
db" command is supposed to be supported on an HADR standby.
- if the standby is not active, it should cause activation
- if the standby is already active, it will return a warning saying as
much
Note that activation of a standby does not mean normal user connections
can be made. Rather, it means that the db server is running and, if it
can connect to its partner, performing as an HADR standby.
If you examing the diagnostic log for the database (db2diag.log in your
db2dump directory), you may be able to find something about the sequece
of events leading to this and the other error messages you received.
Then I stopped hadr
on the standby and did an activate and it seemed to not have a problem
with that but when I started hadr as standby again it stated it was
successful but still would not connect to the primary.
There may have been some other steps in there. When a standby is
stopped, the db goes into an inactive rollforward-pending mode. In
this mode, the "activate db" command should return an error like this:
SQL1117N A connection to or activation of database "MYDB" cannot be
made
because of ROLL-FORWARD PENDING. SQLSTATE=57019
In any case, once you got the ex-standby started as a non-HADR
database, it most likely would be impossible for it to reconnect
successfully as a standby again except by reinitialization from scratch
(new db restore). To bring the ex-standby out of rollforward-pending
requires the rollforward to be completed there. That generally puts
that database on what we refer to as a "new log chain". In other
words, from the perspective of the db's history, as reflected in the db
log, that db has diverged and is on a different path from the primary
now.
I couldnt do a
db2pd or a db2 get snapshot to see what state the standby thought it
was in. When I issued a db2 get snapshot it stated I could not perform
that action unless the db was activated.
If you looked in the db2diag.log files for both primary and standby you
would likely find messages indicating that the standby had attempted to
connect with the primary but failed in the handshake validations. When
this is rejected, the standby deactivates. (No sense it trying again
and again as this is not a transient error.)
Without the db being active, there's no shared memory for db2pd to
attach to, nor can the get snapshot be performed.
Basically, I was in a state where I couldnt get an active standby and I
couldnt do anything as an active standard. I finally had to do the
rollforward so I was able to get it to an active standard mode.
My guess is a previously issued rollforward on the standby helped it
get into that state.
If you are concerned that HADR is behaving incorrectly, please open a
case with IBM service and provide your understanding of what happened
along with the db2diag.log files from both primary and standby covering
the entirety of the relevant time period.
Regards,
- Steve P.
--
Steve Pearson, IBM DB2 for Linux, UNIX, and Windows, IBM Software Group
DB2 "Portland" Development Team, IBM Beaverton Lab, Beaverton, OR, USA