Re: NLB Cluster - Ping fails or long time to reply from outside local subnet



Using Network Monitor I see the pings being received and replies being sent
WHEN ping at my computer reports success. All other times I see nothing
related to my continuing ping.

This must point to some other router/switch on the network, not forwarding
packets when required.

"David Morgan" <microsoft_newsgroups@xxxxxxxxxxxxxxxxxxxxxxx> wrote in
message news:%23X6d98n7HHA.5424@xxxxxxxxxxxxxxxxxxxxxxx
Well I have configured the cluster on NIC3, a standalone DLink card, and
still have the same problem.

Only this adapter has a default gateway defined.

I have confirmed that the VLAN (and switch) are Layer 2 only.

I have disabled one of the Broadcom NICs, (NIC1 which had an IP from the
same subnet). No difference.

So, the problem persists, from outside the local network pings either fail
or respond with very high latency.

Installing a hub wouldn't make any difference as I am trying to get this
working with only one cluster host. If I had more then I can see why a
hub could help.

Why are the packets being created by the NLB driver not being routed
properly... ? In fact, I've just noticed pattern in the reply times. It
looks like you get a long reply then a short one. Very strange.

C:\>ping -t x.x.16.125

Pinging x.x.16.125 with 32 bytes of data:

Request timed out.
Reply from x.x.16.125: bytes=32 time=1009ms TTL=118
Reply from x.x.16.125: bytes=32 time=28ms TTL=118
Request timed out.
Request timed out.
Request timed out.
Reply from x.x.16.125: bytes=32 time=1509ms TTL=118
Reply from x.x.16.125: bytes=32 time=60ms TTL=118
Request timed out.
Reply from x.x.16.125: bytes=32 time=1509ms TTL=118
Reply from x.x.16.125: bytes=32 time=37ms TTL=118
Request timed out.
Request timed out.
Reply from x.x.16.125: bytes=32 time=1008ms TTL=118
Reply from x.x.16.125: bytes=32 time=29ms TTL=118
Reply from x.x.16.125: bytes=32 time=999ms TTL=118
Reply from x.x.16.125: bytes=32 time=53ms TTL=118
Reply from x.x.16.125: bytes=32 time=997ms TTL=118
Reply from x.x.16.125: bytes=32 time=40ms TTL=118
Request timed out.
Request timed out.
Reply from x.x.16.125: bytes=32 time=1009ms TTL=118
Reply from x.x.16.125: bytes=32 time=31ms TTL=118
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Reply from x.x.16.125: bytes=32 time=1008ms TTL=118
Reply from x.x.16.125: bytes=32 time=33ms TTL=118
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Reply from x.x.16.125: bytes=32 time=1008ms TTL=118
Reply from x.x.16.125: bytes=32 time=32ms TTL=118
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Reply from x.x.16.125: bytes=32 time=1008ms TTL=118
Reply from x.x.16.125: bytes=32 time=57ms TTL=118
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Reply from x.x.16.125: bytes=32 time=1508ms TTL=118
Reply from x.x.16.125: bytes=32 time=30ms TTL=118
Reply from x.x.16.125: bytes=32 time=998ms TTL=118
Reply from x.x.16.125: bytes=32 time=37ms TTL=118
Reply from x.x.16.125: bytes=32 time=996ms TTL=118
Reply from x.x.16.125: bytes=32 time=38ms TTL=118
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Reply from x.x.16.125: bytes=32 time=1508ms TTL=118
Reply from x.x.16.125: bytes=32 time=37ms TTL=118
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Reply from x.x.16.125: bytes=32 time=1007ms TTL=118
Reply from x.x.16.125: bytes=32 time=49ms TTL=118
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Reply from x.x.16.125: bytes=32 time=1007ms TTL=118
Reply from x.x.16.125: bytes=32 time=30ms TTL=118
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Reply from x.x.16.125: bytes=32 time=1007ms TTL=118
Reply from x.x.16.125: bytes=32 time=29ms TTL=118
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Reply from x.x.16.125: bytes=32 time=1007ms TTL=118
Reply from x.x.16.125: bytes=32 time=34ms TTL=118
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Reply from x.x.16.125: bytes=32 time=1006ms TTL=118
Reply from x.x.16.125: bytes=32 time=32ms TTL=118
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Reply from x.x.16.125: bytes=32 time=1005ms TTL=118
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Reply from x.x.16.125: bytes=32 time=1505ms TTL=118
Reply from x.x.16.125: bytes=32 time=28ms TTL=118

Again, I should re-iterate that pings respond perfectly when performed
from another host on the same subnet/x.x.16.0 network.

Many thanks.

David



"Chuck [MSFT]" <ctimon@xxxxxxxxxxxxxxxxxxxx> wrote in message
news:eRnIbWJ7HHA.980@xxxxxxxxxxxxxxxxxxxxxxx
Something to keep in mind....if the switch is Layer 3, the ports the NLB
nodes plug into must function at layer 2 or it won't work properly.

I'm still interested in the behavior with only one NIC in the picture.

With NLB, it is pretty common practice to break the problem down into its
simplest components and see if it works, if it works then build up from
there.

--
Chuck Timon, Jr.
Microsoft Corporation
Windows Server 2008 Readiness Team
This posting is provided 'AS IS" with no warranties, and confers no
rights.

"David Morgan" <microsoft_newsgroups@xxxxxxxxxxxxxxxxxxxxxxx> wrote in
message news:ejPPAjI7HHA.5772@xxxxxxxxxxxxxxxxxxxxxxx
Yes, NIC1 and NIC2 (in each machine) are dual port Broadcom cards. They
are definitely not being teamed and I have tested the locally
administered address is being set correctly on the cluster adapter. ARP
reports everything as expected.

I will set-up the cluster using the DLink card, NIC3 which was added
afterwards.

Regardless, none of this would suggest why it works locally but not from
outside the LAN. It is as if the router is intermittently not accepting
the packets to/from the LAA address or something.


"Chuck [MSFT]" <ctimon@xxxxxxxxxxxxxxxxxxxx> wrote in message
news:ewVUlnC7HHA.1900@xxxxxxxxxxxxxxxxxxxxxxx
Let's simplify this.....disable all but one NIC in each member of the
NLB cluster, setup NLB on it, plug the NICs into a hub, remember those?
:{> then uplink the hub to a switch port and test.

If you have NLB in your environment....keep a spare dumb hub
around....never know when you might need it.

--
Chuck Timon, Jr.
Microsoft Corporation
Windows Server 2008 Readiness Team
This posting is provided 'AS IS" with no warranties, and confers no
rights.

"Bookham Measures" <bookham_measures_no_spam@xxxxxxxxx> wrote in
message news:OlGc7c96HHA.5404@xxxxxxxxxxxxxxxxxxxxxxx
Hello

We have set-up NLB cluster with two servers, ultimately for an IIS
application.

Everything functions perfectly when the clients are on the same subnet
as defined by their IP address.

Hosts outside the subnet, i.e. come via a router, cannot get reliable
ping responses. They get a mixture of "request timed out" or a
delayed reply, between 54ms to 1510ms. The ping summary always
reports more than 60% packet loss. This applies to the cluster IP
address and the dedicated IP on the cluster NIC in each server. The
mixture of timeouts and replies come at different times when running
simultaneous pings to the three IP addresses.

All configuration has been performed via the Network Load Balancing
Manager.

As we have plenty of adapters the cluster has been configured in
Unicast mode. We have set-up a VLAN on our switch and plugged the
dedicated cluster NICs in to those ports. The configuration is as
follows.


Server 1
---------
NIC1: x.x.16.121 DG x.x.16.1 in VLAN
NIC2: x.x.16.123 DG x.x.16.1
NIC3: 192.168.1.21 DG not set.

Server 2
---------
NIC1: x.x.16.122 DG x.x.16.1 in VLAN
NIC2: x.x.16.124 DG x.x.16.1
NIC3: 192.168.1.22 DG not set.

Cluster
-------
Server 1 NIC1 Priority 1
Server 2 NIC1 Priority 2
Cluster IP x.x.16.125
Equal

NIC3 on each server is numbered so as to communicate with the database
cluster. I have tried removing the default gateway from NIC1/2 to see
if I can different results, but I cannot. Interestingly, when one of
the servers is offline the problem persists. Remember, this problem
only occurs from outside the x.x.16.0 subnet. Hosts on the same
subnet (not in the VLAN), have no problems communicating to the
cluster or NIC1 IP addresses.

When the cluster is deleted via the Manager, the IPs on NIC1 in each
machine start responding to pings normally with good times.

NIC2 in both servers responds to pings from anywhere satisfactorily,
the whole time, (they are not in the VLAN used by NIC1s).

NIC1 and NIC2 in each machine are a "Broadcom NetXtreme Gigabit
Ethernet" dual port adapter and have the latest driver from the IBM
website. Both servers are IBM System-X 3850 M2s running Windows 2003
Server R2 SP2. (32 Bit). Quad Xeon 2.5Ghz with 3 Gb RAM.

The switch is a Foundry EdgeIron 48G.

Where can I go next to troubleshoot this problem? The fact that the
IPs respond normally from everywhere when there is no cluster
configured, must mean that there is something wrong at the NLB driver
level.

Many thanks in advance.

David











.



Relevant Pages

  • Re: NLB Cluster - Ping fails or long time to reply from outside local subnet
    ... Windows Server 2008 Readiness Team ... I have disabled one of the Broadcom NICs, (NIC1 which had an IP from the same subnet). ... I should re-iterate that pings respond perfectly when performed from another host on the same subnet/x.x.16.0 network. ... They are definitely not being teamed and I have tested the locally administered address is being set correctly on the cluster adapter. ...
    (microsoft.public.windows.server.clustering)
  • Re: Heartbeat Network Adapter
    ... Cluster Administrator. ... that you're actually using the NICs you think you are for each distinct ... MVP - Windows Server - Clustering ... Network Configuration Best Practices for Windows 2000 ...
    (microsoft.public.windows.server.clustering)
  • Re: 192.168.x.x oddities
    ... When I went to the server to see if I could connect to a share on the ... I run a small network at home, using a wireless router to connect to a ... and unrouteable on the Internet. ... Am I therefore correct in my assumption that the ISP is routing my pings ...
    (Security-Basics)
  • Re: issue moving cluster group to one node?
    ... an issue with the network name. ... The computer account for CLuster resource 'Network ... MVP - Windows Server - Clustering ...
    (microsoft.public.windows.server.clustering)
  • Re: Active/Active configuration
    ... Almost everything out there is onActive/Passive.Setting up another instance in a cluster is no different than setting up the ... how does sql server knows which instance an application/user is refering to?You need a separate IP addresse for each clustered instance. ... you really shouldn't share a single network. ... Is this a valid assumption?You may get better performance by virute of running each database on its own ...
    (microsoft.public.sqlserver.clustering)