Internal Network Redundancy

Filed under:Network Redundancy    

The internal network of a business can be made almost completely redundant. Every production server and every network device can be duplicated. Theoretically all network devices including the closet switches and cable runs to the desktop should be redundant. Failure of a single port for a single nic server can and have stopped entire teams of staff from working if it contains an application they are dependent on. Failures in the sever room or network core have a higher business impact than access switches. However having 50 people unable to use their computer because an access switch went down could still result in the business turning away customers because there wasn’t enough redundancy.

Redundancy isn’t very hard to understand. In a nutshell you have two of everything and install a second (therefore redundant) device with a configuration almost identical to the first. It is considered best practice to cross link all the devices so that if any device fails there is only one device on the network affected and redundancy saves the day. Additionally multiple concurrent failures must down two redundant devices somewhere in the network making it even more reliable. Now, complete redundancy is by definition impossible because if the entire network was completely redundant you’d have two separate networks not one. There must be a single (therefore non-redundant) component that allows the failed components to not be used and not affect the rest of the network (trunking or a routing protocols for example). This issue is sometimes referred to as “half brained”. Think of two twin (load balanced) routers connected to a remote router on separate cabling. Say the first router fails, but it not detected by the redundant one. The remote router will continue to try and send traffic through the failed router so the site gets 50% packet loss and connections time out because it doesn’t know redundancy has been affected. This is half brained and is the same for a failover based pair except that it fails completely or the failure is unknown until the other fails. Any redundant devices that are not load balanced should be manually failed over at scheduled intervals to ensure both are functional.

Redundancy should be considered for the cube farm as well as the server room and network core. Running a second cable to a wall jack for PC redundancy is expensive and cable runs have a low failure rate once they have been installed and validated. However if a closet switch fails, it will take time to just move all of the patch cables over to another switch not to mention racking and configuring it. Access closets are probably the most common place for people to skimp on redundancy due to the lower risk. Redundancy in user access should still be managed and an educated conscious decision should be made on how to address it.

All communications devices (and all servers) should have redundant power supplies connected to independent power sources for redundancy. This means separate smart ups and separate commercial power sources in the building going to separate power substations for maximum redundancy. A generator is also a very wise investment as it provides a source of power very unlikely to be affected by adverse weather such as an ice storm that may break all external power sources despite redundancy in the power network.

As you can see, the network has it’s share of failure points. However redundancy can greatly reduce the risk of any given failure from adversely affecting the business. It can seem hard to justify the costs of the duplication. However without redundancy the business impact will be worse when something fails. Both carriers and the internal network should be carefully planned to avoid anything that isn’t redundant. If the phones or network have a failure, will your business have the redundant connections to continue working or will customers go elsewhere?