Friday, July 27, 2007

Cisco Ripples - DCA and RRM - Help is on the way

Since I first published " The Ripple Effect" back in February I have heard from many folks who have validated the effect but to my chagrin, I have had no solution to offer. Well thankfully there are smarter people than me out there and solutions have started to appear.

I was alerted to the fact that Medical Connectivity consulting recently put Cisco in their sights and quoted my blog with regard to Dynamic Channel Assignment and RRM causing issues. The Web, being the great time waster that it is, lead me on a journey. As I read the article I clicked here and there and next thing I knew I was looking at a forum at Cisco that was talking about this exact phenomena.

One of the forum posters had some great suggestions to eliminate this problem in the future. Bruce Johnson at Partners Healthcare offered this solution,
"We saw the majority of DCA events were triggered by Interference from Rogue APs. After we disabled Foreign AP Avoidance the number of channel changes dropped by an entire order of magnitude (1000s to 100s). We disabled Cisco AP Load Avoidance and this reduced the number of DCAs within an order of magnitude (100s less).

DTPC will power-up APs to max levels to provide a 3-neighbor -65 RSSI coverage "grid" and 7921s will power up to follow suit (up to their max Tx Power). Other clients with higher Tx power may send the APs to max power causing a mismatch with IP phones.

You can decrease the tx-power-threshold so the "grid" won't be as hot (default is -65, change to -71 or -74):

config advanced 802.11a tx-power-control-thresh <-50 to -80>
config advanced 802.11b tx-power-control-thresh <-50 to -80>

and reduce the coverage hole detection threshold (reduce Min SNR level in RRM Thresholds) to suppress the power-up activity."
Bruce seemed on track with this fix. the problem is that it isn't a fix. It shuts off the RRM and DCA so that the WLAN would remain stable. So where is the benefit of a controller based system?

He does note that a fix is forthcoming from Cisco, "They are revamping the behavior of RRM in the WLC 4.1 Maintenance release." Which is later confirmed by a Cisco employee, Saurabh Bhasin a TME,
"With the 4.1 Maintenance Release(MR) due out on shorly, many improvements based on such feedback have been brought into RRM's algorithms ? improvements aimed at allowing administrators to fine-tune their RRM-run WLANs where desired. These enhancements will allow for greater control over both the channel and power output selection algorithms, so administrators may assist RRM in being either more or less aggressive in such decisions, depending on application and network needs. Additionally, enhancements have been made to the management and reporting of all RRM information and configuration alterations to allow for better tracking of RF environmental fluctuations and to assist in keeping track of RRM activity. Further technical detail on the inner workings of these enhancements will be available very soon in an update to the above-mentioned RRM Whitepaper."
The paper he references is found here and explains a lot of what we are all seeing. (here is the PDF version)

(NOTE: Since publishing this post, Cisco has moved the link. Here is a more recent version. Please double-check with Cisco that you have their latest information)

So here is to hope that WLC 4.1 Maint. Rels. fixes it. As an aside, Bruce Johnson is skeptical,
"Its all well and good to make things work for Intel and the CCX/CCKM compliant crew, but if you have any of the other brands of WLAN NICs (like those made by medical device manufacturers, who won't subscribe to fast roaming features until they're adopted by the IEEE) you are best keeping RRM disabled until it delivers on its promise as stated in the following 802.11TGv Objectives draft:

Service and Function Objectives

Solutions shall define mechanisms to provide the service listed below.

[Req2000] TGv shall support Dynamic Channel Selection, to allow STAs to avoid interference. Solution shall be able to change the operating channel (and/or band) for the entire BSS during live system operation and be done seamlessly with no intermittent loss of connectivity from the perspective of an associated STA. Solution shall not define algorithm for channel selection."

1 comment:

  1. Hello,
    Here is the most recent version of the document I could find about RRM:

    You will find some new features (like "DCA sensitivity level") and some pieces of advice.

    I hope it will help those (like me!) who encounter some problems with RRM.