Environment:
Microsoft Windows Server 2008 Standard SP2
Microsoft Office Communications Server 2007 R2
Microsoft Windows 7
Microsoft Office Communicator 2007 R2
Ok, so I had this weird problem with OCS Normalization that was driving me nuts.
Originally, when we built out our OCS environment, we didn’t have a great understanding of the telephony (Enterprise Voice) best practice configuration information. And to be quite honest, there isn’t a lot of well documented information out on the web. So, we got it working at the time – but later found out we had a lot of room for improvement.
The past two weeks, I have been working extensively on cleanup for OCS. One of the areas I have been focused on, is Normalization.
Currently, our OCS environment consists of two user Enterprise Pools (not counting the Director pool). One pool services our corporate headquarters, and the second is the “main” pool, which services the rest of the company – which consists of about 24 locations or so.
As mentioned earlier, one thing we didn’t have configured to best practice standards, was the Normalization patterns and user account Line URIs. Our user account Line URIs were formatted Tel:1XXXYYYZZZZ instead of e.164 best practice format Tel:+1XXXYYYZZZZ. The Normalization Rules were configured to handle the number without the +, causing them to not be in best practice either. Not to mention that we were controlling quite a bit of the formatting by OCS, instead of leveraging the gateway to handle this.
To say the least, even though we weren’t in best practice on our formatting, OCS Enterprise Voice was working (for the most part). However, there were weird little nuances that would come up here and there (e.g. Someone would try ‘click to dial’ on another user in Communicator to their PSTN phone number, but the call would fail – then they would try from their Outlook contact of the person and click to dial would work; even though the numbers were in the same format).
So, we decided to go back and get all of the Normalization fixed up, and put into best practice. We spent a few late nights, trying to update the Gateways to handle the adding and removing of the + and updating the Normalization rules to include the + as well as the Line URIs, but we kept running into the same issue. Outbound dialing to the PSTN would work, but inbound from the PSTN would not work – very frustrating. After a few of nights of putting in a bunch of changes and getting to about 4:00AM and having to revert them all, we decided to try going about this a different way. Instead, we added a new Normalization rule to only effect my test account so that we could troubleshoot the issue during the day, instead of 4 in the morning. This new rule was set in the best practice format and my test account Line URI was also in best practice. This is where things get interesting.
This is the rule that I created to normalize numbers going to my test account in e.164 format (you can see at the bottom, the test translation is normalizing the number correctly):
You can see here that the test account is set up properly:
Doing a test from the Voice Route Helper shows that the call was routing to the correct user account:
From the MOC client, when the number was typed in, it was normalizing the number as expected:
From that point, everything looked good. I couldn’t imagine why inbound calls were not going to this account like they were suppose to. So I fired up a Snooper log from the front end to see if it showed anything weird. For options, I selected the TranslationApplication, SIPStack, and S4 – including all flags. Here is where I found the problem. In this log, it shows that the TranslationService is using the “Little Rock 501 Area Code” Normalization rule (which only adds a 1 to the front of the number), instead of using my nwe “LIT3 Test Normalization Rule” (which would add the + and the 1. Since it wasn’t using the right rule and adding the + on the translated number, it can’t find the SIP user to send the call to. And yes, all the OCS servers were rebooted multiple times after hours, but the behavior stayed the same:
So this was really confusing. All testing showed that the number should have been normalized correctly. After speaking to some other OCS guys, who reached out to some Microsoft contacts, we found that this is actually a bug; the server uses a different engine than the clients, route helper, or test translation – and this engine was not behaving as it should. This saved me from being too frustrated in at least knowing that we had uncovered a bug. In the long run, it wasn’t a big deal since we were changing how we did normalization anyway (controlling the adding and removing of the +1 at the gateway level, so everything coming into OCS, inside OCS, and leaving OCS all was in e.164 format).
In the end, we weren’t doing Normalization in best practice standards (we are now) and we uncovered the bug (even though we weren’t in best practice, the server should have still been reading the correct Normalization Rule). Hopefully this will help someone else confirm if they run into this issue, that they are seeing a bug.
