7/25/2011 7:24 AM
As you all know, DISA and the Army have been working on improving performance of DoD Enterprise Email (EE) for the last 2 months. Four major items adversely affecting EE performance were identified and resolved. The operational pause for EE is paying off and I am confident we will meet the criteria to lift it soon, and resume migrations.
MG Jennifer Napper, CG, 9th SC(A)/NETCOM shared a progress report with me that I'd like to share. Her progress report is pasted below.
Michael E. Krieger
Major Items for Resolving DoD EE Performance
- Following pre-migration guidelines (getting systems patched)
- Fixing DISA firewall architecture at DoD EE Pods
- Fixing out-of-sequence packet issue at DoD EE Pods and Army TLA firewalls
- Fixing circuit issues at Aberdeen Proving Ground (APG)
- At the start of the operational pause, most installations had better than 80% compliance with pre-migration guidelines. Today, installations are reporting 90-95% compliance.
- All Army installations had lots of issues with connections dropping and Outlook freezing. These were greatly reduced after DISA made their firewall architecture changes.
- Once the out-of-sequence packet issue was resolved on DISA and Army firewalls, the user experience with Outlook at the Pentagon (ITA) and most of Ft Belvoir was uniformly good. Once a network IPS was properly configured in the Nolan Bldg on Ft Belvoir, all of Belvoir began receiving good performance on Outlook.
- Once DISA repaired a circuit leaving APG, APG's experience with Outlook was uniformly good.
- As installations achieved better than 90% compliance on pre-migration guidelines (required patches and configurations), tickets on Outlook performance dropped to easily manageable levels.
During this assessment, the joint team observed large numbers of packet retransmission requests impacting overall performance. After analysis it was determined the Cisco Firewall Service Modules at the DoD EE Pod and at Army locations were reordering packets to such an extent it was causing the system to request large amounts of retransmissions and greatly reducing network throughput. Cisco provided recommended changes to configurations on the firewall service modules that were implemented by the Army on 7 Jun 2011 and this greatly reduced the amount of out-of-sequence packets and associated packet retransmission requests.
DISA has conducted internal design reviews and made adjustments to the DoD EE system to improve performance for the Army. The first major improvements were observed on Monday 13 Jun 2011 and were attributed to changes DISA made in the firewall and routing topology and fixes to out-of-sequence packets at the DoD EE pod in OKC. We observed a 30-50% improvement in speed in performing functions such as synchronizing the Offline Address Book (OAB).
Once these changes were made the team continued to assess performance primarily at the ITA and Fort Belvoir locations. ITA had a Tipping Point Network Intrusion Prevention System in the path adding to the out-of-sequence packet issue. They discussed possible solutions with the vendor and then implemented a solution that eliminated the remaining out-of-sequence packets at ITA.
Efforts at Fort Belvoir continued and the team identified two major problems at this site. The Nolan building had an inline Intrusion Prevention System that was misconfigured resulting in over 50% restrictions in available bandwidth. This was immediately resolved. The Installation had inline Intrusion Prevention Systems located below their perimeter defense system at a location in the network where these devices could not see both the transmit data and the receive data for user sessions. This resulted in a significant slowdown of the overall Installation network. Again, once discovered, this was immediately rectified. Once all changes were made at this site, the site easily exceeded the standard performance metrics for Outlook connecting to DoD EE.
Efforts at APG were ongoing at the same time as the Belvoir and ITA efforts. APG was having challenges with their automated patching system based on System Center Configuration Manager (SCCM). NETCOM assisted the site and this system came back online and began aggressive distribution of required patches to DoD EE consumer computers. This effort has now resulted in a 95% compliance rate for required patches.
On 10 July, testing of APG circuit CCSD 77DJ discovered that the Grooming Switch Card (GSC) within a CONUS ODXC node was generating significant errors. This card was replaced, and it appears to have been the major contributor to the 3-5% packet loss we were measuring at APG since we started testing at this site. One measure of the success of this change is in the volume of trouble tickets related to DoD EE generated at APG. The week prior to the change had over 500 DoD EE problem tickets submitted, the week after the change had 6 DoD EE problem tickets submitted.
After a week of assessing Outlook client performance at APG, the results are that users are receiving very good performance from DoD EE. The very few exceptions are most often due to the need to complete the last 5% of patch distributions. There are no systemic issues with Outlook and overall the site is in very good condition related to DoD EE Outlook service.
Performance at all three sites has improved to the point they easily exceed standard testing metrics and are able to download the OAB within 3 minutes (metric is 4 minutes). Users can open or send signed messages in less than 2 seconds (metric is 2 seconds).