High Performance Computing Service

Upgrade Schedule

Last update: Fri Jun 29 16:29:43 BST 2012

The HPCS cluster will be undergoing a major upgrade during March-May 2012. The current plan is outlined on this page and any amendments will be reflected here.

When  Event  Comments
  
Thursday 22nd March 10:00-midnight  Full system maintenance.  No logins or job processing. smp01 to be retired. Will be followed by benchmarks on the Westmeres only.
  
Monday 26th March 08:00  Machine room works begin.  At risk period begins.
  
Wednesday 28th March 12:00  Woodcrest service reduces to units F and H only.  Woodcrest units A-E & G decommissioned. This may require all Woodcrests to be emptied.
  
Tuesday 3rd April 12:00  GPU cluster relocation starts.  tesla17-32 withdrawn from service.
  
Wednesday 4th April 09:00  GPU cluster relocation continues.  tesla01-16 withdrawn from service.
  
Saturday 7th April  GPU cluster relocation complete.  Tesla nodes restored to service.
  
Monday 9th April - Friday 27th April  Ongoing hardware and software preparation.  Core system, networks and new storage configured.
  
Thursday 3rd May  Full system maintenance. Remaining Woodcrest units F and H and bindloe login nodes decommissioned.  New Westmere login nodes (login-west*) introduced. Final hardware installations.
  
Monday 7th May - Friday 11th May  All physical cabling completed.  
  
Monday 14th May - Friday 25th May  Cluster provisioning, fault finding and stabilization.  
  
Saturday 26th May - Friday 1st June  Linpack testing  During this phase the number of available Westmere nodes may need to be scaled back to accommodate the additional heat and power load.
Final result: 183.379 TFlops
Position 93 on the June 2012 Top500 list.
  
Friday 8th June (approx 2 weeks)  Beta testing starts.  Please contact support if you are interested in participating.
  
Tuesday 19th June  Data transfer to new storage commenced.  
  
Monday 25th June  Full maintenance 10:00.  Lustre to be upgraded to 2.1. Final data transfers to new storage system.
  
Tuesday 26th June  Maintenance continues.
Darwin3 enters production.
  New Sandy Bridge login nodes to become generally available. Westmere compute nodes to be renamed and all compute nodes provisioned with SL6.2. Darwin3 full production commences.