4th July 2014

From Define Wiki
Jump to navigation Jump to search

Roshan

Opportunities

Cirtix Training

Ineda

  • Roshan provided ARM Viridis server

BARC

  • Testing completed
  • RFP soon
  • Roshan please get the files and please do a run for phi and document on the wiki

FhGFS

  • Please pad out the benchmarking section
  • Potentially more tests on the A15 system

Lustre

  • to be tested at a later date

Ansys

  • Please continue to work on this

Hortonworks hadoop

  • Please continue to work on this
  • Hortonworks system / 1 head node plus 4 nodes
  • Dave and Mike to work together on this

Gromacs 5

  • Roshan started
  • Rest of the team to work on this

Mike

Allinea demo from Mike next Friday.

  • Agreed
  • Allinea will work from any node – the licence sever is running on the head node.

CloudX / OpenStack

  • Demo in a few weeks
  • The MPstor systems are coming back up now – dave can you have a look at the email is sent yesterday
  • The Cloudx demo seemed to go well. There are a few deal in the pipeline, we get more details onjce an NDA is signed. But at least one deal is ready to go as soon as the account is set up.

Exablaze

  • Put the two cards in 1 server

GPU NOdes

  • The GPU nodes are rebuilt and back online (gpu1 – 7)

Cern Benchmarks

  • The X10 system is installing with SL – hopefully it won’t have an issues

Ansys

  • The ANsys partnership has been set up – I’ll get the information over the Sam asap.

PHIs

  • I’ll set the PHI system up on the cluster next week

Nvidia / Gromacs

  • Looking at GROMACS5 today, need to work out how to auto logoff users, meeting on Monday. Account has been set up for Nvidia.

GEO benchmarks – in progress

Boxes

  • are being sent to warehouse if you’re ever looking for one in the lab and can’t find it

Ceph

  • We have the closest systems we had in discrep to the ceph solution on our website – you / Leo were asking about it so I pulled them out if you need them. They are already in the rack. I’ll look into this at somepoint but it can wait until we have a bit more time

Network Issues

  • Our network bugs have been fixed as far as we can tell, but if you have any issues on pxes network let me know.

RMA GPU Bug Reports

  • RMA now have a GPU system set up by myself to run the bug reports on. Jon is going to look into giving them a port on pxes network, in which case I’ll make a profile on cobbler to install cuda for them with the latest version of centos so they don’t have to ask me again.

Order

  • The Viridis order is just waiting to have one node reinstalled from the internet as its going into read only mode by default.
  • There is an order for hortonworks to be installed on

Support Issues

  • We’re looking into an issues with a RHEL install we did – it look like Sam didn’t give us the kickstart we should have had. So they have reinstalled from scratch and now are having issues booting although we don’t know why yet. Having now seen the kickstart they wanted us to us we can’t see any issues.

Infiniband

  • I’m going to get a new IB licence for another switch from stock to get the clusters IB back online – the switch EPCC were using at the show has been taken by Boston Germany

AMD GPUs

  • We’re waiting to find a systems for the s10000’s

Other Issues

  • We have overheat lights on IML1 / CLI1 in the lustre systems. Their IPMIs are down so I can’t see why. Am I OK to reboot them.

ARM

  • We have A15 now in house again
  • Photos in Mikes Home directory on cluster