HPC RSE Meeting Notes: 21 April 2020

Attending organisations

  • GW4, University of Bristol: Isambard
  • University of Cambridge: CSD3
  • UCL: MMM Hub
  • EPCC, University of Edinburgh: ARCHER, ARCHER2, Cirrus, DiRAC
  • University of Leeds: N8-CIR
  • QUB: NI-HPC
  • University of Sheffield: JADE
  • University of Reading
  • DAMPT, University of Cambridge
  • Cardiff University
  • University of Hull
  • University of Durham
  • University of Edinburgh
  • Aberystwyth University
  • Swansea University
  • University of Leicester
  • University of York
  • University of Nottingham
  • University of Bath
  • University of Birmingham

Actions

  • (AndyT) Propose SocRSE online event on online training this year to SocRSE trustees
  • (AndyT) Suggest that ExCALIBUR run a webinar to bring community up to date with the project

National HPC RSE Updates

ARCHER, ARCHER2, Cirrus, DiRAC, EPCC

  • Oracle bare metal cloud benchmarking
    • Skylake nodes interconnected by 100 Gbps RoCE
    • Benchmarked DiRAC benchmark applications up to 32 nodes
    • Single node performance same as native
    • Multi-node performance similar to single-rail IB cluster
    • Full story at upcoming webinar, see:
  • DiRAC benchmarks
    • Application benchmarks from the community - QCD, astrophysics/cosmology
    • Baseline across current DiRAC systems
    • Release in open source benchmarking model (as for ARCHER benchmarks)
  • ARCHER2: awaiting delivery of hardware
    • Website now available: https://www.archer2.ac.uk
    • 1st eCSE call planned to open soon: https://www.archer2.ac.uk/ecse/
  • Cirrus: new hardware delivered but not yet configured

JADE

  • Waiting on delivery for JADE2
  • Working on training materials for deep learning

MMM Hub

  • Expecting new Tier-2 hardware in middle of May

### Isambard

  • Continued high rate of sign ups
  • Updated Arm compilers
  • Isambard 2 due by the end of the year
  • Have early access to A64FX in Japan

### NI-HPC

  • Finalising acceptance tests based on remote access - should be complete this week
  • Ulster have appointed their RSE will start 1 June
  • QUB will advertise as soon as possible

### N8 HPC

  • Hardware delivered to Durham
  • RSEs distributed across N8 - one from each institution. Some already in place.

Cambridge

  • Awaiting delivery of new Tier-2 hardware
  • Open Exascale Lab at Cambridge RSE team from Dell to target new Intel architecture - want to hire new RSEs soon. Aligned to ExCALIBUR project (see below)

Other topics

### Supplier access to site

  • Location dependent
    • UCL have key staff who volunteered to come in to their datacentre
    • EPCC have a rota working to minimise staff on site, suppliers coming in to fix issues with systems
    • Durham: University makes distinction between fixing existing kit (allowed) and building new stuff (not allowed)

ExCALIBUR and Exascale

  • ExCALIBUR: https://www.metoffice.gov.uk/research/approach/collaboration/spf/excalibur
  • Aiming to redesign high priority simulation codes and algorithms to fully harness the power of future supercomputers, keeping UK research and development at the forefront of high-performance simulation science.
  • Led by the Met Office and EPSRC (on behalf of UKRI Research Councils) along with UKAEA, NERC, STFC and MRC.
  • Phase 1 call from EPSRC to fund Design and Development Working Groups to bring together a wide range of domain experts, mathematicians, computational scientists, and Research Software Engineers (RSEs).
    • Call outcome: https://gow.epsrc.ukri.org/NGBOViewPanel.aspx?PanelId=1-811FGY
  • Also a number of hardware testbeds being installed at various sites across the UK to look at different hardware aspects

Upcoming training, events and meetings

  • ARCHER2 Training (all online):
    • Introduction to the ARCHER2 hardware and software, Wednesday 22nd April 2020 11:00-12:00 BST, https://www.archer2.ac.uk/training/courses/200422-archer2-hardware/
    • Porting and performance of DiRAC HPC benchmarks on Oracle bare metal cloud, Wednesday 29th April 2020 11:00-12:00 BST, https://www.archer2.ac.uk/training/courses/200429-oracle-cloud-benchmarking/ (webinar in collaboration with DiRAC)
    • ARCHER2 Spectrum of Support, Online, Wednesday 13th May 2020 15:00-16:00 BST, https://www.archer2.ac.uk/training/courses/200513-archer2-support/
    • The determination of clusters structures combining infrared spectroscopy and density functional theory calculations, Online, Wednesday 20th May 2020 15:00-16:00 BST, https://www.archer2.ac.uk/training/courses/200520-cluster-structures/
    • LAMMPS, 4 - 18 May 2020, https://www.archer2.ac.uk/training/courses/200504-lammps/
    • Message-passing programming with MPI, 14 - 22 May 2020, https://www.archer2.ac.uk/training/courses/200514-mpi/
    • Reproducible computational environments using containers, 13 - 14 July 2020, https://www.archer2.ac.uk/training/courses/200713-containers/
  • The IWOCL / SYCLcon 2020 program on all things OpenCL and SYCL, starting Mon 27th April, https://www.iwocl.org/iwocl-2020/conference-program/
  • Andrew Jones’ “HPC huddles” (MS Teams meeting with 10-20 international HPC people in a free form discussion) sign up by replying to his tweet announcements https://twitter.com/hpcnotes

Date of next meeting

1400 UK Time, TBC May 2020