Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Team Kent Ridge

National University of Singapore

Team Kent Ridge Logo

Diagram

NUS Hardware Diagram

Hardware

Our cluster comprises of ARM64-based compute nodes (Orange Pi 5 Max), with these key hardware specifications:

CategorySpecification
Master ChipRockchip RK3588 (8nm LP process)
CPU8-core 64-bit; 4× Cortex-A76 @ 2.4GHz + 4× Cortex-A55 @ 1.8GHz with independent NEON coprocessor
GPUIntegrated ARM Mali-G610; OpenGL ES 1.1/2.0/3.2, OpenCL 2.2, Vulkan 1.2
NPU6 TOPS; supports INT4/INT8/INT16/FP16 hybrid computing
PMURK806-1​
RAM16 GB LPDDR5
StorageMicroSD card slot; M.2 M-Key (PCIe 3.0 ×4, NVMe SSD) ​
USB2× USB 3.0, 2× USB 2.0 ​
Ethernet1× 2.5 GbE LAN via PCIe (RTL8125BG) ​
Power InputUSB Type-C, 5V @ 5A ​
PCB Dimensions89 mm × 57 mm × 1.6 mm ​
Weight62 g ​
Power Consumption12.5W (expected)

All SBCs will communicate over the NICGIGA 24-port switch and connect to each other via SSH.

Two Raspberry Pi 3Bs will be used (not included in the budget) for:

  1. a DHCP Server + DNS
  2. a Gateway / Firewall (required to access our university network, incoming connections disabled during the competition)

Power monitoring

The system will be connected to a Server Technology Sentry CW-8H2A413 Switched PDU for accurate power monitoring. The PDU will send power usage statistics to a Grafana dashboard, and a view of this will be shared live with the committee. The setup (including PDU) will also be livestreamed via video. We are happy to provide further verification of the power draw values.

Hardware Table

CategoryItemDescriptionQuantityCalculated Power Draw (W)Cost per unit (USD)Total Price (USD)
ComputeOrange Pi 5 Max 16GBSingle-Board Computer (SBC)17210249.004233.00
StorageKioxia Exceria MicroSDHC U1 32GBOperating System Drive17NA11.77200.10
StorageIntel® Optane™ Memory M10 Series (16GB)Metadata / Log Drive445.0020.00
StorageCrucial T500 PCIe Gen4x4 NVMe M.2 SSD (1TB)NFS Drive (Fast)23167.31334.62
StorageCrucial P310 PCIe Gen4x4 NVMe M.2 SSD (1TB)NFS Drive (Cheap)11.5164.00164.00
NetworkNICGIGA 24-Port 2.5G Ethernet Switch (S25-2400)Network Switch115249.99249.99
PowerDC5.5 x 2.1mm Female Jack to USB Type-CPower Adapter17NA3.0852.50
PowerDC Power Fuse Distribution StripPower Distribution2NA48.3096.60
PowerMeanwell LRS-300-5V w UK PlugPower source1NA36.3436.34
Power14AWG 100ft Electrical WirePower Cable2NA15.0030.00
NetworkCat5E + Cat6 Cables (Various Lengths)Network Cables17NA1.5626.56
CoolingARCTIC P14 Pro PSTFans3129.0027.00
Cooling14x14x6mm and 22x22x10mm HeatsinksHeatsinks17NA0.569.61
CasingDIY 10“ Server Rack (2020 Extrusion)Server Rack1NA204.02204.02
PowerDC5.5 x 2.1mm Male JackPower Adapter17NA0.284.83
Totals242.55689.17

Software

  • Operating System
    • Armbian Linux: Best support for Orange Pi
  • Cluster Orchestration and Management
    • Ansible: Orchestration tool, eliminating the need to manually set up individual SBCs
    • SLURM: Job scheduler and workload manager. Chosen for team familiarity due to being used in university’s cluster.
    • Grafana: Dashboard for system monitoring. Chosen for team familiarity.
    • Prometheus: Used to collect and store real-time metrics such as power usage and temperature.
  • Storage
    • BeeGFS: Parallel file system for higher IO throughput.
  • MPI
    • OpenMPI: For multi-node runs,
  • Linear Algebra Libraries
    • OpenBLAS: Linear algebra operations.

Strategy

Benchmarks / Applications

  1. High Performance Linpack (HPL)
    • HPL is installed and compiled with OpenBLAS.
    • We will experiment with several parameters (such as problem size N and block size NB) and tune them for our system.
    • Tune to maximize TFLOPS.
  2. D-LLAMA
    • Compiled from source following the official repository documentation.
    • Implementation of weight quantization and exploration of NPU-offloaded computation.
  3. MDTest
    • Compiled from the MDTest source utilizing OpenMPI.
    • Test different storage methods like BeeGFS.
    • Optimise with MPI rank placement and network tuning.
  4. IQ-TREE
    • Compiled from source.
    • Tune with OpenMP and MPI for ranks and threads per node.
  5. Mystery Application
    • Assign team members based on the mystery application and members’ individual interests.

Team Details

NameInterests!Responsibilities
Muhammad Asyraf Bin Abdul RahimMotorsport / tinkering / gamingHardware and system design & management, Ansible system admin
Lau Zhe WenPhotographyHardware procurement, optimizing IQ-TREE and SLURM
Tan Yong XiangLikes climbing thingsHardware procurement, power system design, optimizing IQ-TREE
Mande Neil AshvinikumarNetworking (the ethernet kind) enthusiastNetwork setup, optimising MDTest and BeeGFS
Gabra ShubhanPlaying chess, playing cricket, travellingHardware procurement, power system design, optimizing D-LLAMA
Chan Dong JunStargazingAnsible system admin, optimizing D-LLAMA and general software
Joel ChongLow latency systems, C++ enthusiast (very enthusiastic)Optimizing HPL, kernel, network and general software
Koh Tze RuiAlso likes climbing thingsHardware procurement, optimizing HPL