Processing needs

Lessons learned during previous hackathons

Lukas Kluft, Florian Ziemen

Hardware

  • a couple of dedicated analysis nodes in a SLURM partition (~32 at DKRZ, 10 above normal)
  • reservations have proven less useful than simply expanding the interactive partition
  • limit of 2 nodes per participant (larger jobs are usually user error)

Services

  • interactive access to the computing resources
  • JupyterHub (or similar) is usually an appreciated entry point

Python environments

  • it’s hard (if not impossible) to provide a single environment for all users
  • Compromise: provide a lean python environment with “common” scientific packages
  • users can install their exotic dependencies on top of that

Our recommendation

  • use micromamba1 to manage a basic python environment
  • we should maintain a single environment.yaml to provide common packages
  • users may install more “exotic” packages on their own

Data

  • state-of-the-art compression can help to reduce both disk usage and access time (zstd, lz4)
  • test access to data using the hackathon environment