Processing needs
Lessons learned during previous hackathons
Lukas Kluft, Florian Ziemen
Hardware
a couple of dedicated analysis nodes in a SLURM partition (~32 at DKRZ, 10 above normal)
reservations have proven less useful than simply expanding the
interactive
partition
limit of
2 nodes per participant
(larger jobs are usually user error)
Services
interactive access to the computing resources
JupyterHub
(or similar) is usually an appreciated entry point
Python environments
it’s hard (if not impossible) to provide a single environment for all users
Compromise:
provide a lean python environment with “common” scientific packages
users can install their exotic dependencies on top of that
Our recommendation
use
micromamba
1
to manage a basic python environment
we should maintain a single
environment.yaml
to provide
common
packages
users may install more “exotic” packages on their own
Data
state-of-the-art
compression can help to reduce
both
disk usage and access time (
zstd
,
lz4
)
test access to data
using the hackathon environment