Karriere
am Klinikum & Medizinischer Fakultät
AI HPC Cluster Administrator (f/m/d)
The "Hertie Institute for AI in Brain Health" (Hertie AI) is a research institute of the Faculty of Medicine, funded by the Gemeinnützige Hertie Stiftung, with the aim of detecting diseases of the nervous system earlier and treating them better with the help of artificial intelligence. Currently, Hertie AI is in a dynamic build-up phase. Hertie AI cooperates with the strong and innovative AI ecosystem in Tübingen (e.g. Cyber Valley, Cluster of Excellence “Machine Learning in Science”, Tübingen AI Center). Hertie AI uses and benefits greatly from shared infrastructures with these initiatives, like the Machine Learning Cloud (ML Cloud), but has special compute requirements due to its goal to analyze brain data and simulate neural circuits. The ML Cloud, is a state-of-the-art compute infrastructure with powerful AI CPU and GPU compute capacities, petabyte-scale storage volumes, used by more than 400 researchers and engineers.
About the role:
We are seeking a skilled and proactive Cluster System Administrator to join our team, responsible for managing and optimizing our high-performance computing environment specifically designed for AI workloads. In this role, you will work closely with a team of HPC experts, AI researchers, and IT specialists to ensure that our systems operate at peak performance, supporting AI and ML teams with reliable, scalable computing resources.
What you'll do:
- Cluster Management: Oversee and manage daily operations of the compute infrastructure, including configuration, deployment, and optimization of nodes and networks to maximize performance for AI workloads
- System Monitoring and Maintenance: Monitor system performance, storage, and network utilization to ensure the clusters operate efficiently. Address hardware and software issues as they arise
- User Support: Provide technical assistance to AI researchers, data scientists, and developers on efficient use of cluster resources.
- Documentation and Reporting: Create and maintain comprehensive documentation on system configuration, maintenance tasks, and troubleshooting procedures. Generate regular reports on system performance, uptime, and resource usage for management
What you will bring (position requirements):
- Education and Experience: Specialist knowledge and professional experience in information technology, applied computer science or computer engineering equivalent to the level of a Master's degree
- Technical Skills: Proficiency in HPC cluster management tools (e.g., SLURM, PBS, or Torque), Linux system administration
- Scripting and Automation: Strong scripting skills in Python, Bash, or other languages to automate tasks, optimize processes, and improve system reliability
- Networking and Storage: Solid understanding of high-speed networking, parallel file systems, and large-scale storage solutions (e.g., Lustre, Ceph)
- Problem-Solving: Excellent troubleshooting abilities and a proactive approach to resolving system issues before they impact users. Interest in artificial intelligence and motivation to collaborate with scientists and professionals in the field of AI research
- English proficiency
Relevant experience in some of the following technologies:
- Experience with automation tools for configuration management (e.g. Ansible, Puppet, Chef) and revision control systems (e.g. Git)
- Experience with containers (Docker/ Singularity/Podman / Kubernetes)
What we offer:
- Collaboration in the multifaceted environment of a modern university hospital, which in addition to patient care, also focuses on medical research and teaching
- Future-proof workplace and location as well as attractive remuneration including a company pension scheme (VBL) and at the same time the most flexible working hours possible
- Subsidization of the job ticket for public transport and attractive discounts on employee offer platforms
- Structured onboarding phase, clinic's own academy to develop professional, social and methodological skills
- Preventive health care through a wide range of sports activities
Contact:
We offer remuneration in accordance with TV-L (collective wage agreement for the Public Service of the German Federal States). In line with its internationalization agenda, the University of Tübingen welcomes applications from outside Germany. The University of Tübingen is committed to equal opportunity, diversity and inclusion and wishes to enhance the share of women and under-represented categories employed in research. Applications from equally qualified candidates with disabilities will be given preference. Women are expressly encouraged to apply. In principle, the position can be shared. Employment is based on the relevant provisions of university law. Please observe the applicable vaccination regulations. Presentation costs can unfortunately not be covered.
To apply, please send a cover letter and your CV in English and all relevant certificates in your application as a single PDF file by 01.01.2025. For more information or questions about technical aspects of the position, please contact Dr. Kristina Kapanova at kristina.kapanova@uni-tuebingen.de.
Dr. Kristina Kapanova
hertieai@medizin.uni-tuebingen.de
Closing date for applications:
02.02.2025
including CV and cover letter under specification of the index number 5579.
For more information, please visit:
www.medizin.uni-tuebingen.de/karriere
Interesse?
Cookies
Notwendige Cookies
CSRF (Cross Site Request Forgery) Protection Cookie
Diese Cookie wird aus Sicherheitsgründen gesetzt und dient zur Abwehr von CSRF.
Authentication-Session Cookie
Wird ein Bewerberprofil erstellt im Zuge einer Erstbewerbung bzw. loggt sich der Besucher in seinem Profil ein im Zuge des Portalbesuches so wird ein Authentication-Session Cookie gesetzt, damit die Navigation funktioniert und Eingaben gespeichert werden können. Dieses Cookie bleibt solange erhalten, bis der Bewerber sich ausloggt oder den Webbrowser schließt.
Culture
Dient der Erfassung der Sprache, in der das Portal aufgerufen wird. Das Cookie wird für einen Monat gespeichert.
Statistische Cookies
Referrer
Wenn der Websitebesucher von einer anderen Seite auf das Bewerberportal kommt (zB der Bewerber wird über den “ApplyLink” / „Jetzt bewerben“-Button für einen spezifischen Job von einem Jobportal weitergeleitet), wird die Information, wo der Bewerber das Inserat gefunden hat, mitgegeben. Der Websitebesucher kann im Falle einer Bewerbung diesen aber auch nochmal abändern. Die Information wird einmalig im Bewerberprofil des Bewerbers gespeichert, sollte eine Bewerbung über das Bewerberportal abgeschlossen werden. Der Cookie wird für 2 Wochen gespeichert.
Ihr Browser ist veraltet!
Bitte aktualisieren Sie Ihren Browser, um diese Webseite korrekt darzustellen. Jetzt aktualisieren