Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
hpc-compendium
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Deploy
Releases
Package Registry
Container Registry
Model registry
Operate
Terraform modules
Monitor
Incidents
Service Desk
Analyze
Value stream analytics
Contributor analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Terms and privacy
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
ZIH
hpcsupport
hpc-compendium
Commits
a76c8928
Commit
a76c8928
authored
9 months ago
by
Martin Schroschk
Browse files
Options
Downloads
Patches
Plain Diff
Rework recabling notes
parent
79f70692
No related branches found
No related tags found
2 merge requests
!1093
Automated merge from preview to main
,
!1092
Rework recabling notes
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
doc.zih.tu-dresden.de/docs/index.md
+2
-2
2 additions, 2 deletions
doc.zih.tu-dresden.de/docs/index.md
doc.zih.tu-dresden.de/docs/jobs_and_resources/alpha_centauri.md
+4
-70
4 additions, 70 deletions
...h.tu-dresden.de/docs/jobs_and_resources/alpha_centauri.md
with
6 additions
and
72 deletions
doc.zih.tu-dresden.de/docs/index.md
+
2
−
2
View file @
a76c8928
...
...
@@ -31,10 +31,10 @@ Please also find out the other ways you could contribute in our
## News
*
**2024-05-08**
[
Maintenance at `Alpha Centauri` - User action required!
](
jobs_and_resources/alpha_centauri.md#recabling-maintenance
)
*
**2024-07-08 --2024-07-12**
All HPC systems and services will be unvailable due to a maintenance
at the cooling infrastructure
*
**2024-02-05**
[
New Support Form: HPC Café Questions & Answers
](
support/support.md#open-qa-sessions
)
(
[
to the Event
](
https://tu-dresden.de/zih/qa-sessions-nhr-at-tud
)
)
*
**2024-02-05**
[
New JupyterHub now available
](
access/jupyterhub.md
)
## Training and Courses
...
...
This diff is collapsed.
Click to expand it.
doc.zih.tu-dresden.de/docs/jobs_and_resources/alpha_centauri.md
+
4
−
70
View file @
a76c8928
...
...
@@ -5,53 +5,6 @@ The multi-GPU cluster `Alpha Centauri` has been installed for AI-related computa
The hardware specification is documented on the page
[
HPC Resources
](
hardware_overview.md#alpha-centauri
)
.
## Recabling Maintenance
!!! warning "User Action Required"
Please read the following information carefully and follow the provided instructions.
We are in the
[
process of becoming `Alpha Centauri` a Stand-Alone Cluster
](
#becoming-a-stand-alone-cluster
)
.
Planned now is the integration of the cluster into the InfiniBand infrastructure
of the new cluster
[
`Barnard`
](
barnard.md
)
.
!!! hint "Maintenance Work"
On **June 4+5**, we will shut down and migrate `Alpha Centauri` to the Barnard InfiniBand
infrastructure.
As consequences,
*
BeeGFS will no longer be available,
*
all
`Barnard`
filesystems (
`/home`
,
`/software`
,
`/data/horse`
,
`/data/walrus`
) can be
used normally.
For your convenience, we already have started migrating your data from
`/beegfs`
to
`/data/horse/beegfs`
. Starting with the downtime, we again synchronize these data.
!!! hint "User Action Required"
The less we have to synchronize the faster will be the overall process. So, please clean-up
as much as possible as soon as possible.
Important for your work is:
*
Do not add terabytes of data to
`/beegfs`
if you cannot "consume" it before June 4.
*
After the final successful data transfer to
`/data/horse/beegfs`
, you then have to
move it to the normal workspaces on
`/data/horse`
.
*
Be prepared to adapt your workflows to the new paths.
What happens afterward:
*
complete deletion of all user data in
`/beegfs`
*
complete recabling of the storage nodes (BeeGFS hardware)
*
software and firmware updates
*
set-up of a new WEKA filesystem for high I/O demands on the same hardware
In case of any question regarding this maintenance or required action, please do not hesitate to
contact the
[
HPC support team
](
../support/support.md
)
.
## Becoming a Stand-Alone Cluster
The former HPC system Taurus is partly switched-off and partly split up into separate clusters
...
...
@@ -64,29 +17,10 @@ stand-alone cluster with
### Filesystems
Your new
`/home`
directory (from
`Barnard`
) is also your
`/home`
on
`Alpha Centauri`
.
If you have not
[
migrated your `/home` from Taurus to your **new** `/home` on Barnard
](
barnard.md#data-management-and-data-transfer
)
, please do so as soon as possible!
!!! warning "Current limitations w.r.t. filesystems"
For now, `Alpha Centauri` will not be integrated in the InfiniBand fabric of Barnard. With this
comes a dire restriction: **the only work filesystems for Alpha Centauri** will be the `/beegfs`
filesystems. (`/scratch` and `/lustre/ssd` are not usable any longer.)
Please, prepare your
stage-in/stage-out workflows using our [datamovers](../data_transfer/datamover.md) to enable the
work with larger datasets that might be stored on Barnard’s new capacity filesystem
`/data/walrus`. The datamover commands are not yet running. Thus, you need to use them from
Barnard!
The new Lustre filesystems, namely `horse` and `walrus`, will be mounted as soon as `Alpha` is
recabled (planned for May 2024).
!!! warning "Current limitations w.r.t. workspace management"
Workspace management commands do not work for `beegfs` yet. (Use them from Taurus!)
Your new
`/home`
directory (from
`Barnard`
) is also your
`/home`
on
`Alpha Centauri`
. Since 5th July
2024,
`Alpha Centauri`
is fully integrated in the InfiniBand infrastructure of
`Barnard`
. With that,
all
[
filesystems
](
hardware_overview.md#storage-systems
)
(
`/home`
,
`/software`
,
`/data/horse`
,
`/data/walrus`
) are available.
## Usage
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment