Invest in Discovery
Why Invest in the Discovery Condominium Model?
Investors have on-demand access to their purchase, competing only with the people from within their groups for access to the resource. This is possible through the use of individualized queues, linked to the resources purchased. Only those individuals approved by the purchaser have access to the queue. The relationship with Discovery is at the desire of the investor, and those who wish to remove their equipment from the system are free to do so. NMSU and ICT will provide system administrators for the equipment, a secure, climate controlled location, and nightly storage for home directories. Investors benefit by being able to burst out into the rest of the Discovery system, thereby gaining easy and smooth access to additional resources as needed.
When the investor-purchased resources aren’t in use, they will be made available to the community as a whole for use. This type of arrangement is often referred to as a backfill queue. However, this general access to investor-owned resources can be immediately revoked when the investors wish to use their resources. The job will automatically be re-queued to ease the burden on users. Investors will also be given priority on this backfill queue.
Service Options
Investors can purchase 1 general compute node without prior approval. Due to the cost of the purchase (over $20k), more than 1 node needs approval as the purchase must be competitively quoted by 3 parties. Writing node purchases into a grant doesn’t require any prior discussion with the HPC team. However, once the nodes are ready to be purchased, please come speak with the HPC team to help you in getting the best price.
Users of the system may install and use software as needed. If assistance is needed, please expect an average lead time of 2–4 weeks. If assistance is needed for licensed software, please be prepared to review the license to assess the viability of installing on an HPC system and for general use. Note: Some software require a license per node, so to remain compliant, we must read the license.
Node costs are based on the equipment cost and include a 5-year warranty. After the 5-years are concluded, the purchased resource becomes a part of the general-compute queue and will be maintained until failure or deemed unreliable by the HPC team. At this point, it will be at ICT’s discretion to remove and dispose of the resources.
If the investor wishes to remove their equipment from the system, they’re able to do so. Removing equipment causes several changes:
-
The equipment will be taken from the restricted access room, although the shared access space can still be used for storing and running the machine
-
The InfiniBand interconnect will remain behind, which will slow the communication between nodes, if several nodes were purchased
-
A rack may be needed for performance
-
A trained and knowledgeable administrator needs to be found for the machine (this is a requirement of having a device on the network)
How to Order?
Please review the Service Catalog below to decide what best suits your needs. Contact the Discovery admins at hpc-team@nmsu.edu when you are ready to purchase. Someone will contact you to verify and then nodes will be purchased. Please allow for 2 months once the purchase has been approved by purchasing. The nodes will be installed and available as quickly as possible; the majority of this time is for the equipment provider to make/send it.
If you need a special node, please contact Discovery admins at hpc-team@nmsu.edu
Availability of the System
Discovery is meant to be available for campus use around the clock, every day of the year. Maintenance windows lasting 2 weeks happen 2–3 times a year for important and major updates. These will be scheduled during low-use times, and users will be given at least 2 weeks notice. Occasionally an immediate issue arises that require the system to be taken offline. The administrators will communicate this and work to restore the system with as little inconvenience as possible to the HPC users. Every effort will be made to pause and resume the jobs when the system is restarted.
Service Catalog
Standard Compute Nodes
The following nodes have been competitively priced. However, as they’re quotes, they’re estimates and may change. Pricing is heavily dependent on market conditions and the size of the purchase.
If you are planning to purchase or include nodes in a grant proposal, please contact Discovery admins at hpc-team@nmsu.edu for up-to-date pricing or if you need a different configuration.
Processors |
Cores Per Node |
Memory |
Local Disk |
Networking |
Budgetary Estimate/Warranty |
2x Intel Xeon Gold 6226R, 2.9–3.9 GHz, 16C/32T |
32 (64T) |
192 GB RDIMM, 2666MT/s, Dual Rank |
480GB SSD SATA |
InfiniBand HDR (100Gbps) |
$10,190 / 5 yrs |
2x Intel Xeon Gold 6226R, 2.9–3.9 GHz, 16C/32T |
32 (64T) |
384 GB RDIMM, 2666MT/s, Dual Rank |
480GB SSD SATA |
InfiniBand HDR (100Gbps) |
$11,490 / 5 yrs |
Special Compute Nodes
If you are interested in purchasing a non-standard compute resource, please contact Discovery administrators at hpc-team@nmsu.edu. Please see the below table for budgetary estimates. You aren’t limited to the below. If you have a special need, please contact the team. The prices quoted are inclusive and consider the cost of the equipment and 5 years of a manufacturer maintenance agreement, as well as 5 years of administration by ICT.
Type |
Processors |
Accelerator(s) |
Cores Per Node |
Memory |
Local Disk |
Networking |
Budgetary Estimate/Warranty |
GPU |
2x Intel Xeon Gold 6226R, 2.9–3.9 GHz, 16C/32T |
4x NVIDIA Tesla T4 16GB |
32 (64T) |
384 GB RDIMM, 2666MT/s, Dual Rank |
480GB SSD SATA |
InfiniBand HDR (100Gbps) |
$25,975 / 5 yrs |
GPU |
2x Intel Xeon Gold 6226R, 2.9–3.9 GHz, 16C/32T |
2x NVIDIA Tesla V100s 32GB |
32 (64T) |
384 GB RDIMM, 2666MT/s, Dual Rank |
480GB SSD SATA |
InfiniBand HDR (100Gbps) |
$39,790 / 5 yrs |
Large Memory |
4x Intel Xeon Gold 5120 2.2G, 14C/28T |
N/A |
48 (96T) |
3TB LRDIMM, 2666MT/s, Quad Rank |
2TB 7.2K RPM NLSAS 12Gbps |
InfiniBand HDR (100Gbps) |
$82,830 / 5 yrs |
Storage
All users are provided with 100GB of backed-up home directory space and 1TB of non-backed-up scratch directory space. Users working together on a project can request 500GB of Project Space. This storage space is shared between users working on the project.