# **SPOKE 1 FUTURE HPC & BIG DATA**

# Hardware (HWS) and Software (SWI) living labs: **Specification of procurements for** laboratory full operation



Ministero dell'Università e della Ricerca

Finanziato dall'Unione europea NextGenerationEU

Italia**domani** 

# **EXECUTIVE SUMMARY**

The deliverable describes the state of the realization of the two national living labs planned for Spoke 1: the Hardware and System lab (HWS) in Bologna and the Software and Integration lab (SWI) in Torino. The two living lab has been designed to sustain research and technological transfer, acting as paces to design, build and validate prototypes with industrial partners, starting from industrial needs and using academic knowledge, and to train the next generation of industrial researchers in HPC and BigData.

| <u>Summary</u>                            |
|-------------------------------------------|
| Active tasks                              |
| Partners                                  |
| Dissemination and Exploitation Activities |
| Living labs HWS (Hardware & Systems)      |
| HWS Lab equipment provisioning activities |
| Chip design and prototyping provisioning  |
| System design and prototyping             |
| <u>Results</u>                            |
| Living lab SWI (Software & Integration)   |
| SWI Lab equipment design goals            |
| SWI Lab equipment provisioning activities |
| SWI Results                               |
| EuroHPC JU project coordination meeting   |
| HWS & SWI coordination                    |

#### Summary

As for the two laboratories planned in Spoke1 (Hardware and System lab HWS-Lab, in Bologna, and Software and Integration lab SWI-Lab in Torino), the spaces have been found and rented (when needed, as in the case of UNITO) and adaptation, furnishing works are planned and in the early phases of procurement. Furthermore, the two labs have already started working on prototypes and collaborative activities, leveraging temporary spaces made available by the hosting institutions. These activities have already produced a joint demonstration and prototype, as well as a submitted publication, as highlighted in the dedicated section in this document.

Overall, the activities of Spoke1 are proceeding according to plan, with no noticeable delays. On the contrary, early scientific results (in the form of submitted publications) demonstrate that Spoke1 is ahead in ramping up scientific production and active collaborations.

#### Active tasks

Currently, tasks targeting milestone 5 are active (ending at Month 8 – April 2023).

The task aims at producing one deliverable for both living labs (HWS & SWI):

• HWS & SWI Labs: Specification of procurements for full laboratory operation (R);

#### Partners

The following partners are involved in the two tasks:

#### UNIBO, UNITO

#### **Dissemination and Exploitation Activities**

The two partners, expressing the leader and co-leader of Spoke1, had almost weekly virtual meetings to coordinate spoke management and spoke lab set-up. They also physically met in Dallas (USA) at the

Supercomputing conference (Nov 2022) and in Bologna at the ISCS kick-off meeting (Nov 2022). In particular, the role of living labs within the scientific organization of the Spoke1 has been discussed and defined. The two labs acted as brokers between all industrial and academic partners affiliated with Spoke1 for setting up the Proof-of-Concept project to be implemented with innovation funds.

#### Living labs HWS (Hardware & Systems)

Infrastructure and provisioning activities

Within the reporting period, UNIBO started the HW Lab building infrastructure activities, identifying the spaces for the lab and starting the administrative steps and interactions with architects and electricians, toward the commencement of the renovation of the identified spaces. In the following, a detailed description of the activities is provided.

In November 2022, architect Chiara Semprini Cesari (ATES Technical Area for Building and Sustainability) was designated as the RUP (Responsible for Public Procurement) for the above-mentioned project, which is fully funded by PNRR (National Recovery and Resilience Plan) funds.

By the end of last December, the definitive identification of the rooms located on the raised ground floor of the lateral building was completed for housing the computer laboratories, server, computing center, and offices for Prof. Benini's team.

In January-February 2023, the functional organization of the spaces was shared with Prof. Benini based on the layout of the designated areas and the desired specifications communicated during meetings and email exchanges. Simultaneously, two external professionals were identified to be entrusted with the design and operational management of interventions regarding mechanical, electrical, fire safety, and data transmission systems.

In March-April 2023, a formal inquiry was submitted regarding the critical aspects related to the application of the DNHS (National Guidelines for School Buildings) regime for defining the project requirements (as per the Operational Guide 2022) to the competent Ministry of Infrastructure. The response received stated that our inquiry was too specific.

On April 4, 2023, the results of the internally conducted study within the department on the application of the DNHS principle to the specific field of construction and compliance with all prescribed regulations were shared with the head of the office.

Consequently, discussions with the professionals resumed for the signing of contracts for the design of electrical, mechanical, fire safety, and data transmission systems.

By June 2023, the contracts will be signed, and formal design activities will commence, provided that the transfer of resources from the PNRR department to our budget is promptly completed.

It is expected that the lab will include 4 open spaces for Ph.D. students and postdocs, 4 offices for senior and permanent staff, 2 meeting rooms, a laboratory of electronics, a server room and a small kitchen, as described in Fig. 1.



Fig. 1. Floorplan of the HW lab.

## HWS Lab equipment provisioning activities

The HW lab expects the provisioning of research equipment and infrastructure required to start the research activities, in both chip design and system design fields.

#### Chip design and prototyping provisioning

- 1 server supporting the design of integrated circuits. The purpose of this server is to host EDA tools and related licenses for the design, simulation, functional verification, and physical implementation and verification of integrated circuits. The server will also host the design kits of the CMOS technologies used for power, performance, and area estimations of the IP blocks developed within the lab's research activities, as well as for the fabrication of silicon IC prototypes and demonstrators. This server might be made available to the partners of the spoke upon request. Modes and regulations for remotely accessing the server will be defined once the infrastructure is acquired.
- FPGA emulation platforms (2 Xilinx VCU 118 and 2 Xilinx VCU 128). The purpose of the FPGAs is to provide rapid prototyping infrastructure, as well as software development vehicles for all the systems developed within the IC research activities of the lab. Having such a kind of infrastructure allows anticipating significantly the software development for architectures developed within the project, thanks to the more rapid prototyping time of a system on chip on FPGA over silicon manufacturing (typically a few weeks vs. several months). Two different kinds of FPGA are provisioned, a smaller one (Xilinx VCU 118), suitable for prototyping relatively small systems in the domain of automotive and satellite platforms targeting high-end embedded applications, and a larger one (Xilinx VCU 128) suitable for the prototyping of architectures in the HPC domain (i.e., similar to GP-GPUs). The FPGAs might be made available to the partners of the spoke upon request.

Modes and regulations for remotely accessing the FPGAs will be defined once the infrastructure is acquired.

#### System design and prototyping

- One low power density rack (~5KW) to host RISC-V based compute nodes, namely Monte Cimone Rack (MC-Rack). The purpose of this rack is to host COTS based RISC-V system platform enhanced with RISC-V accelerators. It consists of the 8 E4 RV007 Server Blade based on a dual SiFive Freedom U740 SoC already procured. These will be extended based on market availability with: 8 x AlveoU50, AlveoU55C, Xilinx MK180, and Xilinx VCU128 for RISC-V accelerators HW emulation, RISC-V commercial available accelerators (i.e., InspireSemi, Tenstorrent, AxelleraAI, Esperanto, etc. ) and upcoming HPC RISC-V server boards.
- One high power density rack (~20KW) to host high performance computing equipment, namely the HPC Rack (HPC-Rack). The purpose of this rack is to host HPC-grade computing blades; the primary interest is on ARM Host and flagship NVIDIA accelerator cards - type may vary according to market availability. Overall the rack will be composed of a storage server, 2x ARM + GPU server, 1x heterogeneous server, 1x infiniband switch, and 1x network switch.
- SW and HW equipment for benchmarking, including smart PDUs identified as the Eaton ePDU metered output, for the HPC and MC Racks, as well as commercial license for SPEC benchmarks. Commercial licenses have been selected to allow high TRL benchmarking results.

#### Results

The HWS lab activities consisted of four main directions:

- Infrastructure provisioning and planning. During the period, the CN HPC HW Lab space has been identified in Via Carlo Pepoli 3/2; several site inspections have been performed with the architects and electricians to define the suitability of the spaces, especially concerning safety regulations. The space allocation (offices, open spaces, electronics laboratory, data center) and a first version of the data center capacity plan have been provided. A research paper has been done together with the UNITO SWI lab on federated learning, comparing the RISC-V Monte Cimone system with the ARM and Intel systems.
- RISC-V server platform based on COTS. The first RISC-V HPC cluster worldwide, namely Monte Cimone, has been installed in the ECS Lab and configured as a compute cluster. A first international course – Lab of big data architecture (Prof. Andrea Bartolini) – has been designed. Students of the class learned about HPC and performance profiling on the Monte Cimone cluster and how to benchmark and perform empirical roofline models of the cluster's compute nodes.
- Monte Cimone evolutions:
  - Discussion with an NVIDIA representative has been performed to provide NVIDIA GPU acceleration to the Monte Cimone compute nodes. Cards have been identified for carrying on the activities. Initial smoke tests have been performed on the Linux operating system driver, showing porting needs for activating the driver. CUDA runtime has been identified as a roadblock on this activity.
  - FPGA acceleration cards have been identified.
- Data center automation: an initial meeting has been conducted with CINECA to set up the Examon framework porting to the Leonardo supercomputer. A follow-up meeting has been set.
- Remote access to the Monte Cimone system has been granted with best-effort support. A user guide has been defined at <a href="https://gitlab.com/ecs-lab/monte-cimone-doc">https://gitlab.com/ecs-lab/monte-cimone-doc</a>. Physical and remote access to the HWS will be defined in the future based on the progress of the laboratory infrastructure installation.

## Living lab SWI (Software & Integration)

In the reporting period, the SWI lab started the set-up of the physical lab located at the Computer Science Department of the University of Torino, which officially started its activities on 5th June 2023, hosting the EuroHPC cross-projects coordination and integration meeting participated by all the EuroHPC project of the call 2019 (40 attendees) <u>https://alpha.di.unito.it/eurohpc-meeting-2023/</u>.

On 1st October, office premises (350mq) located in the same building as the Computer Science Department of the University of Torino (Corso Svizzera 185, Torino) was rented (Fig. 2).

From September to December 2022, the primary focus was crafting the laboratory into an inviting co-working space for researchers from Universities and Companies involved in the spoke, for people doing training, and students working on their thesis. Marco Aldinucci, the lab responsible, collaborated with architects Lavinia Tagliabue (from the University of Torino) for the design and planning of office spaces according to the modular design concept so that the whole furniture can be moved in 2026 to the new building of the Computer Science department of the University of Torino (currently under restoration). The following requirements had to be fulfilled: working space for 12 people, meeting rooms, 2 private offices, breakroom (Fig. 3).



Fig. 2 – The floor plan of the space for the SWI laboratory

Timeline:

- September 2022 Renting of spaces started.
- September 2022- Conceptual design ready (allocation of spaces and furniture). Made with in-kind University of Torino resources (Arch. Lavinia Tagliabue, University of Torino).
- November 2022 Executive project ready (assigned to per.ind. Alessandro Destefanis).

- February 2023 Restoration works assigned to Elettra Muris sas di Muris Alberto & C, headquarter in via Saluzzo n. 20 10064 Pinerolo (TO) Cod. Fisc. / P. I.V.A. 04311620019.
- April 2023 Restoration works started.
- May 2023 Restoration works completed, furniture installation (assigned to VIOLAUFFICIO di arch. M. Viola).
- June 2023 ICT material installation: Internet connection, switches, WiFi, teleconf systems, displays, etc.).

The laboratory became fully operational on 5th June 2023.





Fig. 3 – The floor plan of the space organized according to the SWI lab requirements

The final organization (on 5th June 2023) of the space provides (Fig. 4a-4b):

- 16 workstations for fellows and research assistants;
- 2 insulated offices not assigned to any specific person, to be used in turns for tasks that require privacy or concentration;
- 2 formal meeting spaces with video projection, accommodating 10 people;
- 1 small meeting space accommodating 4 people;
- 1 informal space, accommodating 4/6 people (also including a gender-balanced table football);
- 2 technical rooms;

- 1 service room;
- 1 coffee space;
- 1 printer and network services room.



Fig. 4a – The SWI lab at 5 June 2023



Fig. 4b – The SWI lab at 5 June 2023 (including air pollution monitoring data)

#### SWI Lab equipment design goals

The SWI lab design aims to be a national lighthouse to serve as a contamination lab between academia and industry in the area of software for cloud-HPC. Its design revolves around several objectives supporting this goal:

- 1. Modular. To be moved and reproduced in other institutions and buildings.
- 2. Green. All the devices, both the data center and the offices, are designed to match the world's best energy efficiency benchmarks and are software-defined so that all systems consuming energy can be programmed to respect a user-defined model (from lighting to data center).
- 3. **Wellbeing**. The working space includes spaces for collaborative work, informal discussions and relaxation, supporting and boosting co-creation opportunities. Lighting power, use of daylighting, color temperature of the light, air temperature, relative humidity and indoor air quality are directly

controlled to enforce a healthy space that preserves and enhances the cognitive performance of the researchers. The same building hosts a 24/7 gym (not managed but the university).

- 4. **Attractive**. The office is designed to facilitate cooperation but also guarantees space to work in insulation. The furniture is chosen according to very high-quality design standards and inclusive principles supporting the sustainable vision of the HPC lab.
- 5. Sustainable. The SWI lab benefits from the University of Torino co-funding for offices (power, cleaning, networking, etc.) and ICT systems (HPC4AI data center). The SWI lab has a clear sustainability plan based on public funding (EU and national) and industrial investment. The lab aims at working on projects carrying industrial needs (proposed by industries) using methods, and IP developed at the university to 1) trigger a technological transfer path for academic research and 2) to form next-generation industrial researchers directly working on industrial problems with academic tutorship and training.
- 6. State-of-the-art systems. The SWI lab invests in state-of-the-art systems and prototypes to provide academic partners and industries with a real testing validation environment for next-generation software running on current and next-generation systems. The systems are designed to support system software and application from TRL<5 (components validated in a lab) to TRL 5-6 (technology and system demonstrated in a relevant environment). The SWI systems will be accessible by all FutureHPC spoke partners</p>

#### SWI Lab equipment provisioning activities

The SWI lab has standard office equipment for each working place (laptop, monitor, tablet). The SWI lab benefits from the University of Torino's existing HPC4AI (https://hpc4ai.unito.it) Research Infrastructure (as an in-kind contribution). HPC4AI implements a Tier-3 250KW green data center (PUE <1.1 - see online monitoring https://frontend.hpc4ai.unito.it:8080/BMS/) with 16 racks (22 KW/rack) and a cloud-HPC system described in Fig. 5. SWI lab is equipped with two Dyson air cleaning systems (PM1-10, viruses including Covid-19, formaldehyde, etc.). HPC4AI and the SWI lab (Fig. 4b) are live monitored through 6 independent monitoring stations for all common air pollutants listed by World Health Organization (PM 1/2.5/4/10, CO2, tVOC, 03, NH3, CO, NO2, CH2O) with historical time-series publicly available at https://centraline.di.unito.it/



Fig. 5 – HPC4AI logical organization

Beyond laptops, the acquisition of ICT systems is currently ongoing. Currently, the systems acquired are:

- RISC-V Esperanto accelerator (Fig.6)
  - Engineering sample of RISC-V accelerator
  - Different FPGA boards (Digilent Genesys 2, Arty A7) (Fig.7)
    - The Digilent Genesys 2 board is an advanced, high-performance, ready-to-use digital circuit development platform based on the Kintex-7 FPGA. Given its high capacity, high-speed FPGA, fast external memories, and high-speed digital video ports, we plan to use the board to experiment with accelerators, and softcores, possibly based on non-floating point arithmetic.



Fig. 6 – Esperanto RISC-V accelerator board



Fig. 7: FPGA boards (Digilent Genesys 2, Arty A7)

- The Arty A7 is a ready-to-use development platform for the Artix-7 low-power FPGA, specifically designed for use as a MicroBlaze Soft Processing System. The target here is to develop accelerators for Model Predictive Control (MPC), which are extremely popular in embedded systems.
- One experimental platform Intel+NVidia (2 CPU sockets + 4 GPUs), equipped with the first commercial prototype of two-phase cooling developed (Fig. 8) within the TEXTAROSSA EuroHPC project (Two-phase cooling experimental system 4xH100). A tender is under execution, and the declaration of interest 2023 deadline for was the 6th June (see https://unito.ubuy.cineca.it/PortaleAppalti/it/ppgare\_avvisi\_lista.wp?actionPath=/ExtStr2/do/Front End/Avvisi/view.action&currentFrame=7&codice=A00135& csrf=BA3X16LP7X4N44VE0HQ3M0ZMU <u>5XP42WI</u>.)



Fig 8: Two-phase cooling server prototype (UNITO, E4, InQuattro)

#### **SWI** Results

Several activities have already been executed and planned involving the SW Lab.

- The organization of a cycle of periodic seminars from industrial partners to present industrial needs at UNITO: ENI (Bortot Feb 2023).
- The organization, together with CINI HPC-KTT lab, of an annual series of HPC summer schools for Ph.D. and early-stage researchers to be held annually in a different city. The first two editions are planned for Pavia (Jun 2023) and Trento (Jun 2024). The school in Pavia (<u>https://hpc-summer-school.unipv.it/</u>) just took place on the 12th-16th June (Fig. 9).
- Preliminary engagements to host the unfolding of industrial Proof-of-Concept projects (industrial research) with
  - Leonardo Company: cloud-HPC workflow management systems;
  - IntesaSanPaolo: Proof of concept software: Federated Learning for Finance;
  - iFAB: Proof of concept service: Federated Learning as a Service;
  - Sogei: DevOps cloud-HPC;

- ENI: Cross-Platform Full Waveform Inversion;
- Unipol: services provisioning and cloud-HPC workflow management system for the multi-spoke Hammon project.
- Partnership with local initiatives (sharing of office space and facilities)
  - CTE-NEXT <u>https://ctenext.it/</u> the house of emerging technologies (smart roads, urban air mobility, innovative urban services, 4.0 industry) that also acquired 3 professional teleconference systems for SWI lab.
  - ToMove (under negotiation) The "ToMove" project will create a Living Lab spread across the territory of the City of Turin focused on innovative solutions of cooperative, connected, and autonomous mobility, grafting and expanding the purposes and facilities of the ongoing initiatives.



Fig. 9 – Pictures from the Summer School in Pavia

#### EuroHPC JU project coordination meeting

As already stated, the SWI premises hosted on from 5th to 7th June the ADMIRE project General Assembly and the EuroHPC cross-projects coordination and integration meeting participated by all the EuroHPC project of the call 2019 (40 attendees) <u>https://alpha.di.unito.it/eurohpc-meeting-2023/</u>. (Fig. 10a-10b)



Fig. 10a – EuroHPC coordination meeting moments



Fig. 10b – EuroHPC coordination meeting moments

## HWS & SWI coordination

SWI and HWS started a scientific collaboration on Machine Learning and Federated Learning for emerging architectures, including RISC-V and RISC-V acc. The collaboration stimulated the production of the first public porting of the Pytorch and OpenFL framework on RISC-V (made at SWI) and the experimentation on Monte Cimone RISC-V cluster (made at HWS). From the activity of the two labs, we derived several papers, some currently under peer review and others already accepted in main conferences, which are listed in the flagship technical deliverables. The join activity of the two labs have been presented (poster and/or talk) in a number of venues including:

- 1. HiPEAC 2023, Jan 2023, Toulouse, France
- 2. EuroHPC summit week 2023, Mar 2023, Gothenburg, Sweden
- 3. ACM Computing Frontiers 2023, May 2023, Bologna, Italy
- 4. EuroHPC project coordination meeting 2023, Jun 2023, Torino, Italy
- 5. RISC-V Summit Europe 2023, Jun 2023, Barcelona, Spain
- 6. 1st CINI HPC-KTT National Summer School, Jun 2023, Pavia, Italy

No deviations are currently envisaged.