Azure HPC Recipe Document for Ansys Fluent

1 Introduction

Running a complex CFD simulation requires significant amount of time and the latest hardware with faster computational (CPU and GPU) capabilities. Microsoft Azure provides all the necessary infrastructure required to run these high-end work loads and jobs. The Microsoft Azure Virtual Machines are equipped with latest CPUs and GPUs in the market. One such Azure Virtual Machine Configuration is HB120rs_v3 Virtual Machine.

The HBv3 virtual machine (Standard_HB120rs_v3 Virtual Machine) feature up to 120 AMD EPYC 7003-series (Milan) CPU cores, 448 GB of RAM.

Size

vCPU

Memory: GiB

Memory bandwidth GB/s

Base CPU frequency (GHz)

All-cores frequency (GHz, peak)

Single-core frequency (GHz, peak)

RDMA performance (Gb/s)

Max data disks

Standard_HB120rs_v3

120

448

350

2.45

3.1

3.675

200

32

Standard_HB96rs_v3

96

448

350

2.45

3.1

3.675

200

32

Standard_HB64rs_v3

64

448

350

2.45

3.1

3.675

200

32

Standard_HB32rs_v3

32

448

350

2.45

3.1

3.675

200

32

Standard_HB16rs_v3

16

448

350

2.45

3.1

3.675

200

32

HBv3 VM’s with different number of vCPU’s are deployed to find out the optimal configuration for Ansys Fluent 2021 R2 on single node. Based on the single node performance, the VM configuration for multimode runs has been selected.

The subsequent section will show the performance of Ansys Fluent 2021 R2 on Azure HBv3 Virtual Machines on single node and multi-node cluster configurations

2 Ansys Fluent 2021 R2 Performance on Azure Virtual Machines

2.1 Ansys Fluent Overview

Ansys Fluent enables users to solve complex CFD engineering problems and make better, faster design decisions. Ansys Fluent gives you more time to innovate and optimize product performance. With Ansys Fluent, you can create advanced physics models and analyse a variety of fluids phenomena—all in a customizable and intuitive space.

2.2 Ansys Fluent 2021 R2 Performance Results on Single Node

Performance tests have been performed on the below Test models, and the Total Wall-clock time per 100 iterations (in seconds) and speed up have been determined and presented below.

1) Aircraft_wing_14m

Job Name

Cores

Wall-Time per 100 iters (s)

Speedup

Aircraft wing 14m

16

860.67

1.00

32

569.03

1.51

64

442.69

1.94

96

433.45

1.99

120

429.54

2.00

2) Pump_2m

Job Name

Cores

Wall- Time per 100 iters (s)

Speedup

pump 2m

16

213.83

1.00

32

146.38

1.46

64

118.26

1.81

96

112.53

1.90

120

115.47

1.85


3) Landing_gear_15m

Job Name

Cores

Wall-time per 100 iters (sec)

Speedup

Landing gear_15m

16

871.37

1.00

32

580.31

1.50

64

501.02

1.74

96

484.46

1.80

120

489.96

1.78

4) Oil_Rig_7m

Job Name

Cores

Wall-time per 100 iters (s)

Speedup

Oil rig 7m

16

377.11

1.00

32

224.16

1.68

64

152.42

2.47

96

140.81

2.68

120

132.34

2.85


5) Sedan_4m

Job Name

Cores

Wall-time per 100 iters (s)

Speedup

sedan 4m

16

154.02

1.00

32

99.88

1.54

64

79.40

1.94

96

74.88

2.06

120

75.62

2.04

6) Combustor_12m

Job Name

Cores

Wall-Time per

100 iters (s)

Speedup

combustor 12m

16

3238

1.00

32

2085

1.55

64

1513

2.14

96

1360

2.38

120

1236

2.62


7) Exhaust_System_33m

Job Name

Cores

Wall-Time per 100 iters (s)

Speedup

Exhaust system 33m

16

2685

1.00

32

1628

1.65

64

1334

2.01

96

1205

2.23

120

1112

2.42

2.3 Ansys Fluent 2021 R2 Performance Results on Multi-Nodes (Cluster)

From the Performance of Ansys Fluent on single node Virtual machines, we could observe that Ansys Fluent gives the optimal performance with 64 cores and 96 cores VM configurations of HBv3 Series. Here we observed that the scaleup difference between 64 CPUs and 96 CPUs is 5% to 10%. If we take the license cost into consideration, 64 CPUs Configuration will be an optimal choice for the end user in terms of both performance and cost. So, we considered Standard_HB120-64rs_v3 with 64 cores configuration for Multi-node Simulations and carried out the Benchmarking on HBv3 Cluster setup

Below are the Performance results of the fluent models

1) Aircraft_wing_14m

Nodes

Cores

wall- time per 100 iters (s)

Speedup

1

64

442.69

1.00

2

128

226.06

1.96

3

192

149.31

2.96

4

256

109.23

4.05


2) Pump_2m

Node

cores

Wall-time per 100 iters (s)

Speedup

1

64

118.26

1.00

2

128

55.42

2.13

3

192

35.53

3.33

4

256

24.26

4.88

3) Landing_gear_15m

Nodes

Cores

Wall-time per 100 iters (s)

Speedup

1

64

501.02

1.00

2

128

247.17

2.03

3

192

160.02

3.13

4

256

117.78

4.25


4) Oil_Rig_7m

Node

Cores

Wall-time per 100 iters (s)

Speedup

1

64

152.42

1.00

2

128

75.48

2.02

3

192

52.76

2.89

4

256

41.38

3.68


5) Sedan_4m

Node

Cores

Wall-time per 100 iters (s)

Speedup

1

64

79.40

1.00

2

128

39.66

2.00

3

192

23.90

3.32

4

256

20.15

3.94

6) Combustor_12m

Node

Cores

Wall-time per 100 iters (s)

Speedup

1

64

1512.56

1.00

2

128

828.63

1.83

3

192

531.82

2.84

4

256

359.86

4.20


7) Exhaust_System_33m

Nodes

Cores

Wall-time per 100 iters (s)

Speedup

1

64

1333.72

1.00

2

128

629.02

2.12

3

192

399.66

3.34

4

256

304.05

4.39

3 Azure Cost

In the below cost reports presented, we have shown only the indicative costs. The application installation time is not considered and only the wall clock time per 100 iterations for running each model in Ansys Fluent on HBv3 virtual machine is considered and the license cost is not included.

The Hourly rates reported are subject to change. For the current rate please refer the link below https://azure.microsoft.com/en-in/pricing/calculator/

Cost calculation for single Node configuration

VM Name

# CPUs

Azure VM hourly cost ($)

Wall clock time (Hours)

Azure consumption

HB120rs-16rs_v3

16

$ 4.68

2.33

$10.92

HB120rs-32rs_v3

32

$ 4.68

1.48

$6.93

HB120rs-64rs_v3

64

$ 4.68

1.15

$5.38

HB120rs-96rs_v3

96

$ 4.68

1.06

$4.95

HB120rs_v3

120

$ 4.68

1.00

$4.67

Azure cost calculation for multi-Node configuration

VM Name

# Nodes

# Cores

Azure VM hourly cost ($)

Wall clock time (Hours)

Azure consumption

HB120rs-64rs_v3

1

64

$ 4.68x1

1.15

$5.38

HB120rs-64rs_v3

2

128

$ 4.68x2

0.58

$5.46

HB120rs-64rs_v3

3

192

$ 4.68x3

0.38

$5.28

HB120rs-64rs_v3

4

256

$ 4.68x4

0.27

$5.08

4 Summary

  1. Ansys Fluent 2021 R2 Application is successfully deployed and tested on HBv3 AMD EPYC 7003 series Azure Virtual Machines.
  2. Ansys Fluent simulations on single node configuration are scaling well up to 64 CPUs and 96 CPUs and after that the speedup is saturating with further increase in the cores.
  3. For Multi-Node runs, Ansys Fluent is scaling up linearly with increase of nodes as seen in above results

5 Running Ansys Fluent on Azure Virtual Machines:

Users can use Ansys Cloud or alternatively they can reach out to any one of the following contact for the further support.

  1. Contact through Ansys: cloud-sales@ansys.com
  2. Contact through Microsoft: Microsoft global black belt team
  3. Contact through Capgemini: AzureHPC-Certification@capgemini.com