0% found this document useful (0 votes)
61 views

Rao W 0230 Tuning Rhel For Databases

This document discusses tuning Red Hat Enterprise Linux for database workloads. It provides tuning tips and discusses aspects of tuning for I/O, memory, CPU and network. It describes the tuned tool for dynamically modifying system parameters to improve performance. Tuned includes predefined profiles for different subsystems and workloads. The document discusses tuning parameters, tools, and testing results for bare metal and virtualized environments.

Uploaded by

Arvind Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views

Rao W 0230 Tuning Rhel For Databases

This document discusses tuning Red Hat Enterprise Linux for database workloads. It provides tuning tips and discusses aspects of tuning for I/O, memory, CPU and network. It describes the tuned tool for dynamically modifying system parameters to improve performance. Tuned includes predefined profiles for different subsystems and workloads. The document discusses tuning parameters, tools, and testing results for bare metal and virtualized environments.

Uploaded by

Arvind Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 69

Tuning Red Hat Enterprise Linux

for Databases
Sanjay Rao
Principal Performance Engineer, Red Hat

Objectives of this session


Share tuning tips
RHEL 7 scaling
Aspects of tuning
Tuning parameters
Results of the tuning
Bare metal
KVM Virtualization

Tools

RED HAT ENTERPRISE LINUX


MORE THAN A DECADE OF INNOVATION

RED HAT
ADVANCED SERVER 2.1
BRINGING LINUX AND OPEN
SOURCE TO THE ENTERPRISE
02

03

04

05

RED HAT
ENTERPRISE LINUX 4
DELIVERING RAS, STORAGE,
MILITARY-GRADE SECURITY
06

RED HAT
ENTERPRISE LINUX 3
MULTI-ARCHITECTURE SUPPORT,
MORE CHOICES WITH A FAMILY
OF OFFERINGS

07

08

RED HAT
ENTERPRISE LINUX 6
LINUX BECOMES MAINSTREAM FOR
PHYSICAL, VIRTUAL, AND CLOUD
09

10

11

RED HAT
ENTERPRISE LINUX 5
VIRTUALIZATION, STATELESS LINUX
ANY APPLICATION, ANYWHERE, ANYTIME

RED HAT CONFIDENTIAL | ADD NAME

12

13

14

RED HAT
ENTERPRISE LINUX 7
THE FOUNDATION FOR THE
OPEN HYBRID CLOUD

RHEL Kernel Architecture Support


Architectures
64

bits Only (with 32bit user space compatibility support)


x86_64

PPC64

s390x

Kernel

Headers

Debug

Theoretical Limits on X86_64

Logical CPU maximum 5120 logical CPUs

Memory maximum 64T


4

Memory Mgt/Scheduler/Locking Enhancement


Memory Management
Memory allocator Switch to SLUB allocator for efficient memory allocation, avoid
fragmentation, and most importantly provided scalability.
Memory Allocation - Fair Zone Allocator Policy (in conjunction with kswapd) to better even out
the memory allocation and reclaim pages across different zones.
Thrash detection-based feature for file cache allows the mm to better serve applications that
access large file size such as data streaming and big data set in file cache.
Fine Grained page table locking for huge pages (back-ported) - better performance when
many threads access the virtual memory of a process simultaneously
TLB flush ranged support on x86 (back-ported) to improve munmap syscall performance
Sched/NUMA, NUMA-Balance feature moves tasks (which can be threads or processes)
closer to the memory they are accessing. It also moves application data closer to memory of the
numa code that the tasks is referencing it.

Virtual Memory/Scheduler/Locking Enhancement


Scheduler/Locking Mechanism/Ticks
IO Scheduler automatic switch uses deadline scheduler for enterprise storage devices
Autogroup disabled to save context switch to improve performance
Big kernel lock switch to small granular subsystem locks (2.6.39)
Micro-optimze smart wake-affinity
Dynamic Ticking (dyntick) kvm/HPC long running processes, telcom, financial (any apps
that dont need jitters[less interrupts])
Sys Vs IPC, Semaphore scalability Improvement
And More..

What To Tune
I/O
Memory
CPU
Network

What is tuned ?

Tuning framework that dynamically modifies system parameters that affect


performance

Pre-existing list of profiles for different sub-systems / application classes

Existing profiles can be modified (not recommended)

Custom profiles can be created

Installed by default in RHEL 7


Desktop/Workstation: balanced
Server/HPC: throughput-performance

Can be rolled back

Tuned: Updates for RHEL7


Re-written for maintainability and extensibility.
Configuration is now consolidated a single tuned.conf file
/usr/lib/tuned/<profile-name>/tuned.conf
Detail config
Optional hook/callout capability
Adds concept of Inheritance (just like httpd.conf)
Profiles updated for RHEL7 features and characteristics

Tuned: Throughput Profiles - RHEL7


Tunable
Inherits From/Notes
sched_min_ granularity_ns
sched_wakeup_granularity_ns
dirty_ratio
dirty_background_ratio
swappiness
I/O Scheduler (Elevator)
Filesystem Barriers
CPU Governor
Disk Read-ahead
Disable THP
Energy Perf Bias
kernel.sched_migration_cost_ns
min_perf_pct (intel_pstate only)
tcp_rmem
tcp_wmem
udp_mem

Units

Balanced

throughput-performance

network-throughput
throughput-performance

nanoseconds
nanoseconds
Percent
Percent
Weight 1-100
Boolean
KB
Boolean
nanoseconds
Percent
Bytes
Bytes
Pages

auto-scaling
3000000
20
10
60
deadline
Enabled
ondemand
128
Enabled
normal
500000
auto-scaling
auto-scaling
auto-scaling
auto-scaling

10000000
15000000
40
10
10

performance
4096
performance
100
Max=16777216
Max=16777216
Max=16777216

Tuned: Latency Profiles - RHEL7


Tunable
Inherits From/Notes
sched_min_ granularity_ns
sched_wakeup_granularity_ns
dirty_ratio
dirty_background_ratio
swappiness
I/O Scheduler (Elevator)
Filesystem Barriers
CPU Governor
Disable THP
CPU C-States
Energy Perf Bias
kernel.sched_migration_cost_ns
min_perf_pct (intel_pstate only)
net.core.busy_read
net.core.busy_poll
net.ipv4.tcp_fastopen
kernel.numa_balancing

Units

Balanced

nanoseconds
nanoseconds
percent
percent
Weight 1-100

auto-scaling
3000000
20
10
60
deadline
Enabled
ondemand
N/A
N/A
normal
N/A

Boolean
Boolean

nanoseconds
percent
microseconds
microseconds
Boolean
Boolean

latency-performance

network-latency
latency-performance

10000000
10000000
10
3
10

performance
No
Locked @ 1
performance

Yes

5000000
100
50
50
Enabled
Disabled

Tuned: Virtualization Profiles - RHEL7


Tunable

Units

throughput-performance

Inherits From/Notes
sched_min_ granularity_ns

nanoseconds

10000000

sched_wakeup_granularity_ns
dirty_ratio
dirty_background_ratio

nanoseconds
percent
percent

15000000
40
10

swappiness
I/O Scheduler (Elevator)

Weight 1-100

10

Filesystem Barriers
CPU Governor

Boolean

Disk Read-ahead
Energy Perf Bias
kernel.sched_migration_cost_ns

Bytes

min_perf_pct (intel_pstate only)

percent

virtual-host

virtual-guest

throughputperformance

throughputperformance

performance
4096
performance
nanoseconds

5000000
100

30
30

tuned profile list


# tuned-adm list
Available profiles:
- balanced
- desktop
- latency-performance
- network-latency
- network-throughput
- powersave
- sap
- throughput-performance
- virtual-guest
- virtual-host
Current active profile: throughput-performance

Tuned Database workload


Tuned runs using PCI - SSDs

Trans / Min

3.10.0-113
3.10.0-113
3.10.0-113
3.10.0-113

10

40
User Count

80

balanced
TP
LP
pwrsave

I/O Tuning Hardware


Know Your Storage
SAS or SATA? (Performance comes at a premium)
Fibre Channel, Ethernet (Check the PCI slot PCI / PCI E x4, x8)
Bandwidth limits (I/O characteristics for desired I/O types)

Multiple HBAs
Device-mapper multipath
Provides multipathing capabilities and LUN persistence
Check for your storage vendors recommendations (upto 20% performance gains with
correct settings)

How to profile your I/O subsystem


Low level I/O tools dd, iozone, dt, etc.
I/O representative of the database implementation

I/O Tuning Understanding I/O Elevators

Deadline

Two queues per device, one for read and one for writes
I/Os dispatched based on time spent in queue
Used for multi-process applications and systems running enterprise storage
CFQ
Per process queue
Each process queue gets fixed time slice (based on process priority)
Default setting - Slow storage (SATA) root file system
Noop
FIFO
Simple I/O Merging
Lowest CPU Cost
Low latency storage and applications (Solid State Devices)

I/O Tuning Configuring I/O Elevators


Boot-time
Grub command line elevator=deadline/cfq/noop
Dynamically, per device
echo deadline > /sys/block/sda/queue/scheduler
tuned
tuned-adm profile throughput-performance

I/O Tuning File Systems

Direct I/O

Predictable performance

Memory
(file cache)

DIO

Avoid double caching

Avoid
Double
Caching

DB Cache

Reduce CPU overhead

Asynchronous I/O
Eliminate synchronous I/O stall

Flat files on
file systems

Critical for I/O intensive applications

Configure read ahead (for sequential read operations)


Database (parameters to configure read ahead)
Block devices ( commands blockdev -- getra / setra)
Configure device read ahead for large data loads

Database

Configure read ahead

I/O Tuning Effect of read ahead during data load


Completion time for loading 30G data
DB2 v9.7 (fp4)

00:05:46

Completion time 42%


better with device
read ahead set to
1024 from 256

Completion Time

00:05:02
00:04:19
00:03:36
00:02:53
00:02:10
00:01:26
00:00:43
00:00:00

256
1024

I/O Tuning Database Layout


Separate files by I/O (data , logs, undo, temp)
OLTP data files / undo / logs
All transactions generate logs and undo information

DSS data files / temp files


Merge / joins / indexes generally use temp segments

Use low latency / high bandwidth devices for hot spots


Use database statistics

Linux Tool used for Identifying I/O hotspots

iostat -dmxz <interval>

This shows I/O for only the disks that are in use

I/O Tuning OLTP - Logs


OLTP workload - Logs on FC vs Fusion-io
Single Instance

Transactions / Min

~20 performnace improvemnet by


moving the logs to faster
SSD drives

Logs Fusion-io
Logs FC

10U

40U

80U

I/O Tuning Storage (OLTP database)


OLTP workload - Fibre channel vs Fusion-io

Transactions / Min

4 database instances

FC

Fusion-io

dm-cache
Caching in the device mapper stack to improve performance
Caching of frequently accessed data in a faster target
tech-preview

Dm-cache testing with OLTP Workload

Trans / Min

Using SSD device to back SATA drive

SATA
Dm-cache backed with FusionIO
Dm-cache backed with FusionIO Run2

10U

20U

40U

60U
User set

80U

100U

Memory Tuning
NUMA
Huge Pages
Manage Virtual Memory pages
Flushing of dirty pages
Swapping behavior

Understanding NUMA (Non Uniform Memory Access)


S1
M1

M3

M1

S2

c1

c2

c1

c2

c3

c4

c3

c4

c1

c2

c1

c2

c3
S3

c4

c3

c4

M2

M3

M1

M4

M2

M3

M4

M2

M4

S4

S Socket
C Core
M Memory Bank Attached to each Socket
D Database
Access path between sockets
Access path between sockets and memory

D1
S1

D2

D3

D4

S2

S3

S4

No NUMA optimization

D1

D2

D3

D4

S1

S2

S3

S4

NUMA optimization

Multi Socket Multi core architecture


NUMA required for scaling
Autonuma code provides performance improvement
Additional performance gains by enforcing NUMA placement

Memory Tuning Finding NUMA layout


[root@perf30~]#numactlhardware
available:4nodes(03)
node0cpus:04812162024283236404448525660
node0size:32649MB
node0free:30868MB
node1cpus:15913172125293337414549535761
node1size:32768MB
node1free:29483MB
node2cpus:261014182226303438424650545862
node2size:32768MB
node2free:31082MB
node3cpus:371115192327313539434751555963
node3size:32768MB
node3free:31255MB
nodedistances:
node0123
0:10212121
1:21102121
2:21211021
3:21212110

Memory Tuning NUMA


Enforce NUMA placement
Numactl
CPU and memory pinning

Taskset
CPU pinning

cgroups
cpusets
cpu and memory cgroup

Libvirt
for KVM guests CPU pinning

Memory Tuning Effect of NUMA Tuning


Performance improvement is seen
with more cross NUMA
activity

2inst-hugepgs
2inst-hugepgs-numapin
4 inst-hugepgs
4 inst-hugepgs-numapin

10U

40U

80U

Memory Tuning NUMA - numad


What is numad?
User-level daemon to automatically improve out of the box NUMA system performance
Added to Fedora 17
Added to RHEL 6.3 as tech preview
Not enabled by default
What does numad do?
Monitors available system resources on a per-node basis and assigns significant
consumer processes to aligned resources for optimum NUMA performance.
Rebalances when necessary
Provides pre-placement advice for the best initial process placement and resource
affinity.

Memory Tuning Effect of numad


RHEL 7 - 4 VMs - 80U run

Trans / Min

Running Database OLTP workload

4 VM
4 VM - numapin
4 VM - numad

Memory Tuning Huge Pages

2M pages vs 4K standard linux page

Virtual to physical page map is 512 times smaller


TLB can map more physical pages, resulting in fewer misses
Traditional Huge Pages always pinned
Most databases support Huge pages
1G pages supported on newer hardware
Transparent Huge Pages in RHEL6 (cannot be used for Database shared memory
only for process private memory)
How to configure Huge Pages (16G)
echo 8192 > /proc/sys/vm/nr_hugepages
vi /etc/sysctl.conf (vm.nr_hugepages=8192)

Memory Tuning huge pages


OLTP Workload

Trans / Min

~ 9 % improvement
in performance

Regular pgs
Huge pages

10U

20U

40U
User Set Count

60U

80U

Tuning Memory Flushing Caches


Drop unused Cache
Frees
File

unused memory

cache

If the DB uses cache, may notice slowdown

Free pagecache

Free slabcache

echo 1 > /proc/sys/vm/drop_caches


echo 2 > /proc/sys/vm/drop_caches

Free pagecache and slabcache

echo 3 > /proc/sys/vm/drop_caches

Tuning Memory swappiness


Not needed as much in RHEL7
Controls how aggressively the system reclaims mapped memory:
Default - 60%
Decreasing: more aggressive reclaiming of unmapped pagecache
memory, thereby delaying swapping
Increasing: more aggressive swapping of mapped memory

dirty_ratio and dirty_background_ratio


pagecache
100% of pagecache RAM dirty

flushd and write()'ng processes write dirty buffers

If there is a lot of pagecache pressure one


would want to start background flushing
sooner and delay the synchronous writes.
This can be done by

Lowering the dirty_background_ratio

Increasing the dirty_ratio

dirty_ratio(20% of RAM dirty) processes start synchronous writes

flushd writes dirty buffers in background

dirty_background_ratio(10% of RAM dirty) wakeup flushd


do_nothing
0% of pagecache RAM dirty

Objectives
of
this
session
Anything Hyper has to be good ... right?
Using Hyperthreads improves performance with database workload but the mileage will vary
depending on how the database workload scales.
Having more CPUs sharing the same physical cache can also help performance in some
cases
Some workloads lend themselves to scaling efficiently and they will do very well with
hyperthreads but if the scaling factor for workloads are not linear with physical cpus, it
probably wont be a good candidate for scaling with hyperthreads.

Objectives of this session

Scaling test - Hyperthread vs no Hyperthead


OLTP workload (96G sga with different CPU counts)

1 node no HT
1 node HT

Trans / min

2 nodes no HT
2 nodes HT
3 nodes no HT
3 nodes HT
4 nodes no HT
4 nodes HT

10U

40U

80U

User sets

Single Instance scaled across NUMA nodes, one node at a time. The 1 node test shows the best gain in performance.
As more NUMA nodes come into play, the performance difference is hard to predict because of the memory placement
and the CPU cache sharing among physical threads and hyperthreads of the CPUs
Avg % gain 1 Node 30% / 2 Nodes 20 % / 3 Nodes 14 % / 4 Nodes 5.5 %

Objectives of this session

Multi Instances of database with and without Hyperthreads


Instances aligned with Numa Nodes with 16P - 32G mem (24G sga)

Trans / min

~ 35% improevment

4 Inst HT
4 Inst no HT

10U

20U

40U

60U

80U

User sets

Each of the 4 instances were aligned to an individual NUMA node. This test shows the best gain in
performances as other factors influencing performance like NUMA, I/O are not a factor

Network Tuning Databases

Network Performance
Separate network for different functions (Private network for database traffic)

If on same network, use arp_filter to prevent ARP flux


echo 1 > /proc/sys/net/ipv4/conf/all/arp_filter
Private
H1
Hardware
Public
10GigE
Supports RDMA w/ RHEL6 high performance networking package (ROCE)
Infiniband (Consider the cost factor)
Packet size (Jumbo frames)

Linux Tool used for monitoring network


sar -n DEV <interval>

H2

Network tuning Jumbo Frames with iSCSI storage


Transactions / min

OLTP Workload

1500
9000

10U

40U

100U

Derived metric based on Query completion time

DSS workloads

9000MTU

1500MTU

Database Performance

Application tuning
Design
Reduce locking / waiting
Database tools (optimize regularly)
Resiliency is not a friend of performance

Please attend
Oracle Database 12c on Red Hat Enterprise Linux: Best practices
Thursday, April 17 11 a.m. to 12 p.m. in Room 102

Cgroup
Resource Management
Memory, cpus, IO, Network
For performance
For application consolidation
Dynamic resource allocation
Application Isolation
I/O Cgroups
At device level control the % of I/O for each Cgroup if the device is shared
At device level put a cap on the throughput

Cgroup Resource management


Cgroups - to manage resources

By controllring resources, the performance


of each instance can be controlled

No Resource Control

Cgroup- Resource Control

Inst 1
Inst 2
Inst 3
Inst 4

Cgroup NUMA pinning


Cgroups for NUMA pinning
Works with Hugepages

Works with Huge Pages transparently


giving another 13% gain
Aligning NUMA by using Cgroups
shows a 30% gain

Trans / Min

Inst
Inst
Inst
Inst

no NUMA

Cgroup NUMA

Cgroup NUMA Hugepgs

4
3
2
1

C-group - Dynamic resource control

Trans / Min

Dynamic CPU change with Cgroups

CPU assigned to instance 1 - 16


CPUs assigned to instance 2 - 64

CPU assigned to instance 1 - 64


CPUs assigned to instance 2 - 16

Instance 1
Instance 2

Time

Cgroup Application Isolation


Memory Resource Management

System Level Memory Swapping

Oracle OLTP Workload

35 K

30 K

20 K

Swap In
Swap Out

15 K

10 K

Transactions Per Minute

25 K
Instance
Instance
Instance
Instance

5K

K
Time

Regular

Throttled

Even though one application does not have resources and starts swapping,
other applications are not affected

4
3
2
1

Database on RHEV

Quick Overview KVM Architecture


Guests run as a process in userspace on the host
A virtual CPU is implemented using a Linux thread
The Linux scheduler is responsible for scheduling a virtual CPU, as it is a
normal thread

Guests inherit features from the kernel


NUMA
Huge Pages
Support for new hardware

Virtualization Tuning Caching


Figure 1

VM1

Figure 2

VM2

VM1

Host
File Cache

VM2

Cache = none (Figure 1)


I/O from the guest is not cached on the host
Cache = writethrough (Figure 2)
I/O from the guest is cached and written through on the host
Works well on large systems (lots of memory and CPU)
Potential scaling problems with this option with multiple guests (host CPU used to
maintain cache)
Can lead to swapping on the host

Virt Tuning Effect of I/O Cache Settings


OLTP testing in 4 VMs
Cache=WT vs Cache=none

Trans / min

cache=none
cache=WT
cache=WT-run2

10U

20U

40U
User set count

60U

80U

Virt Tuning Using NUMA


4 Virtual Machines running OLTP workload
7.05

Trans / min normalized to 100

7.54

no NUMA

Manual Pin

NUMAD

Virt Tuning Tuning Transparent Huge Pages


4 VM testing
Comparision between THP and huge pages on host

Trans / Min

THP-scan=10000
THP - scan=100
Hugepages

10U

20U

40U

60U
User Set

80U

100U

Virt Tuning Kernel Samepage Merging (KSM)


KSM and THP scan
KSM breaks down THP, so performance
advantage of THP is lot.
Some of it is recovered by lowering the scan interval
CPU overhead of KSM results in less CPU time for application

Trans / Min

THP 10000scan, ksm on, mem opt server (150)


THP 100scan, ksm on, mem opt server (150)
THP 100scan, ksm off, mem opt server(150)
THP 100scan, ksm off, no memory opt

40U

80U

100U

User Set

Use KSM for


- Running Windows Virtual Machines on RHEV
- Oversubscribing memory on the host

RHEV 3.3 Migration

Transactions / minute

Migrating a 108G VM running OLTP ~ 500K Trans/min

TPM- regular
TPM Mig BW - 32
TPM Mig BW - 0

Time

Configure migration_max_bandwidth = <Value> in /etc/vdsm/vdsm.conf

Virtualization Tuning Network


VirtIO
VirtIO drivers for network

vhost_net (low latency close to line speed)


Bypass the qemu layer

PCI pass through


Bypass the host and pass the PCI device to the guest
Can be passed only to one guest

SR-IOV (Single root I/O Virtualization)


Pass through to the guest
Can be shared among multiple guests
Limited hardware support

Tools

Performance Monitoring Tools


Monitoring tools
top, vmstat, ps, iostat, netstat, sar, perf
Kernel tools
/proc, sysctl, AltSysRq
Networking
ethtool, ifconfig
Profiling
oprofile, strace, ltrace, systemtap, perf

Performance Monitoring Tool perf


Performance analysis tool
perf top (dynamic)
perf record / report (save and replay)
perf stat <command> (analyze a particular workload)

Performance Monitoring Tool perf top


Multi Instance OLTP Run without Huge Pages

Performance Monitoring Tool perf record / report


Multi Instance OLTP Run with Huge Pages

Performance Monitoring Tool perf stat


perf stat <command>

monitors any workload and collects variety of statistics


can monitor specific events for any workload with -e flag (perf list give list of events)

perfstatwithregular4kpages
Performancecounterstatsfor<databaseworkload>
7344954.315998taskclock#6.877CPUsutilized
64,577,684contextswitches#0.009M/sec
23,074,271cpumigrations#0.003M/sec
1,621,164pagefaults#0.221K/sec
16,251,715,158,810cycles#2.213GHz[83.35%]
12,106,886,605,229stalledcyclesfrontend#74.50%frontendcyclesidle[83.33%]
8,559,530,346,324stalledcyclesbackend#52.67%backendcyclesidle[66.66%]
5,909,302,532,078instructions#0.36insnspercycle
#2.05stalledcyclesperinsn[83.33%]
1,585,314,389,085branches#215.837M/sec[83.31%]
43,276,126,707branchmisses#2.73%ofallbranches[83.35%]
1068.000304798secondstimeelapsed

Performance Monitoring Tool perf stat


perfstatwith2Mhugepages
Performancecounterstatsfor<databaseworkload>
9262233.382377taskclock#8.782CPUsutilized
66,611,453contextswitches#0.007M/sec
25,096,578cpumigrations#0.003M/sec
1,304,385pagefaults#0.141K/sec
20,545,623,374,830cycles#2.218GHz[83.34%]
15,032,568,156,397stalledcyclesfrontend#73.17%frontendcyclesidle[83.33%]
10,633,625,096,508stalledcyclesbackend#51.76%backendcyclesidle[66.66%]
7,533,778,949,221instructions#0.37insnspercycle
#2.00stalledcyclesperinsn[83.33%]
2,109,143,617,292branches#227.714M/sec[83.33%]
45,626,829,201branchmisses#2.16%ofallbranches[83.33%]
1054.730657871secondstimeelapsed

Performance Monitoring Tool sar


Output of sar -N DEV 3
For a DSS workload running on iSCSI storage using different MTUs
1500MTU
01:40:08PMIFACErxpck/stxpck/srxkB/stxkB/srxcmp/stxcmp/srxmcst/s
01:40:11PMeth00.340.340.020.020.000.000.00
01:40:11PMeth5135016.7819107.72199178.191338.530.000.000.34
01:40:14PMeth00.660.000.050.000.000.000.66
01:40:14PMeth5133676.7418911.30197199.841310.250.000.000.66
01:40:17PMeth00.670.000.050.000.000.000.67
01:40:17PMeth5134555.8519045.15198502.271334.190.000.000.33
01:40:20PMeth01.000.000.070.000.000.000.67
01:40:20PMeth5134116.3318972.33197849.551325.030.000.001.00
9000MTU
06:58:43PMIFACErxpck/stxpck/srxkB/stxkB/srxcmp/stxcmp/srxmcst/s
06:58:46PMeth00.910.000.070.000.000.000.00
06:58:46PMeth5104816.3648617.27900444.383431.150.000.000.91
06:58:49PMeth00.000.000.000.000.000.000.00
06:58:49PMeth5118269.8054965.841016151.643867.910.000.000.50
06:58:52PMeth00.000.000.000.000.000.000.00
06:58:52PMeth5118470.7354382.441017676.213818.350.000.000.98
06:58:55PMeth00.940.000.060.000.000.000.00
06:58:55PMeth5115853.0553515.49995087.673766.280.000.000.47

OLTP Workload
PCI SSD Storage

OLTP Workload
Fibre Channel Storage

Performance Monitoring Tool vmstat


rbswpdfreebuffcachesisobiboincsussyidwast
621950924489431213070476267048008530144255353501132574344590
32050924367080013121677248544006146152650293689337333353110
212750924297553213162077808736002973147526208866614020265130
72050924255576413201278158840002206136012195266145217269120
251850924200236813253678647472002466144191202556336619267110
42150924146955213294479111672002581144470211256602922265110
13250924081469613336879699200002608151518219676984123264110
41750924004662013380480385232002638151933230447029424264100
171450923949958013420480894864002377152805236637265525262100
35145092389100241345968143695200227815286424944742312726190
201350923831390013503281978544002091156207242577296826262100
11450923783107613552882389120001332155549197985819520267110
232450923743077213593682749040001955145791195575613318266140
341250923686450013639683297184001546141385199575689419267130
rbswpdfreebuffcachesisobiboincsussyidwast
7706604551798763588886622696000732526689570185149686907400
7716604506300923592887047624800687330690070166149804887500
7636604460311683601327444477600581857428677388177454888400
8116604415106083605127864148000497045293975322168464897300
8236604353588363610128446625600401144104274022162443887400
8136604349914523618928474000800212644087673702161618887500
7916604349397923622968474701600232340032473091161592906300
7906604348796443629928475401600227541263173271160766896400
7626604348446163633968476097600227541577773019158614896400
6136604348086803638288476801600220940152272367159100896400
7716604347819443641808477499200217240196673253159064906400
5446604347249483647728480345600303142129972990156224896400
8026604347015003655008480907200221657324676404175922887510

Memory Stats
I/O Stats
CPU stats

Swap stats

Performance Monitoring Tool iostat


Output of iostat -dmxz 3
Device:rrqm/swrqm/sr/sw/srMB/swMB/savgrqszavgquszawaitsvctm%util
sda0.0025.200.002.400.000.1190.670.0417.5011.502.76
dm00.000.000.001.000.000.008.000.0547.0015.201.52
dm20.000.000.0026.200.000.108.000.4316.430.471.24
fioa0.0041.801057.603747.6028.74114.7561.161.720.360.1676.88
Device:rrqm/swrqm/sr/sw/srMB/swMB/savgrqszavgquszawaitsvctm%util
sda0.002.990.003.190.000.0215.000.014.504.441.42
dm20.000.000.005.990.000.028.000.012.432.371.42
fioa0.0032.93950.703771.4625.33127.1866.141.770.380.1676.57
Device:rrqm/swrqm/sr/sw/srMB/swMB/savgrqszavgquszawaitsvctm%util
sda0.0022.800.001.600.000.09121.000.016.256.120.98
dm20.000.000.0024.200.000.098.000.114.690.400.98
fioa0.0040.00915.003868.6024.10118.3160.971.630.340.1675.34
Device:rrqm/swrqm/sr/sw/srMB/swMB/savgrqszavgquszawaitsvctm%util
sda0.0054.200.001.600.000.22278.000.016.005.000.80
dm20.000.000.0055.600.000.228.000.244.260.140.80
fioa0.0039.80862.003800.6021.93131.6767.471.720.370.1675.96
Device:rrqm/swrqm/sr/sw/srMB/swMB/savgrqszavgquszawaitsvctm%util
sda0.002.400.000.800.000.0130.000.016.756.750.54
dm20.000.000.003.000.000.018.000.011.801.800.54
fioa0.0036.00811.203720.8020.74116.7862.151.560.340.1672.72

Wrap Up Bare Metal


I/O
Choose the right elevator
Eliminated hot spots
Direct I/O or Asynchronous I/O
Virtualization Caching
Memory
NUMA
Huge Pages
Swapping
Managing Caches
RHEL has many tools to help with debugging / tuning

Wrap Up Bare Metal (cont.)


CPU
Check cpuspeed settings
Network
Separate networks
arp_filter
Packet size

Thank you

You might also like