0% found this document useful (0 votes)

61 views

Rao W 0230 Tuning Rhel For Databases

This document discusses tuning Red Hat Enterprise Linux for database workloads. It provides tuning tips and discusses aspects of tuning for I/O, memory, CPU and network. It describes the tuned tool for dynamically modifying system parameters to improve performance. Tuned includes predefined profiles for different subsystems and workloads. The document discusses tuning parameters, tools, and testing results for bare metal and virtualized environments.

Uploaded by

Arvind Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

61 views

Rao W 0230 Tuning Rhel For Databases

Uploaded by

Arvind Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 69

Tuning Red Hat Enterprise Linux

for Databases
Sanjay Rao
Principal Performance Engineer, Red Hat

Objectives of this session

Share tuning tips
RHEL 7 scaling
Aspects of tuning
Tuning parameters
Results of the tuning
Bare metal
KVM Virtualization

Tools

RED HAT ENTERPRISE LINUX

MORE THAN A DECADE OF INNOVATION

RED HAT
ADVANCED SERVER 2.1
BRINGING LINUX AND OPEN
SOURCE TO THE ENTERPRISE
02

RED HAT
ENTERPRISE LINUX 4
DELIVERING RAS, STORAGE,
MILITARY-GRADE SECURITY
06

RED HAT
ENTERPRISE LINUX 3
MULTI-ARCHITECTURE SUPPORT,
MORE CHOICES WITH A FAMILY
OF OFFERINGS

RED HAT
ENTERPRISE LINUX 6
LINUX BECOMES MAINSTREAM FOR
PHYSICAL, VIRTUAL, AND CLOUD
09

RED HAT
ENTERPRISE LINUX 5
VIRTUALIZATION, STATELESS LINUX
ANY APPLICATION, ANYWHERE, ANYTIME

RED HAT CONFIDENTIAL | ADD NAME

RED HAT
ENTERPRISE LINUX 7
THE FOUNDATION FOR THE
OPEN HYBRID CLOUD

RHEL Kernel Architecture Support

Architectures
64

bits Only (with 32bit user space compatibility support)

x86_64

PPC64

s390x

Kernel

Headers

Debug

Theoretical Limits on X86_64

Logical CPU maximum 5120 logical CPUs

Memory maximum 64T

Memory Mgt/Scheduler/Locking Enhancement

Memory Management
Memory allocator Switch to SLUB allocator for efficient memory allocation, avoid
fragmentation, and most importantly provided scalability.
Memory Allocation - Fair Zone Allocator Policy (in conjunction with kswapd) to better even out
the memory allocation and reclaim pages across different zones.
Thrash detection-based feature for file cache allows the mm to better serve applications that
access large file size such as data streaming and big data set in file cache.
Fine Grained page table locking for huge pages (back-ported) - better performance when
many threads access the virtual memory of a process simultaneously
TLB flush ranged support on x86 (back-ported) to improve munmap syscall performance
Sched/NUMA, NUMA-Balance feature moves tasks (which can be threads or processes)
closer to the memory they are accessing. It also moves application data closer to memory of the
numa code that the tasks is referencing it.

Virtual Memory/Scheduler/Locking Enhancement

Scheduler/Locking Mechanism/Ticks
IO Scheduler automatic switch uses deadline scheduler for enterprise storage devices
Autogroup disabled to save context switch to improve performance
Big kernel lock switch to small granular subsystem locks (2.6.39)
Micro-optimze smart wake-affinity
Dynamic Ticking (dyntick) kvm/HPC long running processes, telcom, financial (any apps
that dont need jitters[less interrupts])
Sys Vs IPC, Semaphore scalability Improvement
And More..

What To Tune
I/O
Memory
CPU
Network

What is tuned ?

Tuning framework that dynamically modifies system parameters that affect

performance

Pre-existing list of profiles for different sub-systems / application classes

Existing profiles can be modified (not recommended)

Custom profiles can be created

Installed by default in RHEL 7

Desktop/Workstation: balanced
Server/HPC: throughput-performance

Can be rolled back

Tuned: Updates for RHEL7

Re-written for maintainability and extensibility.
Configuration is now consolidated a single tuned.conf file
/usr/lib/tuned/<profile-name>/tuned.conf
Detail config
Optional hook/callout capability
Adds concept of Inheritance (just like httpd.conf)
Profiles updated for RHEL7 features and characteristics

Tuned: Throughput Profiles - RHEL7

Tunable
Inherits From/Notes
sched_min_ granularity_ns
sched_wakeup_granularity_ns
dirty_ratio
dirty_background_ratio
swappiness
I/O Scheduler (Elevator)
Filesystem Barriers
CPU Governor
Disk Read-ahead
Disable THP
Energy Perf Bias
kernel.sched_migration_cost_ns
min_perf_pct (intel_pstate only)
tcp_rmem
tcp_wmem
udp_mem

Units

Balanced

throughput-performance

network-throughput
throughput-performance

nanoseconds
nanoseconds
Percent
Percent
Weight 1-100
Boolean
KB
Boolean
nanoseconds
Percent
Bytes
Bytes
Pages

auto-scaling
3000000
20
10
60
deadline
Enabled
ondemand
128
Enabled
normal
500000
auto-scaling
auto-scaling
auto-scaling
auto-scaling

10000000
15000000
40
10
10

performance
4096
performance
100
Max=16777216
Max=16777216
Max=16777216

Tuned: Latency Profiles - RHEL7

Tunable
Inherits From/Notes
sched_min_ granularity_ns
sched_wakeup_granularity_ns
dirty_ratio
dirty_background_ratio
swappiness
I/O Scheduler (Elevator)
Filesystem Barriers
CPU Governor
Disable THP
CPU C-States
Energy Perf Bias
kernel.sched_migration_cost_ns
min_perf_pct (intel_pstate only)
net.core.busy_read
net.core.busy_poll
net.ipv4.tcp_fastopen
kernel.numa_balancing

Units

Balanced

nanoseconds
nanoseconds
percent
percent
Weight 1-100

auto-scaling
3000000
20
10
60
deadline
Enabled
ondemand
N/A
N/A
normal
N/A

Boolean
Boolean

nanoseconds
percent
microseconds
microseconds
Boolean
Boolean

latency-performance

network-latency
latency-performance

10000000
10000000
10
3
10

performance
No
Locked @ 1
performance

Yes

5000000
100
50
50
Enabled
Disabled

Tuned: Virtualization Profiles - RHEL7

Tunable

Units

throughput-performance

Inherits From/Notes
sched_min_ granularity_ns

nanoseconds

10000000

sched_wakeup_granularity_ns
dirty_ratio
dirty_background_ratio

nanoseconds
percent
percent

15000000
40
10

swappiness
I/O Scheduler (Elevator)

Weight 1-100

Filesystem Barriers
CPU Governor

Boolean

Disk Read-ahead
Energy Perf Bias
kernel.sched_migration_cost_ns

Bytes

min_perf_pct (intel_pstate only)

percent

virtual-host

virtual-guest

throughputperformance

performance
4096
performance
nanoseconds

5000000
100

30
30

tuned profile list

# tuned-adm list
Available profiles:
- balanced
- desktop
- latency-performance
- network-latency
- network-throughput
- powersave
- sap
- throughput-performance
- virtual-guest
- virtual-host
Current active profile: throughput-performance

Tuned Database workload

Tuned runs using PCI - SSDs

Trans / Min

3.10.0-113
3.10.0-113
3.10.0-113
3.10.0-113

40
User Count

balanced
TP
LP
pwrsave

I/O Tuning Hardware

Know Your Storage
SAS or SATA? (Performance comes at a premium)
Fibre Channel, Ethernet (Check the PCI slot PCI / PCI E x4, x8)
Bandwidth limits (I/O characteristics for desired I/O types)

Multiple HBAs
Device-mapper multipath
Provides multipathing capabilities and LUN persistence
Check for your storage vendors recommendations (upto 20% performance gains with
correct settings)

How to profile your I/O subsystem

Low level I/O tools dd, iozone, dt, etc.
I/O representative of the database implementation

I/O Tuning Understanding I/O Elevators

Deadline

Two queues per device, one for read and one for writes
I/Os dispatched based on time spent in queue
Used for multi-process applications and systems running enterprise storage
CFQ
Per process queue
Each process queue gets fixed time slice (based on process priority)
Default setting - Slow storage (SATA) root file system
Noop
FIFO
Simple I/O Merging
Lowest CPU Cost
Low latency storage and applications (Solid State Devices)

I/O Tuning Configuring I/O Elevators

Boot-time
Grub command line elevator=deadline/cfq/noop
Dynamically, per device
echo deadline > /sys/block/sda/queue/scheduler
tuned
tuned-adm profile throughput-performance

I/O Tuning File Systems

Direct I/O

Predictable performance

Memory
(file cache)

DIO

Avoid double caching

Avoid
Double
Caching

DB Cache

Reduce CPU overhead

Asynchronous I/O
Eliminate synchronous I/O stall

Flat files on
file systems

Critical for I/O intensive applications

Configure read ahead (for sequential read operations)

Database (parameters to configure read ahead)
Block devices ( commands blockdev -- getra / setra)
Configure device read ahead for large data loads

Database

Configure read ahead

I/O Tuning Effect of read ahead during data load

Completion time for loading 30G data
DB2 v9.7 (fp4)

00:05:46

Completion time 42%

better with device
read ahead set to
1024 from 256

Completion Time

00:05:02
00:04:19
00:03:36
00:02:53
00:02:10
00:01:26
00:00:43
00:00:00

256
1024

I/O Tuning Database Layout

Separate files by I/O (data , logs, undo, temp)
OLTP data files / undo / logs
All transactions generate logs and undo information

DSS data files / temp files

Merge / joins / indexes generally use temp segments

Use low latency / high bandwidth devices for hot spots

Use database statistics

Linux Tool used for Identifying I/O hotspots

iostat -dmxz <interval>

This shows I/O for only the disks that are in use

I/O Tuning OLTP - Logs

OLTP workload - Logs on FC vs Fusion-io
Single Instance

Transactions / Min

~20 performnace improvemnet by

moving the logs to faster
SSD drives

Logs Fusion-io
Logs FC

10U

40U

80U

I/O Tuning Storage (OLTP database)

OLTP workload - Fibre channel vs Fusion-io

Transactions / Min

4 database instances

Fusion-io

dm-cache
Caching in the device mapper stack to improve performance
Caching of frequently accessed data in a faster target
tech-preview

Dm-cache testing with OLTP Workload

Trans / Min

Using SSD device to back SATA drive

SATA
Dm-cache backed with FusionIO
Dm-cache backed with FusionIO Run2

10U

20U

40U

60U
User set

80U

100U

Memory Tuning
NUMA
Huge Pages
Manage Virtual Memory pages
Flushing of dirty pages
Swapping behavior

Understanding NUMA (Non Uniform Memory Access)

S1
M1

c3
S3

S Socket
C Core
M Memory Bank Attached to each Socket
D Database
Access path between sockets
Access path between sockets and memory

D1
S1

No NUMA optimization

NUMA optimization

Multi Socket Multi core architecture

NUMA required for scaling
Autonuma code provides performance improvement
Additional performance gains by enforcing NUMA placement

Memory Tuning Finding NUMA layout

[root@perf30~]#numactlhardware
available:4nodes(03)
node0cpus:04812162024283236404448525660
node0size:32649MB
node0free:30868MB
node1cpus:15913172125293337414549535761
node1size:32768MB
node1free:29483MB
node2cpus:261014182226303438424650545862
node2size:32768MB
node2free:31082MB
node3cpus:371115192327313539434751555963
node3size:32768MB
node3free:31255MB
nodedistances:
node0123
0:10212121
1:21102121
2:21211021
3:21212110

Memory Tuning NUMA

Enforce NUMA placement
Numactl
CPU and memory pinning

Taskset
CPU pinning

cgroups
cpusets
cpu and memory cgroup

Libvirt
for KVM guests CPU pinning

Memory Tuning Effect of NUMA Tuning

Performance improvement is seen
with more cross NUMA
activity

2inst-hugepgs
2inst-hugepgs-numapin
4 inst-hugepgs
4 inst-hugepgs-numapin

10U

40U

80U

Memory Tuning NUMA - numad

What is numad?
User-level daemon to automatically improve out of the box NUMA system performance
Added to Fedora 17
Added to RHEL 6.3 as tech preview
Not enabled by default
What does numad do?
Monitors available system resources on a per-node basis and assigns significant
consumer processes to aligned resources for optimum NUMA performance.
Rebalances when necessary
Provides pre-placement advice for the best initial process placement and resource
affinity.

Memory Tuning Effect of numad

RHEL 7 - 4 VMs - 80U run

Trans / Min

Running Database OLTP workload

4 VM
4 VM - numapin
4 VM - numad

Memory Tuning Huge Pages

2M pages vs 4K standard linux page

Virtual to physical page map is 512 times smaller

TLB can map more physical pages, resulting in fewer misses
Traditional Huge Pages always pinned
Most databases support Huge pages
1G pages supported on newer hardware
Transparent Huge Pages in RHEL6 (cannot be used for Database shared memory
only for process private memory)
How to configure Huge Pages (16G)
echo 8192 > /proc/sys/vm/nr_hugepages
vi /etc/sysctl.conf (vm.nr_hugepages=8192)

Memory Tuning huge pages

OLTP Workload

Trans / Min

~ 9 % improvement
in performance

Regular pgs
Huge pages

10U

20U

40U
User Set Count

60U

80U

Tuning Memory Flushing Caches

Drop unused Cache
Frees
File

unused memory

cache

If the DB uses cache, may notice slowdown

Free pagecache

Free slabcache

echo 1 > /proc/sys/vm/drop_caches

echo 2 > /proc/sys/vm/drop_caches

Free pagecache and slabcache

echo 3 > /proc/sys/vm/drop_caches

Tuning Memory swappiness

Not needed as much in RHEL7
Controls how aggressively the system reclaims mapped memory:
Default - 60%
Decreasing: more aggressive reclaiming of unmapped pagecache
memory, thereby delaying swapping
Increasing: more aggressive swapping of mapped memory

dirty_ratio and dirty_background_ratio

pagecache
100% of pagecache RAM dirty

flushd and write()'ng processes write dirty buffers

If there is a lot of pagecache pressure one

would want to start background flushing
sooner and delay the synchronous writes.
This can be done by

Lowering the dirty_background_ratio

Increasing the dirty_ratio

dirty_ratio(20% of RAM dirty) processes start synchronous writes

flushd writes dirty buffers in background

dirty_background_ratio(10% of RAM dirty) wakeup flushd

do_nothing
0% of pagecache RAM dirty

Objectives
of
this
session
Anything Hyper has to be good ... right?
Using Hyperthreads improves performance with database workload but the mileage will vary
depending on how the database workload scales.
Having more CPUs sharing the same physical cache can also help performance in some
cases
Some workloads lend themselves to scaling efficiently and they will do very well with
hyperthreads but if the scaling factor for workloads are not linear with physical cpus, it
probably wont be a good candidate for scaling with hyperthreads.

Objectives of this session

Scaling test - Hyperthread vs no Hyperthead

OLTP workload (96G sga with different CPU counts)

1 node no HT
1 node HT

Trans / min

2 nodes no HT
2 nodes HT
3 nodes no HT
3 nodes HT
4 nodes no HT
4 nodes HT

10U

40U

80U

User sets

Single Instance scaled across NUMA nodes, one node at a time. The 1 node test shows the best gain in performance.
As more NUMA nodes come into play, the performance difference is hard to predict because of the memory placement
and the CPU cache sharing among physical threads and hyperthreads of the CPUs
Avg % gain 1 Node 30% / 2 Nodes 20 % / 3 Nodes 14 % / 4 Nodes 5.5 %

Objectives of this session

Multi Instances of database with and without Hyperthreads

Instances aligned with Numa Nodes with 16P - 32G mem (24G sga)

Trans / min

~ 35% improevment

4 Inst HT
4 Inst no HT

10U

20U

40U

60U

80U

User sets

Each of the 4 instances were aligned to an individual NUMA node. This test shows the best gain in
performances as other factors influencing performance like NUMA, I/O are not a factor

Network Tuning Databases

Network Performance
Separate network for different functions (Private network for database traffic)

If on same network, use arp_filter to prevent ARP flux

echo 1 > /proc/sys/net/ipv4/conf/all/arp_filter
Private
H1
Hardware
Public
10GigE
Supports RDMA w/ RHEL6 high performance networking package (ROCE)
Infiniband (Consider the cost factor)
Packet size (Jumbo frames)

Linux Tool used for monitoring network

sar -n DEV <interval>

Network tuning Jumbo Frames with iSCSI storage

Transactions / min

OLTP Workload

1500
9000

10U

40U

100U

Derived metric based on Query completion time

DSS workloads

9000MTU

1500MTU

Database Performance

Application tuning
Design
Reduce locking / waiting
Database tools (optimize regularly)
Resiliency is not a friend of performance

Please attend
Oracle Database 12c on Red Hat Enterprise Linux: Best practices
Thursday, April 17 11 a.m. to 12 p.m. in Room 102

Cgroup
Resource Management
Memory, cpus, IO, Network
For performance
For application consolidation
Dynamic resource allocation
Application Isolation
I/O Cgroups
At device level control the % of I/O for each Cgroup if the device is shared
At device level put a cap on the throughput

Cgroup Resource management

Cgroups - to manage resources

By controllring resources, the performance

of each instance can be controlled

No Resource Control

Cgroup- Resource Control

Inst 1
Inst 2
Inst 3
Inst 4

Cgroup NUMA pinning

Cgroups for NUMA pinning
Works with Hugepages

Works with Huge Pages transparently

giving another 13% gain
Aligning NUMA by using Cgroups
shows a 30% gain

Trans / Min

Inst
Inst
Inst
Inst

no NUMA

Cgroup NUMA

Cgroup NUMA Hugepgs

4
3
2
1

C-group - Dynamic resource control

Trans / Min

Dynamic CPU change with Cgroups

CPU assigned to instance 1 - 16

CPUs assigned to instance 2 - 64

CPU assigned to instance 1 - 64

CPUs assigned to instance 2 - 16

Instance 1
Instance 2

Time

Cgroup Application Isolation

Memory Resource Management

System Level Memory Swapping

Oracle OLTP Workload

35 K

30 K

20 K

Swap In
Swap Out

15 K

10 K

Transactions Per Minute

25 K
Instance
Instance
Instance
Instance

K
Time

Regular

Throttled

Even though one application does not have resources and starts swapping,
other applications are not affected

4
3
2
1

Database on RHEV

Quick Overview KVM Architecture

Guests run as a process in userspace on the host
A virtual CPU is implemented using a Linux thread
The Linux scheduler is responsible for scheduling a virtual CPU, as it is a
normal thread

Guests inherit features from the kernel

NUMA
Huge Pages
Support for new hardware

Virtualization Tuning Caching

Figure 1

VM1

Figure 2

VM2

VM1

Host
File Cache

VM2

Cache = none (Figure 1)

I/O from the guest is not cached on the host
Cache = writethrough (Figure 2)
I/O from the guest is cached and written through on the host
Works well on large systems (lots of memory and CPU)
Potential scaling problems with this option with multiple guests (host CPU used to
maintain cache)
Can lead to swapping on the host

Virt Tuning Effect of I/O Cache Settings

OLTP testing in 4 VMs
Cache=WT vs Cache=none

Trans / min

cache=none
cache=WT
cache=WT-run2

10U

20U

40U
User set count

60U

80U

Virt Tuning Using NUMA

4 Virtual Machines running OLTP workload
7.05

Trans / min normalized to 100

7.54

no NUMA

Manual Pin

NUMAD

Virt Tuning Tuning Transparent Huge Pages

4 VM testing
Comparision between THP and huge pages on host

Trans / Min

THP-scan=10000
THP - scan=100
Hugepages

10U

20U

40U

60U
User Set

80U

100U

Virt Tuning Kernel Samepage Merging (KSM)

KSM and THP scan
KSM breaks down THP, so performance
advantage of THP is lot.
Some of it is recovered by lowering the scan interval
CPU overhead of KSM results in less CPU time for application

Trans / Min

THP 10000scan, ksm on, mem opt server (150)

THP 100scan, ksm on, mem opt server (150)
THP 100scan, ksm off, mem opt server(150)
THP 100scan, ksm off, no memory opt

40U

80U

100U

User Set

Use KSM for

- Running Windows Virtual Machines on RHEV
- Oversubscribing memory on the host

RHEV 3.3 Migration

Transactions / minute

Migrating a 108G VM running OLTP ~ 500K Trans/min

TPM- regular
TPM Mig BW - 32
TPM Mig BW - 0

Time

Configure migration_max_bandwidth = <Value> in /etc/vdsm/vdsm.conf

Virtualization Tuning Network

VirtIO
VirtIO drivers for network

vhost_net (low latency close to line speed)

Bypass the qemu layer

PCI pass through

Bypass the host and pass the PCI device to the guest
Can be passed only to one guest

SR-IOV (Single root I/O Virtualization)

Pass through to the guest
Can be shared among multiple guests
Limited hardware support

Tools

Performance Monitoring Tools

Monitoring tools
top, vmstat, ps, iostat, netstat, sar, perf
Kernel tools
/proc, sysctl, AltSysRq
Networking
ethtool, ifconfig
Profiling
oprofile, strace, ltrace, systemtap, perf

Performance Monitoring Tool perf

Performance analysis tool
perf top (dynamic)
perf record / report (save and replay)
perf stat <command> (analyze a particular workload)

Performance Monitoring Tool perf top

Multi Instance OLTP Run without Huge Pages

Performance Monitoring Tool perf record / report

Multi Instance OLTP Run with Huge Pages

Performance Monitoring Tool perf stat

perf stat <command>

monitors any workload and collects variety of statistics

can monitor specific events for any workload with -e flag (perf list give list of events)

perfstatwithregular4kpages
Performancecounterstatsfor<databaseworkload>
7344954.315998taskclock#6.877CPUsutilized
64,577,684contextswitches#0.009M/sec
23,074,271cpumigrations#0.003M/sec
1,621,164pagefaults#0.221K/sec
16,251,715,158,810cycles#2.213GHz[83.35%]
12,106,886,605,229stalledcyclesfrontend#74.50%frontendcyclesidle[83.33%]
8,559,530,346,324stalledcyclesbackend#52.67%backendcyclesidle[66.66%]
5,909,302,532,078instructions#0.36insnspercycle
#2.05stalledcyclesperinsn[83.33%]
1,585,314,389,085branches#215.837M/sec[83.31%]
43,276,126,707branchmisses#2.73%ofallbranches[83.35%]
1068.000304798secondstimeelapsed

Performance Monitoring Tool perf stat

perfstatwith2Mhugepages
Performancecounterstatsfor<databaseworkload>
9262233.382377taskclock#8.782CPUsutilized
66,611,453contextswitches#0.007M/sec
25,096,578cpumigrations#0.003M/sec
1,304,385pagefaults#0.141K/sec
20,545,623,374,830cycles#2.218GHz[83.34%]
15,032,568,156,397stalledcyclesfrontend#73.17%frontendcyclesidle[83.33%]
10,633,625,096,508stalledcyclesbackend#51.76%backendcyclesidle[66.66%]
7,533,778,949,221instructions#0.37insnspercycle
#2.00stalledcyclesperinsn[83.33%]
2,109,143,617,292branches#227.714M/sec[83.33%]
45,626,829,201branchmisses#2.16%ofallbranches[83.33%]
1054.730657871secondstimeelapsed

Performance Monitoring Tool sar

Output of sar -N DEV 3
For a DSS workload running on iSCSI storage using different MTUs
1500MTU
01:40:08PMIFACErxpck/stxpck/srxkB/stxkB/srxcmp/stxcmp/srxmcst/s
01:40:11PMeth00.340.340.020.020.000.000.00
01:40:11PMeth5135016.7819107.72199178.191338.530.000.000.34
01:40:14PMeth00.660.000.050.000.000.000.66
01:40:14PMeth5133676.7418911.30197199.841310.250.000.000.66
01:40:17PMeth00.670.000.050.000.000.000.67
01:40:17PMeth5134555.8519045.15198502.271334.190.000.000.33
01:40:20PMeth01.000.000.070.000.000.000.67
01:40:20PMeth5134116.3318972.33197849.551325.030.000.001.00
9000MTU
06:58:43PMIFACErxpck/stxpck/srxkB/stxkB/srxcmp/stxcmp/srxmcst/s
06:58:46PMeth00.910.000.070.000.000.000.00
06:58:46PMeth5104816.3648617.27900444.383431.150.000.000.91
06:58:49PMeth00.000.000.000.000.000.000.00
06:58:49PMeth5118269.8054965.841016151.643867.910.000.000.50
06:58:52PMeth00.000.000.000.000.000.000.00
06:58:52PMeth5118470.7354382.441017676.213818.350.000.000.98
06:58:55PMeth00.940.000.060.000.000.000.00
06:58:55PMeth5115853.0553515.49995087.673766.280.000.000.47

OLTP Workload
PCI SSD Storage

OLTP Workload
Fibre Channel Storage

Performance Monitoring Tool vmstat

rbswpdfreebuffcachesisobiboincsussyidwast
621950924489431213070476267048008530144255353501132574344590
32050924367080013121677248544006146152650293689337333353110
212750924297553213162077808736002973147526208866614020265130
72050924255576413201278158840002206136012195266145217269120
251850924200236813253678647472002466144191202556336619267110
42150924146955213294479111672002581144470211256602922265110
13250924081469613336879699200002608151518219676984123264110
41750924004662013380480385232002638151933230447029424264100
171450923949958013420480894864002377152805236637265525262100
35145092389100241345968143695200227815286424944742312726190
201350923831390013503281978544002091156207242577296826262100
11450923783107613552882389120001332155549197985819520267110
232450923743077213593682749040001955145791195575613318266140
341250923686450013639683297184001546141385199575689419267130
rbswpdfreebuffcachesisobiboincsussyidwast
7706604551798763588886622696000732526689570185149686907400
7716604506300923592887047624800687330690070166149804887500
7636604460311683601327444477600581857428677388177454888400
8116604415106083605127864148000497045293975322168464897300
8236604353588363610128446625600401144104274022162443887400
8136604349914523618928474000800212644087673702161618887500
7916604349397923622968474701600232340032473091161592906300
7906604348796443629928475401600227541263173271160766896400
7626604348446163633968476097600227541577773019158614896400
6136604348086803638288476801600220940152272367159100896400
7716604347819443641808477499200217240196673253159064906400
5446604347249483647728480345600303142129972990156224896400
8026604347015003655008480907200221657324676404175922887510

Memory Stats
I/O Stats
CPU stats

Swap stats

Performance Monitoring Tool iostat

Output of iostat -dmxz 3
Device:rrqm/swrqm/sr/sw/srMB/swMB/savgrqszavgquszawaitsvctm%util
sda0.0025.200.002.400.000.1190.670.0417.5011.502.76
dm00.000.000.001.000.000.008.000.0547.0015.201.52
dm20.000.000.0026.200.000.108.000.4316.430.471.24
fioa0.0041.801057.603747.6028.74114.7561.161.720.360.1676.88
Device:rrqm/swrqm/sr/sw/srMB/swMB/savgrqszavgquszawaitsvctm%util
sda0.002.990.003.190.000.0215.000.014.504.441.42
dm20.000.000.005.990.000.028.000.012.432.371.42
fioa0.0032.93950.703771.4625.33127.1866.141.770.380.1676.57
Device:rrqm/swrqm/sr/sw/srMB/swMB/savgrqszavgquszawaitsvctm%util
sda0.0022.800.001.600.000.09121.000.016.256.120.98
dm20.000.000.0024.200.000.098.000.114.690.400.98
fioa0.0040.00915.003868.6024.10118.3160.971.630.340.1675.34
Device:rrqm/swrqm/sr/sw/srMB/swMB/savgrqszavgquszawaitsvctm%util
sda0.0054.200.001.600.000.22278.000.016.005.000.80
dm20.000.000.0055.600.000.228.000.244.260.140.80
fioa0.0039.80862.003800.6021.93131.6767.471.720.370.1675.96
Device:rrqm/swrqm/sr/sw/srMB/swMB/savgrqszavgquszawaitsvctm%util
sda0.002.400.000.800.000.0130.000.016.756.750.54
dm20.000.000.003.000.000.018.000.011.801.800.54
fioa0.0036.00811.203720.8020.74116.7862.151.560.340.1672.72

Wrap Up Bare Metal

I/O
Choose the right elevator
Eliminated hot spots
Direct I/O or Asynchronous I/O
Virtualization Caching
Memory
NUMA
Huge Pages
Swapping
Managing Caches
RHEL has many tools to help with debugging / tuning

Wrap Up Bare Metal (cont.)

CPU
Check cpuspeed settings
Network
Separate networks
arp_filter
Packet size

Thank you

Learn SAP Basis in 24 Hours
From Everand
Learn SAP Basis in 24 Hours
Alex Nordeen
4.5/5 (2)
2 - RHEL - Enterprise Performance Tuning (RH442)
No ratings yet
2 - RHEL - Enterprise Performance Tuning (RH442)
6 pages
Ripes A Visual Computer Architecture Simulator
100% (1)
Ripes A Visual Computer Architecture Simulator
8 pages
RHCA Syllabus
No ratings yet
RHCA Syllabus
13 pages
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
From Everand
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
Wei Liu
No ratings yet
FreeRADIUS Beginner's Guide
From Everand
FreeRADIUS Beginner's Guide
Dirk van der Walt
No ratings yet
Lab 6 Arithmetic Operations I
100% (1)
Lab 6 Arithmetic Operations I
3 pages
History and Generation of Computer
67% (3)
History and Generation of Computer
4 pages
Redhat Summit Perf Analysis and Tuning Part 1 2013
100% (1)
Redhat Summit Perf Analysis and Tuning Part 1 2013
69 pages
Book Sle Tuning
No ratings yet
Book Sle Tuning
231 pages
Book-Sle-Tuning Color en
No ratings yet
Book-Sle-Tuning Color en
218 pages
Book Sle Tuning
No ratings yet
Book Sle Tuning
231 pages
Performance Tuning Oracle Rac On Linux
No ratings yet
Performance Tuning Oracle Rac On Linux
12 pages
Book Sle Tuning
No ratings yet
Book Sle Tuning
231 pages
Rodrigo Freire - Rhel 6 Performance & Tuning
No ratings yet
Rodrigo Freire - Rhel 6 Performance & Tuning
37 pages
Redhat Summit Perf Analysis and Tuning Part 2 2013
100% (1)
Redhat Summit Perf Analysis and Tuning Part 2 2013
57 pages
SLES12 System Analysis and Tuning
No ratings yet
SLES12 System Analysis and Tuning
212 pages
Book-Sle-Tuning Color en PDF
No ratings yet
Book-Sle-Tuning Color en PDF
222 pages
Analize de Sistemas e Modificação Suse
No ratings yet
Analize de Sistemas e Modificação Suse
218 pages
RHEL Tuning Guide
100% (2)
RHEL Tuning Guide
118 pages
T5-Linux Performance Tuning
No ratings yet
T5-Linux Performance Tuning
52 pages
Book - Sle.tuning Color en
No ratings yet
Book - Sle.tuning Color en
217 pages
Linux Tuning For ASE and IQ
No ratings yet
Linux Tuning For ASE and IQ
61 pages
Book-Tuning en
No ratings yet
Book-Tuning en
220 pages
Tuned Devconf 2019
No ratings yet
Tuned Devconf 2019
27 pages
Linux Perf Tuning 2010 1up
No ratings yet
Linux Perf Tuning 2010 1up
91 pages
RHEL 6 Performance Tuning Guide
No ratings yet
RHEL 6 Performance Tuning Guide
76 pages
Tuning System Performance
No ratings yet
Tuning System Performance
8 pages
Suse Linux
No ratings yet
Suse Linux
240 pages
RH442 Red Hat Enterprise System Monitoring and Performance Tuning
0% (1)
RH442 Red Hat Enterprise System Monitoring and Performance Tuning
2 pages
SBP-performance-tuning Color en
No ratings yet
SBP-performance-tuning Color en
29 pages
BP106 IBM Lotus Domino RunFaster 1 - Nash!
No ratings yet
BP106 IBM Lotus Domino RunFaster 1 - Nash!
54 pages
Suse Linux Enterprise Tuning and Configuration For Sas Guide
No ratings yet
Suse Linux Enterprise Tuning and Configuration For Sas Guide
4 pages
All My IT Tech Posts
From Everand
All My IT Tech Posts
Stephen Edwards
No ratings yet
s319054 1290801195658 Phpapp02
No ratings yet
s319054 1290801195658 Phpapp02
68 pages
Perfomance Checking On Servers With Linex Os
No ratings yet
Perfomance Checking On Servers With Linex Os
30 pages
System Tuning Perf Final
No ratings yet
System Tuning Perf Final
84 pages
Perf MNTNG
No ratings yet
Perf MNTNG
3 pages
Linux Performance Tuning Logistics: Tutorial Runs From 9 To 5:00pm
No ratings yet
Linux Performance Tuning Logistics: Tutorial Runs From 9 To 5:00pm
46 pages
Tuning Linux For MongoDB
No ratings yet
Tuning Linux For MongoDB
26 pages
Red Hat Enterprise Linux 7: Ondřej Vašík RHEL 5/7 Engineering Lead, Red Hat Czech
No ratings yet
Red Hat Enterprise Linux 7: Ondřej Vašík RHEL 5/7 Engineering Lead, Red Hat Czech
29 pages
RH436 RHEL5u4 en 11 20091130
100% (1)
RH436 RHEL5u4 en 11 20091130
388 pages
Woodman, Shakshober Performance Analys
No ratings yet
Woodman, Shakshober Performance Analys
91 pages
Tburke Rhel6 Summit
No ratings yet
Tburke Rhel6 Summit
46 pages
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
From Everand
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
Franco Mario
No ratings yet
Solaris-Day7
No ratings yet
Solaris-Day7
4 pages
Tikam
No ratings yet
Tikam
62 pages
RHCC Toronto Sept 2016 PDF
No ratings yet
RHCC Toronto Sept 2016 PDF
85 pages
LPIC-3 Exam 306-300 Mastery: 500 Practice Questions on High Availability & Storage Clusters
From Everand
LPIC-3 Exam 306-300 Mastery: 500 Practice Questions on High Availability & Storage Clusters
Steve Brown
No ratings yet
Red Hat Enterprise Linux-6-Storage Administration Guide-En-US
No ratings yet
Red Hat Enterprise Linux-6-Storage Administration Guide-En-US
224 pages
Red Hat Enterprise Linux-5-Tuning and Optimizing Red Hat Enterprise Linux For Oracle 9i and 10g Databases-En-US
No ratings yet
Red Hat Enterprise Linux-5-Tuning and Optimizing Red Hat Enterprise Linux For Oracle 9i and 10g Databases-En-US
136 pages
030-036 Tuning
No ratings yet
030-036 Tuning
7 pages
Performance 1738330914993
No ratings yet
Performance 1738330914993
44 pages
Pop!_OS System Administration Guide: Definitive Reference for Developers and Engineers
From Everand
Pop!_OS System Administration Guide: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
From Everand
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Linux Services Deployment
From Everand
Linux Services Deployment
Fabian Mestre
No ratings yet
Hack into your Friends Computer
From Everand
Hack into your Friends Computer
Magelan Cyber Security
No ratings yet
Aix Quick Sheet
No ratings yet
Aix Quick Sheet
2 pages
Firebird Tuning
No ratings yet
Firebird Tuning
60 pages
Linux System Administrator Manual
No ratings yet
Linux System Administrator Manual
303 pages
RedHat Enterprise Linux 7 Installation of OS High Availability (QNMEngineWithPostgres & Qstar installation)
No ratings yet
RedHat Enterprise Linux 7 Installation of OS High Availability (QNMEngineWithPostgres & Qstar installation)
43 pages
Information Technology HandBook
From Everand
Information Technology HandBook
Duong Tran
3/5 (1)
Cloud Infrastructure and Data Center
From Everand
Cloud Infrastructure and Data Center
Duong Tran
No ratings yet
Sheet 6
No ratings yet
Sheet 6
2 pages
Manual - Lab 2
No ratings yet
Manual - Lab 2
6 pages
Chapter 2 - Hardware and Software Concepts: 2004 Deitel & Associates, Inc. All Rights Reserved
No ratings yet
Chapter 2 - Hardware and Software Concepts: 2004 Deitel & Associates, Inc. All Rights Reserved
41 pages
Css Subject Computer Science Syllabus Download
No ratings yet
Css Subject Computer Science Syllabus Download
4 pages
Omni Flow Computer 3000-6000
100% (1)
Omni Flow Computer 3000-6000
2 pages
Memory Organisation
No ratings yet
Memory Organisation
37 pages
MIPS Architecture - Wikipedia
No ratings yet
MIPS Architecture - Wikipedia
114 pages
Lec-01(Introdcution to 8085)
No ratings yet
Lec-01(Introdcution to 8085)
43 pages
CPU IOP Communication
0% (4)
CPU IOP Communication
8 pages
Siemens s7-300 Programming
No ratings yet
Siemens s7-300 Programming
98 pages
Multi Core Processor
No ratings yet
Multi Core Processor
11 pages
Ss2 E-Note First Term Computer
No ratings yet
Ss2 E-Note First Term Computer
60 pages
7700K Temperature - CPU PLL OC Voltage EN
No ratings yet
7700K Temperature - CPU PLL OC Voltage EN
12 pages
8086 Architecture
No ratings yet
8086 Architecture
15 pages
Rand F Reports
No ratings yet
Rand F Reports
6 pages
Digital Computer, It's Components and Block Diagram
No ratings yet
Digital Computer, It's Components and Block Diagram
4 pages
Computer Fundamentals
No ratings yet
Computer Fundamentals
522 pages
Third Semester Sl. No Code Title HC /SC /FC Credit Pattern Credit S Working Hrs L T P
No ratings yet
Third Semester Sl. No Code Title HC /SC /FC Credit Pattern Credit S Working Hrs L T P
3 pages
EE5903 RTS Real Time Scheduling Policies: Bharadwaj Veeravalli
No ratings yet
EE5903 RTS Real Time Scheduling Policies: Bharadwaj Veeravalli
90 pages
OSY All Model Ans Paper
No ratings yet
OSY All Model Ans Paper
101 pages
Types of Processors and Their Architecture
No ratings yet
Types of Processors and Their Architecture
5 pages
Aug 08
No ratings yet
Aug 08
61 pages
Issyll
No ratings yet
Issyll
165 pages
HPE ProLiant DL380 Gen11 data sheet-PSN1014696069SGEN
No ratings yet
HPE ProLiant DL380 Gen11 data sheet-PSN1014696069SGEN
5 pages
Et200sp Base Units Manual en-US en-US PDF
No ratings yet
Et200sp Base Units Manual en-US en-US PDF
146 pages
Parts of A Computer With Their Functions
No ratings yet
Parts of A Computer With Their Functions
19 pages
Practical 2 - COA
No ratings yet
Practical 2 - COA
10 pages