0% found this document useful (0 votes)

62 views

Gshare and Pshare Branch Predictors

The document provides a comprehensive analysis of Gshare and Pshare branch predictors, highlighting their mechanisms, advantages, and disadvantages. Gshare utilizes global history for branch prediction while Pshare focuses on local history, making them suitable for different scenarios in modern processors. Both predictors are employed in various CPU architectures, with ongoing research aimed at enhancing their performance through hybrid models and advanced techniques.

Uploaded by

maneabhishek5355

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

62 views

Gshare and Pshare Branch Predictors

Uploaded by

maneabhishek5355

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

Gshare and Pshare Branch

Predictors: A Comprehensive
Analysis
Branch prediction is crucial for high-performance superscalar and out-of-order processors.
Two widely used index-sharing predictors are Gshare and Pshare, both of which attempt
to improve accuracy while managing aliasing in Pattern History Tables (PHTs).

1. Gshare Branch Predictor

Gshare (Global Sharing Predictor) was introduced by Scott McFarling (1993) and is a
global-history-based branch predictor. Instead of using a simple concatenation of global
history and PC (GAg predictor), Gshare XORs the PC with the global history register to
create a more distributed PHT index.

How Gshare Works

1. Branch History Register (BHR) stores recent branch outcomes (e.g., last 12
branches).
2. The lower bits of the program counter (PC) are XORed with the BHR.
3. The result is used to index the PHT, which contains 2-bit saturating counters for
branch predictions.
4. The PHT counter updates based on actual branch behavior, adjusting over time.

📌 Key Idea: XORing spreads out PHT index entries more evenly, reducing aliasing
conflicts.

📌 Pros of Gshare

✔ Better utilization of PHT entries: The XOR operation reduces destructive aliasing
compared to GAg.
✔ Effective for correlated branches: Captures global branch behavior efficiently.
✔ Simple implementation: Only modifies the index function, making it hardware-
friendly.

❌ Cons of Gshare

✖ Not ideal for local branch patterns: Since it uses global history, it may fail for
branches that are locally dependent.
✖ Increased indexing complexity: The XOR operation adds latency, impacting access
speed in deep pipelines.
✖ Aliasing still exists: While less than GAg, some branches still overwrite each other's
history.

2. Pshare Branch Predictor

Pshare (Per-Address Sharing Predictor) is a local-history-based version of Gshare. Instead of
a global history register, Pshare maintains per-address branch history (BHT) entries.

How Pshare Works

1. Each static branch address maps to a unique history register (BHT).

2. The BHT stores the recent outcomes of a given branch.
3. The stored history is XORed with the PC to index the PHT.
4. The PHT counter predicts the branch outcome and updates accordingly.

📌 Key Idea: Instead of relying on a single global history, Pshare tracks each branch
individually.

📌 Pros of Pshare

✔ Less interference between different branches: Since each branch stores its own history,
there is less aliasing compared to Gshare.
✔ Better for loops and repetitive patterns: Works well when a branch is dependent on its
own previous executions.
✔ More accurate in localized patterns: Avoids mixing up unrelated branch histories.

❌ Cons of Pshare

✖ Higher storage cost: Each branch needs its own BHT entry, consuming more memory
than Gshare.
✖ Poor at capturing global correlations: If a branch depends on other branches'
outcomes, Pshare fails to predict accurately.
✖ More complex hardware implementation: Since each branch address maintains a
separate history register, BHT lookups increase latency.

3. Gshare vs. Pshare: Key Differences

Feature Gshare Pshare
Branch History Type Global Per-Address (Local)
Indexing Method BHR ⊕ PC (XOR) BHT ⊕ PC (XOR)
Aliasing Reduction Moderate High (avoids global aliasing)
Hardware Simple Higher (requires BHT)
Feature Gshare Pshare
Complexity
Correlated branches (global Localized branch behavior
Best For
patterns) (loops)
Weakness Fails for local patterns Fails for correlated branches
Storage Cost Low High

4. Real-Life Use Cases

Both Gshare and Pshare are used in modern processors to improve instruction
throughput.

🔹 Gshare Use Cases

1. Intel Processors (Pentium Pro, Pentium 4, Core series)

o Pentium 4 (NetBurst Architecture) used a modified Gshare-based
predictor.
o Modern Intel Core i7 processors still use Gshare-inspired tournament
predictors.
2. AMD Athlon & Ryzen
o AMD CPUs favor global history-based predictors like Gshare due to
efficient aliasing management.
3. IBM POWER4 and Alpha 21264
o Used Gshare-based predictors with large PHT tables to handle deep
pipelines.

🔹 Pshare Use Cases

1. IBM POWER4 (2002)

o IBM implemented Pshare-like per-address history tracking to improve
loop prediction.
2. AI & Gaming Processors
o Some AI accelerators and GPUs (e.g., NVIDIA & AMD) use Pshare-style
predictors for repetitive control logic.
3. Embedded Processors (ARM)
o ARM-based mobile processors implement Pshare variants for power-
efficient branch prediction.

5. Latest Research Papers (Post-2000)

Here are some recent research papers discussing Gshare and Pshare advancements:

1. Jiménez & Lin (2003) - "Neural Methods for Dynamic Branch Prediction"
oThis paper compared Gshare and Perceptron-based predictors and found
that Perceptron predictors outperform Gshare for linearly separable
branches.
2. Tendler et al. (2002) - "IBM Power4 Architecture"
o Describes how IBM POWER4 used global-history and per-address branch
predictors in a hybrid approach.
3. Seznec et al. (2006) - "The TAGE Predictor"
o TAGE (TAgged GEometric) predictors improve upon Gshare by using
multiple tables with varying history lengths.

6. Summary & Takeaways

🔹 Gshare is a global-history-based predictor that XORs PC and history to reduce
aliasing.
🔹 Pshare is a per-address-history-based predictor that tracks each branch separately,
improving local prediction accuracy.
🔹 Gshare is better for globally correlated branches, while Pshare excels in loops and
repetitive execution patterns.
🔹 Modern CPUs use hybrid predictors, combining Gshare (global) and Pshare (local) for
optimal performance.
🔹 Recent research (post-2000) focuses on hybrid models that integrate neural networks,
TAGE, and machine learning to further enhance prediction accuracy.

In the future, branch prediction will likely evolve to include AI-driven and statistical
models, further reducing misprediction penalties in deep pipelines. 🚀

Assignment 1
No ratings yet
Assignment 1
3 pages
(NXP) .1-Day Hands-On Arm Training
100% (2)
(NXP) .1-Day Hands-On Arm Training
208 pages
Branch Predictors
No ratings yet
Branch Predictors
41 pages
9 Types of Two Level Branch Predictor
No ratings yet
9 Types of Two Level Branch Predictor
4 pages
Lec4 Supp Branch Prediction
No ratings yet
Lec4 Supp Branch Prediction
45 pages
17.L15 BranchPrediction
No ratings yet
17.L15 BranchPrediction
38 pages
07 Branch Prediction
No ratings yet
07 Branch Prediction
35 pages
lect09-adv-branch-prediction
No ratings yet
lect09-adv-branch-prediction
55 pages
Software-Based and Hardware-Based Branch Prediction Strategies and Performance Evaluation
No ratings yet
Software-Based and Hardware-Based Branch Prediction Strategies and Performance Evaluation
19 pages
Branch Prediction: Prof. Mikko H. Lipasti University of Wisconsin-Madison
No ratings yet
Branch Prediction: Prof. Mikko H. Lipasti University of Wisconsin-Madison
22 pages
Branch Handling
No ratings yet
Branch Handling
23 pages
Branch Prediction Techniques: Prof. Pimal Khanpara Department of Computer Science & Engineering
No ratings yet
Branch Prediction Techniques: Prof. Pimal Khanpara Department of Computer Science & Engineering
20 pages
Comp Arch Proj Report 2
No ratings yet
Comp Arch Proj Report 2
11 pages
Lec 15
No ratings yet
Lec 15
23 pages
Computer Architecture: Branching
No ratings yet
Computer Architecture: Branching
37 pages
Dynamic Branch Prediction
No ratings yet
Dynamic Branch Prediction
7 pages
9.1.0 Branch Prediction Pentiums IBM PPC
No ratings yet
9.1.0 Branch Prediction Pentiums IBM PPC
163 pages
Branch Prediction: Jeroen Lichtenauer
No ratings yet
Branch Prediction: Jeroen Lichtenauer
23 pages
Ue21ec341b 20240412163937
No ratings yet
Ue21ec341b 20240412163937
22 pages
CS252 Graduate Computer Architecture Prediction (Con't) (Dependencies, Load Values, Data Values) February 22, 2010
No ratings yet
CS252 Graduate Computer Architecture Prediction (Con't) (Dependencies, Load Values, Data Values) February 22, 2010
54 pages
10_branchprediction
No ratings yet
10_branchprediction
49 pages
18 740 Fall15 Lecture05 Branch Prediction Afterlecture
No ratings yet
18 740 Fall15 Lecture05 Branch Prediction Afterlecture
93 pages
Dynamic Branch Prediction
No ratings yet
Dynamic Branch Prediction
17 pages
5.Branch prediction
No ratings yet
5.Branch prediction
25 pages
البحث الثاني
No ratings yet
البحث الثاني
10 pages
Finding Difficult Branches
No ratings yet
Finding Difficult Branches
19 pages
Branch Prediction
No ratings yet
Branch Prediction
6 pages
CA_L15a_BranchPrediction_Intro_And_StaticPredictors
No ratings yet
CA_L15a_BranchPrediction_Intro_And_StaticPredictors
19 pages
Questions That I Encountered
No ratings yet
Questions That I Encountered
9 pages
Administering ArcGIS for Server
From Everand
Administering ArcGIS for Server
Hussein Nasser
No ratings yet
Branch Prediction: Joel Emer
No ratings yet
Branch Prediction: Joel Emer
36 pages
Aca Unit-4 Notes
No ratings yet
Aca Unit-4 Notes
23 pages
Dynamic Branch Prediction With Perceptrons
No ratings yet
Dynamic Branch Prediction With Perceptrons
10 pages
L12 - Advanced Branch Preiction
No ratings yet
L12 - Advanced Branch Preiction
9 pages
WRL-TN-36
No ratings yet
WRL-TN-36
29 pages
The Schemes and Performances of Dynamic Branch Predictors: Chih-Cheng Cheng
No ratings yet
The Schemes and Performances of Dynamic Branch Predictors: Chih-Cheng Cheng
18 pages
Branch_Prediction_Two_Level_c9dad57e-1c2e-47df-8284-25f3c9587a86
No ratings yet
Branch_Prediction_Two_Level_c9dad57e-1c2e-47df-8284-25f3c9587a86
2 pages
branchPred
No ratings yet
branchPred
27 pages
Branch Prediction ARM
No ratings yet
Branch Prediction ARM
14 pages
RISC-V Pipeline P3
No ratings yet
RISC-V Pipeline P3
24 pages
Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture - 16 Branch Prediction
No ratings yet
Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture - 16 Branch Prediction
26 pages
Exploring_Convolution_Neural_Network_for_Branch_Prediction
No ratings yet
Exploring_Convolution_Neural_Network_for_Branch_Prediction
9 pages
BranchNet_A_Convolutional_Neural_Network_to_Predict_Hard-To-Predict_Branches
No ratings yet
BranchNet_A_Convolutional_Neural_Network_to_Predict_Hard-To-Predict_Branches
13 pages
Branch Prediction Maryamhamza
No ratings yet
Branch Prediction Maryamhamza
12 pages
CA Lecture 4 Module 3
No ratings yet
CA Lecture 4 Module 3
27 pages
Lab3-Branch-Prediction-Hardware
No ratings yet
Lab3-Branch-Prediction-Hardware
16 pages
The Bi-Mode Branch Predictora
No ratings yet
The Bi-Mode Branch Predictora
11 pages
Selective Branch Prediction Schemes Based On FPGA MIPS Processor For Educational Purposes
No ratings yet
Selective Branch Prediction Schemes Based On FPGA MIPS Processor For Educational Purposes
9 pages
AmarthyaRidheeshSethPravarProj1
No ratings yet
AmarthyaRidheeshSethPravarProj1
4 pages
L11 PipelineHazards 4
No ratings yet
L11 PipelineHazards 4
30 pages
Branch Prediction
No ratings yet
Branch Prediction
2 pages
CUDA Programming Fundamentals: Definitive Reference for Developers and Engineers
From Everand
CUDA Programming Fundamentals: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
TAGE Predictor
No ratings yet
TAGE Predictor
24 pages
Computer Architecture Project 2: Understanding Gem5 Branch Predictor Structure
No ratings yet
Computer Architecture Project 2: Understanding Gem5 Branch Predictor Structure
5 pages
A Hybrid Branch Prediction Scheme: An Integration of Software and Hardware Techniques
No ratings yet
A Hybrid Branch Prediction Scheme: An Integration of Software and Hardware Techniques
8 pages
05 - Pipelining - Branch Prediction
No ratings yet
05 - Pipelining - Branch Prediction
20 pages
Branch Prediction
No ratings yet
Branch Prediction
38 pages
A case for (partially) tagged geometric history length branch prediction
No ratings yet
A case for (partially) tagged geometric history length branch prediction
24 pages
Branch Net
No ratings yet
Branch Net
13 pages
Build Your Own Distributed Compilation Cluster - A Practical Walkthrough
From Everand
Build Your Own Distributed Compilation Cluster - A Practical Walkthrough
Hunter Davis
No ratings yet
Nintendo 64 Architecture: Architecture of Consoles: A Practical Analysis, #8
From Everand
Nintendo 64 Architecture: Architecture of Consoles: A Practical Analysis, #8
Rodrigo Copetti
No ratings yet
How To Build Your Own Working 16 Bit Microcomputer
No ratings yet
How To Build Your Own Working 16 Bit Microcomputer
98 pages
A Transmission Gate Flip-Flop Based On Dual-Threshold CMOS Techniques
No ratings yet
A Transmission Gate Flip-Flop Based On Dual-Threshold CMOS Techniques
4 pages
Call and JMP Instructions
No ratings yet
Call and JMP Instructions
25 pages
Microchip ATTINY13A SSH Datasheet
No ratings yet
Microchip ATTINY13A SSH Datasheet
20 pages
Ic 7404
No ratings yet
Ic 7404
4 pages
DBWS - EEPROM With Arduino - Internal & External
No ratings yet
DBWS - EEPROM With Arduino - Internal & External
35 pages
Operation Mode: nMOS Operation Mode: pMOS: CMOS Inverter: Transient Response
No ratings yet
Operation Mode: nMOS Operation Mode: pMOS: CMOS Inverter: Transient Response
6 pages
Flash Cross Reference Guide
No ratings yet
Flash Cross Reference Guide
3 pages
Mic 4
No ratings yet
Mic 4
31 pages
Cervoz - Industrial - SSD - 2.5inch - PATA - M120 - Datasheet - Rev2.0
No ratings yet
Cervoz - Industrial - SSD - 2.5inch - PATA - M120 - Datasheet - Rev2.0
12 pages
Students Copy CTE 241 Intro to Microprocessor and Assembl 055542
No ratings yet
Students Copy CTE 241 Intro to Microprocessor and Assembl 055542
18 pages
EUV Lithography Systems - Products - ASML
No ratings yet
EUV Lithography Systems - Products - ASML
6 pages
Tutorial Sheets
No ratings yet
Tutorial Sheets
9 pages
MPMC Model Exam Question Paper (3)
No ratings yet
MPMC Model Exam Question Paper (3)
5 pages
COM Express Basic Type 6 Module: conga-TS77
No ratings yet
COM Express Basic Type 6 Module: conga-TS77
2 pages
Vlsi Design Instruction Manual With Solution
No ratings yet
Vlsi Design Instruction Manual With Solution
65 pages
pa 3
100% (1)
pa 3
14 pages
Tomasulo's Algorithm and Scoreboarding
No ratings yet
Tomasulo's Algorithm and Scoreboarding
17 pages
Ar1021x CL3D
No ratings yet
Ar1021x CL3D
1 page
Week 3+ Week 4
No ratings yet
Week 3+ Week 4
64 pages
Unit-Iv: Intel 8051 Microcontroller
No ratings yet
Unit-Iv: Intel 8051 Microcontroller
42 pages
93LC46
No ratings yet
93LC46
30 pages
Sub Code: 06ES42 IA Marks: 25 HRS/ Week: 04 Exam Hours: 03 Total HRS.: 52 Exam Marks: 100
No ratings yet
Sub Code: 06ES42 IA Marks: 25 HRS/ Week: 04 Exam Hours: 03 Total HRS.: 52 Exam Marks: 100
4 pages
Barani Institute of Science Sahiwal: Information and Communication Technoligy
No ratings yet
Barani Institute of Science Sahiwal: Information and Communication Technoligy
6 pages
Premier ddr4 2666 U Dimm v2
No ratings yet
Premier ddr4 2666 U Dimm v2
2 pages
The 8051 Microcontroller: Hsabaghianb at Kashanu - Ac.ir 1
100% (1)
The 8051 Microcontroller: Hsabaghianb at Kashanu - Ac.ir 1
141 pages
Arm Instructions
No ratings yet
Arm Instructions
24 pages
CAO - Unit-3 (Addressing Modes)
No ratings yet
CAO - Unit-3 (Addressing Modes)
8 pages
Co-Design of A Novel CMOS Highly Parallel, Low-Power, Multi-Chip Neural Network Accelerator
No ratings yet
Co-Design of A Novel CMOS Highly Parallel, Low-Power, Multi-Chip Neural Network Accelerator
6 pages

Uploaded by

Uploaded by

Gshare and Pshare Branch

1. Gshare Branch Predictor

How Gshare Works

2. Pshare Branch Predictor

How Pshare Works

1. Each static branch address maps to a unique history register (BHT).

3. Gshare vs. Pshare: Key Differences

4. Real-Life Use Cases

🔹 Gshare Use Cases

1. Intel Processors (Pentium Pro, Pentium 4, Core series)

🔹 Pshare Use Cases

1. IBM POWER4 (2002)

5. Latest Research Papers (Post-2000)

6. Summary & Takeaways

You might also like