0% found this document useful (0 votes)
292 views

Module 2 Presentation - Approximations and Errors

This document discusses numerical errors and approximations. It defines accuracy as how close a value is to the true value, while precision refers to the agreement between individual measurements. There are three main types of errors: input errors from initial data, round-off errors from limited computer precision, and truncation errors from approximations in numerical methods. Significant figures rules are provided to quantify the level of accuracy in measurements. Floating point representation and rounding methods like truncation and symmetric rounding are explained.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
292 views

Module 2 Presentation - Approximations and Errors

This document discusses numerical errors and approximations. It defines accuracy as how close a value is to the true value, while precision refers to the agreement between individual measurements. There are three main types of errors: input errors from initial data, round-off errors from limited computer precision, and truncation errors from approximations in numerical methods. Significant figures rules are provided to quantify the level of accuracy in measurements. Floating point representation and rounding methods like truncation and symmetric rounding are explained.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 79

MODULE 2:

APPROXIMATIONS
AND ERRORS
Learning Objectives:
At the end of the lesson, you as a future Engineer,
are expected to:
• Differentiate and calculate errors.
• Understand the difference between accuracy
and precision
• Learn how to represent numbers in a floating
point form
• Convert different number systems
• Differentiate non-linear transcendental and
polynomial functions.
• Review about the Taylor series
Topics:
A. Numbers and their Accuracy
1. Floating Point Representation
B. Significant Figures
C. Accuracy and Precision
D. Error Definitions
1. Measuring Errors
2. Sources of Errors
3. Propagation Errors
E. Binary Representation of Numbers
F. Taylor Series Revisited
INTRODUCTION
Numerical methods provide solutions to problems that are
based on approximation and along with that, error is also
generated.

Such errors are characteristic of most of the techniques


described in this subject. Students and practicing engineers
should limit errors especially in their field of work, since you
will be penalized and not rewarded for your errors.

In professional practice, errors can be very costly and


sometimes, it will cause a great effect in your work like, if a
structure or device fails, lives can be lost.
NUMBERS AND THEIR ACCURACY
The numbers that we use every day are represented
using decimal system, which means that the digits
represent multiple of powers of 10. For example, the
number 326 actually represents a polynomial in
powers of 10, for we have
326 = 3 x 102 + 2 x 101 + 6 x 100
326 = 3x100 + 2x10 + 6X1
326 = 300 + 20 + 6

326= 326
FLOATING POINT REPRESENTATION
A typical computer uses the so-called floating
point representation for real numbers. This
consists of a certain number of binary digits or bits
divided into three parts namely:
Ø a single digit for the sign (which may be positive
or negative),
Ø a fixed number of digits representing the
fractional or decimal part called the mantissa,
and
Ø another fixed number of digits to represent the
exponent or characteristic.
Common Number Systems
ØDecimal - Base 10 (0-9)
ØBinary – Base 2 (0,1)
ØOctal – Base 8 (0-7)
ØHexadecimal – Base 16 (0-F)
Figure 1. The manner in which a floating point
number is stored in a word.
The general form of the floating point representation
for real numbers is:

±0.a1a2…ak x bm
Where:
- k is the maximum number of digits allowed in the
mantissa,
- b is the base of the representation and
- m is the exponent.
Note: The digit a1 in the mantissa is not allowed to be zero,
except for the case when the number being represented is 0.
Example 1:

The decimal number :

325.687

can be represented in decimal floating-point


form as :

+0.325687 x 103.
Rounding Off Numbers in Floating-Point
Form
There are two ways of rounding off these
numbers which are employed by the
computer.
A. Truncation or Chopping
B. Symmetric Rounding Off
TRUNCATION OR CHOPPING
In this method, if the mantissa
contains m digits, then the first m
significant digits of the number are
retained, while the remaining digits
are chopped off.
Example:
Round off the number 152.6893156 to five
significant digits and express the result in floating
point form using Truncation or chopping.
Solution:
The floating point representation of 152.6893156
is +0.1526893156 x 103, hence its mantissa is
0.1526893156.
By truncation, we retain only the first m=5
significant digits of mantissa. Thus, we obtain
0.15268 x 103
SYMMETRIC ROUNDING OFF
It is carried out by first adding the quantity
(b/2) x b –m-1
to the mantissa before truncation is done.
This method is similar to the rounding off that
we do when we perform manual calculations in
base ten, wherein we add 1 to the last digit that
we retain if the first digit to be dropped off is at
least 5 ( in this case, we have b=10, so that
b/2=5).
Example:
Round off the number 152.6893156 to five significant
digits and express the result in floating point form
using Symmetric Rounding Off.
Solution:
The floating point representation = +0.1526893156 x 103
By using symmetric rounding off, we first compute:
𝒃 𝟏𝟎
xb –m-1 = x 10-6 = 0.000005
𝟐 𝟐
Add this quantity to the mantissa,
0.1526893156 + 0.000005 = 0. 1526943156,
And then retain only the first m=5 significant digits,
yielding 0.15269 x 103
SEATWORK
Express each of the following numbers in floating
point from with the given length m of the
mantissa, and then perform (1) symmetric
rounding off and (2) truncation or chopping. The
indicated subscript represents the base.
a. 34516.8312, m=6
b. 43301102142115, m=7
c. .00004434210621317, m=8
d. 12214.100235, m=5
Answer:
a. 34516.8312, m= 6
Floating Point
+0.345168312x105
Truncation
0.345168x105
Symmetric
(10/2)x10-6-1=5x10-7 =0.0000005
0.345168312+0.0000005
=0.345168812
0.345168x105
Answer:
b. 43301102142115, m= 7
Floating Point
+0. 4330110214211x513
Truncation
0. 4330110x513
Symmetric
(5/2)x5-7-1=2.5x5-8 =0.000000025
0. 4330110214211+0.000000025
=0.433011046421
=0. 4330110x513
Answer:
d. 12214.100235, m= 5
Floating Point
+0. 1221410023x55
Truncation
+0. 12214x55
Symmetric
(5/2)x5-5-1=2.5x5-6 =0.0000025
0. 1221410023+0.0000025
=0. 122143523
=+ 0. 12214x55
Significant Figures
The significant digits of a number are those
that can be used with confidence.

They correspond to the number of certain


digits plus one estimated digit.

Although it is usually a straightforward


procedure to ascertain the significant figures
of a number, some cases can lead to
confusion.
Significant Figures, con’t.
The concept of significant figures has two
important implications for our study of
numerical methods:
1. We must develop criteria to specify how
confident we are in our approximate result.
One way to do this is in terms of significant
figures.
2. Because computers retain only a finite
number of significant figures, some numbers
can never be represented exactly.
Significant Figures, con’t.
Significant Figures Rules
There are certain rules which need to be
followed to measure the significant figures of a
calculated measurement.
Listed below are the basics of the law:
Ø All non-zero digits are significant.
Ø Zeroes between non-zero digits are
significant.
Ø A trailing zero or final zero in the decimal
portion only are significant.
Significant Figures, con’t.
Following are the significant figures rules that govern the determination
of significant figures:
Ø Those digits which are non-zero are significant.
For example, in 6575 cm there are four significant figures and in
0.543 there are three significant figures.
Ø If any zero precedes the non-zero digit then it is not significant. The
preceding zero indicates the location of the decimal point, in 0.005
there is only one and the number 0.00232 has 3 figures.
Ø If there is a zero between two non-zero digits then it is also a
significant figure.
For example; 4.5006 have five significant figures.
Ø Zeroes at the end or on the right side of the number are also
significant.
For example; 0.500 has three significant figures.
Ø Counting the number of objects for example 5 bananas 10 oranges
have infinite figures as these are inexact numbers.
Answer:
c. .00004434210621317, m= 8
Floating Point
+0. 443421062131x7-4
Truncation
+0. 44342106x7-4
Symmetric
(7/2)x7-8-1=3.5x7-9 =0.0000000035
+0. 443421062131+0.0000000035
=0. 443421065631
=+0. 44342106x7-4
Significant Figures, con’t.
Significant Figures Examples
The numbers in boldface are the significant
figures.
1) 4308 – 4 significant figures
2) 40.05 – 4 significant figures
3) 470,000 – 2 significant figures
4) 4.00 – 3 significant figures
5) 0.00500 – 3 significant figures
ACCURACY and PRECISION
Accuracy refers to how closely a
computed or measured value agrees with
the true value.

Precision refers to how closely individual


computed or measured values agree with
each other.
ACCURACY and PRECISION, con’t.

Figure 2. An example from marksmanship illustrating the concepts of accuracy


and precision. (a) Inaccurate and imprecise; (b) accurate and imprecise; (c)
inaccurate and precise; (d) accurate and precise.
ERROR DEFINITIONS
Numerical errors arise from the use of
approximations to represent exact
mathematical operations and quantities.
Errors that arise during computation may be
classified into three general types:
1. Input error,
2. Round-off error and
3. Truncation error
ERROR DEFINITIONS, con’t.
Input errors
- are errors inherent in the source data.
- They occur often in
Ø experimental data,
Ø encoding stage,
Ø formulation of mathematical formulas
ERROR DEFINITIONS, con’t.
Round-off errors
- occur because of the limited capacity of the
computer to store real numbers.
- Instead of an infinite decimal expansion, real
numbers are represented using a fixed and
finite number of digits. Thus, computations
are done on approximate rather than exact
values, so that the composed values also
contain errors.
ERROR DEFINITIONS, con’t.
Truncation errors
– are errors that are due to the specific
algorithm or formula that is used in a
particular numerical method.
MEASURING ERRORS
To be able to deal with the issue of
errors, we need to :
Øidentify where the error is coming
from, followed by
Øquantifying the error, and lastly
Øminimize the error as per our needs
TRUE ERROR
- denoted by Et is the difference between the
true value (also called the exact value or
absolute value) and the approximate value.
True Error = True value – Approximate value
True error
True fractional relative error = True value
The relative error can also be multiplied by 100 percent to express it
as
True error
εt = True value X100%
where εt designates the true percent relative error
EXAMPLE 1: Calculation of Errors
Suppose that you have the task of
measuring the lengths of a bridge and a
rivet and come up with 9999 and 9 cm,
respectively. If the true values are 10,000
and 10 cm, respectively, compute
(a) the true error and
(b) the true percent relative error for
each case.
SOLUTION:
(a) The error for measuring the bridge is
Et = 10,000 − 9999 = 1 cm
and for the rivet it is
Et = 10 − 9 = 1 cm
(a) The percent relative error for the bridge is
εt = ___1___ x 100% = 0.01%
10,000
and for the rivet it is
___1___
εt = 10 x 100% = 10%
EXAMPLE 2:
The derivative of a function f(x) at a
particular value of x can be approximately
calculated by

of f’(2) For f(x)=7e0.5x and h=0.3 , find


a) the approximate value of f’(2)
b) the true value of f’(2)
c) the true error for part (a)
SOLUTION:
For

and ,

a.)
For x=2 and h=0.3

Note:

= 10.265
SOLUTION, con’t.
For

and ,

b.) The exact value of f’(2) can be calculated by


using our knowledge of differential calculus.

So the true value of is f’(2)


SOLUTION, con’t.
For

and ,

c.) True error is calculated as


Et = True value – Approximate value
For

and ,
APPROXIMATE ERROR
When we are solving a problem numerically,
we will only have access to approximate values.
We need to know how to quantify error for
such cases.
Approximate error is denoted by Ea and is
defined as the difference between the present
approximation and previous approximation.
Approximate Error = Present Approximation – Previous Approximation
EXAMPLE
The derivative of a function f(x) at a particular
value of can be approximately calculated by

For and at x=2 , find the following


a) f’(2) using h=0.3
b) f’(2) using h=0.15
c) approximate error for the value of f’(2) for part (b)

c
SOLUTION:
a) The approximate expression for the
derivative of a function is

For x=2 and h=0.3


Note:

= 10.265
SOLUTION:
b) Repeat the procedure of part (a) with h=0.15

For x=2 and h=0.15

Note:

= 9.8799
SOLUTION:
c) So the approximate Error, Ea is
Ea = Present Approximation – Previous Approximation
= 9.8799 – 10.265
= - 0.38474
RELATIVE APPROXIMATE ERROR
c) So the approximate Error, Ea is
Ꜫa = Present Approximation – Previous Approximation
Present Approximation
=| 9.8799 – 10.265 x100 |
9.8799
= 3.897812731%
Error Definitions

True error: Et = True value – Approximation (+/-)


True value – Approximation
True percent relative error : e t = ´100%
True value

Approximate Error
• For numerical methods, the true value will be known only when we deal with
functions that can be solved analytically.
• In real world applications, we usually do not know the answer a priori.
Approximate error
Approximate Relative Error : ea = ´100%
Approximation
Iterative approaches (e.g. Newton’s method)
(Current Approx.) - (Previous Approx.)
Approx. Relative Error : e a = ´100%
CurrentApprox.
Computations are repeated until stopping criterion is satisfied

ea á e s Pre-specified % tolerance based on your


knowledge of the solution. (Use absolute value)

If εs is chosen as:
e s = (0.5 ´ 10 (2 -n) )%
Then the result is correct to at least n significant figures (Scarborough 1966)
48
ROUND OFF ERROR
A computer can only represent a number
approximately. For example, a number like
may be represented as 0.333333 on a
PC. Then the round off error in this case is

Then there are other numbers that cannot be


represented exactly. For example, and
are numbers that need to be approximated in
computer calculations.
TRUNCATION ERROR
• Truncation error is defined as
the error caused by truncating
a mathematical procedure.
EXAMPLE: Maclaurin series expansion
x2 x3 xn
e x
= 1 +x + + + ... +
2 3! n!
Calculate e0.5 (= 1.648721…) up to 3 significant figures. During the calculation
process, compute the true and approximate percent relative errors at each step

Error tolerance e s = (0.5 ´ 10(2-3) )% = 0.05%

Terms Count Result εt (%) True εa (%) Approx.


1 1 1 39.3
1+(0.5) 2 1.5 9.02 33.3
1+(.5)+(.5)2/2 3 1.625 1.44 7.69
1+(.5)+(.5)2/2+(.5)3/6 4 1.6458333 0.175 1.27
1+(.5)+(.5)2/2+(.5)3/6+(.5)4/24 5 1.6484375 0.0172 0.158
1+(.5)+(.5)2/2+(.5)3/6+(.5)4/24+(.5)5/120 6 1.648697917 0.00142 0.0158
BINARY
REPRESENTATION
Ø Binary is a base-2 number system that uses two mutually
exclusive states to represent information.
Ø A binary number is made up of elements
called bits where each bit can be in one of the two
possible states. Generally, we represent them with the
numerals 1 and 0.
Ø We also talk about them being true and false. Electrically,
the two states might be represented by high and low
voltages or some form of switch turned on or off.
BINARY
REPRESENTATION, con’t.
In a binary system, we have a similar system where
the base is made of only two digits 0 and 1. So it is
a base 2 system. A number like (1011.0011) in
base-2 represents the decimal number as
BINARY
REPRESENTATION, con’t.
We build binary numbers the same way we build numbers in our
traditional base 10 system. However, instead of a one's column, a
10's column, a 100's column (and so on) we have a one's column, a
two's columns, a four's column, an eight's column, and so on, as
illustrated below.
BINARY
REPRESENTATION, con’t.
For example, to represent the number 203 in base 10, we know we
place a 3 in the 1's column, a 0 in the 10's column and a 2 in
the 100's column. This is expressed with exponents in the table
below.
BINARY
REPRESENTATION, con’t.
For example, to represent the number 203 in base 10, we know we
place a 3 in the 1's column, a 0 in the 10's column and a 2 in
the 100's column. This is expressed with exponents in the table
below.

Or, in other words, 2 × 102 + 3 × 100 = 200 + 3 = 203.


BINARY
REPRESENTATION, con’t.
To represent the same thing in binary, we would have the following
table.

That equates to 27 + 26 + 23+21 + 20 = 128 + 64 + 8 + 2 + 1 = 203.


BINARY
REPRESENTATION, con’t.
The easiest method to convert
between bases is repeated
division. To convert, repeatedly
divide the quotient by the base,
until the quotient is zero, making
note of the remainders at each
step. Then, write the remainders
in reverse, starting at the bottom
and appending to the right each
time. An example should illustrate;
since we are converting to binary Reading from the bottom and appending
we use a base of 2. to the right each time gives 11001011
BINARY
REPRESENTATION, con’t.
Descending Powers of Two Subtraction
1. Start by making a chart.

List the powers of two in a "base 2 table" from right to left. Start
at 20, evaluating it as "1". Increment the exponent by one for
each power. Make the list up until you've reached a number very
near the decimal system number you're starting with. For this
example, let's convert the decimal number 15610 to binary.
BINARY
REPRESENTATION, con’t.
Descending Powers of Two Subtraction
BINARY
REPRESENTATION, con’t.
Descending Powers of Two Subtraction
2. Look for the greatest power of 2.

Choose the biggest number that will fit into the number you are
converting. 128 is the greatest power of two that will fit into 156,
so write a 1 beneath this box in your chart for the leftmost binary
digit. Then, subtract 128 from your initial number. You now have
28.
BINARY
REPRESENTATION, con’t.
Descending Powers of Two Subtraction
BINARY
REPRESENTATION, con’t.
Descending Powers of Two Subtraction
3. Move to the next lower power of two.

Using your new number (28), move down the chart marking how
many times each power of 2 can fit into your dividend. 64 does
not go into 28, so write a 0 beneath that box for the next binary
digit to the right. Continue until you reach a number that can go
into 28.
BINARY
REPRESENTATION, con’t.
Descending Powers of Two Subtraction
BINARY
REPRESENTATION, con’t.
Descending Powers of Two Subtraction
4. Subtract each successive number that can fit, and mark it
with a 1.

16 can fit into 28, so you will write a 1 beneath its box and
subtract 16 from 28. You now have 12. 8 does go into 12, so
write a 1 beneath 8's box and subtract it from 12. You now
have 4.
BINARY
REPRESENTATION, con’t.
Descending Powers of Two Subtraction
BINARY
REPRESENTATION, con’t.
Descending Powers of Two Subtraction
5. Continue until you reach the end of your chart.

Remember to mark a 1 beneath each number that does go into


your new number, and a 0 beneath those that don't.
BINARY
REPRESENTATION, con’t.
Descending Powers of Two Subtraction
BINARY
REPRESENTATION, con’t.
Descending Powers of Two Subtraction
6. Write out the binary answer.
The number will be exactly the same from left to right as the 1's and
0's beneath your chart. You should have 10011100. This is the binary
equivalent of the decimal number 156. Or, written with base
subscripts: 15610 = 100111002.
Repetition of this method will result in memorization of the powers
of two, which will allow you to skip Step 1.
BINARY
REPRESENTATION, con’t.
Descending Powers of Two Subtraction
Number
Representation

86409
in Base-10

173
in Base-2

71
The representation of -173 on a 16-bit computer
using the signed magnitude method

72
Common Arithmetic Operations

ADDITION:
0.1557 · 101 + 0.4381 · 10−1

• [1 − (−1) = 2]
• 0.4381 · 10−1 →0.004381 · 101
• Thus,

• Notice how the last two digits of the second number that were shifted to the right have essentially been lost from the
computation.

73
Common Arithmetic
Operations
• SUBTRACTION:
▫ sign of the subtrahend is reversed.
▫ Example>>>subtract 26.86 from 36.41
The loss of significance during
the subtraction of nearly
equal numbers is among the
greatest source of round-off
error in numerical methods.
▫ the result is not normalized, and so shift the decimal one place to
the right to give 0.9550 · 10 = 9.550
1

74
Common Arithmetic
Operations
• MULTIPLICATION:
– Example:

If, as in this case, a leading zero is introduced, the result is


normalized,

chopped
75
Common Arithmetic
Operations
• DIVISION:
– Division is performed in a similar manner, but the
mantissas are divided and the exponents are
subtracted. Then the results are normalized and
chopped.

76
Adding a Large and a Small
Number.
• Suppose we add a small number, 0.0010, to a large
number, 4000, using a hypothetical computer with
the 4-digit mantissa and the 1-digit exponent. We
modify the smaller number so that its exponent
matches the larger,

77
SEATWORK 3
NORMALIZE YOUR ANSWER
Convert the following:
1. 123510= __________2 1. 1.0678.100 + 0.0986.10-2
2. 1100111012=_________10 2. 0.5612.102 + 0.5959.10-2
3. 101010111112=__________10 3. Subtract 0.5612+ 1.5959
4. 342010=__________________2 4. Subtract 1.0008.10-2+ 0.0341.101
5. 1010111111002=_____________10 5. Multiply 1.5612 and 0.0219
6. Multiply 0.5612.102 + 0.5959.10-2
Use Maclaurin Series to solve for the
Et and Ea. 7. Add 0.5612.104 + 0.00 5959.10-2
Find the value of e0.75 using the first 7
terms of the Maclaurin series
expansion.
78
Thank you for
listening!!!
Prepared by:
Engr. Michael L.de Vera

You might also like