Function to bound

The function we compute for atomic and reduction use case is

Where N is know and M in in $[1,M_{max}]$ and u is the machine epsilon

Relative upper bound error

The upper bound of the Relative error is given in "The Accuracy of Floating Point Summation" (2.6) from Nicholas J. Higham https://doi.org/10.1137/0914050. Applied to f(N,M) it give:

$\begin{align*} | RE(f(N,M)) | & \le \frac{(N \cdot M-1) \cdot u \cdot f(N,M)}{N} \ & \le \frac{(N \cdot M-1) \cdot u \cdot N}{N} \ & \le N \cdot M \cdot u \end{align*}$

Maximun Relative Error

$\begin{align*} \max_{M \in [1,M_{max}]} R_E(N,M) = & \max_{M \in [1,M_{max}]} N \cdot M \cdot u \ = & N \cdot M_{max} \cdot u \end{align*}$

For FP32, maximun problem size to get less than 1%

$\begin{align*} 1% & \ge NM \cdot 2^{-24} \ NM & \le \frac{1%}{2^{-24}} \ & \le 167772 \le 55 \cdot 55 \cdot 55 \end{align*}$

Empirical analysis

import numpy as np
import sys

def sum_dec32(N,M):
	inc = np.float32(1) / np.float32(M)
	s = np.float32(0)
	for i in range(N*M):
		s += inc

	return abs(N-s)/N

N, M_max = map(int,sys.argv[1:])
m_error,m = max((sum_dec32(N,M),M) for M in range(1,N_max+1))
print (m, f"{m_error:.3%}".format(m_error))

This give for N = 55*55 and M = 55 a error of 0.187%.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FP_error.md

FP_error.md

Function to bound

Relative upper bound error

Maximun Relative Error

For FP32, maximun problem size to get less than 1%

Empirical analysis

Files

FP_error.md

Latest commit

History

FP_error.md

File metadata and controls

Function to bound

Relative upper bound error

Maximun Relative Error

For FP32, maximun problem size to get less than 1%

Empirical analysis