Comparison of Short-Term Rainfall Prediction Models For Real-Time Flood Forecasting
Comparison of Short-Term Rainfall Prediction Models For Real-Time Flood Forecasting
www.elsevier.com/locate/jhydrol
Abstract
This study compares the accuracy of the short-term rainfall forecasts obtained with time-series analysis techniques, using
past rainfall depths as the only input information. The techniques proposed here are linear stochastic auto-regressive moving-
average (ARMA) models, artificial neural networks (ANN) and the non-parametric nearest-neighbours method. The rainfall
forecasts obtained using the considered methods are then routed through a lumped, conceptual, rainfall–runoff model, thus
implementing a coupled rainfall–runoff forecasting procedure for a case study on the Apennines mountains, Italy. The study
analyses and compares the relative advantages and limitations of each time-series analysis technique, used for issuing rainfall
forecasts for lead-times varying from 1 to 6 h. The results also indicate how the considered time-series analysis techniques, and
especially those based on the use of ANN, provide a significant improvement in the flood forecasting accuracy in comparison to
the use of simple rainfall prediction approaches of heuristic type, which are often applied in hydrological practice. 䉷 2000
Elsevier Science B.V. All rights reserved.
Keywords: Rainfall forecasting; Flood warning; Stochastic processes; Artificial neural networks; Non-parametric predictors
of rain intensities yet (Krzysztofowicz, 1995). The 3–5 describe the examined methodologies and some
radar detection is, in addition, particularly difficult relevant aspects regarding the application to short-
in mountainous regions, because of ground occulta- term rainfall forecasting. We have gone into more
tion and altitude effects. details about artificial neural networks and nearest-
The QPF may be obtained also by means of time- neighbour methods because their application in
series analysis techniques (Brath et al., 1998; hydrology is relatively new and less well established
Burlando et al., 1993). One should be aware that the with respect to ARMA models. Section 6 presents and
predictive ability of these methods is limited owing to compares the obtained results in terms of predicted
the low persistence in time that usually characterises rainfall depths. Section 7 describes the hydrologic
rainfall time-series, even if sampled at a fine temporal rainfall–runoff transformation model and introduces
scale. However, the moderate computer time and data heuristic rainfall forecasting approaches used as
availability required by such techniques make their benchmarks for the comparison of the time-series
application very attractive in the context of real-time rainfall forecasting methods. The performances of
flood forecasting. Thus, the benefits potentially the coupled rainfall–runoff forecasting systems corre-
achievable from their application with real precipita- sponding to all the rainfall predictive schemes, both
tion data, in terms of efficiency of the flood forecast, time-series methods and heuristic approaches, are
are worth analysing. then compared, evaluating the efficiency of the
The following time-series analysis techniques have rainfall predictions in terms of the transformed river
been applied: (1) Linear stochastic auto-regressive discharges. Section 8 offers conclusions and a
moving-average models (ARMA), which express the perspective on future developments.
future rainfall as a linear function of past data. The
approach is thus linear, model-driven and parametric,
i.e. it first requires the identification of the type of 2. Case study
relationship among the variables (model identifica-
tion) and then the estimation of model parameters. 2.1. Study area and data set description
(2) Artificial neural network architectures (ANN),
belonging to the non-linear, data-driven approaches: The case study considered herein is referred to the
the resulting model depends on the available data to Sieve River basin, a first-order tributary of the Arno
be “learned”, without any a priori hypothesis about River, located in the Apennines Mountains in Central
the kind of relationship, which is allowed to be Italy. Since the Sieve joins the right bank of the Arno
complex and non-linear. (3) K-nearest-neighbour just a few kilometres upstream of the city of Florence,
method (K-NN), a non-parametric regression metho- the prediction of the corresponding flood hydrographs
dology, not implying any structured interaction, but plays a key role in determining the flood risk for the
exploiting the closeness (“neighbourhood”) between city. The main stream length is 58 km, the areal exten-
the most recent observations and K “similar” sets of sion of the watershed is around 830 km 2 and the time
observations chosen in an adequately large training of concentration is approximately 10 h. The closure
sample. section is Fornacina, where hourly discharge observa-
The aim of our analysis is a comparison of the above tions were collected between 1 January 1992 and 31
methods, not only in terms of obtained rainfall depths, December 1996.
but also from an hydrological point of view, considering For the same observation period, hourly tempera-
an integrated rainfall and runoff forecasting system tures (for estimating the potential evapo-transpiration)
operated on a real-world case study. The issued rainfall were measured at four stations and hourly rainfall
forecasts, therefore, will be routed through a lumped depths at 12 raingauges located inside the study
conceptual rainfall–runoff transformation model and basin, thus allowing the computation of the average
the performances of the flow forecast will be analysed areal precipitation over the watershed. The rainfall is
and compared. spatially averaged (with the Thiessen polygons
Section 2 describes the study area and the data sets, method) so that it is consistent with the use in the
along with the use of the data in the analysis. Sections lumped rainfall–runoff transformation model. The
134 E. Toth et al. / Journal of Hydrology 239 (2000) 132–147
averaging also improves the efficiency of the forecast, allowing the calibration phase to “learn” an accurate
compared to the efficiency of making forecasts for representation of the problem domain. These first two
each one of the gauges separately before averaging sets of events will be referred to as split-sample pair
the results over the basin. In fact, the averaging A. In order to verify the variability of the forecasts
produces a smoothing of non-stationarities and fluc- results if different split-samples are used, three further
tuations recorded at each gauge, while preserving the pairs (B–D) of calibration and validation sets were
general pattern of the storm event over the watershed selected, by slightly changing the definition of storm
(Burlando et al., 1993). event described in Section 2.1, so as to include also
Given the predominance of the presence of null or events not considered before. In pair B, the validation
very low values in the rainfall series and our interest set remains the same, so that a fair comparison with
in flood forecasting, we limited our analysis, both in the results described above is allowed. For the
the calibration and validation phases, to the rainfall calibration set, instead, the events (again twice the
values belonging to storm events, so as to identify the number of the validation events) containing the high-
temporal pattern characterising the storms, whose est rainfall values (above 3 mm/h) in the remaining
persistence properties are different from those of dry data were chosen, lowering at the same time the
or low rainfall sequences. A storm event was conven- threshold imposed on the change in the corresponding
tionally defined as an interval containing at least one discharge from 20 to 10 m 3/s. In pairs C and D, new
hourly rainfall depth exceeding 1 mm (“wet” observa- events were chosen for both the calibration and
tion), preceded by at least 5 and followed by at least validation sets, so as to force the presence of the
20 h of rainfall lower than 1 mm, provided that in the most extreme rainfall events in the calibration set
same time interval the corresponding observed hourly still more (even if at the expense of the corresponding
discharge increases by at least 20 m 3/h. In this way, increase in discharges), with the aim of improving the
the wet observations are followed by a time interval “learning” of rainfall peaks. The new events in these
not shorter than the concentration time of the basin, so last two experiments were chosen requiring a peak of
that the observed river discharges include the total at least 4 and 5 mm/h, respectively. In addition, the
contribution of the surface runoff generated by each number of observations following the wet observa-
storm event. In addition, the requirement on the tions was reduced in the last case from 20 to 5; in
change in discharge allows the exclusion of the events this way, there is a further reduction of the presence
that do not produce hydrologic response (dry periods). of low rainfall values in the sets. We here anticipate
In the observation period a total of 84 storm events that the forecast performances on the split-samples
were identified, including a total number of 4580 B–D in the analysis were found to be very close to
hourly rainfall observations, for an average storm those obtained when using the A split-sample pairs.
length of 55 h. For this reason and for brevity and clarity of presenta-
tion, only the results referring to case A will be
2.2. Description of the calibration approaches described in what follows.
In the adaptive calibration no database of past
Two alternative approaches were followed for esti- significant observed events is supposed available,
mating the parameters of the models: split-sample but only the most recently observed values, immedi-
calibration and adaptive calibration. ately preceding the forecast instant. Thus, the set of
In the split-sample calibration, the storm events data is extremely poor for generalisation purposes,
were divided into two sets: a calibration (or training) but, on the other hand, the limited amount of data to
set and a validation set, where the former contains be processed allows the re-calibration of the model
twice the number of events as the latter. The sets on-line, at each time step, as soon as new observations
were chosen so as to have approximately the same become available. This adaptability enables the model
proportion of low duration–high intensity and high parameters to adjust to the properties of an ongoing
duration–low intensity rainfall events. It follows event, by capturing the characteristics of the current
that the training set should be representative of the meteorological situation.
characteristics of different kinds of event, thus For both calibration approaches, in correspondence
E. Toth et al. / Journal of Hydrology 239 (2000) 132–147 135
of each hourly time step belonging to the validation vary with the season. Nonetheless, the limited number
set, a rainfall forecast was issued for the subsequent of rainfall events in the observation period prevented
1–6 h, using the most recent observations as input. us, in the split-sample calibration, from grouping the
The resulting forecasted rainfall values were events in monthly periods, as it is usually done in
processed as inputs to the rainfall–runoff transforma- hydrology to circumvent non-stationarity. In the
tion model, thus providing the discharge forecast. adaptive calibration application, non-stationarity is
accounted for by allowing the model parameters to
vary with time since the calibration is performed
3. ARMA models solely on the progress of the current event. We
preferred not to perform any preliminary transforma-
Most of the time-series techniques traditionally tion of the data in order to make them as close to
used for modelling water resources series fall within Gaussian as possible. In fact, Gaussian data are not
the framework of the ARMA class of linear stochastic required for the forecast application of ARMA
processes. They are usually denoted as ARMA(p,q) models, since they provide the best linear prediction
models, where p and q are the auto-regressive and even in the non-Gaussian case (Brockwell and Davis,
moving-average orders, respectively (Box and 1987).
Jenkins, 1976; Brockwell and Davis, 1987; Bras The selection of the model orders, p and q, was
and Rodriguez-Iturbe, 1994). They describe each driven by some results available in literature.
observation of the time series as a weighted sum of Obeysekera et al. (1987) determined an equivalence
p previous data, and the current as well as q previous between the correlation structure of an ARMA(1,1)
values of a white noise process: model and some point process models, like the
xt f1
xt⫺1 ⫺ mx ⫹ f2
xt⫺2 ⫺ mx Poisson rectangular pulses and the Neyman–Scott
white noise models (see Rodriguez-Iturbe et al.,
⫹… ⫹ fp
xt⫺p ⫺ mx ⫹ ht ⫹ u1 ht⫺1 1984). On the other hand, the Neyman–Scott rectan-
gular pulses model, which has proved to represent the
⫹u2 ht⫺2 ⫹ … ⫹ uq ht⫺q ⫹ mx ; (1) stochastic structure of rainfall better (Rodriguez-
Iturbe et al., 1987), has a correlation structure
where xt is the investigated time series; ht , a white equivalent to that of an ARMA(2,2) process.
noise, i.e. a non-correlated, zero-mean random In the adaptive calibration, the parameters are
variable that is also not correlated with the past values estimated in correspondence with each forecast
of xt ; f1 ; …; fp and u1 ; …; up , the auto-regressive and instant, on the basis of the last values measured in
moving-average parameters, respectively; and mx ; the real-time. The number of past observations to be
mean of the time series. used for each calibration was chosen on the basis of
Parameter estimation for ARMA models can be the results of a previous study (Brath et al. 1998). The
performed in several ways. We applied here an estimation of the parameters was performed there
approximation in the spectral domain of the Gaussian with a number w of observations xt immediately
maximum likelihood function, which was first preceding each forecast instant, with w varying from
proposed by Whittle (1953) for short-memory models. 7 to 100, aiming at identifying the value of w that
provides the best forecasting performances. The
3.1. ARMA model application results showed that for increasing w, the efficiency
of the forecast improved moderately for short lead-
The application of low-order ARMA processes to times (1–3 h), but a longer set of past data (more than
model short-term precipitation values is considered 3 days of previous hourly observations) provided a
here, following the modelling framework proposed much better performance for lead-times longer than
by Brath et al. (1988) and Burlando et al. (1993). 4 h. Thus, we set the moving window of past rainfall
The application of ARMA models requires the data observations to be used in each adaptive calibration
to be stationary and this is often not the case for hourly equal to the 100 last measured hourly observations
rainfall observations, whose statistical properties may (that is, w 100:
136 E. Toth et al. / Journal of Hydrology 239 (2000) 132–147
4. Artificial neural networks The use of ANN for rainfall forecasting has not been
fully explored, yet. A pioneer work is the study by
As mentioned above, the rainfall generation French et al. (1992), who applied a neural network to
mechanism is extremely difficult to identify and forecast 1 h ahead, two-dimensional rainfall fields on a
model: there is still much that is not understood regular grid. The forecast was based on synthetically
regarding the small-scale behaviour of precipitation generated rainfall grid values corresponding to the
and the underlying physical laws. It is not easy to previous hourly period. Kuligowski and Barros (1998)
recognise all the existing complex, and typically generated a QPF of point precipitation accumulated
non-linear, relationships between the various aspects over the following 6-h period using as inputs the ante-
of the dynamical process (French et al., 1992; cedent rainfall depths measured in adjacent gauges and
Kuligowski and Barros, 1998). the radiosonde-based wind direction. Our interest is
To accommodate the inability of an ARMA-type mainly focused on the hydrological use of rainfall fore-
model to account for these effects, non-linear casts: the implemented scheme provides a spatially
statistical models have been proposed, such as the averaged rainfall forecast for all the lead-times from 1
threshold model and the bilinear model, but their to 6 h, directly usable as input to the rainfall–runoff
complexity often makes them unsuitable for lumped model, and the only available information
operational applications (Chakraborty et al., 1992). used are past rainfall observations.
In addition, such models still belong, like the Neural networks distribute computations to
ARMA type, to the model-driven approaches, requir- relatively simple processing units called neurons,
ing an identification of the kind of relation between grouped in layers and densely interconnected. Three
the variables (model selection) and an estimation of different layer types can be distinguished: input layer,
the selected model parameters. From these considera- connecting the input information (and not carrying out
tions stems the interest in alternative non-linear fore- any computation), one or more hidden layers, acting
casters, belonging to the data-driven approaches, as intermediate computational layers between input
where no a priori relationship between known para- and output, and an output layer, producing the final
meters and observed values has to be hypothesised output. In correspondence to each computational node
and no knowledge of the underlying process is J, each entering value
IJi is multiplied by a connec-
needed. Both artificial neural network architectures tion weight
wij : Such products are then all summed
and nearest-neighbour methods present these with a neuron-specific parameter, called bias
bj ;
appealing characteristics. used to scale the sum of products into a useful
Artificial neural networks have been widely studied range. The computational node finally applies a non-
and applied to a variety of problems, including hydro- linear activation function ( f ), often a sigmoid, to the
meteorological simulation and forecasting. Several above sum producing the node output
OJ:
studies have been dedicated to the prediction of Neural networks are trained (calibrated) with a set
river flows (at a time scale ranging from one year to of observed input and output (called target to be
one day) with no exogenous inputs, that is with only distinguished from the network final output) data
the use of past flow observations (e.g. Karunanithi et pairs, the training data set, which is processed repeat-
al., 1994). However, the large majority of the ANN edly, changing the values of the parameters until they
hydrologic applications predict future flows based on converge to values such that each input vector
the knowledge of previous rainfall depths (and other produces output values as close as possible to the
meteorological variables) along with past observed desired target vectors. Several variants of the popular
flows. The appeal of the use of ANN as black-box and extensively tested BackPropagation (BP) training
rainfall–runoff models lies mainly in their ability to algorithm were tested in the present work. The
reproduce the non-linear nature of the rainfall–runoff Levenberg–Marquardt algorithm, a quasi-Newton
transformation, and encouraging results have been method that proved to be the quickest and less easily
obtained in literature on both real and synthetic hydro- trapped in local minima among all the tested training
logic data (e.g. Hsu et al., 1995; Minns and Hall, techniques, was finally chosen.
1996; Shamseldin, 1997; Zealand et al., 1999). The comparison of the performance of networks
E. Toth et al. / Journal of Hydrology 239 (2000) 132–147 137
w11h b1h
Pt
w1o
b2h bo
Pt-1
Pt+1* Lead-time = 1 step
Pt-2 b3h
Pt-3
w11h b1h
Pt+1*
w1o
b2h bo
Pt
Pt+2* Lead-time = 2 steps
Pt-1 b3h
Pt-2
a
bo1
b
Fig. 1. (a) Feed-forward recursive network. Pt is the precipitation process; whij ; woi are the connection weights towards the hidden and output
layers, respectively, and bj are the node biases. (b) Feed-forward direct multi-step network. Pt is the precipitation process; whij ; woij are the
connection weights towards the hidden and output layers, respectively, and bj are the node biases.
allowing different feed-forward and feedback connec- were also tested: they provided comparable results,
tions between the nodes led to the choice of a classic but longer time and more random initialisations
multi-layer feed-forward network, where the informa- were needed for training.
tion flows only in one direction, from the input As far as the number of hidden layers is concerned,
through the hidden up to the output layer. Networks there is no theory yet to tell how many hidden layers
allowing other types of connections between the are needed. It has been proved that only one layer of
layers (recurrent and cascade-forward networks) hidden units “suffices to approximate any function
138 E. Toth et al. / Journal of Hydrology 239 (2000) 132–147
with finitely many discontinuities to arbitrary preci- that the optimal network architecture and properties
sion”, provided the activation functions of the hidden are highly problem-dependent and no definitive estab-
units are non-linear (the “Universal Approximation lished methodology exists to deal with the neural
Theorem”, see Hornik et al., 1989). The hidden network modelling problem. As described above,
nodes also allow taking into account the presence of preliminary forecast analyses were performed on a
non-stationarities in the data, such as trends and few storm events, for choosing, on the analysed time
seasonal variations (Maier and Dandy, 1996). series, different architectures and properties of the
Obviously, the introduction of additional hidden networks.
layers allows the fit of a larger variety of target func- From all the above considerations it was decided to
tions. On the other hand, the use of more than one extensively test ANN architectures consisting of a
hidden layer substantially increases the number of multi-layer feed-forward network with only one hidden
parameters to be estimated and the training time, layer. The network is trained with the Levenberg–
and it exacerbates the problem of local minima, Marquardt algorithm and the multi-step approach is
increasing the need for several random initialisations. the direct multi-step method. The optimal complexity
Moreover, the addition of hidden layers often fails to of the model, that is the number of input and hidden
provide noticeable improvement in the out-of-training nodes, will be determined, as it is usually done, by a
forecasting application (see, for example, Zealand et trial-and-error approach.
al., 1999), as confirmed also in our case study
applications.
The use of an ANN for forecasting time series 5. K-NN method
implies that the input nodes are connected to a number
of past observed values supposed sufficient to identify The K-nearest-neighbour method has its origins as a
the process at future time steps. For forecasting non-parametric statistical pattern recognition proce-
several time steps ahead (multi-step ahead predic- dure, aiming at distinguishing between different
tion), two methods have been considered. The first patterns according to chosen criteria. Among the
is the recursive multi-step method: the network has various non-parametric techniques, in the sense that
only one output node, forecasting a single step ahead, no theoretical or analytical relation is known or
and the network is applied recursively, using the assumed between the inputs and the outputs, it is the
previous predictions as inputs for the subsequent most intuitive, but nevertheless possesses powerful
forecasts (see Fig. 1a). This forecasting technique is statistical properties. Yakowitz (1987) and Karlsson
similar to the one used by the ARMA-type models. and Yakowitz (1987a,b) did considerable work in
The second method (direct multi-step) exploits the extending the K-NN method to time-series and fore-
capability of a neural network to provide a multiple casting problems, obtaining satisfactory results and
output, when several nodes are included in the output constructing a robust theoretical base for the K-NN
layer, and each output node represents one time step to method. The intuitiveness of the approach and the
be forecasted (see Fig. 1b). In our preliminary tests, powerful theoretical basis have made the method
the recursive multi-step method provided very good attractive to forecasters, especially in the hydrologic
results for lead-time equal to one time step, but there field, where the method found successful applications
was a drastic deterioration when the lead-time (Karlsson and Yakowitz, 1987a,b; Galeati, 1990;
increased, as could be expected, because in the recur- Kember and Flower, 1993; Todini, 1999).
sive methodology the forecast errors are propagated The prediction of a time series is based on a local
into subsequent forecasts. Given the importance of approximation, making use of only the nearby obser-
good performance in correspondence with the longer vations. For each forecast instant t, let x d
t
lead-times, the direct multi-step method was chosen.
xt ; …; xt⫺d⫹1 be a feature vector of past records. A
feature vector is a vector that summarises the whole
4.1. ANN application past history in a smaller-dimension vector of observa-
tions supposed to contain most of the information
The most crucial disadvantage of ANN models is relevant to the forecast. The method assumes that
E. Toth et al. / Journal of Hydrology 239 (2000) 132–147 139
the probability distribution of the random variable subsequent to these K historical nearest neighbours.
conditioned on the entire past
xt⫹1 =xt ; xt⫺1 ; …; is It may be noticed that the K-NN approach does not
the same as that of the random variable conditioned require the selection of a class of models and the
on only the d past observations
xt⫹1 =xd
t: estimation of the model parameters, so that the
It was proved that, even if x d
t does not satisfy the identification of a specific form of the input/output
above “history summarisation” properties, the K-NN relationship is not needed.
forecaster will be asymptotically optimal among all
the forecasters defined on the feature vector x d
t: 5.1. K-NN method application
That is, under fairly general circumstances, conver-
The nature of the nearest-neighbour method makes an
gence to the optimal forecaster is assured as the
adaptive calibration approach completely meaningless,
historical data set increases (Karlsson and Yakowitz,
because the approach is based on the presence of an
1987b). Let us indicate the expectation of the next
extended database. In fact, in addition to being data-
value as x^t⫹1 , conditioned on the current feature
driven, it is a method that does not detect any input/
vector x d
t; that is,
output mapping function, not even a posteriori (whereas
x^t⫹1 Ext⫹1 兩xd
t:
2 the ANNs do), and it has, therefore, no extrapolation
ability when presented with an unfamiliar input vector.
To estimate x^t⫹1 ; the K-NN method imposes a metric, As a consequence, only the split-sample application was
denoted by 储·储, on the feature vector x d
t to find the performed.
set of K past nearest neighbours of x d
t; i.e. the K d- A trial-and-error procedure was implemented for
dimensional vectors of past observations: x d
tj ; J finding the number of nearest neighbours, K, and the
1; …; K; which minimise 储xd
t ⫺ x d
tj 储: dimension of the feature vector d (corresponding to
The most intuitive and widely used metric to the number of past rainfall data considered represen-
identify neighbours is the Euclidean norm, which, tative for the forecast), providing the best performing
for a d-dimensional vector Z d
z1 ; z2 ; …; zd ; is
forecasts. As Karlsson and Yakowitz (1987a) empha-
!1=2 sise, K and d seem, and indeed are, parameters but the
X
d
储Z 储
d 2
zi :
3 method itself is nevertheless non-parametric. In fact,
i1 K and d do not imply a model for x^t⫹L : the purpose of
their search “is pragmatic; it is to make the Nearest-
The forecast is then obtained by averaging the Neighbour forecaster as accurate as possible for a
temporal evolution of the nearest neighbours, given database”.
assumed to be similar to the evolution of the current
situation, that is,
6. Analysis and comparison of rainfall forecasting
1 XK
results
x^t⫹1 x :
4
K j1 tj ⫹1
The performances of the considered time-series
The generalisation to higher lead-times L is straight- methods (ARMA models, ANN and K-NN method)
forward: are first of all analysed and compared assessing the
respective ability to forecast the spatially averaged
1 XK
rainfall depths belonging to the validation set storms.
x^t⫹L x :
5
K j1 tj ⫹L The issued rainfall forecasts will be successively
routed through the conceptual rainfall–runoff trans-
Thus, in our case, the K-NN algorithm looks through formation model and the performances of the result-
all consecutive d-dimensional vectors in the entire ing river flow forecasts will be analysed and compared
historical rainfall depths database and locates K of in the next session.
these d-ples, which are closest to the vector of the d The performances of the various forecasting techni-
most recent rainfalls. The prediction of the next rain- ques are investigated through a trial-and-error process.
fall is then taken to be the average of the rainfall The rainfall forecasting results are extremely difficult to
140 E. Toth et al. / Journal of Hydrology 239 (2000) 132–147
Fig. 2. Mean correlation coefficients of the one to six steps ahead rainfall forecasts issued: (a) by ANN architectures with varying number of
input nodes, and (b) by nearest-neighbour implementations with varying number of neighbours.
analyse, and a variety of quantitative measures (root respective standard deviations. It ranges from ⫺1.0 to
mean square error, mean absolute error, coefficient of 1.0, with higher values indicating better agreement.
persistence, efficiency coefficient, correlation coeffi-
cient, index of agreement) were considered for synthe- 6.1. Description of the split-sample and adaptive
sising the effectiveness of the performances over all the applications
lead-times, so to make a comparison possible. Different
methods provided the best results for different forecast 6.1.1. ARMA models: trial-and-error processes
performance measures and for different lead-times, so In the split-sample calibration, low-order ARMA
that a performance classification could not be assessed models were implemented, both purely auto-regressive
unequivocally. and with a moving-average component, provided that
Among the considered measures, the correlation the sum of the auto-regressive and moving-average
coefficient (CC) was chosen, following other rainfall orders, p ⫹ q; is equal to or less than 6. All the tested
forecasting studies (French et al., 1992; Kuligowski ARMA models provided almost analogous perfor-
and Barros, 1998), as the most representative for mances (mean CC 0.291–0.293).
assessing rainfall forecast performance. The correla- In the adaptive calibration, the tested ARMA
tion coefficient is given by the covariance of the fore- models had auto-regressive orders of 1 and 2 and
casts and observations divided by the product of the moving-average orders less or equal to 3. The trend
E. Toth et al. / Journal of Hydrology 239 (2000) 132–147 141
Table 1
Correlation coefficients of rainfall predictions
1 2 3 4 5 6
ANN split-sample NI 18, NH 2 0.689 0.511 0.407 0.358 0.331 0.327 0.437
ARMA split-sample Almost all equivalent 0.686 0.430 0.276 0.173 0.114 0.077 0.293
Nearest Neighbours K 70, d 2 0.709 0.493 0.336 0.239 0.174 0.110 0.344
ANN adaptive NI 3, NH 3 0.527 0.307 0.196 0.171 0.169 0.162 0.255
ARMA adaptive p 1, q 0 0.744 0.472 0.283 0.134 0.060 0.003 0.281
of the obtained performances, for all the ARMA from 2 to 12. The improvement of the performance
models, is characterised by relatively good perfor- with an increasing number of nearest neighbours is
mance for lead-time of 1 h, followed by a collapse less noticeable for more than 20 neighbours and
in correspondence of longer time horizons. The over- there is no marginal improvement in the overall
all best results are provided by ARMA models with performance when increasing K beyond 70 (see Fig.
parsimonious configurations, indicating ARMA(1,0) 2b). Small values (from 2 to 4) of the feature dimen-
as the best performing model (mean CC 0.281). sion d gave the most satisfactory results for each given
number of neighbour vectors K.
6.1.2. ANN: trial-and-error processes
In the split-sample application, architectures with a 6.2. Overall comparison of rainfall forecasts
number of input nodes NI ranging from 2 to 24 were
tested. For each input layer dimension, the number of A direct comparison of all the implemented
hidden nodes (NH) was progressively increased from methodologies is presented here, so as to highlight
2 to 8 nodes. A deterioration of the forecasting per- the relative strengths and limitations. Table 1 shows
formance on the validation set, indicating over-fitting, the correlation coefficients of the rainfall predictions
was always shown for moderate dimensions of the provided by each time-series analysis technique (with
hidden layer, the best results corresponding to NH the modelling configurations identified in the trial-
between 2 and 6. The performance of ANN architec- and-error processes), both for different lead-times
tures, considering all the lead-times, improved as the and for the mean over all the lead-times. The table
number of input nodes (NI) increased, with modest ranks ANNs above the nearest-neighbour method and
additional gain for more than 15–18 nodes (see this latter above the ARMA models, on our precipita-
Fig. 2a). tion data. Such results seem to indicate the appropri-
The networks tested in the adaptive calibration of ateness of non-linear approaches when modelling
were extremely parsimonious (both NI and NH rainfall time-series.
ranging from 2 to 5), because the limited number of With regard to the ARMA models, the adaptive
training samples (here chosen corresponding to 100 calibration application provides very good performance
past observations for a fair comparison with the for very short lead-times, while the split-sample
ARMA adaptive approach) would make a complex approach achieves better results for lead-times longer
network easily subject to over-fitting. The optimal net- than 3 h. This is consistent with the results obtained by
work complexity for adaptively calibrated neural Burlando et al. (1993), who found a superiority of the
networks seems to correspond to numbers of input adaptive calibration method tested for predictions 1 and
and hidden nodes, NI and NH, equal to 3. 2 h ahead.
The ANN adaptive calibration application proves to
6.1.3. Nearest-neighbours: trial-and-error process be unreliable for short lead-times and especially
A trial-and-error procedure was implemented for a inadequate for reproducing low rainfall, even if it is
number of nearest neighbours, K, ranging from 5 to satisfactorily stable for lead-times longer than 3 h. For
100 and a dimension of the feature vector, d, ranging a better result on short lead-times, we should probably
142 E. Toth et al. / Journal of Hydrology 239 (2000) 132–147
Table 2
Mean correlation coefficients of predictions for different rainfall ranges
Low rainfall (⬍0.1 mm) Medium rainfall (0.1–1 mm) High rainfall (⬎1 mm)
have resorted to the recursive multi-step method, but obtained when predicting extreme rainfall. However,
at the expense of the performance for several steps such a choice would provide poor performance in the
ahead. The ANN corresponding to the split-sample prediction of events that, although not extreme, are
calibration gives the overall best results for lead- relevant in terms of vulnerability of flood-prone areas.
times longer than 2 h, even if this kind of structure Considering all the lead-times and all the rainfall
slightly penalises the 1-h ahead forecast. categories, the ANN with split-sample calibration
Table 2 presents the correlation coefficients seems the most adequate among all the considered
obtained for different rainfall intensity ranges, to get approaches, at least in reference to our case study.
an insight into the type of rainfall for which each In addition, it should be underlined that almost all
method performs better. Both the nearest-neighbour the computational effort spent for the implementation
and ARMA split-sample models provide a good fit of of the ANN split-sample application is concentrated
low-intensity precipitation, but the ANN with split- in the training phase, while the issue of the forecasts
sample calibration allows the best results for medium with the trained network is practically instantaneous,
and high rainfall values. thus making this approach very appealing in a real-
The forecasts issued when the calibration is of the time forecasting framework.
split-sample type present, overall, smaller deviations
from the observed values. Nevertheless, Table 2 high-
lights the difficulties experienced by all the methods 7. Rainfall–runoff transformation: analysis and
in reproducing high-intensity rainfall. The analysed comparison of discharge forecasts
techniques do not seem to properly reproduce high
precipitation values. This is probably due to the fact 7.1. Hydrologic model description
that they are influenced by the majority of low-rainfall
observations in the calibration sets. Analysing the The deterministic model used for simulating the rain-
issued forecasts, it may be noticed that the adaptive fall–runoff transformation is a conceptual continuous
calibrations often issue unreliable forecasts in simulation model called ADM (Franchini, 1996),
correspondence with abrupt changes in the rainfall which is based on the concept of probability distributed
intensities; on the other hand, the forecasts obtained soil moisture storage capacity. The catchment is
with the split-sample approaches tend, when the lead- assumed to be composed of an infinite number of
time increases, towards the average of the calibration elementary areas (each one with a different soil moisture
data, often underestimating the values that follow the content and a different soil moisture capacity) and the
high-intensity occurrences. It may be inferred that this proportion of elementary areas that are saturated is
is the same reason that causes the presence of an described by a distribution function: the total surface
apparent upper limit in the split-sample forecasts. runoff is the spatial integral of the infinitesimal contri-
If a different goodness-of-fit criterion had been bution deriving from the saturated elementary areas.
chosen in the trial-and-error processes, for instance The model is divided into two main blocks: the first
maximising the fit on the highest rainfall values represents the water balance at soil level, while the
alone, better results probably would have been second represents the transfer of runoff production at
E. Toth et al. / Journal of Hydrology 239 (2000) 132–147 143
the basin outlet. The soil, in turn, is divided into two casting theory, is the persistent method, which equals
zones: the upper zone produces surface and subsur- the future rainfall intensity, over all the investigated
face runoff, having as inputs precipitation and poten- lead-times, to the last measured value,
tial evapo-transpiration, while the lower zone (whose
x^t⫹L xt ; ᭙L:
6
input from the first one is the percolation flow)
produces base runoff. The transfer of these compo- The last investigated heuristic approach, somehow
nents to the outlet section takes place in two distinct similar to the persistent method, consists in extrapo-
stages: the first represents the flow along the hill- lating future values setting the intensity for each given
slopes towards the channel network, while the second, lead-time L equal to the mean intensity measured over
the flow along the channel network towards the basin the last L observations, that is,
outlet. The 11 parameters of the ADM model were X
L
accurately calibrated with the Shuffled Complex xt⫺i⫹1
Evolution global optimisation algorithm (Duan et x^t⫹L i1
:
7
al., 1992). The conceptual model has been separately L
parameterised off-line (fixed calibration) so that its This last predictive scheme will be denoted as the
parameters do not change during the forecasting modified persistent method.
period. Joint optimisation of the rainfall–runoff
model coupled with the rainfall forecasters would, 7.3. Analysis and comparison of flow forecasting
of course, be possible, but this might cause, owing performances
to compensation effects, undesirable biases in the
calibration of the parameters of the rainfall–runoff The performance of the discharge forecasts attain-
model (and possibly a departure from their physical able using the QPF provided by the different rainfall
meaning). predictive models was evaluated by computing the
corresponding coefficient of efficiency, which is
widely recognised as one of the most suitable good-
7.2. Standards of reference: heuristic rainfall
ness-of-fit measures for runoff. For the analysis of
predictive approaches
discharge performance results, the discharge series
To evaluate the performances of the analysed time- chosen as a reference was not the series of observed
series forecasting methods when used for providing discharges, but the hourly discharges simulated by the
the inputs to the rainfall–runoff transformation model, conceptual model when using as inputs the observed
the resulting discharges will be compared with those precipitation (“true” discharges). This scenario was
obtained with some predictive benchmarks, consisting considered in order to be able to evaluate the improve-
of rainfall forecasting approaches of a purely heuristic ment obtainable by the rainfall forecasting module
nature. Three predictive procedures have been consid- alone, independently of the effects of the simulation
ered among the possible alternatives that a hydrologi- errors induced by possible residual inadequacies of
cal practitioner may envisage in case sophisticated the hydrologic model.
modelling tools are not available and in case the The coefficient of efficiency for each lead-time L is
only information at his/her disposal are the most given by:
P
recent rainfall observations.
Q ⫺ Q^ t⫹L 2
Probably, the most widespread approach when EL 1 ⫺ P t⫹L 2 ; L 1; …; 6
8
Qt⫹L ⫺ Q
using a rainfall–runoff transformation model in a
real-time framework is to assume that the future rain- where Q^ t is the discharge at time t forecasted using as
fall will be null (null rainfall). It is an optimistic input the predicted rainfall values; Qt ; the value of the
hypothesis, assuming that the forecast is issued at corresponding “true” discharge (known rainfall); and
the end of the event, whereas, especially in watersheds the mean of the Qt series. The summations are
Q;
with short response time, a forecast is needed earlier extended to all the issued forecasts, that is, to all the
in the storm progress. forecast instants t belonging to all the validation
A second term of comparison, widely used in fore- events.
144 E. Toth et al. / Journal of Hydrology 239 (2000) 132–147
1.00
0.99
EFFICIENCY COEFFICIENT
0.98
0.97
0.96
0.95
0.94
0.93
0.92
0.91
0.90
1 2 3 4 5 6
LEAD-TIME (HOURS)
ARMA split-sample ARMA adaptive
ANN split-sample ANN adaptive
Nearest Neighbours Null rainfall
Heuristic persistent Heuristic persistent modified
Fig. 3. Efficiency coefficients of the river flows corresponding to the different rainfall forecasting procedures: ARMA models with split-sample
and adaptive calibration; ANN with split-sample and adaptive calibrations; and nearest-neighbour method. The heuristic approaches: null
rainfall, persistent rainfall method and persistent modified method.
Fig. 3 shows the performances, in terms of inadequate data sets, as the records of low-intensity
efficiency coefficient, of the coupled rainfall–runoff rainfall immediately preceding the arrival of the
forecasting schemes obtained with all the considered highest precipitation intensities certainly are. As was
rainfall forecasting procedures. It may be observed expected, the null rainfall hypothesis proves to be
that the hydrological processes taking place in the unrealistic, since it may strongly underestimate the
rainfall–runoff transformation tend to level out all rainfall volumes, whereas the persistent methods
the rainfall forecasts corresponding to very short (both in the traditional formulation and modified)
lead-times and it dampens out most of the variability provide an improvement with respect to the null
between the different methods highlighted in the rainfall approach.
analysis of the rainfall depths described in the The ARMA model with adaptive calibration shows
previous section. As a consequence, the good a relevant deterioration with increasing lead-time, as
performance of the ARMA adaptive calibration could be expected, given the poor performance of the
approach for 1-h lead-time becomes unnoticeable, rainfall forecasts obtained using this approach in
whereas the relevant deterioration with increasing correspondence to longer lead-times. The discharges
lead-time confirms how the method is less simulated with the ARMA split-sample rainfall
appropriate than the three split-sample calibration predictions appear slightly closer to the reference
methods. discharges in comparison with the results obtained
Moreover, the response time of the watershed shifts with the nearest-neighbours scheme, whilst the perfor-
the poor results provided by the adaptive calibration mance of the rainfall forecasts appeared superior for
of ANN from the low to the large lead-times. The the nearest-neighbour method. It may be hypothesised
unsatisfactory results of the ANN with adaptive that this is due to the non-linear and threshold effects
calibration is probably due to the strong limitation characterising the rainfall–runoff transformation
of the forecasting ability of ANN when trained on modelling.
E. Toth et al. / Journal of Hydrology 239 (2000) 132–147 145
0
Hourly precipitation (mm)
0.5
1
1.5
2
2.5
0 10 20 30 40 50 60
ANN
200 K-NN
ARMA
Discharge (m /s)
150
3
100
50
0
0 10 20 30 40 50 60
Time (hours from 5.00 of 22/02/95)
Fig. 4. Example of rainfall and runoff observations and forecasts during the event of 22 February 1995: observed precipitation (past, black bars;
future, white bars), observed discharge (past, solid line; future, dotted line) and 1 to 6 h ahead predictions issued by ANN, nearest-neighbour
and ARMA methods with split-sample calibration.
the predictors based on ANN architectures, within the anthropogenic effects on hydrological processes”,
analysed range of parameters and model structures. contract no. 9908032871_001.
The dampening effect induced by the rainfall–runoff
transformation processes tends to level the per-
formances of the different methods, so that, when References
considering the predicted runoff, the satisfactory fit
that some of the methods allow for very short-term Box, G.E.P., Jenkins, G.M., 1976. Time Series Analysis: Forecasting
rainfall forecasts is of little worth. and Control. Holden Day, San Francisco, CA.
Overall, the study indicates that the considered Bras, R.L., Rodriguez-Iturbe, I., 1994. Random Functions and
time-series analysis techniques provide an improve- Hydrology. Dover, New York.
Brath, A., 1999. On the role of numerical weather prediction models
ment in the flood forecasting accuracy with respect to in real-time flood forecasting. In: Proceedings of the
the use of intuitive, heuristic rainfall prediction International Workshop on River Basin Modeling: Management
approaches, even if the rainfall forecasting perfor- and Flood Mitigation, September 25–26, 1997, Monselice
mance measures indicate only a weak to moderate (Italy), pp. 249–259.
relationship between forecasted and observed values. Brath, A., Burlando, P., Rosso, R., 1988. Sensitivity analysis of real-
time flood forecasting to on-line rainfall predictions. In: Siccardi,
This is due to the fact that past rainfall observations F., Bras, R.L. (Eds.), Selected Papers from the Workshop on
alone are not sufficient to predict future precipitation Natural Disasters in European-Mediterranean Countries, Perugia,
accurately, not even for short time periods. Italy, pp. 469–488.
The results shows that the use of time-series Brath, A., Montanari, A., Toth, E., 1998. Su alcune tecniche stocas-
analysis techniques for precipitation forecasting may tiche per il miglioramento delle prestazioni del preannuncio di
piena (in Italian, with English Abstr.). Proc. XXVI Convegno di
allow an extension of the lead-time up to which a idraulica e costruzioni idrauliche, Catania, Italy, vol. II, pp. 159–
reliable flood forecast may be issued, providing a 172.
quick prediction based on past values solely and Brockwell, P.J., Davis, R.A., 1987. Time Series: Theory and
directly in the format required by the rainfall–runoff Methods. Springer, New York.
transformation model. On the other hand, strong Burlando, P., Rosso, R., Cadavid, L.G., Salas, J.D., 1993. Forecasting
of short-term rainfall using ARMA models. J. Hydrol. 144, 193–
limitations to a time-series analysis approach are 211.
due to the lack of information needed for a reliable Chakraborty, K., Mehootra, K., Mohan, C.K., Ranka, S., 1992.
prediction. More substantial improvement should Forecasting the behavior of multivariate series using neural
certainly be pursued through numerical weather networks. Neural Netw. 5, 961–970.
prediction models, once they are able to provide Duan, Q., Sorooshian, S., Gupta, V.K., 1992. Effective and efficient
global optimization for conceptual rainfall–runoff models.
timely rainfall forecasts at a temporal and spatial Water Resour. Res. 28 (4), 1015–1031.
scale compatible with the requirements of flood fore- Franchini, M., 1996. Use of a genetic algorithm combined with a
casting in small and medium-sized basins. local search method for automatic calibration of conceptual
rainfall runoff models. Hydrol. Sci. J. 41 (1), 21–39.
French, M.N., Krajewski, W.F., Cuykendall, R.R., 1992. Rainfall
forecasting in space and time using a neural network. J. Hydrol.
Acknowledgements 137, 1–31.
Galeati, G., 1990. A comparison of parametric and non-parametric
We are grateful to the three anonymous reviewers methods for runoff forecasting. Hydrol. Sci. J. 35 (1), 79–94.
whose suggestions allowed us to improve the presen- Hornik, K., Stinchcombe, M., White, H., 1989. Multilayer feedfor-
tation of our results. The work presented here has been ward networks are universal approximators. Neural Netw. 2,
359–366.
partially supported by the European Community Hsu, K., Gupta, H.V., Sorooshian, S., 1995. Artificial neural
through the grant ENV4-CT97-0529 (FRAMEWORK network modeling of the rainfall–runoff process. Water Resour.
project), by the National Research Council of Italy — Res. 31 (10), 2517–2530.
Group for the Prevention of Hydrogeological Karlsson, M., Yakowitz, S., 1987a. Nearest-neighbor methods for
Disasters, contract no. 98.00587.PF42 and by the nonparametric rainfall–runoff forecasting. Water Resour. Res.
23 (7), 1300–1308.
Ministry of University and Research in Science and Karlsson, M., Yakowitz, S., 1987b. Rainfall–runoff forecasting
Technology (MURST) of Italy through its 40% methods, old and new. Stochastic Hydrol. Hydraul. 1, 303–318.
national grant to the program on “Climatic and Karunanithi, N., Grenney, W.J., Whitley, D., Bovee, K., 1994.
E. Toth et al. / Journal of Hydrology 239 (2000) 132–147 147
Neural networks for river flow prediction. J. Comput. Civil considerations in the modeling of temporal rainfall. Water
Engng 8 (2), 201–220. Resour. Res. 20 (11), 1611–1619.
Kember, G., Flower, A.C., 1993. Forecasting river flow using Rodriguez-Iturbe, I., Febres de Power, B., Valdes, J.B., 1987.
nonlinear dynamics. Stochastic Hydrol. Hydraul. 7, 205–212. Rectangular pulses point process models for rainfall: analysis
Krzysztofowicz, R., 1995. Recent advances associated with flood of empirical data. J. Geophys. Res. 92 (D8), 9645–9656.
forecast and warning systems. US National Report to IUGG, Shamseldin, A.Y., 1997. Application of a neural network technique
1991–1994, Rev. Geophys. 33 (Part 2, Suppl. S), 1139–1147. to rainfall–runoff modelling. J. Hydrol. 199, 272–294.
Kuligowski, R.J., Barros, A.P., 1998. Experiments in short-term Todini, E., 2000. Real-time flood forecasting: operational experience
precipitation forecasting using artificial neural networks. Mon. and recent advances. In: Marsalek, J. et al. (Eds.), Flood issues in
Weather Rev. 126, 470–482. contemporary water management, Kluwer Academic Publisher,
Maier, H.R., Dandy, G.C., 1996. The use of artificial neural The Netherlands, pp. 261–270.
networks for the prediction of water quality parameters. Water Whittle, P., 1953. Estimation and information in stationary time
Resour. Res. 32 (4), 1013–1022. series. Ark. Mat. 2, 423–434.
Minns, A.W., Hall, M.J., 1996. Artificial neural networks as Yakowitz, S., 1987. Nearest neighbor method for time series
rainfall–runoff models. Hydrol. Sci. J. 41 (3), 399–417. analysis. J. Time Ser. Anal. 8 (2), 235–247.
Obeysekera, J.T.B., Tabios III, G.Q., Salas, J.D., 1987. On parameter Zealand, C.M., Burn, D.H., Simonovic, S.P., 1999. Short term
estimation of temporal rainfall models. Water Resour. Res. 23 streamflow forecasting using artificial neural networks. J.
(10), 1837–1850. Hydrol. 214, 32–48.
Rodriguez-Iturbe, I., Gupta, V.K., Waymire, E.C., 1984. Scale