Comparing WRF Physics Options
Introduction It is fairly common to use three main measures of forecast accuracy: bias, mean absolute error (MAE), and root mean squared error (RMSE).For a reference on calculating these, you can try this link: A Protocol for standardizing the Performance Evaluation of Short-Term Wind Power Prediction Models. Please note that "bias" is defined as the mean error. Error is defined as the measured value minus the predicted value. This means that a negative bias indicates that the predictions were generally too high, which is counter intuitive. To help avoid this confusion, I use the term "mean error" so that the intuitive interpretation of "bias" is avoided. A technique is used here that I have not seen mentioned elsewhere. However, it seems likely that someone has used these before. I coined the terms "Mean Absolute Bias Corrected Error (MABCE" and "Root Mean Squared Bias Corrected Error (RMSBCE)." These statistics indicate that before the MAE or RMSE are calculated, any bias that has been identified is removed from the individual forecasted values; the errors are "bias corrected." The reasoning is that when using forecasted values for predicting wind or generated electrical power, any bias that has been previously identified would be removed. For practical purposes, the details of what comes out of the model are less important than the quality of forecast that would actually be passed on to the utility companies. That passed on forecast would have any known biases removed, so the statistics that compare forecasts should also have any known biases removed. PLEASE NOTE: For most of the data shown, I do something that should raise a red flag. I calculate the bias from the same set of data from which I later remove the bias. This is simply because I have not accumulated measurements and forecasts over a long enough period of time to calculate bias separately. I would if I could, and I will as time and data progresses. And perhaps my shortcut is irrelevant. The bias corrected values are not much different than the noncorrected values. Testing Boundary Layer and Surface Layer Schemes The main WRF physics options that were tested were the planetary boundary layer schemes and the surface layer schemes. These are controlled by the WRF namelist values bl_pbl_physics and sf_sfclay_physics, respectively. Some of the work was done using WRF version 3.0.1.1 which was release in 2008. The work done on that release indicated that the MYJ boundary layer scheme (bl_pbl_physics = 2) and the ETA surface layer scheme (sf_sfclay_physics, = 2) were the best for the initialization data with which I was testing. I used archived RUC data, which uses fixed pressure levels at every 25 mb. Most of these tests used 198 forecasts, which seemed reasonable, though not all could be matched up to sodar data that is being used as measured values. When graphed, there was enough variability between errors from one forecast hour to another that it indicated more forecasts/measurements should be used. For testing the new options in WRF 3.1, the number of forecasts were doubled to 396. This has also been done for some of the WRF version 3.0.1.1 options so like comparisons could be put on the same graph. For 396 forecasts, approximately 370 could be matched up to sodar reading per forecast hour. Occasionally, the sodar either does not record wind speeds or the measurements are of poor quality and I discard them.
For completeness, here are the MABCE and RMSE graphs for the same data. The pattern is basically the same. ![]() I hope to add a few more lines of some other options from WRF version 3.0 that are being rerun for completeness sake. The previous tests were all done in version 3.0.1.1 and the results may be different if bug fixes changed them. Also, I have accumulated more sodar data. The change in season may affect some results. As mentioned, these comparisons were done with RUC data for initialization and lateral boundary layer conditions. Similar tests have been started with NAM data. NAM data comes out less frequently but seems to be of better quality. As part of an initial check of WRF results, please see this page: WRF vs Initialization Data.
|