Warning: include(/opt/home/womencom/www/includes/header.html) [function.include]: failed to open stream: No such file or directory in /usr/local/www/docs/Activities/craw_archive/creu/crewReports/2003/newjersey_final.php on line 3

Warning: include() [function.include]: Failed opening '/opt/home/womencom/www/includes/header.html' for inclusion (include_path='/opt/coolstack/php5/lib/php:.:') in /usr/local/www/docs/Activities/craw_archive/creu/crewReports/2003/newjersey_final.php on line 3

Project: Parallel Genetic Algorithms: An Exploration of Weather Prediction Through Clustered Computing
Student Researchers: Emily Gibson, Jessie Burger
Advisor: Deborah Knox
Institution: The College of New Jersey





PURPOSE
We predicted weather temperatures using genetic algorithms as a means to explore how to design, implement, and apply genetic algorithms to real world problems. Weather systems are extremely complex and poorly understood, which makes weather prediction an excellent subject for genetic algorithms.

PROCEDURE
A preexisting cluster of six identical Dell Optiplex machines with Pentium III 1 GHz processors and 256 MB of RAM were used for the research project. Each node runs Red Hat Linux 7.2, while the NFS server runs Red Hat Linux 6.2. The MPICH distribution of MPI is running on our cluster of desktop workstations. The MPI library allows the creation of C, C++, and FORTRAN programs using standard parallel programming functions such as send, receive, barrier, and gather. In addition, we used the freely available parallel genetic algorithm library PGAPack in conjunction with C and MPI to program genetic algorithms. Similar to MPICH, PGAPack allows C and FORTRAN programmers to begin writing genetic algorithms, using helpful functions such as select, crossover, mutate, evaluate, and fitness. Lastly, our project utilized the United States Historical Climatology Network Daily Temperature, Precipitation, and Snow Data for 1871-1997 for our temperature data.

A genetic algorithm was implemented as a binary chromosome where each allele value corresponded to a different rule, rules were organized by types, and the value of the allele indicated whether or not the rule was to be applied. Three simple operations were identified which could be used to predict the weather based solely on temperature: calculating the average, calculating the weighted average, and performing trend analysis. These three rule types were applied in two different ways. The first rules in the chromosome applied average, weighted average, and trend analysis rules to the current year.s data, while the remaining rules applied these functions to data spanning the past twenty years for average and weighted average rules and the past 5 years for trend analysis rules. Overall, there were 519 rules in the chromosome. The evaluation function iterated through the chromosome and applied the corresponding rule only if the allele value was equal to one. Each rule produced a prediction, which was then averaged with the predictions produced by all of the other activated rules to find the final prediction.

RESULTS
Utilizing the PGAPack and MPI libraries, we implemented a small-scale parallel genetic algorithm and developed a simple temperature prediction scheme based on an abridged set of data. Our two large-scale trial runs focused on understanding the effects of different crossover and mutation rates as well as an attempt to determine which particular sequence of rules provided the best prediction. One run focused on a single day and varied five factors, while the other run focused on multiple days and varied two factors.

The first run varied five factors: population size, maximum number of iterations, crossover rate, mutation rate, and number of chromosomes to be replaced. The population size had 5 possible values: 100, 250, 500, 1000, or 5000. The maximum number of iterations had 5 possible values: 500, 1000, 2500, 5000, or 10,000. The crossover rate had 6 possible values: 0.0, 2.0, 4.0, 6.0, 8.0, or 1.0. The mutation rate had 6 possible values: 0.0, 2.0, 4.0, 6.0, 8.0, or 1.0. The number of chromosomes had 5 possible values: .011, .014, .02, .033, or .100 of the population size. One run was conducted for each possible combination of factors, for a total of 4500 runs. While our analysis techniques were not sophisticated enough to be able to easily view the interactions between all five variables, a rough analysis reveals little difference in accuracy between each combination of variables.

The second run utilized the best performing variable combinations from the first run and tested these combinations over a wider range of days. The population size, maximum number of iterations, and the number of chromosomes to replace variables were held constant to simplify analysis. Therefore, only crossover and mutation values were changed between each prediction. For low rates, mutation and crossover equally affect the accuracy of the prediction. However, when rates of mutation and crossover increase above .6, the mutation replacement method yields a closer prediction. On the whole, our results illustrated that neither mutation or crossover rates above .6 were very accurate.

The accuracy of predictions from the first run, where all five variables were in use for predicting a single day.s temperature, varied between 6.5% and 23.4% error. The accuracy of predictions from the second run, where only two variables were in use to predict the temperature on a variety of days, ranged from 0.0% to 130.0% error.

CONCLUSIONS
We believe that the low accuracy of our genetic algorithm, regardless of the configuration of variables, is the result of the high number of similar rules used in the algorithm. For a binary genetic algorithm to be successful there must be little to no overlap in the tasks being encoded. Unfortunately, many of our approaches overlapped significantly. Future work includes simplification of the chromosome to more accurately determine the benefits of crossover versus mutation, as well as eliminating a selection of rules to shorten the chromosomes and decrease rule overlap.

Through our Genetic Algorithm CREW Project we have learned a novel programming approach to scientific modeling. Because Genetic Algorithms are based on the trial and error process, we have also enhanced our problem solving abilities. Unfortunately our algorithm led to an indeterminate improvement of temperature prediction. However, future work is planned to exploit our two most important results. First, that chromosome design is more influential than crossover and mutation rates on algorithm performance. And second, that chromosome lengths should be kept small to maximize results. With these two features in mind, we hope to design a more robust and accurate prediction algorithm.

PUBLICATIONS
J. Burger, E. Gibson. .Parallel Genetic Algorithms: An Exploration of Weather Prediction through Clustered Computing.. The Journal of Computing in Small Colleges. Wooster, Ma. Vol. 18 No. 5 May 2003, pp. 272-3.

WEB PAGE
http://www.tcnj.edu/~crew

POSTER PRESENTATIONS J. Burger, E. Gibson. .Parallel Genetic Algorithms: An Exploration of Weather Prediction through Clustered Computing.. Consortium for Computing Sciences in Colleges Northeastern Conference 2003, Providence, RI, April 2003.

J. Burger, E. Gibson. .Parallel Genetic Algorithms: An Exploration of Weather Prediction through Clustered Computing.. Posters Under the Dome 2003, Trenton, NJ. March 20, 2003 presentation to NJ State Legislatures at the NJ State Capitol. Met with Senator Thomas Kean, Jr. and discussed research at the state colleges.


Warning: include(/opt/home/womencom/www/includes/footer.html) [function.include]: failed to open stream: No such file or directory in /usr/local/www/docs/Activities/craw_archive/creu/crewReports/2003/newjersey_final.php on line 60

Warning: include() [function.include]: Failed opening '/opt/home/womencom/www/includes/footer.html' for inclusion (include_path='/opt/coolstack/php5/lib/php:.:') in /usr/local/www/docs/Activities/craw_archive/creu/crewReports/2003/newjersey_final.php on line 60