|
Universidade Federal de Viçosa Viçosa, MG. Brasil |
Program Genes A Software in
the Area of Genetics and Experimental Statistics |
Departamento de
Biologia Geral Viçosa, MG.
36570-00 |

Introduction
To
achieve superior genetic material, the selected individuals must assemble,
simultaneously, a series of favorable attributes that will impart a
comparatively higher yield to them and meet consumers’ demands. Therefore, a
way to increase the chances of success for a breeding program is the
performance of reliable experiments that provide a great volume of experimental
data. Based on the adequate processing of these data, genetic parameters can be
estimated and biological phenomena can be interpreted. In the phase of analysis
and interpretation of results, it is very important for researchers to have
available software systems and computer resources.
The
development of software in the area of Genetics and Breeding is crucial due to
the scarcity of such resources in the scientific community. The availability of
these tools will supply an increasing demand of users in several research
institutions that manipulate a great volume of data, requiring an adequate
processing for the accurate estimation of statistical and biological
parameters.
Specifically
for Genetics, it can be observed that the intensive breeding of many species
and the complexity of the most important characters have required the use of
selection criteria increasingly accurate. In all breeding stages, breeders must
use information that is expressed in parameters of the biometric models, which are usually available in the exits of most software systems for the
scientific area.
Hence, the GENES
software was especially developed to meet the needs of the areas of Genetics
and Experimental Statistics.
Description
The
Genes software system is compatible with IBM
PCs and should be used with the
Windows operational system.
Some
configurations are indispensable, such as:
-
a screen resolution 1024 x 768
(large fonts 120ppp) and
-
the use of a decimal symbol expressed by points.
It comprises 201 executable projects, 131 text
documents in the rtf format, occupying around 250Mbytes, available in English
and Portuguese.
Provision of Data for Processing
The procedures generally present a common sequence of
data analysis. Basically, users supply the name of the file that contains the
data to be processed, give information about the parameters (number of
variables, treatments, blocks, etc), supply the names of the variables
(optional) and print or save the results achieved.
The provision of data is carried out by a file containing data in a spread
sheet, in which each column represents a certain characteristic to be analyzed,
and each line, the experimental observation. Sometimes, the first columns are
reserved to describe the classificatory variables or effect describers, such as
treatments, blocks, years, locations, etc.
Modules
The Genes software system presents modules of
analysis, which are described below, that involve several procedures of
biometric analysis.
1. Biometrics

Genotype x Environment Interaction: stratification
analysis, dissimilarity and correlations between environments.
Stability and Adaptability: analysis by methods based on the ANOVA
(traditional, Plaisted and Peterson,
1959, Wricke,1965 and Annicchiarico,1992), regression (Eberhart and Russell, 1966,
Finlay and Wilkinson, 1963 and Tai,
1971), bissegmented regression (Verma,
Chahal and Murty, 1978, Silva and
Barreto, 1985 and Cruz, Torres and Vencovsky, 1989) nonparametric analysis (Huehn, 1990, visual analysis and Lin and Binns,1988),
analysis of factors and main components or centroids.
Gains per Selection – Indices: calculation of gains per selection between families
(univariate and indices), considering the direct and indirect selection, the
classic index of Smith,1936 and Hazel, 1943, based on the sum of ranks of Mulamba and Mock,1978, base of Willians, 1962, multiplicative of Subandi et al., 1973, free of weights and
parameters of Elston, 1963, based on the
desired gains of Pesek and Baker, 1969 and on the genotype-ideotype distance
index. Calculation of gains per selection between families by univariate methods or by following restricted indices: classic of
Smith,1936 and Hazel, 1943, of
Kempthorne and Nordskog, 1959, of Tallis,1962, of James, 1968, of Cunningham et
al., 1970, and based on the desired gains of Pesek and Baker, 1969. Calculation
of gain per selection between, considering indices under colinearity, of gains
per selection between and within, in balanced and unbalanced experiments, per massal and stratified selection between and
within. Visual selection analysis, selection of several environments and
prediction of gains per selection within, without information from the plants
within the parcel.
Diallel Analysis: Analysis of the
balanced diallels (Methodologies of Griffing, 1956, Gardner and Eberhart, 1966,
Hayman,1954 and Cocherhan and Weir,1977, test between hybrids and reciprocals,
prediction of compounds and hybrids and of family indices) joint diallel
analysis (of balanced diallels of Griffing, 1956, of Gardner and Eberhart,
1966, and of partial and circulating diallels),
Partial diallels ( by the methodologies of Geraldi and Miranda Filho,
1988, of Miranda Filho and Geraldi,1984, of Kempthorne, 1966, of Viana et al.
1999 and 2000 and prediction of triple
and double hybrids). Analysis of circulating diallels,
circulating partial and unbalanced.
Segregant and Non-segragant generations: scale joint test (P1, P2, F1, F2 with
optional inclusion of RC1 and
RC2 ), analysis of experiments of segregant lines and parents in
intercalating rows and analysis of individuals in the generation Ft and their derivative lines Ft+1
Repetibility : Analysis of
original or classified data
Combined selection : analysis of
experiments of families with balanced and unbalanced data. Analysis of genetic design proposed by Comstock and Robinson (1948), Comstock and Robinson (1948) involving
several Sets
Genetic and Environmental
Progress
Nuclear Collection
2. Multivariate
Analysis

Main Components
Canonical Variables
Canonical Correlations
Discriminant Analysis (by the method proposed by
Analysis of Factors
Measures of Dissimilarity: based on continuous, multicateegoric or
binary phenotipic quantitative
variables. Analysis of molecular data
from dominant or codominant markers.
Grouping Analysis: Tocher optimization method, hierarchical,
graphic dispersion and 2D and 3D projection. Identification of more and less
similar accesses.
Importance of
Characters: by main components or by the distance by the Generalized Mahalanobis distance and canonical variable analysis.
3. Simulation

Simulation
of experiments
Simulation of Samples (p populations and v variables)
Optimal Number of Families
Optimal Number of Plants (Random or
Predifined Sampling)
Optimal Number of Repetitions or
Optimal Sample Size
4. Genetic
Diversity

Diversity between Accesses: based on continuous, multi-category, binary phenotypic variables, and
analysis of data of dominant and codominant markers (multi-allelic).
Diversity between Populations: Nei’s Genetic identity Calculation (1972) and the following distances:
Euclidean, of
Diversity within populations: calculation of the coefficient of endogamy and heterozygosis,
Shannon-Wiener index and the heterozygosis from the binary data.
Diversity between and within populations: descriptive analysis, Nei’s diversity calculation (1973), Wright
fixation index (Two alleles or Multiple alleles), from the heterozigosity of
Weir (1996). Analysis of the Contingency Table, anova of the allelic frequency
(F, f and
), Amova of Excoffier et al (1992) and analysis of
binary data.
Discriminant Analysis: discriminant
analysis of
Relationship Coefficient
Grouping analysis: using the following methods: Tocher optimization and
hierarchical methods, by graphic dispersion, 2D and 3D projection and analysis
of more and less similar accesses.
Matrices of Dissimilarities: calculation of the correlation and sum between elements of matrices of
dissimilarity.
Importance of Characters: considering
phenotypic quantitative characters or molecular information, by means of the
Manova
Optimization:
Analysis of the optimal number of binary or multi-allelic markers for the
study on genetic variance.
Simulation: simulation of
populations, crossings and samples of populations, under the effect of the
divergent selection or genetic drift.
Hardy-Weinberg Equilibrium:
Analysis of populations based on the information of codominant diallelic or
multi-alleclic markers.
Unbalance of the Gametic Stage
Disequilibrium
5. Experimental
Statistics

Descriptive Statistics
Normality Test
Variance Analysis: analysis of
completely randomized designs and schemes, of experiments with regular and
non-regular treatments, in casualized blocks, factorial and subdivided parcels.
Analysis of origin/progeny/plant, simple and triple lattices and
hierarchical models.
Regressions:
simple linear, non-linear, multiple and polynomial, response surface and 3D graphics
analysis.
Correlations: calculation of genetic correlations, partial and
canonical Pearson and
Spearman correlations. Path analysis (involving 1 or 2 chains) and path analysis under colinearity.
Comparison Between Averages: Tests of Tukey, Duncan, Scheffé and
Scott and Knott, Tukey test with
variable number of repetitions, Dunnett,
t test, Tocher, chi-square test to evaluate hypotheses, heterogeneity
and factorial linkage.
Stand Correction Methods
6. Matrices

Diagnosis of Multicolinearity
Algebra of Matrices
Solution of the System ![]()
Solution of the System ![]()
References
concerning the software system
|
CRUZ,
C. D. . Programa Genes - Análise multivariada e simulação. 1. ed. Viçosa, MG:
Editora UFV, 2006. v. 1. 175 p. |
|
|
CRUZ,
C. D. . Programa Genes - Biometria. 1. ed. Viçosa,MG: Editora UFV, 2006. v.
1. 382 p. |
|
|
CRUZ,
C. D. . Programa Genes - Diversidade Genética. 1. ed. Viçosa, MG: Editora
UFV, 2008. v. 1. 278 p. |
|
|
CRUZ, C. D. . Programa Genes - Estatística
Experimental e Matrizes. 1. ed. Viçosa: Editora UFV, 2006. v. 1. 285 p. |