openModeller id: MAXENT
Current version: 1.0 Developer(s): Elisangela S. da C. Rodrigues, Renato De Giovanni, Daniel Bolgheroni
Accepts Categorical Maps: no
Requires absence points: yes
Author(s): Steven J. Phillips, Miroslav Dudík, Robert E. Schapire
The principle of maximum entropy is a method for analyzing available qualitative information in order to determine a unique epistemic probability distribution. It states that the least biased distribution that encodes certain given information is that which maximizes the information entropy (content retrieved from Wikipedia on the 19th of May, 2008: http://en.wikipedia.org/wiki/Maximum_entropy). This implementation in openModeller follows the same approach of Maxent (Phillips et al. 2004). It was compared with Maxent 3.3.3e through a standard experiment using all possible combinations of parameters, generating models with the same number of iterations, at least a 90% rate of matching best features considering all iterations, distribution maps with a correlation (r) greater than 0.999 and no difference in the final loss. However, previous implementations of this algorithm (before version 1.0) used to generate quite different results. The first versions were based on an existing third-party Maximum Entropy library which produced low quality models compared with all other algorithms. After that, the algorithm was re-written a couple of times by Elisangela Rodrigues as part of her Doctorate. Finally, the EUBrazil-OpenBio project funded the remaining work to make this algorithm compatible with Maxent. Please note that not all functionality available from Maxent is available here - in particular the possibility of using collecting bias and categorical maps is not present, as well as many specific parameters for advanced users. However, you should be able to get compatible results for all other available parameters.
1) Jaynes, E.T. (1957) Information Theory and Statistical Mechanics. In Physical Review, Vol. 106, #4 (pp 620-630). 2) Berger, A. L., Pietra, S. A. D. and Pietra, V. J. D. (1996). A maximum entropy approach to natural language processing. Computational Linguistics, 22, 39-71. 3) Darroch, J.N. and Ratcliff, D. (1972) Generalized iterative scaling for log-linear models. The Annals of Mathematical Statistics, Vol. 43: pp 1470-1480. 4) Malouf, R. (2003) A comparison of algorithms for maximum entropy parameter estimation. Proceedings of the Sixth Conference on Natural Language Learning. 5) Phillips, S.J., Dudík, M. and Schapire, R.E. (2004) A maximum entropy approach to species distribution modeling. Proceedings of the Twenty-First International Conference on Machine Learning, pp 655-662.
Number of background points
openModeller id: NumberOfBackgroundPoints
Number of background points to be generated.
Data type: integer Domain: [0.0, 10000.0] Typical value: 10000
Use absence points as background
openModeller id: UseAbsencesAsBackground
When absence points are provided, this parameter can be used to instruct the algorithm to use them as background points. This would prevent the algorithm to randomly generate them, also facilitating comparisons between different algorithms.
Data type: integer Domain: [0.0, 1.0] Typical value: 0
Include input points in the background
openModeller id: IncludePresencePointsInBackground
Include input points in the background: 0=No, 1=Yes.
Data type: integer Domain: [0.0, 1.0] Typical value: 1
Number of iterations
openModeller id: NumberOfIterations
Number of iterations.
Data type: integer Domain: [1.0, oo] Typical value: 500
Terminate tolerance
openModeller id: TerminateTolerance
Tolerance for detecting model convergence.
Data type: real Domain: [0.0, oo] Typical value: 0.00001
Output format
openModeller id: OutputFormat
Output format: 1 = Raw, 2 = Logistic.
Data type: integer Domain: [1.0, 2.0] Typical value: 2
Quadratic features
openModeller id: QuadraticFeatures
Enable quadratic features (0=no, 1=yes)
Data type: integer Domain: [0.0, 1.0] Typical value: 1
Product features
openModeller id: ProductFeatures
Enable product features (0=no, 1=yes)
Data type: integer Domain: [0.0, 1.0] Typical value: 1
Hinge features
openModeller id: HingeFeatures
Enable hinge features (0=no, 1=yes)
Data type: integer Domain: [0.0, 1.0] Typical value: 1
Threshold features
openModeller id: ThresholdFeatures
Enable threshold features (0=no, 1=yes)
Data type: integer Domain: [0.0, 1.0] Typical value: 1
Auto features
openModeller id: AutoFeatures
Enable auto features (0=no, 1=yes)
Data type: integer Domain: [0.0, 1.0] Typical value: 1
Product/threshold threshod
openModeller id: MinSamplesForProductThreshold
Number of samples at which product and threshold features start being used (only when auto features is enabled).
Data type: integer Domain: [1.0, oo] Typical value: 80
Quadratic threshold
openModeller id: MinSamplesForQuadratic
Number of samples at which quadratic features start being used (only when auto features is enabled).
Data type: integer Domain: [1.0, oo] Typical value: 10
Hinge threshold
openModeller id: MinSamplesForHinge
Number of samples at which hinge features start being used (only when auto features is enabled).
Data type: integer Domain: [1.0, oo] Typical value: 15
The following image shows a sample model in the environmental space (temperature x precipitation) generated with the standard dataset used for tests (Thalurania furcata boliviana localities dataset):
fig. 1: Maxent model with default parameters. |