MARS 2.0

Licensing Details


MARS requires that all training data reside in RAM, so the larger the data set to be analyzed, the larger the RAM needed to analyze it. The exact amount of RAM required will vary from problem to problem. The table below is intended as a guide for the maximum number of candidate predictor variables that can be specified in a MARS analysis for the given sample size and amount of RAM workspace:


Number of Predictor Columns You Can Use For Different Training Sample Sizes and MARS versions
Sample Size 64 MB compile  [2m]** 128 MB compile  [4.8m] 256 MB compile  [9.6m] 512 MB compile  [22.8m]***
10,000 200 480 960 2280
25,000 80 190 380 910
50,000 40 95 190 455
100,000 20 45 95 225
200,000 5 20 45 110



* MARS run with default settings and with the following assumptions: no missing values or categorical variables in training data; maximum interactions set to 1; maximum basis functions set to the number of specified predictors. NOTE that each variable containing a missing value counts as two predictors.

** Maximum number of numbers (in millions) based on above assumptions.

*** Custom compiles up to 32 GB are available on UNIX platforms. the maximum number of candidate predictor variables that can be specified regardless of available RAM is 8,192.



Rule of Thumb for Calculating Required RAM

A rule of thumb that you can also use for calculating the needed RAM for your data set is to multiply the data set size by a factor of 3 to 4. For example, if your data set is 10 megabytes, MARS potentially requires 40 megabytes of RAM for the analysis.


Increasing the Number of Variables MARS Can Handle

If you have a very large list of potential predictors, CART can be used first to extract the most important variables. MARS can then focus on the top variables from the CART model, enabling you to fit larger problem sizes into smaller workspaces and resulting in faster analyses and more accurate and robust models.

Rate this page
Comment