您的当前位置：首页 Applications of artificial intelligence and data mining techniques in soil modeling

Applications of artificial intelligence and data mining techniques in soil modeling

来源：微智科技网

Geomechanics and Engineering, Vol. 1, No. 1 (2009) 53-7453

Applications of artificial intelligence and data mining

techniques in soil modeling

A. A. Javadi† and M. Rezania

Computational Geomechanics Group, School of Engineering, Computing and Mathematics,

University of Exeter, Exeter, EX4 4QF, Devon, UK

(Received October 21, 2008, Accepted January 15, 2009)

Abstract.In recent years, several computer-aided pattern recognition and data mining techniques havebeen developed for modeling of soil behavior. The main idea behind a pattern recognition system is that itlearns adaptively from experience and is able to provide predictions for new cases. Artificial neural networksare the most widely used pattern recognition methods that have been utilized to model soil behavior.Recently, the authors have pioneered the application of genetic programming (GP) and evolutionarypolynomial regression (EPR) techniques for modeling of soils and a number of other geotechnicalapplications. The paper reviews applications of pattern recognition and data mining systems in geotechnicalengineering with particular reference to constitutive modeling of soils. It covers applications of artificialneural network, genetic programming and evolutionary programming approaches for soil modeling. It issuggested that these systems could be developed as efficient tools for modeling of soils and analysis ofgeotechnical engineering problems, especially for cases where the behavior is too complex and conventionalmodels are unable to effectively describe various aspects of the behavior. It is also recognized that thesetechniques are complementary to conventional soil models rather than a substitute to them.

Keywords:artificial intelligence; data mining; neural network; genetic programming; evolutionary com-putation; soil modeling; geotechnical engineering.

1. Introduction

In the past few decades the finite element method (FEM) has been used successfully to predict theresponse of systems across a whole range of industries including geotechnical engineering. In thisnumerical analysis the behavior of the actual material is approximated with that of an idealizedmaterial that deforms in accordance with some constitutive relationships. Therefore the choice of anappropriate constitutive model which adequately describes the behavior of the material plays asignificant role in the accuracy and reliability of the numerical predictions.

During the past few decades several constitutive models have been developed for differentgeomaterials based on mechanics (e.g., Desai et al. 1986, Duncan and Chang 1970, Einstein andHirschfeld 1973, Kawamoto et al. 1988, Lade and Duncan 1975, Roscoe and Schofield 1963). Mostof these models involve determination of material parameters, many of which have no physicalmeaning (Shin and Pande 2000). The engineering properties of geomaterials exhibit varied and

†Ph.D., Corresponding author, E-mail: a.a.javadi@ex.ac.uk

54A. A. Javadi and M. Rezania

uncertain behavior due to the complex and imprecise physical processes associated with the formationof these materials. Recognition of nonlinear behavior of soils and rocks becomes increasinglyimportant in design, stability analysis, prediction and control of failure for geotechnical engineeringprojects. In spite of considerable complexities of constitutive theories, due to the erratic nature ofsoils and rocks, none of the existing constitutive models can completely describe the real behaviorof various types of these materials under various stress paths and loading conditions.

In conventional constitutive material modeling, an appropriate mathematical model is initiallyselected and the parameters of the model (material parameters) are then identified from appropriatephysical tests on representative samples to capture the material behavior. When these constitutivemodels are used in numerical analysis (e.g., FEA), the accuracy with which the selected materialmodel represents the various aspects of the actual material behavior and also the accuracy of theidentified material parameters affect the accuracy of the numerical predictions.

In recent years, some researchers have attempted to build nonlinear constitutive models based oncomputer-aided pattern recognition methods. Artificial neural network (ANN) has been the mostwidely used pattern recognition technique to model the constitutive material behavior. This paperpresents a review and evaluation of different AI and data mining techniques that have beenproposed for modeling of soils. In particular, it covers the applications artificial neural networks,genetic programming and evolutionary polynomial regression techniques in soil modeling and somegeotechnical engineering problems. 2. Data driven techniques

In recent years, by pervasive developments in computational software and hardware, severalcomputer aided pattern recognition and data mining techniques have been emerged and developed.The main idea behind a pattern recognition system is that it learns adaptively from experience andextracts various discriminants, each appropriate for its purpose. Although there are other generalpurpose data-driven techniques, artificial neural network (ANN) and genetic programming (GP) arethe most widely used pattern recognition methods that have been utilized to model complexengineering problems and capture nonlinear interactions between various parameters in a system. Inthis approach model construction is usually divided into three stages: (i) function identification, (ii)parameter estimation and (iii) validation. For model construction, a physical system with an outputy, dependent on a set of inputs X and parameters θ, can be mathematically formulated as

y = F(X, θ)

(1)

where F is a function in an m-dimensional space where m is the number of inputs. Data-driventechniques tend to reconstruct F from input-output data. GP generates a population of expressionsfor F, coded in tree structures of variable size, and performs a global search of the best fit expressionfor F. ANN goal, on the other hand, is to map F rather than to find a feasible structure for it.3. Artificial neural network

Artificial neural networks (ANNs) are computational models broadly inspired by the organizationof the human brain. The most important features of a neural network are its abilities to learn and to

Applications of artificial intelligence and data mining techniques in soil modeling55

be error tolerant. In other words an artificial neural network is able to acquire, represent, and computea mapping from a multivariate space of information to another given a set of data representing thatmapping (Garrett 1994). ANN models are adaptive and capable of generalization. They can handleimperfect or incomplete data, and can capture nonlinear and complex interactions among variablesof a system. Because of these features, the artificial neural network is emerging as a powerful toolfor modeling.

At the most abstract description, a neural network can be considered of as a black box, wheredata is fed from one side, and processed by the neural network which then produces an outputaccording to the supplied input (Caudill 1991). Although a neural network can usually process anykind of data (e.g., qualitative or quantitative information), the data fed into the neural networkshould be pre-processed (e.g., filtered, transformed) to enable faster training and better performance.In fact, the selection, pre-processing, and coding of information is one of the main issues to dealwith when working with neural networks.

A neural network generally consists of an input layer, one or more hidden layers and an outputlayer of neurons. The neurons are the processing units within the neural network and are usuallyarranged in layers. Each layer is composed of several processing units. The processing units are fullyconnected to processing units of the succeeding layer. The information is propagated through theneural network layer by layer. Connections are the paths between neurons where all the informationflows within a neural network. A neuron collects information from all preceding neurons relative tothe flow of the information and propagates its output to the neurons in the following layer. Theoutput of each preceding neuron is modulated by a corresponding weight and a bias. This output isthen modified by transfer function and becomes the final output of the neuron (Dayhoff 1990). Thissignal is then propagated to the neurons of the next layer. The most frequently used and efficientlearning procedure for multi-layer neural networks is the back-propagation learning algorithm basedon the generalized delta rule (Rumelhart et al. 1994). The back-propagation learning rule can beused to adjust the weights and biases of a network in order to minimize the sum-squared error ofthe network. This is done by continually changing the values of the network weights and biases inthe direction of steepest descent with respect to error. Derivatives of the error vector are calculatedfor the network's output layer and then back-propagated through the network until derivatives oferror are available for each hidden layer.

Training refers to the process that repeatedly applies input vectors to the network and calculateserrors with respect to the target vectors and then finds new weights and biases with the learningrule. It repeats this cycle until the sum-squared error falls beneath an error goal, or a maximumnumber of epochs is reached. Training a feed-forward network with the back-propagation learningrule is most frequently used in function approximation and pattern recognition. More detaileddescription of ANNs is out of the scope of this paper. Texts describing aspects and features of ANNmodels and architectures in greater detail can be found in the literature (e.g., Lippmann 1987, Floodand Kartam 1994).

3.1 Application of artificial neural network for soil modeling

Modeling of soil behavior plays an important role in dealing with issues related to soil mechanicsand foundation engineering. The application of ANN offers an alternative means for the modelingof soil behavior. A neural network based constitutive model (NNCM) is fundamentally differentfrom a conventional constitutive model (Zhu et al. 1998a). One of its distinctive features is that it is

56A. A. Javadi and M. Rezania

based on experimental data rather than on assumptions made in developing the constitutive model.Furthermore, NNCM requires no material parameters to be identified. These features ascertain theNNCM to be an objective model that can truly represent the natural connections among variables,rather than a subjective model which assumes the variables obey a set of predefined relations.

The NNCM learns from experimental data and forms neural connections stimuli from the learningprocess. Because of its unique learning, training and prediction characteristics, ANN has greatpotential in soil engineering applications, particularly for the situations where good experimentaldata are available and where conventional constitutive modeling may be difficult and time consuming.A significant number of NNCMs have been developed for modeling of geomaterials. Theapplication of neural network (NN) for constitutive modeling was first proposed by Ghaboussi et al(1990, 1991) for concrete. Later on Ellis et al. (1992) and Ghaboussi et al. (1994) applied theconcept of neural network-based constitutive modeling to model the behavior of geomaterials. Theseworks indicated that NNCMs can effectively capture nonlinear material behavior.

The common procedure of using ANN for constitutive modeling involves training a neuralnetwork using laboratory (or in-situ) data to learn the material behavior. The trained network is thenused to predict the behavior of the material under new loading conditions. The advantages of usingANN when it is trained directly from some experimental (or in-situ) data is obvious. If the trainingdata contains enough and relevant information, the trained network should be able to generalize thematerial behavior to new loading conditions.

Among various types of neural networks, multi-layer feed-forward back-propagation network isknown to be the most suitable architecture to describe the nonlinear relationships, and so far, hasbeen the main type of neural network used to describe material constitutive behavior (Hashash et al.2004). The role of the ANN is to attribute a given set of output vectors to a given set of inputvectors. When applied to the constitutive description, the physical nature of these input-output datais determined by the measured quantities like stresses, strains, pore pressures, temperatures, etc. Atypical NNCM is shown schematically in Fig. 1.

In the simple example shown in Fig. 1 one input layer, two hidden layers and one output layer areconsidered for the network. Three principal strain components (ε1, ε2 and ε3) for an assumedmedium are input and a forward pass through the network including simple computations results in

Fig. 1 A simple NNCM

Applications of artificial intelligence and data mining techniques in soil modeling57

the prediction of three corresponding principal stresses (σ1, σ2 and σ3) in the output layer. Everyneuron in each layer is connected to every neuron in the next layer and each such connection hasassociated with it a “connection weight”. The knowledge stored in the developed network isrepresented by the set of connection weights. The neural network is trained by appropriately modifyingits connection weights through the set of “training cases” until the predicted output variables agreesatisfactory with the desired variables. The “back-propagation” term (Rumelhart et al. 1986) refersto the algorithm by which the observed error in the predicted output variables is used to modify theconnection weights.

Encouraged by the attractive features of neural networks, after exploration of the potential ofANN for constitutive modeling during early 90’s; a number of NNCMs for different materials weredeveloped. Millar and Clarici (1994) showed the capability of ANN for modeling of behavior ofrocks in rock mechanics applications. They used laboratory test results of axial stress-axial strainmeasurements for training and testing of ANN. Four different ANN models, in terms of number ofhidden neurons, were developed and it was shown that ANNs are able to predict the stress-strainrelationship with good accuracy. In this work, a multilayer perceptron architecture with back-propagation training algorithm was used. The input-output set for training the model was:Inputs: σ3, ε1, ε3 and (sign[dσ1/dε1])Output: σ1

where σ1 is the major principal stress, σ3 is the confining pressure, ε1 is the major principal strain,ε3 is the minor principal strain and sign[dσ1/dε1] is the sign of the gradient of the stress-straincurve. The fact that the latter input parameter was required for the ANN training implied that someindication of the history of stress state was necessary as input in the training process.

Ellis et al. (1995) modeled the stress-strain relation of sands using ANN and showed goodagreement between laboratory data and modeling results. A series of undrained triaxial tests on mortarsand was used to develop the models. Two different types of architecture were used to evaluate theability of ANN for modeling sand behavior. They were the conventional neural network withoutfeedback and the sequential NN with feedback.

In a sequential network (Fig. 2), at the initial phase of the training a pattern is input to the planunits. Feed forward process occurs as in the standard backpropagation algorithm, producing the firstoutput pattern. This output is then copied back to the current state units for the next feed forwardprocess. The sequential NN has the potential in incorporating the path dependency of mechanicalbehavior into the model. In order to accommodate this aspect, the input-output parameters for themodel should be variables of time. Based on the results it was found that the sequential NN workedbetter than the conventional backpropagation NN. Thus the authors proposed a sequential networkwith three layers which had 10 neurons in the intermediate layer and its input-output parameters

Fig. 2 Architecture of a typical sequential NN

58A. A. Javadi and M. Rezania

were:

Inputs: σ1, σ3, ε1, u, OCR, Dr and CuOutputs:

i+1i

σ1 and

i+1

where u is the pore water pressure, OCR is the over consolidation ratio which reflects theprevious stress history, Dr is the initial relative density and Cu is the coefficient of uniformity whichcharacterizes the grain size distribution of sand. A constant value of 0.0405% was used for the axialstrain increment, Δε1. Based on the reported results, the NN predictions, in particular the values ofpore water pressure, were not very accurate. Later, it was argued that a prescribed strain rate(0.0405% per minute) has to be defined in order to make predictions with this model (Najjar andBasheer 1996). This issue limits the developed network as applicable only to a specific case with astrain rate of 0.0405% per minute.

Millar and Calderbank (1995) showed that a single multilayer feedforward neural network is ableto predict the deformability behavior of rock. Data used to train the neural network model wasderived from the results of a series of simulations of triaxial tests using commercial explicit finitedifference software, FLAC. The authors made some modifications to their ANN training approachin order to resolve the deficiencies associated with the earlier work (Millar and Clarici 1994) andmake their model worthy for immediate use as a stand alone constitutive relationship in a numericalmodeling code. For this purpose the authors used the same ANN architecture as their earlier work,but they revised the way the input-output parameters were introduced to ANN in the trainingprocedure. The input-output parameter sets used for the training of their revised ANN based modelwere:

Inputs: σ1, σ3, (Δε1) and (Δε3) Outputs:

i+1i

σ1 and

i+1

σ3

where (Δε1) and (Δε3) are the increments of major and minor principal strains, respectively. Thedata, which were produced by triaxial test simulation in FLAC using strain softening modelavailable within this software, had to be scaled within the interval between −0.5 and 0.5 for thetraining process. Also the value of minor principal stress, σ3, was considered not to be identical fora single test. This was done through the superposition of a component of noise to the input valueson each presentation of the data to the NN. The optimum NN structure obtained for the constitutiverelationship was then used to develop a user defined constitutive model, called NN UDM, back inFLAC. Although the accuracy of the NN model over the training data was good however itsprediction ability was so poor and the actual behavior of the NN UDM was far from desiredbehavior when it was used in place of the standard strain softening constitutive model within FLAC.Amorosi et al. (1996) also adopted a neural network based representation for constitutive behaviorof geomaterials. The data obtained from undrained triaxial tests on a particular clay (Vallericca clay)was used to develop the NN model. The input-output parameter sets used in this work wereInputs: σ1, σ3, u, (Δε1) and OCR Outputs:

i+1i

σ1 and

i+1

The constitutive behavior of Vallericca clay was shown to be adequately represented with thetrained NN model. The developed model had a back propagation multilayered perceptron architecture

Applications of artificial intelligence and data mining techniques in soil modeling59

with three layers. The input and output layers used 5 and 2 nodes respectively while the hiddenlayer contained four nodes.

Logar and Turk (1997) presented a constitutive model for soft soils using a neural network. Theresults from oedometer loading tests on a silty soil were used to train a feed forward neuralnetwork. Based on the source of the available data the following input-output set was used todevelop the model

Inputs: σ', Z, w0, wl and wpOutput: e

where σ' is effective stress, Z is the depth from which the sample was taken, w0 is natural watercontent, wl is liquid limit, wp is plastic limit and e is void ratio. The optimum NN structure wasobtained by a single hidden layer consisting of 35 hidden neurons. The results for approximation ofoedometer curves by NN were relatively accurate compared to the experimental measurements withaverage error of around 10% for the training phase. The trained neural network was used todetermine the tangential oedometer modulus as

iΔσ′σ′–σ′

Eoed = –(1+e)-------- = –(1+e)---i--------------i-----+1Δee–e

i+1

(2)

The above equation was then used, instead of the elastic parameter, in a finite element code to

model the amount of settlement in an embankment. The results were reported to be comparablewith those obtained using a cap model for deformation.

Penumadu and Chameau (1997) presented a model for soil behavior within a unified environmentbased on NN. The same triaxial data as used by Ellis et al. (1995) was used for training and testingof the NN sand model. Also stress-strain data obtained from a series of strain controlled undrainedtriaxial tests on clay was used for training and testing the NN clay model. The same type of NN asthe one used in Ellis et al. (1995) (feed back sequential NN) was again used in this work. The NNarchitecture and results for Mortar sand were identical to those presented in Ellis et al. (1995),however for clay a different NN architecture including one hidden layer with 10 nodes was selected.The input-output set for the clay model was

i·, iε, i(Δε)Inputs: τ, εOutput:

i+1

i· is the rate of strain increment. where τ is shear stress and ε

Zhu et al. (1998a) presented a recurrent neural network (RNN) model for simulating andpredicting shear behavior of two different soils. A recurrent neural network is a network where theconnections between the units form a directed cycle. Recurrent neural networks must be approacheddifferently from feed forward neural networks, both when analyzing their behavior and trainingthem. Hidden nodes in an RNN can transmit their outputs to both input layer and output layersimultaneously (Elman 1990). A typical architecture of an RNN with one hidden layer is shown inFig. 3.

Laboratory based experimental data were used for modeling including a set of strain controlledundrained tests and a set of stress controlled drained tests performed on a residual Hawaiianvolcanic soil. The choice of input-output variables was different due to different sources of data. For

60A. A. Javadi and M. Rezania

Fig. 3 A typical recurrent neural network

the strain controlled test the goal was to measure stress response of the specimen to a given strainvalue, therefore the selected input-output variables for RNN training were:Inputs: iq, σ3′, iu, ε1, (Δε1) and ieOutputs:

i+1i

q and

i+1

where q=σ1−σ3 is deviatoric stress.

In contrast, as for a stress controlled test the shear stress and stress increment were known inadvance, the goal was to predict the strain response of the specimen due to a stress increase. Thusfor such test results, the selected input-output parameters for the RNN were:Inputs: σ1′, σ3′, (Δσ1′), (Δσ3′), iu, ε1, εv and ieOutputs:

i+1i

ε1 and

i+1

εv

In both models, RNN structure with one hidden layer containing 20 nodes was found to generatethe minimal sum squared error. Good agreement between the modeling results and the observedexperimental data showed the efficiency of the RNN approach in modeling complex soil behavior.The authors suggested that such an RNN model could be applicable to other soils if appropriateinput and output parameters are chosen.

Zhu et al. (1998b) published a similar work in which the same NNCM (in terms of network type,architecture and input-output set) was used to model soil behavior, using generally the same data asin Zhu et al. (1998a). However in this work the authors proposed that in the network structure onehidden layer with 20 and 35 nodes is suitable for the modeling of the strain controlled undrainedtests and stress controlled drained tests respectively.

Ghaboussi et al. (1998) described a new indirect method, called autoprogressive training, fortraining neural network material models from structural tests to learn complex stress-strain behaviorof materials. The global data measured form a structural load-deflection test was used to train thenetwork. The main premise of the work was that the structural tests usually generate a large numberof spatial patterns of stresses and strains that can be used for training. The term “autoprogressivetraining” referred to a process in which the neural network is itself an integral part of the iterativealgorithm that is used to create the stress-strain training cases from the global response data. Thismethod differs from common applications of NN models in the sense that there is not a known setof data to train the network, but the material model is extracted iteratively from global measurementsusing nonlinear finite element analysis (Haj-Ali et al. 2001). The applications discussed in thispaper show a procedure that can be used to create the stress-strain training data for the neural networkmaterial model, having knowledge of the global load vs. deflection response of the structure. In

Applications of artificial intelligence and data mining techniques in soil modeling61

contrast to previous applications of neural networks in constitutive modeling, in this method therewas not “a priori” a set of directly measured information that accurately represents the materialbehavior but this information must be extracted from the recorded structural response. Based on theresults of two simple examples presented in this paper, the predictions of the neural network trainedin this way were consistent. However, the minimum number of measured structural responses, andtheir type and locations on the structure, that are required in order to uniquely determine a neuralnetwork material model is an important theoretical issue that remains to be addressed.

Sidarta and Ghaboussi (1998) modified the earlier Ghaboussi et al. (1998) work in order todevelop a neural network based constitutive model for geomaterials using autoprogressive training.They used a non-uniform material test which had a non-uniform distribution of stresses and strainswithin the specimen. Then the measured boundary forces and displacements were applied in a finiteelement model of the test to generate the input and output data for training the neural networkmaterial model. Using the data generated in that way, the autoprogressive method was used to trainthe neural network material model. Three drained triaxial tests on Sacramento River sand wereconsidered in this work. The tests were performed with end friction condition, and the relativedensities of the samples ranged from loose to medium dense to dense. The measured axial forcesand confining pressures were directly from the test data. The radial displacements of the outer surfaceof each sample were determined by assuming a parabolic distribution. These measured force anddisplacement boundary conditions were used in the autoprogressive method. The components ofstress and strain, which were required to train the neural network material model, were constructedartificially in the finite element model of the test. In the model, the components of current strainand void ratio, (together with stresses and strains of the previous history points where necessary)were used as input to predict the components of current stress as output. The results indicated thatthe material behavior becomes increasingly more complex (requiring more history point modules)with increasing the soil density.

The trained neural networks were used in finite element analysis of actual triaxial test with endfriction as well as finite element analyses of hypothetical tests with no end friction. The results ofthe analysis with end friction matched well with those of the actual experiment. However the resultsof the forward analysis of the hypothetical tests with no end friction showed significant differenceswith the actual experimental results. The work presented in this paper, introduced an improvementover conventionally trained neural network based constitutive models for geomaterials. The attractionof the non-uniform test, used in this study, is that a range of stress levels and a variety of stresspaths may be represented in a single test, therefore the test results contain information on materialbehavior for different stress levels and stress paths. If that information could be extracted, then theresults of a single non-uniform material test may be sufficient for training a neural network constitutivemodel and there is no need for a large number of conventional triaxial tests with different stresspaths to produce the training data.

Ghaboussi and Sidarta (1998) and Sidarta and Ghaboussi (1998) presented a nested adaptive neuralnetwork (NANN) for constitutive modeling. The idea behind this approach is that the material datahas an inherent structure and one type of such inherent internal structure in data is the nestedstructure. Basically nested adaptive neural networks take advantage of the nested structure of thematerial test data, and reflect it in the architecture of the neural network. A nested neural networkconsists of several modules. The starting point of building a NANN is to develop a base module torepresent the material behavior in the lowest function space in the data structure. This base moduleis a standard multi-layer feed-forward neural network. The base module is then augmented by

62A. A. Javadi and M. Rezania

attaching added modules to form a higher level NANN. The process is theoretically open ended andmore and more modules can be added. The added modules themselves are also standard multi-layerfeed-forward neural networks. The developed NANNs were applied to modeling of drained andundrained behavior of Sacramento River sand in triaxial compression tests. The objective was tomodel a material behavior in both drained and undrained conditions for a range of initial void ratiosand initial confining pressures. First a base module was developed and then the history pointmodules were added. The results indicated that the effect of history points on the material behaviorbecomes increasingly more complex and difficult for the neural network to learn. With increasingthe number of history points, the number of inputs can increase significantly, which after even a fewsteps this can make the network massively complex and result in much higher computational timeand cost.

Another neural network based constitutive relationship was presented by Penumadu and Zhao(1999) to model stress-strain and volume change behavior of sand and gravel under drained triaxialcompression test conditions. The NNCMs presented in this paper were developed based on a largedatabase comprised of nearly 250 triaxial test results collected from literature. Two neural networksand models (Sand-Low and Sand-High) were developed to model the test results on sand in thelow confining pressure (less than 700 kPa) and high confining pressure (higher than 700 kPa) range.The division at 700 kPa was chosen arbitrarily by the authors. Also a single model was developedfor test results on gravel.

A sequential neural network structure (Fig. 3) was used and like other NNCMs, back-propagationalgorithm was employed to train the neural networks. The final optimum network architecture hadthree layers with eleven neurons in input layer, fifteen neurons in the hidden layer and two neuronsin output layer. The number of hidden units was determined using a trial and error procedure. Theselected input-output variables for NN training were:Inputs: σd, εv, σ3′, ε1, (Δε1), e, ns, h, D50, Cu, CcOutputs:

i+1i

σd,

i+1

εv

Seven of the eleven inputs were used to describe the hardness of the mineral (h), shape factor (ns),equivalent particle size and the particle size distribution (D50, Cu, Cc), void ratio (e) and effectiveconfining pressure (σ3′). The current state units of stress and strain were represented with three

iii

inputs using deviator stress (σd), axial strain (ε1) and volumetric strain (εv). For a given specimenconditions and current state units the objective of neural network was to predict deviator stressi+1i+1i

(σd) and volumetric strain (εv) of the next state of an input axial strain increment ((Δε1)).An interesting feature for training the network in this research was that a fixed set of axial strainincrements were chosen consistently for all the test data. This means that the value of strainincrement was chosen to increase at a constant magnitude (e.g., 0.1%). The original experimentaldata (deviator stress-axial strain and volumetric strain-axial strain) were not recorded at a specificstrain increment. The authors obtained the training pattern corresponding to the considered strainincrement by digitalizing the data and using cubic spline interpolation (Press et al. 1992). It wasobserved that the neural network material models obtained in this research were able to representthe constitutive behavior of cohesionless soil with reasonable accuracy. This NNCM was later usedin Penumadu et al. (2000) to simulate triaxial tests.

Habibagahi and Bamdad (2003) used neural network to describe the mechanical behavior ofunsaturated soils. A multilayer perceptron, sequential architecture with feed back capability was

Applications of artificial intelligence and data mining techniques in soil modeling63

chosen in this network. Triaxial test results on Lateritic gravel, reported by Toll (1998), were usedas database. The final network which was obtained through a trial and error procedure had threelayers with 9 neurons in the input layer and three neurons in the output layer. The optimal numberof nodes in the hidden layer was worked out to be five. The input-output parameters set for thisNNCM for unsaturated soils were:

Inputs: iq, εv, E(Ua–Uw), ε1, (P–Ua), (P–Uw), (Ua–Uw), Sr, ρd and θOutputs:

i+1i

i+1

εv and

i+1

E(Ua–Uw)

In the input parameters set, four neurons, namely, soil water content θ, dry density ρd, degree ofsaturation Sr, and soil suction (Ua–Uw), represent the initial condition of the specimen before

shearing. The other six neurons, namely, axial strain ε1, change in suction E(Ua–Uw), mean

effective stresses with respect to pore air and water pressures ((P–Ua) and (P–Uw)), volumetric

strain εv, and deviatoric stress iq are the input variables that must be updated incrementally duringtraining based on the outputs received from the previous increment of training. It was shown thatthe trained network was able to model the mechanical behavior (stress-strain, volume change andchange in suction) of unsaturated soils with reasonable accuracy. The authors also proposed that themodel may be used to simulate triaxial tests (artificial tests) under similar conditions.

In addition to the works mentioned above, some other researchers have also applied NN forconstitutive modeling of geomaterials using different datasets (e.g., Banimahd et al. 2005, Najjar etal. 1999, Wu et al. 2001). The results of these works also show the capability of NN in stress-strainprediction of different soils.

3.2 Implementation of NNCM in finite element method

As has been described in the previous section, to date, many researchers have attempted to modelthe various aspects of the constitutive behavior of geomaterials with neural networks. Althoughthese works are different in terms of their details and terminology; however most of their resultshave indicated that NNs have the ability to represent materials responses to different load paths withreasonable accuracy. From this, in theory, it is seen that in a numerical analysis tool such as FEM, itis possible to replace a conventional (analytical) constitutive model with a suitably trained NNCM.However the focus of most of the investigations has been on the description of the constitutivebehavior itself. As a result little is known about the performance of NNCMs in engineering analyses.The main reason for this appears to be the fact that there are considerable difficulties inincorporating a general NNCM in finite element codes (Shin and Pande 2002).

Shin and Pande (2000) presented a self learning FE code in which a NNCM was used instead ofconventional constitutive models and showed that the application of a constitutive law in the formof a neural operator leads to some qualitative improvement in the application of FEM in engineeringpractice. They presented a procedure where data for training neural network based constitutive modelwere acquired from planned monitoring of structural tests. Unlike conventional procedures wheregenerally material testing is performed to extract the stress-strain relationship and identify materialparameters, in this work, inverse analysis was carried out to identify material parameters frommonitored global structural response. In this way the self learning capability of the software wasexpected; however for this purpose the results of structure behavior needed to be available inadvance. It is obvious that depending on the mesh size of the problem under consideration, large

A. A. Javadi and M. Rezania

amount of data may be accumulated with increasing the number of self learning cycles which canresults in severe computer storage and CPU time problems during training. To address this problema limited number of monitoring points were selected in the structure and the data corresponding tothese points were used to train the NNCM. Selection of the number and location of monitoringpoints is therefore of considerable importance in identifying a reliable NNCM. It was stated thatsuch trained NNCM will need to be treated with caution for modeling the behavior of otherstructures; as it is apparent that an NNCM may predict the correct response at a few points, yet maybe completely inadequate to predict the response at others.

Shin and Pande (2001) showed that in their self learning finite element code the tangentialconstitutive matrix of the material can be computed as it is possible to obtain partial derivatives ofthe neural network model which has been trained though total stress and stress data. The capabilitiesof the developed FE code were illustrated by analyzing a rock specimen under uniaxial cylindricalcompression (with fixed ends). Shin and Pande (2002) proposed a strategy to generate additionaldata from general homogeneous material tests in order to train NNCM. This was done by takingadvantage of isotropy when it is applicable to the material under consideration. A boundary valueproblem of a circular cavity in a plane stress plate was modeled with the self-learning FE codeusing NNCM trained with the enhanced dataset. The self-learning FE analyses showed comparableresults with FE analyses using conventional constitutive models.

Drakos et al. (2006) presented a NNCM and stated that the model is equivalent of the hardeningsoil model. Synthetic data for training the NNCM was generated using the Hardening Soil Model(HSM) available in the commercial software PLAXIS and choosing a set of arbitrary parameters,typical of sands, for the HSM. The performance of the trained NNCM was then validated by usingthis model for numerical analysis of two simple foundation and excavation problems.

Lefik and Schrefler (2003) used a neural network for constitutive modeling of nonlinear materialbehavior and highlighted some of the difficulties in the constitutive description in incremental form.Hashash et al. (2004) described some of the issues related to the numerical implementation ofNNCM in finite element analysis and derived a closed-form solution for material stiffness matrixfor the neural network constitutive model.

Javadi et al. (2002, 2003, 2004a, 2004b, 2005) carried out extensive research on application ofneural networks in constitutive modeling of complex materials in general and soils in particular.They developed an intelligent finite element method (NeuroFE code) based on the incorporation ofa back-propagation neural network in finite element analysis. The intelligent finite element modelwas applied to a wide range of boundary value problems including several geotechnical engineeringapplications and it was shown that ANNs can be very efficient in learning and generalizing theconstitutive behavior of complex materials such as soils, rocks and others.3.3 Other applications of ANN in geomechanics

ANNs have been applied to a wide range of geotechnical engineering problems such as pilebearing capacity (e.g., Abu-Kiefa 1998, Goh 1996), site characterization (e.g., Juang et al. 2001),soil behavior (e.g., Zhu et al. 1998), liquefaction potential (e.g., Juang and Chen 1999), slopestability (e.g., Lu and Rosenbaum 2003), underground openings (e.g., Benardos and Kaliampakos2004, Javadi 2006) and many others. Toll (1996) presents a review of engineering applications ofAI techniques.

Applications of artificial intelligence and data mining techniques in soil modeling65

3.4 Advantages and shortcomings of neural networks

A neural network based constitutive model has several advantages including: (i) It provides a unified approach to constitutive modeling of all materials;

(ii) It does not require any arbitrary choice of the constitutive (mathematical) model. The incorporationof an ANN in FE procedure avoids the need for complex yielding/plastic potential/failure functions,flow rules, etc. There is no need to check yielding, to compute the gradients of the plastic potentialcurve or to update the yield surface;

(iii) There are no material parameters to be identified;

(iv) As a neural network learns the material behavior directly from raw experimental data theANN based constitutive model is the shortest route from experimental research (data) to numericalmodeling;

(v) The numerical parameters of the neural network-based constitutive models are easily andautomatically defined and NNCM can be incorporated in a FE code in a very natural manner. Atrained network can be incorporated in a FE code/procedure in the same way as a conventionalconstitutive model. It can be incorporated either as incremental or total stress-strain strategies. Anintelligent FE method can be used for solving boundary value problems in the same way as aconventional FEM;

(vi) An additional advantage of NNCM is that as more data becomes available, the materialmodel can be improved by re-training the ANN.

Although it has been shown by various researchers that ANNs offer great advantages in theanalysis of many geotechnical engineering problems, but in general, they suffer from a number ofdrawbacks. One of the main disadvantages of the ANN (and NNCM) is that the optimum structureof the network (such as number of inputs, hidden layers, transfer functions, etc.) must be identifieda priori, which is usually done through a time consuming trial and error procedure. In this respect,some attempts have been made to address optimal design of ANN structure based on a multi-objective strategy to find trade-off between model simplicity and accuracy (Giustolisi and Simeone2006). Another major disadvantage of neural network based models is the large complexity of thenetwork structure, as it represents the knowledge in terms of a weight matrix and biases which arenot accessible to user understanding. In other words NN models provide no insight into the wayinputs affect the output and are therefore considered as a black box class of model. The lack ofinterpretability of NN models has inhibited them from achieving their full potential in real worldproblems (Lu et al. 2001). In addition, as ANNs perform function approximation through largeparameterization and the use of simple functional structures (transfer functions), parameter estimationand overfitting problems represent other major disadvantages of a model constructed by ANN(Giustolisi 2002).

4. Genetic programming

Genetic programming which was introduced in the early 90s by Koza (1992), is an evolutionarycomputing method that generates a transparent and structured representation of the data provided.Evolutionary algorithms (EAs) are search techniques based on computer implementation of some ofthe evolutionary mechanisms found in nature (such as selection, crossover and mutation) in order tosolve a function identification problem. The function identification problem is to search for a function

66A. A. Javadi and M. Rezania

Fig. 4 Typical GP tree representing function (2/x1+x2)

in a symbolic form that fits a set of experimental data.

Genetic algorithm (GA) and genetic programming (GP) are the major types of evolutionaryalgorithms. GP is a generalization and an extension of GA. GAs are generally used in parameteroptimization to evolve the best values for a given set of model parameters, whereas GPs give thebasic structure of the approximation model together with the values of its parameters. While a GAuses a string of numbers to represent the solution, the GP combines a high level symbolicrepresentation with the search efficiency of the GA to form the best possible model for the system.Representation schemes in genetic programming are composed of nodes which are elements froma terminal set (constants e.g., 2 and/or variables e.g., x1, x2, etc.) and a functional set (mathematicaloperators that generate the model e.g., ± and xy, etc.). A typical genetic programming tree,representing the simple algebraic expression (2/x1+x2)2 is shown in Fig. 4.

The result of the GP process is a set of random trees of different sizes and shapes, each exhibitinga different fitness with respect to the objective function. If the set of applied functions is sufficientlyrich, tree structures are capable of representing hierarchical programs of any complexity.

The nature of genetic programming (GP) allows the user to gain additional information on howthe system performs, i.e., gives an insight into the relationship between input and output data. Oncea population of computer programs has been randomly created, the process of evolving the populationproceeds using the simple principles as for GAs, with the minor difference that, strings of functionsand terminals are reproduced, crossed over and mutated rather than strings of binary codes.Evolutionary algorithms maintain a population of structures that evolve according to the rules ofnatural selection and some operators inspired from natural genetics such as reproduction or crossover.Each individual in the population receives a measure of its fitness in the current environment. Thefitness criteria are calculated by the objective function i.e., how good the individual is at competingwith the rest of the population. At each generation a new population is created by the process ofselecting individuals according to their fitness and breeding them together using the genetic operators(crossover and mutation). The existing population will then be replaced with the new population.The procedure continues until the termination criterion, which can be either the maximum numberof generations or a particular allowable error, is satisfied. After the termination criterion is met, thesingle best program in the final population is designated as the result of the GP process.

Applications of artificial intelligence and data mining techniques in soil modeling67

4.2 Application of GP in geomechanics

Application of genetic programming in the field of civil engineering is quite new and original, andit has just started to be used in the field of geotechnical engineering. Indeed the very pioneeringworks relating to investigation of the capability of genetic programming in the field of geotechnicshave been published recently by the authors (e.g., Javadi and Rezania 2006, Javadi et al. 2006,Rezania and Javadi 2007). Javadi et al. (2006) introduced GP as a new approach for determinationof liquefaction induced lateral spreading. This is a complex geotechnical problem because of thelarge number of parameters (i.e., parameters describing the earthquake strength, geology of the siteand the soil characteristics) involved. In this work GP models were trained and validated using adatabase of SPT-based case histories. Separate models were presented to estimate lateral displacementsfor free face and for gently sloping ground conditions. It was shown that the GP models are able tolearn, with a very high accuracy, the complex relationship between lateral spreading and itscontributing factors in the form of a function. It was also shown that the attained function can beused to generalize the learning to predict liquefaction induced lateral spreading for new cases notused in the construction of the model. The results of the developed GP models were compared withthose of a commonly used model and the advantages of the proposed GP model were highlighted. Itwas shown that the GP based models for lateral spreading determination, offer an improvement overthe most commonly used, multi linear regression (MLR), model (Youd et al. 2002) for this problem.Rezania and Javadi (2007) utilized genetic programming for prediction of settlement of shallowfoundations on cohesionless soils. It was shown that the application of the traditional methods forprediction of settlement of shallow foundations could lead to very large errors. A new GP basedmodel was developed and presented in this paper. Comparison of the results showed that thepredictions by the proposed GP model provide significant improvements over the traditionalmethods and also outperforms the ANN based models.5. Evolutionary polynomial regression

Evolutionary polynomial regression (EPR) is a data-driven method based on evolutionary computing,aimed to search for polynomial structures representing a system. Genetic programming and neuralnetwork are both very powerful non-linear modeling techniques, but they have their own drawbacks.GP tends to search for mathematical expressions for F in Eq. (1) using an evolutionary approach,but the parameter values (vector θ) are generated as non-adjustable constants, referred to asephemeral random constants. Therefore the constants do not necessarily represent optimal values asin numerical regression methods and good structures of F can be missed in the process (Giustolisi andSavic 2006). Furthermore the number of terms in GP based expressions can greatly exceed and theevolutionary search within GP can be quite slow. Some of the disadvantages of ANN approach havebeen highlighted in section 3.4.

EPR is classified as a symbolic grey box technique which can construct clearly structured modelexpressions for a given set of data. To avoid the problem of mathematical expressions growingrapidly in length with time associated with GP, in EPR the evolutionary procedure is conducted inthe way that it searches for the exponents of a polynomial function with a fixed maximum numberof terms, rather than performing a general evolutionary search as used in conventional GP. Furthermore,during one execution it returns a number of expressions with increasing numbers of terms up to a

68A. A. Javadi and M. Rezania

limit set by the user, to allow the optimum number of terms to be selected.

In general, EPR is a two-stage technique for constructing symbolic models; (i) initially, usingstandard genetic algorithm (GA), it searches for the best form of the function structure, i.e., acombination of vectors of independent inputs, X (Eq. (1)) and (ii) secondly it performs a leastsquares regression to find the adjustable parameters, θ, for each combination of inputs. In this way aglobal search algorithm is implemented for both the best set of input combinations and relatedexponents simultaneously, according to the user-defined cost function.

The global search for the best form of function is performed by means of a standard GA over thevalues in the user defined vector of exponents. The GA operates based on Darwinian evolutionwhich begins with random creation of an initial population of solutions. Each parameter set in thepopulation represents the individual’s chromosomes. Each individual is assigned a fitness based onhow well it performs in its environment. Through crossover and mutation operations, with theprobabilities Pc and Pm respectively, the next generation is created. Fit individuals are selected formating, whereas weak individuals die off. The mated parents create a child (offspring) with achromosome set which is a mix of parents’ chromosomes. It is also possible that one parentchromosome undergoes mutation operation to form the offspring. The EPR process continues overgenerations and stops when the termination criterion, which can be either the maximum number ofgenerations, the maximum number of terms in the target mathematical expression or a particularallowable error, is satisfied. Description of the mathematical formulation and details of the EPRprocedure is outside the scope of the current paper and can be found in, e.g., Giustolisi and Savic(2006).

5.1 Application of EPR in geomechanics

EPR is a recently developed methodology that was originally used for environmental modeling byits developers (Giustolisi and Savic 2006, Giustolisi et al. 2007, Doglioni et al. 2008). However thecapability and outstanding performance of EPR approach in dealing with problems related to otherdisciplines of civil engineering including geotechnical, structural and earthquake engineering wereinvestigated by the authors of this paper (e.g., Javadi et al. 2007, Rezania and Javadi 2006, Rezaniaand Javadi 2008b).

Javadi and Rezania (2008a) introduced the EPR as a new approach for analysis of a number ofgeotechnical engineering problems. They investigated the feasibility of using this method for capturingnonlinear interaction between input and output variables in geotechnical systems. The efficiency ofthe EPR methodology was illustrated by application to a number of complex practical geotechnicalengineering problems which are difficult to solve or interpret using conventional approaches. Themerits and limitations of the proposed method were discussed.

Rezania et al. (2008a) highlighted some of the complexities involved in the analysis of many civilengineering phenomena and the shortcoming of traditional methods in describing such complexities.They presented EPR as a means of for capturing nonlinear interactions between various parametersof civil engineering systems. They illustrated the capabilities of the EPR methodology by applicationto two complex civil engineering problems including evaluation of uplift capacity of suction caissonsand shear strength of reinforced concrete deep beams. The results showed that the proposed EPRmodels provide significant improvement over the existing models. They also indicated that, fordesign purposes, the EPR models are easy to use and provide results that are more accurate than theexisting methods. It was concluded that the new approach overcomes the shortcomings of the

Applications of artificial intelligence and data mining techniques in soil modeling69

traditional and ANN-based methods in analysis of civil engineering systems.

Rezania and Javadi (2008a) presented a new EPR-based approach for prediction of settlement ofshallow foundations. The EPR model was developed and verified using a large database of SPT(standard penetration test) based case histories involving measured settlements of shallow foundations.The results of the EPR model were compared with those of a number of commonly used traditionalmethods and an ANN based model. It was shown that the EPR model is able to learn, with a veryhigh accuracy, the complex relationship between foundation settlement and its contributing factorsin the form of a function and generalize the learning to predict settlement of foundations for newcases not used in the development of the model. They highlighted the advantages of the proposedEPR model over the conventional methods and the ANN based model.

Rezania et al. (2008b) used EPR for determination of liquefaction potential of sands. EPR modelswere developed and validated using a database of 170 liquefaction and non-liquefaction field casehistories for sandy soils based on CPT (cone penetration test) results. Three models were presentedto relate liquefaction potential to soil geometric and geotechnical parameters as well as earthquakecharacteristics. The results of the developed EPR models were compared with a conventional modeland a number of neural network based models. It was shown that the proposed EPR model providesmore accurate results than the conventional model and the accuracy of the EPR results is better thanor at least comparable to that of the neural network based models proposed in the literature.

Javadi and Rezania (2008b) presented an innovative approach to constitutive modeling of materialsin finite element analysis using EPR. The proposed approach provides a unified framework formodeling of complex materials, using evolutionary polynomial regression-based constitutive model(EPRCM), integrated in finite element analysis. The advantages of EPRCM over conventionalconstitutive models and NNCMs were highlighted. The proposed algorithm provides a transparentrelationship for the constitutive material model that can be easily incorporated in a finite elementmodel. The application of the EPRCM for material modeling in finite element analysis wasillustrated through a number of examples.

The main advantage of EPR over ANN appears to be that it provides the optimum structure forthe material constitutive model representation as well as its parameters, directly from raw experimental(or field) data. The advantage compared with genetic programming is in producing compact andwell-structured mathematical expressions.6. Discussion

In conventional constitutive modeling of materials, an appropriate mathematical model is initiallyselected and the parameters of this model (material parameters) are then identified from appropriatephysical tests on representative samples to capture the material behavior. When these constitutivemodels are used in numerical analysis, the accuracy with which the selected material modelrepresents the various aspects of the actual material behavior affects the accuracy of the numericalpredictions. In the past two decades, the use of artificial neural networks has been introduced as analternative approach to constitutive modeling of materials. These studies indicated that neuralnetwork-based constitutive models can be very efficient in learning and generalizing the constitutivebehavior of complex materials. It has also been shown that the neural network based constitutivemodel (NNCM) can be incorporated in a finite element (or finite difference) code as a materialmodel. Although it has been shown by various researchers that ANNs offer great advantages in

70A. A. Javadi and M. Rezania

constitutive modeling of materials, the application of NNCM in finite element analysis ofengineering problems is still in its infancy and the majority of the applications so far have beenlimited to simple boundary value problems and relatively straight forward aspects of materialbehavior. The main shortcomings of the NNCMs, which have prevented them from achieving theirfull potential, are the back box nature of ANN and the fact that the optimum structure of the ANNmust be identified a priori.

To address the shortcomings of the neural network-based approach, in recent years new approacheshave been proposed for modeling of soils and other geomaterials using Genetic Programming (GP)and Evolutionary Polynomial regression (EPR). GP and EPR are evolutionary computing techniquesthat generate a transparent and structured representation of the system being studied. The mainadvantage of GP and EPR over ANN is that they provide the optimum structure for the materialconstitutive model representation as well as its parameters, directly from raw experimental (or field)data. The advantage of EPR compared with GP is in producing compact and well-structuredmathematical expressions. GP and EPR provide a structured representation for the constitutivematerial model that can be readily incorporated in the finite element method. It is envisaged that theestablishment of a unified framework for modeling of materials with complex behavior using ANN,GP or EPR will be valuable across the field in various disciplines of engineering. However, in theauthors’ opinion, the development of these new modeling techniques should be done in parallel withdevelopments in conventional constitutive modeling rather than a replacement for them. 7. Conclusions

A considerable number of neural network-based models have been developed for constitutivemodeling of soils. Many of these models have been developed as simple prototypes to show theapplicability of these techniques in modeling of specific soils. Only few models have been integratedin numerical (e.g., FE) models of engineering systems. The majority of these systems use neuralnetwork as a unified approach to constitutive modeling of complex materials. The results of theseworks have collectively shown the potentials of NNCM for modeling of soil behavior. Morerecently, a number of other data mining systems have been proposed that appear to be able toaddress few shortcomings of the neural network based models. It is envisaged that, while theestablishment of the new unified frameworks for modeling of materials will be valuable across theboard in various disciplines of engineering, however the authors believe that the development ofthese new techniques should be done in parallel with developments in conventional constitutivemodeling rather than a replacement for them. Development of numerical models that include arange of conventional constitutive models besides the new AI and data mining-based models willincrease the range of options for modeling of complex materials. In this way, for materials whosebehavior is understood and sufficiently described by one of conventional constitutive models, anappropriate model can be selected by the user. However, for cases where the behavior is toocomplicated to be described by a conventional model but a sufficient amount of experimental data isavailable, the new modeling tools offer great advantages in numerical analysis of engineering systems.In any case, it should be noted that all these models should be used by engineers and the importanceof engineering judgment in interpretation of numerical results should not be underestimated.

Applications of artificial intelligence and data mining techniques in soil modeling71

References

Abu-Kiefa, M.A. (1998), “General regression neural networks for driven piles in cohesionless soils”, J. Geotech.Geoenviron. Eng., ASCE, 124(12), 1177-1185.

Amorosi, A., Millar, D.L. and Rampello, S. (1996), “On the Use of Artificial Neural Networks as GenericDescriptors of Geomaterial Mechanical Behavior”, Proceeding of the International Symposium on Predictionand Performance in Rock Mechanics and Rock Engineering, EUROCK '96, Torino, Italy, 161-168.

Banimahd, M., Yasrobi, S.S. and Woodward, P.K. (2005), “Artificial neural network for stress-strain behavior ofsandy soils: Knowledge based verification”, Comput. Geotech., 32(5), 377-386.

Benardos, A.G. and Kaliampakos, D.C. (2004), “Modeling TBM performance with artificial neural networks”, J.Tunneling Underground Space Technology, 19(6), 597-605.

Caudill, M. (1991), “Neural network training tips and techniques”, AI Expert, 6(1), 56-61.

Dayhoff, J.E. (1990), Neural Network Architectures: An Introduction, Van Nostrand Reinhold, New York.

Desai, C.S. and Siriwardane, H.J. (1984), Constitutive Laws for Engineering Materials with Emphasis onGeological Materials, Prentice-Hall, Englewood Cliffs, NJ.

Desai, C.S., Somasundaram, S. and Frantziskonis, G. (1986), “A Hierarchical Approach for Constitutive Modelingof Geologic Materials”, Int. J. Numer. Anal. Met. Geomech., 10(3), 225-257.

Doglioni, A. (2004), “A novel hybrid evolutionary technique for environmental hydraulic modeling”, Ph.D.dissertation, Technical University of Bari, Italy.

Doglioni, A., Giustolisi, O., Savic, D.A. and Webb, B.W. (2008), “An investigation on stream temperatureanalysis based on evolutionary computing”, Hydrol. Process, 22(3) 315-326.

Drakos, S., Shin, H.S. and Pande, G.N. (2006), “Finite elements with artificial intelligence”, Proceeding of the2nd International Congress on Computational Mechanics and Simulation, Guwahati, India, Paper No. 297.Duncan, J.M. and Chang, C.Y. (1970), “Non-linear analysis of stress and strain in soils”, J. Soil Mech. Found.Division, ASCE, 96(SM5), 1629-1653.

Einstein, H.H. and Hirschfeld, R.C. (1973), “Model studies on mechanics of jointed rock”, J. Soil Mech. Found.Division, ASCE, 99(SM3), 229-248.

Ellis, G., Yao, C. and Zhao, R. (1992), “Neural network modeling of mechanical behavior of sand”, Proceedingsof the 9th ASCE Conference on Engineering Mechanics, Texas, 421-424.

Ellis, G.W., Yao, C., Zhao, R. and Penumadu, D. (1995), “Stress-Strain Modeling of Sands using ArtificialNeural Netwoks”, J. Geotech. Eng. Division, ASCE, 121(5), 429-435.

Elman, J.L. (1990), “Finding structure in time”, Cognitive Science, 14, 179–211.

Flood, I. and Kartam, N. (1994). “Neural networks in civil engineering II principles and understanding”, J.Comput. Civil. Eng., ASCE, 8(2), 149-162.

Garrett, J.H. (1994). “Where and why artificial neural networks are applicable in civil engineering”, J. Comput.Civil. Eng., ASCE, 8(2), 129-130.

Ghaboussi, J. and Wilson, E.L. (1973), “Flow of compressible fluid in porous elastic media”, Int. J. Numer. Met.Eng., 5, 419-442.

Ghaboussi, J., Carret, J. and Wu, X. (1990), “Material modeling with neural networks”, Proceedings of theinternational conference on numerical methods in engineering: theory and applications, Swansea, UK, 701-717.

Ghaboussi, J., Carret, J. and Wu, X. (1991), “Knowledge-based modeling of material behavior with neuralnetworks”, J. Eng. Mech. Division, 117(1), 132-153.

Ghaboussi, J., Lade, P.V. and Sidarta, D.E. (1994), “Neural network based modeling in geomechanics”,Proceedings of the 8th International Conference on Computer Methods and Advances in Geomechanics,Morgantown, WV, 153-1.

Ghaboussi, J., Pecknold, D.A., Zhang, M. and Haj-Ali, R.M. (1998), “Autoprogressive training of neural networkconstitutive models”, Int. J. Numer. Met. Eng., 42(1), 105-126.

Ghaboussi, J. and Sidarta, D.E. (1998), “New nested adaptive neural networks (NANN) for constitutivemodeling”, Comput. and Geotech., 22(1), 29-52.

Giustolisi, O. (2002). “Some techniques to avoid overfitting of artificial neural networks”, Proceeding of the 5thInternational Conference of Hydroinformatics, Cardiff, UK, 1465-1477.

72A. A. Javadi and M. Rezania

Giustolisi, O. and Savic, D.A. (2006), “A symbolic data-driven technique based on evolutionary polynomialregression”, J. Hydroinform., 8(3), 207-222.

Giustolisi, O., Doglioni, A., Savic, D.A. and Webb, B.W. (2007), “A multi-model approach to analysis ofenvironmental phenomena”, Environ. Model. Softw., 22(5), 674-682.

Giustolisi, O. and Simeone, V. (2006), Optimal design of artificial neural networks by a multi-objective strategy:groundwater level predictions. Hydrol. Sci. J., 51(3), 502-523.

Goh, A.T.C. (1996), “Pile driving records reanalyzed using neural networks”, J. Geotech. Geoenviron. Eng.,ASCE, 122(6), 492-495.

Habibagahi, G. and Bamdad, A. (2003), “A neural network framework for mechanical behavior of unsaturatedsoils”, Can. Geotech. J., 40(3), 684-693.

Haj-Ali, R.M., Pecknold, D.A., Ghaboussi, J. and Voyiadjis, G.Z. (2001), “Simulated micromechanical modelsusing artificial neural networks”, J. Eng. Mech., 127(7), 730-738.

Hashash, Y.M.A., Jung, S. and Ghaboussi, J. (2004), “Numerical implementation of a neural network basedmaterial model in finite element analysis”, Int. J. Numer. Met. Eng., 59, 9-1005.

Javadi, A.A. (2006), “Prediction of air losses in compressed air tunneling using neural network”, J. TunnelingUnderground Space Technology, 21, 9-20.

Javadi, A.A. and Rezania, M. (2006), “A new genetic programming-based evolutionary approach for constitutivemodeling of soils”. Proceeding of the 7th World Congress on Computational Mechanics, Los Angeles,California, USA.

Javadi, A.A. and Rezania, M. (2008a), “A new approach to data-driven modeling in geotechnical engineering”,Int. J. Geomech. Geoeng., (Under Review).

Javadi, A.A. and Rezania, M. (2008b), A New Approach to Constitutive Modeling of Soils in Finite ElementAnalysis using Evolutionary Computation. Proceedings of the Intelligent Computing in Engineering - ICE08Conference, 2-4 July, Plymouth, UK.

Javadi, A.A., Rezania, M. and Nezhad, M.M. (2007), “A new approach to data-driven modeling in civilengineering”, Proceeding of the 1st International Conference on Digital Communications and ComputerApplications, Amman, Jordan, 19-22 March, 2007, 17-24.

Javadi, A. A., Rezania, M. and Nezhad, M.M. (2006), “Evaluation of liquefaction induced lateral displacementsusing genetic programming”, Comput. Geotech., 33(4-5), 222-233.

Javadi, A.A., Tan, T.P. and Elkassas, A.S.I. (2004a), “An intelligent finite element method”, Proceeding of the11th International EG-ICE Workshop, Weimar, Germany, 16-25.

Javadi, A.A., Tan, T.P. and Elkassas, A.S.I. (2005), “Intelligent finite element method”, Proceeding of the 3rdMIT Conference on Computational Fluid and Solid Mechanics, Cambridge, Massachusetts, USA.

Javadi, A.A., Tan, T.P., Elkassas, A.S.I. and Zhang, M. (2004b), “Intelligent finite element method: Developmentof the algorithm”, Proceeding of the 6th World Congress on Computational Mechanics-WCCM VI, Beijing,China.

Javadi, A.A., Tan, T.P. and Zhang, M. (2003), “Neural network for constitutive modeling in finite elementanalysis”, Comput. Assist. Mech. Eng. Sci., 10, 375-381.

Javadi, A.A., Zhang, M. and Tan, T.P. (2002), “Neural network for constitutive modeling of material in finiteelement analysis”, Proceedings of the 3rd International Workshop/Euroconference on Trefftz Method, Exeter,UK, 61-62.

Juang, C.H. and Chen, C.J. (1999), “CPT-based liquefaction evaluation using artificial neural networks”, Comput.Aid. Civil Infrastruct. Eng., 14(3), 221-229.

Juang, C.H., Jiang, T. and Christopher, R.A. (2001), “Three-dimensional site characterization: neural networkapproach”, Geotechnique, 51(9), 799-809.

Jung, S. and Ghaboussi, J. (2006), “Neural network constitutive model for rate-dependent materials”, Comput.Struct., 84(15-16), 955-963.

Lu, M., AbouRizk, S.M. and Hermann, U.H. (2001), “Sensitivity analysis of neural networks in spool fabricationproductivity studies”, J. Comput. Civil Eng., ASCE, 15(4), 299-308.

Lu, P. and Rosenbaum, M.S. (2003), “Artificial neural networks and grey systems for the prediction of slopestability”, Nat. Hazards, 30(3), 383-395.

Kawamoto, T., Ichkawa, Y. and Kyoya, T. (1988), “Deformation and fracturing behavior of discontinuous rock

Applications of artificial intelligence and data mining techniques in soil modeling73

mass and damage mechanics theory”, Int. J. Numer. Anal. Met. Geomech., 12(1), 1-30.

Koza, J.R. (1992), Genetic Programming: on the Programming of Computers by Natural Selection, MassachusettsInstitute of Technology Press, Cambridge, Massachusetts, USA.

Lade, P.V. and Duncan, J.M. (1975), “Elastoplastic stress-strain theory for cohesionless soil”, J. Geotech. Eng.Division, ASCE, 101(GT10), 1037-1053.

Lefik, M. and Schrefler, B.A. (2003), “Artificial neural network as an incremental non-linear constitutive modelfor finite element code”, Comput. Method. Appl. Mech. Eng., 192, 3265-3283.

Lippmann, R.P. (1987), “An introduction to computing with neural nets”, IEEE Acoust. Speech Signal ProcessMag., 4(2), 4-22.

Logar, J. and Turk, G. (1997), “Neural network as a constitutive model of soil”, Zeitschrift für AngewandteMathematik und Mechanik, 77(S1), 195-196.

Millar, D.L. and Calderbank, P.A. (1995), “On the investigation of multilayer feedforward neural network modelof rock deformability behavior”, Proceeding of The 8th International Congress on Rock Mechanics, Tokyo,Japan, 933-938.

Millar, D.L. and Clarici, E. (1994), “Investigation of back-propagation artificial neural networks in modeling thestress-strain behavior of sandstone rock”, Proceedings of IEEE International Conference on Neural Networks,Piscataway, NJ, 3326-3331.

Najjar, Y.M. and Basheer, I.A. (1996), “Discussion of stress-strain modeling of sands using artificial neuralnetworks”, J. Geotech. Eng. Division, ASCE, 122(11), 949-950.

Najjar, Y.M., Basheer, I.A. and Ali, H.E. (1999), “On the use of neuronets for simulating the stress-strainbehavior of soils”, Proceeding of the 7th International Symposium on Numerical Models in Geomechanics-NUMOG VII, Graz, Austria, 657-662.

Owen, D.R.J. and Hinton, E. (1980), “Finite Elements in Plasticity: Theory and Practice”, Pineridge Press,Swansea.

Penumadu, D. and Chameau, J.L. (1997), “Geomaterial modeling using neural networks”, Artificial NeuralNetworks for Civil Engineering: Fundamentals and Applications, N. Kartman, I. Flood, and Garrett, eds.,ASCE, 160-184.

Penumadu, D. and Zhao, R. (1999), “Triaxial compression behavior of sand and gravel using artificial neuralnetworks”, Comput. Geotech., 24(3), 207-230.

Penumadu, D., Zhao, R. and Frost, D. (2000), “Virtual geotechnical laboratory experiments using a simulator”,Int. J. Numeri. Anal. Met. Geomech., 24(5), 439-451.

Press, W.H., Teukolsky, S.A., Vetterling, W.T. and Flannery, B.P. (1992), Numerical recipes, Cambridge UniversityPress, UK.

Rahman, M.S., Wang, J., Deng, W. and Carter, J.P. (2001), “A neural network model for the uplift capacity ofsuction caissons”, Comput. Geotech., 28(4), 269-287.

Rezania, M. and Javadi, A.A. (2006), “Application of evolutionary programming techniques in geotechnicalengineering”, Proceeding of the 6th European Conference on Numerical Methods in Geotechnical Engineering,Schweiger H.F. (ed.), Graz, Austria, 677-682.

Rezania, M. and Javadi, A.A. (2007), “A new genetic programming model for predicting settlement of shallowfoundations”, Can. Geotech. J., 44(12), 1462-1473.

Rezania, M. and Javadi, A.A. (2008a), “Predicting settlement of shallow foundations using evolutionarypolynomial regression”, Comput. Aid. Civil Infrastruct. Eng., (Under Review).

Rezania, M. and Javadi, A.A. (2008b), “Settlement prediction of shallow foundations; a new approach”,Proceeding of the 2nd British Geotechnical Association International Conference on Foundations, Dundee,Scotland, Paper No. 539.

Rezania, M., Javadi, A.A. and Giustolisi, O. (2008a), “An evolutionary-based data mining technique forassessment of civil engineering systems”, J. Eng. Computat., 25(6), 500-517.

Rezania, M., Javadi, A.A. and Giustolisi, O. (2008b), “Evaluation of liquefaction potential based on CPT resultsusing evolutionary polynomial regression”, Comput. Geotech., (Under Review).

Roscoe, K.H. and Schofield, A.N. (1963), “Mechanical behavior of an idealized ‘wet’ clay”, Proceeding of the2nd European Conference on Soil Mechanics and Foundation Engineering, Wiesbaden, Germany, 47-54.

Rumelhart, D.E., Hinton, G.E. and Williams, R.J. (1986), “Learning representations by back-propagating errors”,

74A. A. Javadi and M. Rezania

Nature, 323, 533-536.

Rumelhart, D.E., Hinton, G.E. and Williams, R.J. (1994), Learning Internal Representation by Error Propagationin Parallel Distributed Processing, Massachusetts Institute of Technology Press, Cambridge, Massachusetts,USA.

Shin, H.S. (2001), “Neural network based constitutive models for finite element analysis”, Ph.D. dissertation,University of Wales Swansea, UK.

Shin, H.S. and Pande, G.N. (2000), “On self-learning finite element code based on monitored response ofstructures”, Comput. Geotech., 27, 161-178.

Shin, H.S. and Pande, G.N. (2001), “Intelligent finite elements”, Proceeding of Asian-Pacific Conference forComputational Mechanics-APCOM 01, Sydney, Australia, 1301-1310.

Shin, H.S. and Pande, G.N. (2002), “Enhancement of data for training neural network based constitutive modelsfor geomaterials”, Proceeding of The 8th International Symposium on Numerical Models in Geomechanics-NUMOG VIII, Rome, Italy, 141-146.

Sidarta, D.E. and Ghaboussi, J. (1998), “Constitutive modeling of geomaterials from non-uniform material tests”,Comput. Geotech., 22(1), 53-71.

Toll, D.G. (1996), Artificial Intelligence Applications in Geotechnical Engineering, Electronic Journal ofGeotechnical Engineering, .

Wu, H.C., Zhang, X.B., Bao, T. and Al-Jibouri, S.H. (2001), “Modeling the stress-strain relation for granite usingfinite element-neural network hybrid algorithms”, Proceeding of the 10th International Conference onComputer Methods and Advances in Geomechanics, Tucson, Arizona, 241-245.

Youd, T.L., Hansen, C.M. and Bartlett, S.F. (2002), “Revised multilinear regression equations for prediction oflateral spread displacement”, J. Geotech. Geoenviron. Eng., ASCE, 128(12), 1007-1017.

Zhu, J.-H., Zaman, M.M. and Anderson, S.A. (1998a), “Modeling of soil behavior with a recurrent neuralnetwork”, Can. Geotech. J., 35(5), 858-872.

Zhu, J.-H., Zaman, M.M. and Anderson, S.A. (1998b), “Modeling of shearing behavior of a residual soil withrecurrent neural network”, Int. J. Numer. Anal. Met. Geomech., 22(8), 671-687.

因篇幅问题不能全部显示，请点此查看更多更全内容

查看全文