TECHNIQUES
Lemma
In psycholinguistics, the term lemma is an abstract form of a word that is used in speech production. In the best accepted psycholinguistic models, speech production has several stages, and the lemma occurs after the word has been selected mentally, but before any information has been accessed about the sounds in it (and thus before the word can be pronounced). It therefore contains information concerning only meaning and the relation of this word to others in the sentence.
Lemmatisation
Lemmatisation is the process of determining the lemma for a given word. Since the process involves determining the part of speech of a word in a sentence, it requires knowledge of the grammar of a language, and it can therefore be a great deal of work to implement a lemmatiser for a new language.
Lemmatisation is closely related to stemming. The difference is that a stemmer operates on a single word without knowledge of the context, and therefore cannot discriminate between words which have different meanings depending on part of speech. However, stemmers are typically easier to implement and run faster, and the reduced accuracy may not matter for some applications.
Life Rhythm
Life Rhythm is the experience of the variation in nature, intensity and duration of the many different activities we do in a certain period of time (day, week, year) , be it private or workrelated.
It is the emotional effect of the fragmentation of our day in an ever increasing number of time units in which we executed several diverse activities. Often this variation in activities will show a pattern. The repetition of that pattern feels like a rhythm.
(07 dec 2008)
Linear Regression Model
The linear regression model analyzes the relationship between the response or dependent variable and a set of independent or predictor variables. This relationship is expressed as an equation that predicts the response variable as a linear function of the parameters. These parameters are adjusted so that a measure of fit is optimized. Much of the effort in model fitting is focused on minimizing the size of the residual, as well as ensuring that it is randomly distributed with respect to the model predictions.
The goal of regression is to select the parameters of the model so as to minimize the sum of the squared residuals. This is referred to as ordinary least squares (OLS) estimation and results in best linear unbiased estimates (BLUE) of the parameters.
Logistic Regression
In a classification setting, assigning outcome probabilities to observations can be achieved through the use of a logistic model, which is basically a method which transforms information about the binary dependent variable into an unbounded continuous variable and estimates a regular multivariate model (See Allison's Logistic Regression for more information on the theory of Logistic Regression).
The Wald and likelihood-ratio test are used to test the statistical significance of each coefficient b in the model (analogous to the t tests used in OLS regression; see above). A test assessing the goodness-of-fit of a classification model is the Hosmer and Lemeshow test.









