Friedman’s test with tied data

I completely sympathise with students sometimes when they say they hate statistics. I really do. A colleague emailed me the other day in response to a student query regarding difficulty in getting their hand-calculation for a Friedman test to match that produced by SPSS (shout-out to this eagle-eyed student!!). I smiled to myself, thinking “silly SPSS, giving the wrong results again!”. Then, in a flurry of smugness, I fired up R, entered the data my colleague had provided, and found to my utter surprise that R, too, was producing the “incorrect” result. I checked the hand calculations in Excel and they were all fine. So how come R and SPSS were giving different (apparently incorrect) results?

Here are the data (including the ranks): The equation used in the hand-calculations was where n is the number of observations per condition (12), k is the number of conditions (3), and Ri is the sum of the ith column (condition). It’s a gnarly-looking equation, and students gulp when it’s presented.

However, it turns out this equation is only correct if there are no ties in your data. My colleague informed me this was not mentioned in the book he had access to, and I have since checked 3 books that I have at home (haven’t checked the ones in my office, yet), and all of them give the above formula and make no mention of a correction-for-ties.

The interweb didn’t help much, either, in trying to work out what was going wrong, but I stumbled across a book chapter on Google books (link is here) which gives a corrected formula. To help others, it’s repeated below: Wikipedia to the rescue (!)

It turns out that Wikipedia has an excellent page for Friedman’s test which provides yet another (set of) formula(s). I’m not quite sure the original source that Wikipedia got this equation from, but it turns out that this is a general formula that works for tied data and non-tied data alike. So, it makes sense that R might be using this formula (it would be more economical than having to continually test whether it should use the first or the second equation reported above).

What have I learned?

• Students are very observant. Had this student not contacted my colleague (and he not subsequently contact me), this error would have crept into next year, too. Genuine thanks to them for coming forward and questioning my lecture slides.
• Statistics is hard. How can we expect students to “get” all of this when I didn’t even know there was a mistake here.
• Most textbooks discussing this test don’t use tied data, and so use the first equation above. This does not work with tied data. You can use the second equation, but best use the generalised equation on Wikipedia.
• Wikipedia is (sometimes, at least!) accurate!
• R is still ace.

For those interested, here is the R code for performing the Friedman equations from Wikipedia. This is just to show what it’s doing; you can just see the R help pages for ?friedman.test for a shortcut to computing the test.

rankData <-
matrix(c(2.5, 2.5, 1,
3, 1.5, 1.5,
1.5, 3, 1.5,
2, 3, 1,
1.5, 3, 1.5,
3, 1.5, 1.5,
2, 2, 2,
3, 1.5, 1.5,
2.5, 1, 2.5,
3, 1.5, 1.5,
2, 3, 1,
2.5, 2.5, 1),
nrow = 12,
byrow = TRUE,
dimnames = list(1:12, c("None", "Classical", "Dance")))

rbar_none <- (1/12) * sum(rankData[, 1])
rbar_classical <- (1/12) * sum(rankData[, 2])
rbar_dance <- (1/12) * sum(rankData[, 3])

rbar <- (1/(3*12)) * sum(rankData)

ssT <- 12* sum(((rbar_none - rbar)^2), ((rbar_classical - rbar) ^ 2), ((rbar_dance - rbar) ^ 2))
ssE <- (1/(12*2)) * sum(((rankData - rbar) ^ 2))

Q <- ssT/ssE

Response Time Modelling – How Many Trials?

Over the past year I have become increasingly interested in using models of simple decision making processes, as measured by two-choice response time tasks. A simple 2-choice task might ask participants to judge whether a presented number is odd or even, for example. One popular model of 2-choice RT is the Ratcliff Diffusion Model. A simplified example of this model is presented below. The idea behind the model is that, once presented, the stimulus elicits an evidence-accumulation process. That is, the cognitive system begins sampling the stimulus, and evidence begins to accumulate in a noisy manner towards one of two response boundaries. Once a boundary has been reached, that response is executed. One boundary reflects the correct response, and the other reflects the incorrect response. The average rate of evidence accumulation (which is a drift diffusion process) is reflected in a parameter drift rate. The amount of evidence that needs to accumulate before a response is made is governed by the distance between the two boundaries (boundary separation parameter). The height of these boundaries are symmetrical around a starting point of the drift diffusion process, reflected in a parameter bias. The bias is typically midway between the two boundaries, as (for example) 50% of trials will present odd stimuli, so the participant will have no reason to prefer one response over the other. The final parameter is called non-decision time, which reflects the time it takes to encode the stimulus and execute the motor response. The model is very complex mathematically and computationally, but implementing the model has been made easier over recent years due to the release of several software packages. Among these are E-J Wagenmakers and colleagues’ EZ diffusion model, Vandekerckhove & Tuerlinckx’s DMAT toolbox for MatLab, and Voss and Voss’ Fast-DM. For an excellent overview of the diffusion model, as well as available packages, see this paper by E-J Wagenmakers.

However, a new package (the “RWiener” package) has just been published by Wabersich and Vandekerckhove (see paper here) for R-Statistics. R is my favourite thing EVER (seriously), so I was thrilled to see an R package that can model response times.

How many trials?

The package is very straightforward; it can generate synthetic data with known parameters as well as find best-fitting parameters from data you might have, which is superb. One question I was unsure of, though, is how accurate this parameter estimation routine is for varying number of trials in an experiment. That is, how many trials do I need as a minimum in my experiment for this package to produce accurate parameter estimation? With human data, this is a very difficult (impossible?) question to answer, but is tractable when fitting the model to artificially-generated data; that is, because we know which parameters gave rise to the data (because we tell the computer which to use), we can tell whether the estimated parameters are accurate (compare estimated parameters to “known” parameters).

I set up a simulation where I addressed the “how many trials?” question. The structure is straightforward:

• Decide on some “known” parameters
• Generate N trials with the known parameter for X number of simulated “experiments”
• Fit the model to each simulated experiment
• Compare estimated parameters to known parameters
• Repeat for various values of N

For the simulations reported, I simulated experiments which had Ns of 25, 50, 100, 250, 500, and 1000 trials. This spans from the ridiculously small experiments (N=25) to the ridiculously large experiments (N=1000). For each N, I conducted 100 simulated experiments. Then, I produced boxplots for each parameter which allowed me to compare the estimated parameters to the known parameters (which is shown as a horizontal line on each plot). Ns which recover the known parameters well will have boxplots which cluster closely to the horizontal line for each parameter. Conclusion

The fitting routine is able to produce reasonably accurate parameters estimates with as little as 250 trials; beyond this, there is not much advantage to increasing trial numbers. Note that 100 trials is still OK, but more will produce better estimates. There is no systematic bias in these estimates with low trial numbers except for the non-decision parameter; estimates seemed to be too high with lower trial numbers, and this increased bias decreased as trial numbers are boosted.

The RWiener package is a welcome addition for fitting models to response time data, producing reasonably accurate parameter estimates with as little as ~250 trials.

R Code for above simulation

The code for the simulation is below. (Note, this takes quite a while to run, as my code is likely very un-elegant.)

rm(list = ls())

library(RWiener)

#how many trials in each sample condition?
nTrials <- c(25, 50, 100, 250, 500, 1000)
#how many simulations for each sample?
nSims <- 100

#known parameters
parms <- c(2.00, #boundary separation
0.30, #non-decision time
0.50, #initial bias
0.50) #drift rate

#what are the starting parmaeters for the parameter minimisation routine?
startParms <- c(1, 0.1, 0.1, 1) #boundary, non-decision, initial bias, & drift

#start simulations here
for(i in 1:length(nTrials)){ #for each number of trials (25, 50, etc...)

#get the current number of trials to generate
tempTrials <- nTrials[i]

#generate empty matrix to store results
tempMatrix <- matrix(0, nrow=nSims, ncol=length(parms))

for(j in 1:nSims){ #do a loop over total number of simulations required

#generate synthetic data
tempData <- rwiener(n=tempTrials, alpha=parms, tau=parms, beta=parms,
delta=parms)
#now fit the model to this synthetic data
findParms <- optim(startParms, wiener_deviance, dat=tempData, method="Nelder-Mead")

#store the best-fitting parameters
tempMatrix[j, ] <- findParms\$par
}

#store the matrix of results as a unique matrix
assign(paste("data", tempTrials, sep = ""), tempMatrix)
}

#do the plotting

par(mfrow=c(2,2)) #change layout to 2x2

#plot boundary
boxplot(data25[, 1], data50[, 1], data100[, 1], data250[, 1], data500[, 1], data1000[, 1],
names = nTrials, xlab = "Number of Trials", ylab = "Parameter estimate", main = "Boundary Separation")
abline(a=parms, b=0)

#plot non-decision time
boxplot(data25[, 2], data50[, 2], data100[, 2], data250[, 2], data500[, 2], data1000[, 2],
names = nTrials, xlab = "Number of Trials", ylab = "Parameter estimate", main = "Non-Decision Time")
abline(a=parms, b=0)

#plot initial bias
boxplot(data25[, 3], data50[, 3], data100[, 3], data250[, 3], data500[, 3], data1000[, 3],
names = nTrials, xlab = "Number of Trials", ylab = "Parameter estimate", main = "Initial Bias")
abline(a=parms, b=0)

#plot drift rate
boxplot(data25[, 4], data50[, 4], data100[, 4], data250[, 4], data500[, 4], data1000[, 4],
names = nTrials, xlab = "Number of Trials", ylab = "Parameter estimate", main = "Drift Rate")
abline(a=parms, b=0)

New blog – so what?

I was hesitant to start a personal blog about my academic interests. There are so many excellent psychology-related bloggers out there (Rolf Zwaan’s and Dorothy Bishop’s are among my faves); who am I to say anything of interest to potential readers when they could be reading those blogs? But then I thought, why not just write for yourself, Jim? And so here I am.

I already blog about research methods in psychology at http://www.researchutopia.wordpress.com. This is primarily aimed at undergraduate students who are sailing the turbulent seas of “god we hate statistics”. My rate of new posts is somewhat low, often hindered by the fact I find myself wanting to write about something that is either a) beyond the scope of that blog, or b)  too esoteric for a statistics blog aimed at undergraduates. Therefore, this blog will serve as my outlet for discussing things that don’t quite fit into my other one. Among these topics will be general interest items about academic papers I’ve found of interest, interesting questions I’m working on in my own research, trends in cognitive psychology, statistical issues, data analysis tidbits, bits of computer code (primarily R; for those unaware, I’m a bit of an R-addict), and so on.

I’m hoping that the blog ends up more coherent than it sounds like it will be at this stage. As all posts will be work-related (and when you’re an early-career academic, isn’t everything work-related?), the posts will reflect my path as I wander through academic life.

Every journey begins with a single step…