### Analysing the stroop effect

In a Stroop task, participants are presented with a list of words, with each word displayed in a color of ink. The participant’s task is to say out loud the color of the ink in which the word is printed. The task has two conditions: a congruent words condition, and an incongruent words condition. In the congruent words condition, the words being displayed are color words whose names match the colors in which they are printed: for example RED, BLUE. In the incongruent words condition, the words displayed are color words whose names do not match the colors in which they are printed: for example PURPLE, ORANGE. In each case, we measure the time it takes to name the ink colors in equally-sized lists. Each participant will go through and record a time from each condition.

### What is the independent variable? What is the dependent variable?

Independent variable is a variable that is controlled in an experiment. In this problem it is

whether the words are congruent or incongruent.

Dependent variable is the variable being tested. In this problem it is the time to read congruent

and incongruent words

### What is an appropriate set of hypotheses for this task? What kind of statistical test do you expect to perform?

The appropriate hypotheses are if there is a difference between in the reading time of congruent and incongruent words. If there is a difference, we can infer that the Stroop effect exists. correction The null hypotheses is there is no difference between the reading time of congruent words and incongruent words. Mathematically, Ho: u_congruent – u_incongruent = 0 The alternate hypotheses is there is a difference between the reading time of congruent words and incongruent words and Stroop effect does exist. Mathematically, Ha : u_congruent – u_incongruent != 0 where, Ho is null hypotheses, Ha is alternate hypotheses, u_congruent is the population mean of reading time of congruent words, u_incongruent is the population mean of reading time of incongruent words, ‘=’ is arithmetic sign for equal to, ‘!=’ is the sign for not equal to, If we get enough evidence that we can reject null hypotheses, we will support our alternative hypotheses and infer that there is a difference between reading time of congruent and incongruent words i.e. Stroop Effect exist. I’ll perform two side T-test for paired samples. I chose this because the population standard deviation is unknown. The other reason is our two sets of observations are not independent but paired. This means the same person is recorded twice(once for congruent and other for incongruent).

```
import pandas as pd
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
```

```
df=pd.read_csv("stroopdata.csv")
```

```
df.head()
```

```
df.describe()
```

```
array=np.array(df)
```

The visualizations describe the given situations

```
N = len(array)
con = array[:,0]
incon = array[:,1]
ind = np.arange(N)
width = 0.5
p1 = plt.bar(ind,con,width)
p2 = plt.bar(ind,incon,width,bottom=con)
plt.ylabel('Reading Time')
plt.title('Reading time for Congruent and Incongruent')
plt.legend((p1[0], p2[0]), ('Congruent', 'Incongruent'))
plt.show()
```

We can clearly see from the visualization that time taken to read incongruent words is higher

than the time taken to read incongruent words.

Now,we perform the statistical test to find out the confidence level and critical statistic value. And check if we can reject the null hypothesis.

```
diff=array[:,0]-array[:,1] #Difference vector of two reading times
diff
```

```
u_diff=diff.mean() #Mean of the difference vector
u_diff
```

```
std_diff=diff.std() #Standard Deviation of the difference
std_diff
```

```
n=len(diff)
n
```

```
se_diff=std_diff/np.sqrt(n) #Standard Error
se_diff
```

```
t=(u_diff-0)/se_diff #Calculated t-statistic
t
```

```
dof=n-1 #Degrees of Freedom
dof
```

```
critical_t= stats.t.ppf(1-0.025, dof) #Crtical value given apha=0.05
critical_t
```

**Mean of differences : u_diff = -7.964 Standard deviation of differences : std_diff = 4.762**

**Standard Error : 0.9721 Degree of Freedom: dof= n-1= 23 Calculated T statistic: t = 8.193 For apha**

**= 0.05 , Critical Value is +- 2.0686 p value is nearly 0.000001**

**Hence, our T-statistic is greater than critical value and falls under the critical region, we reject**

**our Null Hypotheses. Also the p-value is much less than 0.05, hence there is strong evidence in**

**support of alternate hypotheses.**

```
```

### Tanishk Sachdeva

#### Latest posts by Tanishk Sachdeva (see all)

- Hypothesis Testing using Stroop Effect - August 3, 2019
- Customer Churn Prediction – Part 1 – Introduction - April 18, 2019
- Comprehensive Classification Series – Kaggle’s Titanic Problem Part 1: Introduction to Kaggle - December 20, 2017