Understanding Statistics

Why Statistics?

When you’re working with MindSonar, it can be important to have a basic understanding of statistics. Why? First of all, it’s good to know as much as possible about a system you’re working with. For instance, if you want to construct a real solid benchmark profile, it is useful to understand what correlation is. Or when you see a standard deviation in a team profile, it’s good to understand precisely what that means.

But also, answering questions about things like ‘validity’ or ‘Cronbachs alpha’ can be important when you are discussing a large project with a potential client – or with the experts they bring in. And it’s not all that difficult, really. Especially since I scoured YouTube and found you some well designed instructive movies.

On this page we’ll cover 4 statistical themes:

  1. Reliability and Validity
  2. Standard Deviation
  3. Correlation
  4. Cronbach’s Alpha

 

1. Reliability and Validity

The two statistical terms you will come across most often are ‘Reliability’ and Validity’. For some reason those two are usually presented as a duo. Maybe because these are the two most basic concepts to evaluate a test with. Reliability asks the question: “Are the test scores consistent?” The principle is, that if you are measuring the same thing repeatedly, you should get the same result each time. Validity asks a different question: “Does the test measure what it is supposed to measure?” With validity the basic idea is that you cannot measure temperature with a yardstick, you need a thermometer.

Reliability and validity can vary independently. All combinations are possible: reliable but not valid, valid but not reliable, neither reliable nor valid and finally the holy grail of test psychology: both reliable and valid. This graphic illustrates these four options:

 

In this first video, Donna Gregory gives you a very quick and easy overview of reliability and validity. The basics are really that simple!

 

In the next video Andrew Conway, Senior Lecturer at the University of South Carolina, explains reliability and validity in some more depth. He also describes how reliability and validity may be measured in different ways.

 

The Reliability and Validity of MindSonar

In the movie shown above, Dr. Conway mentions three ways of measuring reliability and four ways to determine validity:

Reliability

  1. Test / Re-test
  2. Parallel Tests
  3. Inter-item Estimates

Validity

  1. Content Validity
  2. Convergent Validity
  3. Divergent Validity
  4. Nomological Validity

How do these 7 concepts relate to MindSonar?

Test/Re-test Reliability
To determine the test/re-test reliability, the test is given to the same person twice, and – in the case of MindSonar – the test needs to be about the same context as well. The smaller the difference between the two scores, the higher the test retest reliability. So far, this kind of research has not been done on MindSonar.

Parallel Test Reliability
To determine parallel test reliability, MindSonar would have to be compared to other tests that have already been proven to measure the same concept reliably. In the example given in the movie, the temperature measured by an infrared thermometer is correlated to the temperature measured by a normal thermometer. Unfortunately, in the case of MindSonar, there are no parallel tests that have been proven to measure the same concepts. Therefore, determining parallel test reliability is not possible for MindSonar.

Inter-item Estimates
Which leaves us with the third category of reliability measurement: inter-item estimates. The most commonly used inter-item estimate is called Cronbach’s Alpha. This is basically a measure of the internal consistency between the items. Further down on this page Cronbach’s Alpha is explained in detail.

From the perspective of reliability, MindSonar consists of 14 different tests; one for each Meta Program distinction and one for the Graves categories.

Drs. Lisette de Ruyter calculated Cronbach’s Alpha in 2007, based on the results of 1600 respondents. She found the number of items (questions) that lowered Cronbach’s Alpha or only contributed minimally. These items were removed, with the exception of the ones that seemed to cover an important aspect of the Meta Program in question. Although these items did not increase Cronbach’s Alpha much, they did seem to contribute to the depth of the measurement. After this ‘cleaning operation’, Drs. De Ruyter found the values for Cronbach’s Alpha, as shown in the next table. In the social sciences, an Alpha of .70 is considered acceptable and an Alpha of .80 is considered quite reliable. As you can see in the table, all values are at least .70 and most are either close to or higher than .80. So in terms of inter-item estimates, we may conclude that MindSonar is a reliable test, or actually, that MindSonar is a collection of 13 reliable tests. No research has been done yet, on the 14th test measuring the Graves categories.

 Meta Program

 Cronbach’s Alpha

 Average
Cronbach’s Alpha
per
Meta Program

ProactiveReactive

α = 0,767

α = 0,767

Towards
Away from

α = 0,843

α = 0,843

Internal reference External reference

α = 0,770

α = 0,770

Options Procedures

α = 0,766

α = 0,766

Matching Mismatching

α = 0,794

α = 0,794

Internal locus
External locus

α = 0,706

α = 0,706

Global
Specific

α = 0,835

α = 0,835

Maintenance Development Change

α = 0,736

α = 0,608

α = 0,757

α = 0,700

People
Activity
Information

α = 0,823

α = 0,750

α = 0,816

α = 0,796

Concept
Structure
Use

α = 0,694

α = 0,718

α = 0,799

α = 0,737

Together
Proximity
Solo

α = 0,760

α = 0,545

α = 0,860

α = 0,722

Past
Present
Future

α = 0,820

α = 0,809

α = 0,827

α = 0,819

Visual
Auditory
Kinesthetic

α = 0,736

α = 0,722

α = 0,768

α = 0,742


Content Validity
Content validity basically asks: What does common sense tell us about how these questions cover the concept you want to measure? In 2004 Drs. Jean Nijskens did an in-depth study into the content validity of MindSonar. Ten respondents filled out the test and during this process Drs. Nijskens asked each respondent, after each question, why they answered the way they did. He then evaluated whether or not these considerations were representative for the Meta Program being measured.

To give an example, there was one question that showed a photograph of a woman in a white dress, with two alternative texts. This item was intended to measure Matching versus Mismatching. In the one text the woman was thinking “It is great that I’m wearing my nice white dress in this photograph” (Matching), and in the other text she was thinking “It is a pity that I am not wearing my white earrings in this photograph” (Mismatching). The question was: “Which woman thinks most like you think in the context of [here the context  the respondent had defined was filled in]?”. One respondent chose the dress alternative (“It is great that I’m wearing my nice white dress in this photograph”). When Drs. Nijskens asked her why she had made this choice, she said “Because your main apparel (the dress) is always more important than the accessories (the earrings)!”. Of course, this consideration had little or nothing to do with the Meta Program distinction we wanted to measure (Matching versus Mismatching). If an item evoked too many unintended associations like this, the item was either modified or replaced. The new items and the modified items were then presented to respondents again, to ascertain that they were now triggering the desired associations.

Based on this research, some fundamental questions were raised about the way we had operationalised the Meta Programs (i.e. the questions with which we were measuring the Meta Programs). One complete category of items (association questions) was removed altogether, because it was found that too many respondents could not stay focused on the desired context when answering these questions.

Convergent Validity
Determining convergent validity would mean for MindSonar: finding other tests that have already proven to reliably measure Meta Programs, and then calculating the correlation between those tests and MindSonar. Unfortunately at this time, there are no other tests for Meta Programs that have the required statistical qualities. It is therefore, to establish convergent validity in this sense.

There is, however, another sense in which some convergent validity has been shown for MindSonar. I am referring here to benchmark profiles. Some benchmark profiles have been shown to accurately predict some other criterion measure. Benchmark profiles have been shown to be able to predict with clinical significance:

  • the number of cars sold by a car sales people
  • the valuation of a candidate horse riding jury member by a selection committee
  • the effectiveness of city law enforcement agents
  • the financial results obtained by debt collectors

Divergent Validity
No specific divergent validity has been demonstrated for MindSonar.

Nomological Validity
In the case of MindSonar nomological validity means: does the best fit in with the general knowledge about Meta Programs. This type of validity was addressed by Drs. Jean Nijskens in 2005. All questions were shown to group of 60 experts. The experts were NLP Masterpractitioners who had been trained in recognizing Meta Programs. For each question they were asked which Meta program distinction they thought we wanted to measure with that particular question. For each question the percentage of correct identification by the experts was calculated. It’s a question was identified correctly by less than 50% of the experts, it was considered to be an inadequate operationalisation of the underlying construct (the Meta Program). All these questions were replaced or modified. The new questions or modified questions were then presented to the experts again, until they achieved at least 50% adequate identification.

 

2. Standard Deviation

Next, let’s have a look at Standard Deviation. This is basically the average distance to the mean. In other words: how far away are the scores from the average?

Using MindSonar, you will see this measure (SD) pop up automatically in team profiles, when you use the Excel sheet here on this page. In a team profile, the SD for a given Meta Program tells you how different individual team members think, when it comes to that Meta Program. An other way of saying this is: how consistent or flexible are they as a team when it comes to that Meta Program?

For example, say a team has and average score for ‘Options’ of 7, with a SD (Standard Deviation) of 0.2. The average difference is pretty small, so you know the team very consistently thinks in terms of Options. As always, this has its advantages and its pitfalls, depending on the context, but that’s another subject. But if this team had an average of 7 with a SD 0f 4, that’s a different story. At least in terms of consistency and flexibility. The 7 is the average, but the scores vary wildly so the team is not very consistent with respect to this Meta Program and they are quite flexible. And again, this flexibility may be a resource of a limitation, depending on the context.

Here is a great video by the Apstats Guy. I really like this video. He explains in a clear, dynamic and kinesthetic manner what that SD formula means.

 

3. Correlation

Our next statistical concept to understand is correlation. Basically this means: how strongly do two scores hang together? If the correlation between two measures is high, it means that when the one goes up or down, the other one does too. If the correlation is low, then when the one goes up, it doesn’t say much about what the other one will do.

I often use the example of umbrellas and rain. When you measure the amount rain falling say in a given hour, and you measure the number of umbrellas you see in the street in that hour, you will find a strong correlation. The more rain falls, the more umbrella’s you will see. And it works the other way around too: the more umbrellas you count, the more rain you will find. Of course this relationship, the correlation between rain and umbrellas, will not be a hundred percent. Because some people don’t mind getting wet and some people may use their umbrella to protect themselves from the sun.

Please note that a high correlation doesn’t automatically mean that one variable causes the other. The classical example here is the correlation between priests and alcoholics in a city. When the city grows, it will have both more priests and more alcoholics. So there will be a high correlation – when we would measure different cities – between the number of priests and the number of alcoholics. But this does not mean, of course, that the priests cause the alcoholism….

In MindSonar you run into correlations when you look at the correlations between Meta Program scores. For instance the correlation between Options on the one hand and Internal Locus of Control on the other had is .40. Even though this correlation is weak, it is not insignificant. If we square the correlation, then we see that on average  .40 x .40 = .16 percent of the score on Internal Locus of Control is determined by the score on Options. And vice versa. This means that people who score high on Options tend to score high on Internal Locus of Control too. Options has a low correlation (-.02; very close to 0) with Activity. This meas that when someone scores high on Options there’s no telling how they might score on Activity, there is almost no connection at all (.02 x .02 to be precise) between these two scores.

And then there are negative correlations too, meaning that when someone scores high on the one measure, they tend to score low on the other one and vice versa. If we look at Options again, we see that it correlates -.45 with Away From. So someone who scores high on Options, will tend to score low on Away From. And the other way around too, when they score low on Options they tend to score high on Away From.

When is this important? For instance when you are using a benchmark profile. If you have a benchmark profile with high scores for Options and Away From, you know that it will be a little harder to find someone who fits that profile, compared to when you would be looking for someone with high scores on both Options and Internal Locus of Control.

Another place in MindSonar where correlations can be quite important is when you are constructing and evaluating a benchmark profile. If you have a benchmark profile and some number indicating how well someone is doing in the target context, you have two sets of numbers, so you can calculate the correlation. Say, for instance, you have a benchmark profile for car sales people. You profiled a number of car sales people. And you also have a list of the number of cars they sold in the last three years. You should then find a strong positive relationship (a strong correlation) between the benchmark profile and the number of cars sold. The closer a car sale persons profile resembles the benchmark profile he more cars they should have sold.

Please note that all two two-sided Meta Program scores (Proactive/Reactive, Towards,/Away From, and so on) have a perfectly negative correlation (-1) with each other. This is due to how they are measured during the profiling. Respondents divide points over Meta Programs. Therefore, the more point they give to Proactive, for instance, exactly that many less points will be giving to Reactive. The higher their Proactive score is, exactly that much lower their Reactive score will be. This also means that their correlations with other Meta Programs are mirrored: Options correlates +.45 with Internal Locus of Control, and it’s opposite Procedures mirrors that exactly, it correlates -.45 with Internal Locus of Control.

 

Positive and Negative Correlations

To understand what the correlation numbers mean, here is a good overview:

 

Calculating Correlations

To understand how correlations are calculated in statistics, here is that crazy Apstats Guy again with another great video. Please note: at the end of the video he goes into calculating correlation on his TI (Texas Instruments) statistical calculator. If you don’t use a calculator like that, feel free to skip that part. This last part of the video does show however, how correlations can be calculated easily with this relatively simple technology.

 

Alternative Explanation of Correlations

And to get a slightly different, more academic overview of correlations, here is the ‘Math Doctor’ explaining both what correlation is (movie 1) and how to calculate it (movie 2). He is basically explaining the same things as the previous two video’s, but in a different style and with different examples.

 

 

4. Cronbach’s Alpha

And last but not least, let’s have a look at Cronbach’s Alpha. This is a number that expresses the internal consistency of a test. Actually: the internal consistency of a series of test items (questions). Cronbach’s Alpha basically asks: Do these questions seem to measure the same underlying idea? Are they all tapping into the same concept? This is often considered to be an indication of how reliable a test is. Because you cannot measure something reliably without being consistent.

Why is it important to understand this? When you work with MindSonar you will sometimes run into experts like psychologists asking questions about validity and reliability or sometimes about the ‘statistical quality’ of MindSonar. Knowledge of Cronbach’s Alpha will help you answer these questions. Being able to do that can sometimes be crucial to your perceived authority; whether or not you will be hired to do the project might depend on your answers.

 

What Cronbach’s Alpha is and How to use It

Here is a general introduction to Cronbach’s Alpha by Julie Dickinson. What I like about this video, is that she clearly explains the possible trade off between Alpha and validity. Sometimes you could raise the Alpha of your test by removing certain questions. These questions do not, according to Cronbach’s formula, fit in very well with the other questions, in other words: they seem to measure a different concept. And yet removing them, as Julie explains, is not always a good idea. In our case, these questions might cover an important aspect of the Meta Program we want to measure. We were able to reduce the number of questions in MindSonar by removing several questions that actually lowered the Alpha. We also retained a number of questions that did not contribute much towards the Alpha, but that we felt measured an important aspect of a Meta Program.

When considering Cronbach’s alpha, it is important to remember that MindSonar is not one test, but a collection of 13 separate tests, one for each Meta Program set.

Here is a clear explanation by Agnes Ly of how she calculates Crombach’s Alpha with statistical software (SPSS).

Leave a Reply

Your email address will not be published. Required fields are marked *

*