Talk:Likert scale

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Number of Options[edit]

Working in Human-Computer Interactions for web usability, business people often ask, "Why do you use a 1-5 scale and not a 1-10 scale?" I explain that choice is largely subjective. 7 or 10 choices offer greater granularity, but can be too focused for some surveys. It comes down to a Goldilocks problem, how hot do you want your porridge.

Understanding the subjectivity of this survey technique might help readers understand it in their context.

Jay jayamorgan at gmail

The number of categories that can be sustained is an empirical question. It's not so much a matter of 'how hot do you want your porridge' -- rather, how fine-grained can you make the information without it being artificially so? See polytomous Rasch model for further explanation of this point. For my part, though, I thank you for the comments, very useful. Choice is subjective, yes, but that doesn't mean preferences, attitudes, etc. can't be consistent. Stephenhumphry 05:17, 15 September 2005 (UTC)[reply]
Seven or nine point scales tend to give much more granularity. This is because a 5 point scale only allows two for levels of agreement or disagreement. More than 9 categories tends to create a cognitive overload. Klonimus 09:31, 23 October 2005 (UTC)[reply]
Like your comments below, important points for this article. On comment above, I'd like to see evidence for greater granularity with seven categories: i.e. evidence within the structure of the data when subjected to analyses that are sensitivie to this. The problem is separating the semantic space so that the categories carry substantive differences, in terms of the attitude, affective disposition, or whatever is relevant to the item statement. I don't use Likert data much for various reasons, but in my experience, when appling Rasch models I have not found that more than 4 or 5 categories contribute additional information. There ends up being a lot of overlap between categories on the latent continuum and relevant thresholds do not discriminate. Nonetheless, this is not to say it is not possible to achieve such things with sufficient skill and thought. Stephenhumphry 23:45, 25 October 2005 (UTC)[reply]


Data analysis[edit]

I moved this out of the main body because it's not directly relevant as is. The scale and items itself is analysed using "item analysis" methods such a Cronbach's alpha etc. Additionally scales may be analysed by factor analysis, or Guttman/Mokken criteria. Often items, are analysed by looking at their ability to discriminate between high and low scorers on the test.

The summated test scores, can be worked with anyway way you like. Klonimus 09:31, 23 October 2005 (UTC)[reply]


Klonimus 09:31, 23 October 2005 (UTC)[reply]

In practice, data obtained from Likert scales is analyised using Parametric methods such as Student's t-test or ANOVA are also often used to analyse Likert scale data. However the the ordinal data generated by the arithmetic operations involved in the calculation of means require interval level measurements.

Data obtained from Likert scales can be analysed by nonparametric methods such as Mann-Whitney-Wilcoxon test, and Kruskal-Wallis one-way analysis of variance.

MacLennan 04:14, 07 September 2007 (UTC)[reply]

It is somewhat debatable that Likert scaling is ordinal according to Stevens four levels of measurement (nominal, ordinal, interval, or ratio). It seems to fall somewhere in between ordinal and interval scaling. Although there is nothing in the Likert scaling method that would guarantee "equal-appearing intervals" as required for interval scales, in his original article Likert himself was quite surprised that his simple scaling method in several studies yielded highly correlated results (.97-.99) with the commonly-accepted standard for interval scaling, the more complex approach due to Thurstone using ratings by subject-matter-experts. Finally, parametric statistical analyses make no assumptions about interval level measurement as is commonly believed, and indeed procedures such as the t-test and ANOVA were developed prior to the introduction of Stevens levels of measurement.

Have added the obvious easy statistical tools which were missing. Sentence was ‘When treated as ordinal data, Likert responses can be analyzed using non-parametric tests, such as the Mann-Whitney test, the Wilcoxon signed-rank test, and the Kruskal-Wallis test’, and now reads ’When treated as ordinal data, Likert responses can be collated into bar charts, central tendency summarised by the median or the mode (but not the mean), dispersion summarised by the range across quartiles (but not the standard deviation), or analyzed using non-parametric tests, e.g. Chi-square test, Mann-Whitney test, Wilcoxon signed-rank test, or Kruskal-Wallis test’. John Pons 22:47, 2 October 2007 (UTC)[reply]

John, as you correctly say above, the mean and standard deviation is totally inappropriate with Likert data. But this is why it is also incorrect to use a parametric test such as a t-test or ANOVA. Both are in fact based on the assumption that the data are related to the underlying characteristic of interest at least at interval level (and as you know, both use the mean and standard deviation in their calculations). It is easy to see why - if the relationship is ordinal or lower, equal differences in the metric can have unequal relationships back to the characteristic of interest. Thus you can make a statistical conclusion by doing an ANOVA on Likert data that would not be seen in the underlying characteristic that you were investigating - you would have made a bad decision. That goes beyond quantifiable Type I or II error into an unknown amount of systematic error. This is not saying that analyzing it that way wouldn't sometimes "work" (e.g. give you the correct conclusion), but there would be an unknown and unknowable level of risk in that decision unless you could guarantee at least interval-like level of data. (Binomial data, for example, can be used in ANOVA since it is really testing differences in mean of the binomial distribution (Pi), which is allowed. Since the shape changes with Pi, though, the samples size must be large to make the ANOVA robust to that. Similarly, as stated elsewhere, if you mostly have two extremes you are really dealing with binomial-like data. When analyzing Likert data, I might dichotomize it into "Agree" and "Do Not Agree" or use one of the nonparametic approaches. I would never use parametric techniques.) By the way, whether these tests were developed before or after Stevens' recognition of measurement levels is immaterial. Only the impact on the decision risks taken by using an incorrect approach is relevant. A good reference for this perspective is Allen, I. Elaine and Seaman, Christopher A., "Likert Scales and Data Analysis," Quality Progress July 2007, p. 64-5. Available online here. I would advocate adding a "criticisms" section so that the reader will know that using parametric approaches on Likert data has some serious caveats.--216.160.137.179 (talk) 19:50, 22 September 2011 (UTC)[reply]

Text was unclear about the conditions under summation may reasonably apply. The statements 'item responses may be summed to create a score for a group of items' and ‘When responses to several Likert items are summed, they may be treated as interval data...' suggest that it's always possible to sum the scores, and that doing so makes them interval data! Have changed the wording to emphasise the conditionality. Now reads 'in some cases item responses may be summed to create a score for a group of items' and ‘Responses to several Likert questions may be summed, providing that all questions use the same Likert scale and that the scale is a defendable approximation to an interval scale, in which case they may be treated as interval data measuring a latent variable.’ And replaced the unspecified ‘relevant assumptions ’ with ‘these assumptions’.

John Pons 22:58, 2 October 2007 (UTC)[reply]

The Likert scale has been abused for so long that research in social sciences has created a culture of its misuse. You're only going to get nominal data from these tests, no matter how much you wish you could rank the responses. Preferences cannot be ranked in this way, nor can identities, and there's an irreducible subjectivity to the activity of ranking that's not really solved in summing scores. Social scientists need to hear this, especially: your analyses based on the assumption of more than nominal data, if not your data as well, are bogus. On a scale from one to six, Likert scales aren't measuring much of anything.107.184.249.239 (talk) 19:43, 28 March 2016 (UTC)[reply]

Item Selection[edit]

I think the point about item selection in a Likert scale is absolutely fundamental to how these scales should be constructed. Items in a Likert scale should be selected on the basis that they are likely to lead to extreme responses from people with attitudes at the extremes of what is being measured. So if I am measuring a political attitude such as libertarianism vs authoritarianism, I might try to select items such as "People should be allowed to behave however they like, even if it offends other people". You might expect an extreme libertarian to agree strongly with this and an extreme authoritarian to violently disagree. Most wishy-washy liberals would waver somewhere in the centre. The value of selecting these types of item is that they allow the representation of the whole of the spectrum of attitudes. DrJohnBrooke 11:20, 27 January 2006 (UTC)[reply]

I have a question. How come nobody ever talks about the "scale" part of the Likert scale. For example, if choices are arrayed horizontally, with strongly disagree at the left and strongly agree on the right, how does this affect responses compared to arranging them with agree on the left, or top to bottom? - roricka —Preceding unsigned comment added by Roricka (talkcontribs) 14:40, 23 August 2008 (UTC)[reply]
This problem is easily avoided with reverse-coded questions - for instance a fictional measure of introversion I just invented might have two questions: "I enjoy time spent alone more than time spent with other people" and "I enjoy time spent with other people more than time spent alone". If the person answering the questions is biased towards agreeing, or towards disagreeing, or answering on the left or the right side, the questions will produce contradictory answers (it would make no sense for someone to strongly agree with both, so they're probably just agreeing with everything). Robust instrument design has been part of psychological research since at least the 1970s. 152.91.9.219 (talk) 06:30, 12 November 2009 (UTC)[reply]
Thanks for the response, but you ignored my question, which was, if the desired response for all items were placed on the left for some respondents, and placed on the right for others, and placed at the top for another group, and placed on the bottom for another group, would the resulting scales exhibit differences? I do not agree this problem can be investigated using reverse-coded items. In our findings, reverse-coded items fall together in a component analysis. Removing them gives much better results. Also, you use the phrase "reverse-coded questions." Of course, Likert items are rarely questions. In fact, the use of statements vs. questions is a hallmark in this field. I think the article should discuss the biases that result or are avoided because Likert items are statements and NOT questions. User:roricka 11:50, 24 September 2013 (UTC)[reply]

On Likert Inventing the Scale[edit]

My measurement professor claimed that there is evidence that Likert did not "invent" this scale (several published articles used the scale previous to the referenced article by Likert). Unless somebody has researched this issue, the artilce would not say Likert invented this scale but rather that he is "said to have" or "thought to have" or, better yet, that the scale is named after him. (rlj)

I have to agree. As far as I'm aware, Likert suggested that because few people ever choose the extremes on a five-point scale (making it in effect a three-point scale) then we should use seven-point scales instead, cunningly making them five-point scales so to speak. —Preceding unsigned comment added by 210.50.106.153 (talk) 12:59, 15 May 2010 (UTC)[reply]

Wikipedia bug in links[edit]

A couple of the links in this article (as of today) are highlighted in red and the status decoding shows that the links are apparently set for the editing of the targets. Inside this article, those links and the non-red links appear to be formatted in exactly the same way? Shanen (talk) 07:17, 5 February 2008 (UTC)[reply]

pronounciation[edit]

We need a citation for the pronounciation of the work Likert. All the academics I know say "Like-ert". -- Joebeone (Talk) 18:47, 26 December 2006 (UTC)[reply]

I agree - this pronunciation is incorrect. It should be "like-urt" as stated on Likert bio page. cristo@princeton.edu; Jan 2 2007
It's not a matter of correct or incorrect, it's probably a matter of where you're from, as with the pronunciation of many names (and words). I don't have a peer-reviewed reference but see discussion here [[1]]. Holon 10:18, 3 January 2007 (UTC)[reply]
"It's not a matter of correct or incorrect..." Excuse me? This is the sort of logic that frequently imperils Wiki. When both the wife and the son of Rensis Likert pronounce it Lick-ert not even wilful ignorance such as yours is an excuse. Yankoz
Basic wiki tenet: assume good faith. Please. I pronounce it Lick-ert. However, it's not a matter of logical deduction or even inference, it's a matter of convention. I certainly prefer to follow the convention where possible, but know Chinese people whose names I cannot say properly. Do you think all English speaking should pronounce such words as restaurant exactly as the French do, including the pronunciation of 'r'? I do, but I know how. Good luck with that! If any pronunciation is to be given, it should be Lick-ert, but otherwise it should just be left out. Anyone know the country of origin of the name? Holon 09:13, 15 January 2007 (UTC)[reply]
The correct pronunciation of the name is readily accessible to English speakers. Yankoz —The preceding unsigned comment was added by 66.57.20.50 (talk) 16:12, 29 January 2007 (UTC).[reply]
So what is the country of origin of the name? How do you know the English pronunciation Lick-ert is OK? Interested. Holon 00:47, 30 January 2007 (UTC)[reply]
I have been told by the psychology professors here at the University of Michigan, Likert did indeed pronounce his name "Lick-ert" and we have a mission to use it correctly wherever we go, in order to "spread the word" so to speak. 141.214.17.5 23:09, 10 May 2007 (UTC)[reply]
Ok, I took a shot at dealing with this issue with a (sourced) section on pronunciation. 141.116.236.23 17:40, 11 July 2007 (UTC)[reply]
"This is the sort of logic that frequently imperils Wiki. When both the wife and the son of Rensis Likert pronounce it Lick-ert not even wilful ignorance such as yours is an excuse." Hmmm. Pot referring to kettle. If Wiki has a peril, it may be the attitude we're making the world, not describing. Why are workers or researchers under any compuction to pronounce a term in ANY particular way? Spelling is different. But pronounciation evolves, as it should. If you are so smug about it, tell me how to pronounce joule, the unit of energy? It turns out Prof. Joule pronounced it as it it were spelled j-o-w-l. Now, Joule was a clever man, and a special term was named after him. But workers rejected pronouncing a unit as "jowl" -- they were somehow attracted to the sound of "jewel." Go figure. At any rate, if I think Likert should be pronounced "LIE-kert" and so does everyone I talk with, then a yokel who comes along and says Lick-urt is just going to be looked down on. Sorry. That's the real world. roricka —Preceding unsigned comment added by Roricka (talkcontribs) 14:32, 23 August 2008 (UTC)[reply]
There seems to be an anti-American slant in all this business of pronunciation (as always). No one bats an eye when the Germans call their stereo system a "HEE-fi," instead of a "HI-fi." Or go to their kitchen to use the "MEE-kro," and not the "Micro-wave" oven. These are American words for American inventions ("High Fidelity" and "Microwave"), but it's perfectly okay for people to alter their pronunciation to make them easier to say. But when Americans do the same, they get ridiculed and/or looked down upon. Granted, this is a man's name. But were I ever to greet his family, I'd use the proper pronunciation (or as best I could manage). —Preceding unsigned comment added by 24.5.209.26 (talk) 00:04, 11 January 2008 (UTC)[reply]

John Pons's information can be found at thorenshistorypage.tripo.com along with the other information at Loron's Astronomy Mall hifi link. —Preceding unsigned comment added by 68.183.181.147 (talk) 18:05, 26 March 2008 (UTC)[reply]

Non-sensical statement?[edit]

What is this supposed to mean, "These can be applied only when the components are more than 5?" --Belg4mit (talk) 21:38, 1 April 2008 (UTC)[reply]

In such areas of pseudo-science / 'soft' science precise meaning is merely incidental, in contrast to actual science.

Spelling[edit]

I've gone through and replaced all "Lickert" with "Likert," the correct spelling of the name. Jeebus.

two-dimensional?[edit]

What would a two-dimensional Likert-scale be? One in which the items only provided two possible choices? Kingturtle (talk) 19:50, 12 October 2009 (UTC)[reply]

Social desirability bias[edit]

The article says that controlling for social desirability bias is difficult. I've seen a fair number of instruments that had pretty robust control questions for social desirability, so I feel a claim like that needs a reference. 152.91.9.219 (talk) 06:39, 12 November 2009 (UTC)[reply]

Example image[edit]

The example image shows the Wikipedia article rating scales and describes them as a "Likert-type scale". The article defines a Likert item as one in which "respondents specify their level of agreement or disagreement on a symmetric agree-disagree scale for a series of statements". The example clearly does not fit this definition.

The image was uploaded on July 17, 2011. Does anyone agree that this image represents a Likert scale and is an appropriate example of the scale? — Preceding unsigned comment added by Turadg (talkcontribs) 19:02, 4 August 2011 (UTC)[reply]

Hi. I'm the uploader. I'm sorry—although Likert scales (or Likert-type scales) and rating scales are conceptually in the same cognitive neighborhood (and rating scales functionally can be treated by some users in a Likert-like fashion, that is, "here's how much I agree with the assertion that this article is good"), I think, upon closer look at the articles, that you're right that people who design tests and surveys would prefer to keep the terms differentiated. I will remove the image from this article and rename it at Wikimedia Commons (if it will let me rename—I think regular-schmo users are allowed to rename there now). Thanks for pointing this out. — ¾-10 23:23, 4 August 2011 (UTC)[reply]


TQT, I see. Thanks for removing it and renaming the image. It makes a good image for Rating scale so I fixed the link there (after you renamed it.) — Turadg 16:31, 7 August 2011 (UTC)

Long i /ai/, long e /i/[edit]

In an edit summary Mudd1 asked, "If a short "i" is the one in "lick", isn't a long "i" then more like "leek"?)" The answer is that, thanks to the Great Vowel Shift, English is unlike most other languages that use the Latin alphabet. English conventionally calls the /ai/ diphthong "long i" (mind, bind, right), and calls the /i/ "long e" (meek, leek, repeat). It's why the vowel in "leek" sounds like (IPA) /i/ even though "ee" ought to mean /e/ or /ei/ (which English calls "long a", even though, to most of the world, "a" is /a/, not /ei/). Anyway, the change to "diphthong" was a fine alternative way to say it. — ¾-10 02:46, 30 January 2013 (UTC)[reply]

Sorry for reverting your change. I saw too late that you didn't just revert my edit but posted a comment here. Anyways, I could live with a link to an article explaining this somewhat counterintuitive nomenclature but Long i is about a special character used to transcribe Latin which is confusingly pronounced /iː/, not /ai/. Either way, I really cannot judge which of the two ways to describe /laikərt/ are more readily understood by the majority of readers. "Diphthong" is a word that many will not be familiar with whereas "long i" might confuse many non-native speakers (such as myself). IPA would be a solution but then I understand that, again, many native English speakers are not familiar with this notation. --Mudd1 (talk) 15:01, 31 May 2013 (UTC)[reply]

Recent additions to the article[edit]

Two things concern me about this edit.

Firstly, it appears to be by Ypchawla (talk · contribs). You know, there is no harm in continuing to edit under your user name? No one is going to think bad of you for having an edit reverted.

Secondly, I think the edit misses the point of the Likert scale. The Likert scale is made by adding together scores from a number of Likert items, all of which can take discrete (integer) values. This is already explained in the article. What does the edit mean by "the attribute can also be defined in fractions"? Is this recommending responses on a continuous scale to individual Likert items? That is not the standard method.

Then there is this edit.

Is this saying that those additions to Wikipedia are by YP Chawla? We don't credit every contributor individually within the article. The list of contributors can be seen in the article history. If you you have written a paper that relates to the Likert scale we would be interested to see it, but I would discuss it here before citing your own paper.

Yaris678 (talk) 15:14, 6 November 2013 (UTC)[reply]

Consistent language[edit]

I think the language in this article could be made much clearer by following its own advice (in the 'Sample question presented using a five-point Likert' section) about the difference between what an 'item' is vs a 'scale'.

For example in the 'Scoring and analysis' section it states:"by convention Likert items tend to be assigned progressive positive integer values. Likert scales typically range from 2 to 10 – with 5 or 7 being the most common." It seems to me that the usage of 'scale' is incorrect in this passage as the numbers are actually referring to the number of 'points' or categories available in the response to the question (item)?

Thoughts?

Jamesdamillington (talk) 11:05, 7 February 2014 (UTC)[reply]

Level of Measurement[edit]

This statement:

  • Bulleted list item

For example, in a set of items A,B,C rated with a Likert scale circular relations like A>B, B>C and C>A can appear. This violates the axiom of transitivity for the ordinal scale.

Defies mathematical convention and is unclear, there needs to be an example to illustrate what this means. --Leopardtail (talk) 14:27, 17 March 2014 (UTC)[reply]

* * Since the Likert so-called Scale is actually just nominal data, a set of arbitrary categories that social sciences pretend is numerical due to the careers and billions of dollars staked in inherently flawed research done while they failed to observe this fact, your observation is very welcome and I second this (not that these research basics should need a chorus of opinions to be meaningful).  We ought to edit the article accordingly, then watch as Likert fans change it back for no mathematically-valid reason. 107.188.202.156 (talk) 14:24, 23 December 2016 (UTC)[reply]

Please someone fix the grammar[edit]

- for questions early in a test, an expectation that questions about which one has stronger views may follow, such that on earlier questions one "leaves room" for stronger responses later in the test, which expectation creates bias that is especially pernicious in that its effects are not uniform throughout the test and cannot be corrected for through simple across-the-board normalization"

I am struggling to understand what is trying to be conveyed here. Tastyslowcooker (talk) 03:19, 10 May 2016 (UTC)[reply]

Criticism?[edit]

I do not see any criticism of a Likert scale. For example, "do you prefer to go to a movie theater or a library?", that depends on the movie and whether people have their cell phones on or the quality of the general shift of libraries that more closely resemble a garage sale. Surely there must be a complaint that a Likert scale is illogical. — Preceding unsigned comment added by 174.135.160.82 (talk) 22:02, 17 April 2018 (UTC)[reply]

HOW IS CARSHIELD RATED IE,A.B.C.#[edit]

HOW IS CARSHIELD RATED ? — Preceding unsigned comment added by 199.68.108.109 (talk) 22:08, 15 July 2020 (UTC)[reply]

The Website User Survey Picture contains a Glaring Bad Practice[edit]

Hey All, Just popping by to say that the current picture you use as an example of a likert scale is actually a bad example of a Likert scale, as it presents the negative side of the scale to the right, and the positive to the left of the middle point.

For westerners, this will be highly counter intuitive, as we are taught to think of numbers and scales and axes as growing from left to right. A scale such as the one presented here will likely induce participants of tests to become confused or even lead to erroneous inputs. I literally have a slide on this scale as a bad example of how to do one.

Would it be possible to change to a good Likert scale?

"part of a series on sociology"?[edit]

The article sits within the series on sociology, but Likert scales are widely used in many social sciences (not just sociology). I don't have data on this but anecdotally i think it is more common in psychology; it is also considered a core tool of psychometrics, and Renus Likert himself was a psychologist (not a sociologist). Accordingly, I think the article should sit within the series on psychology. I don't know how to edit which series an article sits within, but if someone can help with this that'd be great. I even note that at the top of this talk page it states that "[T]his article is within the scope of WikiProject Psychology" so at least one other person agrees with me already! Don't forget about it (talk) 00:04, 2 May 2023 (UTC)[reply]