5.1 LIWC results
The LIWC results are presented in two tables. Table
4 shows the LIWC Five Standard Default Measures and the differences between innovators and non-innovators as well as a T-test of their significance. Table
5 shows the LIWC Twelve Standard Punctuation results and the differences in the use of punctuation between innovators and non-innovators as well as a T-test of their significance. We calculated the t-statistics according to the results of preliminary Levene’s tests, indicating whether equal variances could, or could not, be assumed.
Table
4 results indicate three significant differences highlighted in green: the 173 innovators write substantially more than the 3,581 non-innovators by 1,481,707 words or 356%. They also write much longer sentences (42.5 vs 19.48 WPS) and use longer words, which indicate more complex language to describe concepts. Innovators use about 17.69% more six letter (Sixltr) words than the 3581 non-innovators. In addition, innovators have about a 1% less dictionary match rate as compared to non-innovators, 17.04 vs 18.01 respectively, which is close to being significant with a score of 0.055. This gap might be attributable to the fact that innovators use new words that are associated with novel products and ideas, terms that are not yet mapped in common dictionaries. Note: the category “Segment” value of 1 indicates an entire file was processed.
The twelve LIWC Default Punctuation Results also provide insights into how the innovators differ from the non-innovators. First, four of the twelve LIWC punctuation categories are not applicable due to data processing procedures. Specifically, the comma and semicolon were used as a “csv” data format separator in exporting the data from the online forum; a period was added at the end of every post to accommodate the WORDij slide procedure, which made the All Punctuation category not applicable. Nevertheless, six punctuation marks stand out that significantly differentiate the two groups: non-innovators use significantly more often the colon, question mark, and exclamation point than innovators (highlighted in red), while innovators use an apostrophe, parentheses, and other punctuation more often than non-innovators (highlighted in green). While the Dash and the Quote mark had a large numeric difference in favor of the innovators’ usage they were not found to be significant.
5.2 WORDij results
The WORDij results are presented in Table
6 and are sorted by Z-Score from low to high.
Table
6 presents three statistical tests: two from WORDij—the Z-score of two population proportions and the Chi-Square for goodness of fit based on counts, which are calculated at the file level. Appended for comparison are the LIWC T-test of means based upon an individual’s posts. The Crovitz words that indicate one or more of the three significant test differences are highlighted in red where non-innovators indicate a higher use of certain words, and those words that are used more by innovators are highlighted in green. The rows highlighted in gray indicate no significant difference. Again, we calculated the t-statistics according to the results of preliminary Levene’s tests, indicating whether equal variances could be assumed.
Overall, 32 of 37 (86%) of the Crovitz Relational Words have a significant Z-score and a Chi-Square score indicating there exists a clear unambiguous difference in the use of particular words between the 173 innovators and 3581 non-innovators. The only exception is for the word “across_through” where the count for the innovators indicated in Column D has a count of zero “0” and thus no Chi-Square can be calculated.
The 20 rows shaded in red indicate where the non-innovators use the Crovitz words significantly more often than innovators (in proportion), with eleven words having a negative Z-score of greater than 10. They are listed in order of magnitude from highest to lowest difference: “not, if, but, when, as, because, still, then, now, after, [and] out.”
The five rows in italic indicate a mixed result. There are no significant Z-Scores or T-test Scores for these five words: “near_by, opposite, off, till, [and] against.” However, the Chi- Square values are significant for the same words.
The 12 rows shaded in green indicate where the innovators use the Crovitz words significantly more often than non-innovators (in proportion), with four words having a positive Z-Score greater than 10. They are in order of magnitude: “of, in, and, [and] among_between.”
To extend the analysis we also evaluated the significance of mean differences, by using the T-tests reported in the last column of Table
6.
Seventy percent of Crovitz words used by non-innovators more than the innovators have significant T-test values. They are: “not, if, but, when, as, still, then, now, after, out, across, where, at, [and] for.” The remaining 30% of the Crovitz words that do not have significant T-test values are: “because, so, before, though, under, [and] round.”
The T-test results are consistent with the Z-Score results when indicating that there is no difference between the two groups in four of the five Crovitz words, which have been highlighted in gray. They are: “near, opposite, till [and] against.” The T-test could not be calculated for the word “off” because of the zero numerator as indicated by N/A in the table.
In seven of the twelve or 58% of Crovitz words used by innovators more often than non-innovators we find a correspondingly significant T-test value. They are: “of, in, and, among, about, on [and] with.” The remaining five of twelve or 42% of Crovitz words in the T-tests are not in agreement with the Z-Scores and Chi-Squares. They are: “by, while, or, from, [and] down.”
There is considerable overlap in the results of the T-tests and the Z-Scores and Chi-Square Scores. For Z-Scores and Chi-Squares we considered the text written by innovators and non-innovators as a whole, whereas T-tests were used to compare group means. Based on our results, we accept the hypothesis that the Crovitz 42 Relational Words discriminate employees classified as “innovators” and “non-innovators” based on their forum text postings.