This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Interactive Journal of Medical Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.i-jmr.org/, as well as this copyright and license information must be included.
Most of adult Internet users have searched for health information on the Internet. The Internet has become one of the most important sources for health information and treatment advice. In most cases, the information found is not verified with a medical doctor, but judged by the “online-diagnosers” independently. Facing this situation, public health authorities raise concern over the quality of medical information laypersons can find on the Internet.
The objective of the study was aimed at developing a measure to evaluate the credibility of websites that offer medical advice and information. The measure was tested in a quasi-experimental study on two sleeping-disorder websites of different quality.
There were 45 survey items for rating the credibility of websites that were tested in a quasi-experimental study with a random assignment of 454 participants to either a high- or a low-quality website exposure. Using principal component analysis, the original items were reduced to 13 and sorted into the factors: trustworthiness, textual deficits of the content, interferences (external links on the Web site), and advertisements. The first two factors focus more on the provided content itself, while the other two describe the embedding of the content into the website. The 45 survey items had been designed previously using exploratory observations and literature research.
The final scale showed adequate power and reliability for all factors. The loadings of the principal component analysis ranged satisfactorily (.644 to .854). Significant differences at
The scale reliably distinguished high- and low-quality of medical advice given on websites.
Internet usage is increasing strongly as more and more people have access to it. The increase reaches all age groups, including older people [
Cost and time factors make searching the Internet an attractive alternative to seeing a doctor in a nonacute situation, as information is available immediately and a visit to one’s doctor can be (work) “time consuming”. Individual reasons for searching medical information might differ—some want to prepare for a medical doctoral consultation, others seek support, or alternative remedies to treatment advice—but the accuracy of search results is significant for “online-diagnosers”. Hence, public health authorities are concerned over the quality of the health information available on the Internet [
A very common medical condition in the general population is sleeping disorders or insomnia. About 50% of the population complains about such problems in a given year, and it is the most common complaint of patients after general pain [
The understanding of trust and credibility factors of Internet health information, and websites in general, has been addressed by research in recent years. Accordingly, various measures and quality criteria for health information on the Internet can be found [
A recent review described some of the tools for assessing the quality as having limited validity [
Another line of research assesses quality aspects of health information websites through predefined key word lists evaluating the provided metadata of websites [
To evaluate how a Web search is conducted, 42 naturalistic observations of individuals searching the Internet for information on sleeping disorders were collected. The participants were asked to search for information about sleeping disorders in general; the search was not limited to a distinct perspective or a certain type of sleeping disorder. Following the individual search on the Internet, post observational, structured, in-depth interviews were conducted to clarify users’ motivation for particular search decisions and obtain additional information on their search behavior.
Undergraduate students were instructed to contact volunteer participants in their neighborhood and to observe their searching behavior. The observers were instructed following the guidelines of DeWalt and DeWalt [
The research group designed a field protocol for this study in order to capture the observed setting and contents, following previous recommendations of Schensul et al [
Based on the conclusions of the observational study and the interviews, a multi-item measure for the credibility of health websites on the Internet was designed. Orientation for this study was found in the previous work on measures of health information quality assessment [
The dimensions of the measure based on the observational study.
Dimension | Number of items | Interest |
Trustworthiness | 8 | Trustworthy source |
Competence | 7 | Content is adequate |
Interference | 7 | Pop-up windows, advertisement |
Layout | 7 | Presentation style |
Textual deficits | 8 | Factor of intelligibility |
Usability | 4 | Access to the information |
Suitability | 4 | Implementation of the advice |
To test the developed scale, an Internet survey was designed, comparing a group exposed to a low-quality site with another one exposed to a high-quality site. Participants were recruited in two weeks through a snowball system via email, social networks, and online-communities. It was initiated with a sample of 14 undergraduate students. The participants were randomly assigned to one of the two conditions. The high-quality website was rated as such by an independent German consumer foundation involved in investigating and comparing goods and services in an unbiased way [
The Internet survey incorporated the websites, and participants had to explore the content for at least four minutes; otherwise it was not possible to continue. The interfaces of the websites were included into the Internet survey mask, while external links on the websites were blocked. Internal paths leading away from sleeping-disorder content were blocked. The quality certificates shown on the high-quality website were removed. The survey was technically pretested before being distributed. After the website exposure, the Internet survey started. The 45 items of the credibility scale and the four items of the outcome measure were presented to each participant in a different random order. At the end of the survey, the participants were asked to respond to questions regarding their Internet usage of health information sites, occupation in a medical profession, and sociodemographic information.
To measure the impact of the website on participants’ behavior, an outcome measure was added. It consisted of four items formulating future intention to consult the site, intention to recommend it, etc (
Outcome measure:
I would recommend this website
I would approach this source for future questions
I can trust the information on this website
If I suffered from sleeping-disorders, I would use the given information
To assess the internal consistency of the measure, a scale reliability analysis was conducted. To check for differences between sociodemographic groups and occupations, respectively, Internet usage for searching medical information, correlations was used. For reasons of sound data analysis, the negatively worded items were reversed using the formula NEWSCORE= (MAX + MIN) – SCORE.
Factors were identified when in the simple structure approach eigenvalues greater than 1.0 were computed [
The participants of the observational study (N=42) were mainly male (25/42, 60%), between 21 and 40 years old, and most had some university degree (20/42, 48%).
When searching for information on sleeping disorders, all participants used the “Google” search engine as a starting point. Other portals or direct access to websites of medical authorities were not considered. This seems to be in accordance with other recent findings [
In the interviews, the participants were asked individually about their personal observation protocol. They reported that the most relevant key factor for choosing a specific website was its name. The observation protocols showed that a simpler domain name is more likely to be clicked, especially if the search-term was an integral part of the name. As reasons for staying on a website and checking the provided information, most participants mentioned a friendly layout and quality content. Commonly mentioned reasons for leaving were disturbances by advertisement or pop-up boxes and nonadequate information (too general or too specific). About 15 participants stressed the importance of a credible author, such as a governmental institution, a medical association, or professional medical personal, as factors to open or stay on a website.
Detailed sample description of the observational study, N=42.
Participants | n | % | |
Total number, N | 42 | 100 | |
|
|
|
|
|
Male | 25 | 60 |
|
Female | 17 | 41 |
|
|
|
|
|
17-20 | 5 | 12 |
|
21-30 | 10 | 24 |
|
31-40 | 12 | 29 |
|
41-50 | 8 | 19 |
|
50-62 | 7 | 17 |
|
|
|
|
|
No school degree | 1 | 2 |
|
Some school degree | 7 | 17 |
|
High school degree | 5 | 12 |
|
Professional school degree | 5 | 12 |
|
In university education | 4 | 10 |
|
University degree | 20 | 48 |
The sample of the Internet survey contained 454 participants; 55.1% (250/454) were male, 45.8% (208/454) between 21-30 years, and about 32.2% (146/454) were still at a university. There were 50.2% (228/454) that used the Internet often or very often to search medical information. There were 4.2% (19/454) participants that reported working in the medical sector. In total, the link of the survey was accessed 995 times, implying a completion rate of 45.5% (454/995) among those who had accessed the site. Slightly more of the 454 participants were assigned (51.1%, n=232) to the high-quality website. Analysis of the participants’ Internet protocol (IP) addresses showed that all accessed the survey from a German Internet connection. The IP address is a unique number assigned of the computer used for the survey. A complete sample description is shown in
No statistically significant differences could be found between male and female, age groups, Internet usage for health information, and educational levels. Working in the medical sector was negatively related to the ability to distinguish the quality of the website, but due to the small sample size, no further investigation can be done on this point.
Detailed sample description of the Internet survey.
Participants | Total | Exposure to high-quality page | Exposure to low-quality page | ||||
|
n | % | n | % | n | % | |
Total number, N | 454 | 100 | 232 | 100 | 222 | 100 | |
|
|
|
|
|
|
|
|
|
Male | 250 | 55.1 | 111 | 47.8 | 72 | 32.4 |
|
Female | 171 | 37.7 | 99 | 42.7 | 139 | 62.6 |
|
Missing | 33 | 7.3 | 22 | 9.5 | 11 | 5.0 |
|
|
|
|
|
|
||
|
15-20 | 110 | 24.2 | 62 | 26.7 | 48 | 21.6 |
|
21-30 | 208 | 45.8 | 90 | 38.8 | 118 | 53.2 |
|
31-40 | 36 | 8.0 | 22 | 9.5 | 14 | 6.3 |
|
41-50 | 36 | 8.0 | 15 | 6.5 | 21 | 9.5 |
|
51-64 | 29 | 6.4 | 20 | 8.6 | 9 | 4.1 |
|
Missing | 35 | 7.7 | 23 | 9.9 | 12 | 5.4 |
|
|
|
|
|
|
||
|
No school degree | 1 | 0.2 | 1 | 0.4 | - | - |
|
In school education | 26 | 5.7 | 13 | 5.6 | 13 | 5.9 |
|
Some school degree | 59 | 13.0 | 37 | 15.9 | 22 | 9.9 |
|
High school degree | 82 | 18.1 | 44 | 19.0 | 38 | 17.1 |
|
Professional school degree | 4 | 0.9 | 2 | 0.9 | 2 | 0.9 |
|
In university education | 146 | 32.2 | 70 | 30.2 | 76 | 34.2 |
|
University degree | 135 | 29.7 | 64 | 27.6 | 71 | 32.0 |
|
Missing | 1 | 0.2 | 1 | 0.4 | - | - |
|
|
|
|
|
|||
|
Yes | 19 | 4.2 | 6 | 2.6 | 13 | 5.9 |
|
No | 423 | 93.2 | 217 | 93.5 | 206 | 92.8 |
|
Missing | 12 | 2.6 | 9 | 3.9 | 3 | 1.4 |
|
|
|
|
||||
|
Not at all | 13 | 2.9 | 7 | 3.0 | 6 | 2.7 |
|
1 Little | 39 | 8.6 | 24 | 10.3 | 15 | 6.8 |
|
2 | 86 | 18.9 | 44 | 19.0 | 42 | 18.9 |
|
3 | 86 | 18.9 | 45 | 19.4 | 41 | 18.5 |
|
4 | 66 | 14.5 | 33 | 14.2 | 33 | 14.9 |
|
5 | 83 | 18.3 | 41 | 17.7 | 42 | 18.9 |
|
6 | 41 | 9.0 | 20 | 8.6 | 21 | 9.5 |
|
7 Very often | 38 | 8.4 | 16 | 6.9 | 22 | 9.9 |
|
Missing | 2 | 0.4 | 2 | 0.9 | - | - |
By means of the principal component analysis, the different dimensions were tested and the number of items reduced. Out of the 45 items of the scale, four primary factors were identified accounting in total for 65% of overall variance, and following the analysis of the items’ factor loadings and contexts, two factors were recognized as content-specific and the other two as website surrounding-specific factors. The 32 items, which are not part of the final scale, were excluded from further analysis as these displayed high cross-loadings, very low loadings, or no loadings on any factors. Factor 1 accounted for 32.37% (eigenvalue 4.275) of the variance, Factor 2 for 7.96% (eigenvalue 1.035), Factor 3 for 13.37% (eigenvalue 1.738), and Factor 4 for 10.83% (eigenvalue 1.408). The newly grouped items are shown in
Results of the principal component analysis.
|
Factors | |||
|
Content-specifica | Surrounding-specifica | ||
|
1 | 2 | 3 | 4 |
The content convinced me. | .835 | |||
The website appears to be trustworthy. | .770 | |||
The website provides good information. | .758 | |||
The author seems to be knowledgeable due to the academic title. | .737 | |||
I learned something reading the content. | .688 | |||
The text is too long. | .854 |
|
||
The sentences have a difficult structure. | .644 |
|
||
Advertisements distracted me. |
|
.796 | ||
The website contains dispensable links. |
|
.732 | ||
Nothing distracts from the content. |
|
.706 | ||
The website has a blurry layout. |
|
.672 | ||
In general advertisement pop-ups help to add meaningful information. |
|
.853 | ||
In general moving advertisement help to draw attention on the content. |
|
.726 | ||
Rotation method, Varimax with Kaiser Normalizationb |
a Extraction method, principal component analysis
b Rotated component matrix; Rotation converged in 5 iterations.
Factor 1 was labeled “Trustworthiness” and contained five items on the website being perceived as convincing, trustworthy, and informative (Cronbach alpha=.839). Factor 2 is “Textual deficits” and unites two items on sentence length and complexity (Cronbach alpha=.761). Factor 3, we called “Interference”; it binds items on irritation by advertisements, links, and layout (Cronbach alpha=.592). Finally, Factor 4, “Advertisements”, is on distraction or usefulness of advertisements (Cronbach alpha=.532).
The Promax rotation for four factors showed that there were no correlations higher than the threshold of .32. Following Tabachnick and Fidell [
Factor correlations of the principal component analysis.
Factorsa | 1 | 2 | 3 |
2 | .256 |
|
|
3 | -.157 | -.218 |
|
4 | .052 | .067 | .198 |
a Rotation Method, Promax with Kaiser Normalization.
The analysis showed significant differences between the high- and the low-quality websites with regard to the perception of three of the four dimensions, all at a
Statistical differences between the two exposures.
Components | Ma | SD | dfb |
|
Significance |
Trustworthiness (Factor 1) | .778 | .112 | 452 | 6.970 | <.001 |
Interference (Factor 3) | .821 | .134 | 452 | 6.132 | <.001 |
Textual deficits (Factor 2) | -.595 | .122 | 452 | -4.905 | <.001 |
Advertisementsd (Factor 4) | -.107 | .134 | 452 | -.802 | .423 |
a M=Mean
b df=degrees of freedom
c
d Equal variances not assumed for this item
The reliability statistics for the four-item outcome measure (see
This research is based on the experience of average Internet users and quantitative testing of the designed scale. Therefore, it was possible to design a novel measure that covers, on the one hand, similar aspects as the DISCERN scale, but provides, on the other hand, important additional Internet-specific items. The items of the widely used DISCERN measure are divided into two sections that focus on the concepts of quality and credibility of the given information [
The sufficient level of scale reliability and the properties of this measure suggest that this measure allows examining the view of health information seekers on the provided information. The experimental design showed that the ratings developed for the scale differentiate between a high- and a low-quality website. This makes this measure a useful tool for examining patients’ Internet searches. The measure was not designed based on specific websites, but on the search procedures of the participants of the observational study. Moreover, it is not condition-specific and can be administered to all medical information websites on the Internet. These characteristics allow administering the tool relatively easily in either Internet- or in paper and pencil-based research studies. It can thus be an easy to use measuring tool, which can be incorporated alongside other measures. Useful apps can be found in the eHealth area and for website testing in health campaigns.
Typical for the experimental research layout, several aspects worked differently from what we expected. Between the two experimental groups, the results showed that participants who were exposed to the high-quality website rated its credibility in this measure higher on the factors trustworthiness and interference, but lower on textual deficits. The unexpected direction of the difference could be due to the different styles of the sites. While the high-quality site had long explanatory text parts, the low-quality site had only simple information. Moreover, unexpected results were found on Factor 4 grouping the advertisement items. The nonsignificant results for the correlation of the experimental conditions seem to be reflected within the specific item wording. In contrast to all other items in the final measures, these items could have suggested a more general answer by the participants, which was not limited to the context of the website they had seen. Participants answered this item based on their general attitude and opinion, and consequently, the answers were not affected by the website they had seen. This is reflected by the nonsignificant results of this factor.
Most of the results regarding the rating of the different quality of websites matched with the previous assumption of the research group. For this case, the measure seems to provide a sufficient rating tool able to produce judgments consistent with experts’ categorizations. Although the testing in this study was done on sleeping disorder websites, other conditions can be included. As the measure is by its content not bound to a specific disease or medical condition, it can be widely used. With respect to the growing usage of Internet apps and Internet information by health professionals and laypersons, the measurement catalogue is still very limited when it comes to the combination of content quality and medical information.
Initiating a research project with a student sample caused some difficulties overcome by using the snowball system in order to include participants from outside the university. Still, the average age of the sample is rather young and, therefore, does not represent the society of Internet users. It should also be mentioned that health information searches on the Internet are linked to such sociodemographic characteristics as age, gender, and health status [
Further research with another independent sample will allow confirming the factor structure of the scale. Moreover, it would be possible to provide solutions to some of the limitations and to improve the measure by defining cut-point values as estimators for high- or low-quality content of websites. The measure would in this way offer the possibility of addressing health information users on the Internet who struggle with identifying quality websites. It would also be practical to continue examining this measure in comparison with the health literacy levels of participants to see whether predictors can be found there. So far, the results showed that (formal) knowledge did not show any differences in the research population.
This measure provides a practical tool, which will show its relevance for research on health information on the Internet. In contrast to previous attempts, this measure is designed for the Internet-setting of this information channel and the particular users’ behavior. The inclusion of the laypersons’ experience into the measurement development process might be seen as unusual, but crucially, this brings the consumers’ perspective into academic research. Therefore, the initially mentioned concern of public health authorities on the quality of health information provided on the Internet [
The seven dimensions with the original items of the measure based on the observational study (compare
HyperText Markup Language
Internet protocol
The authors wish to thank Professor Dr Hans-Bernd Brosius, the Center of Advanced Studies, and the undergraduate students from the Ludwigs-Maximilians Universität Munich, Germany for their invitation and the possibility to give a course and their support of this study.
None declared.