Examining the Supervision Work Alliance Scale: A Rasch Model Approach

Agus Taufiq1, Eka Sakti Yudha1, Yusof Hapsah Md2, Dodi Suryana1, *
1 Department of Guidance and Counseling, Faculty of Education, Universitas Pendidikan Indonesia, Bandung, Indonesia
2 Department of Psychology and Counseling, Universiti Pendidikan Sultan Idris, Perak, Malaysia

Article Metrics

CrossRef Citations:
Total Statistics:

Full-Text HTML Views: 843
Abstract HTML Views: 474
PDF Downloads: 423
Total Views/Downloads: 1740
Unique Statistics:

Full-Text HTML Views: 428
Abstract HTML Views: 234
PDF Downloads: 269
Total Views/Downloads: 931

Creative Commons License
© 2021 Taufiq et al.

open-access license: This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International Public License (CC-BY 4.0), a copy of which is available at: This license permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

* Address correspondence to this author at the Department of Psychology and Counseling, Universiti Pendidikan Sultan Idris, Perak, Malaysia; E-mail:



The supervisory working alliance has a role in facilitating guidance and counseling supervisors in providing understanding of how the service works. Measuring the level of supervision work alliance is one way that can be done to find out whether a supervisor has a good supervisory work alliance or not.


The research aims to describe the quality of the Supervision Work Alliance Scale (SWAS) instrument.

Materials and Methods:

The study employed a cross-sectional method with a quantitative research design. Participants in this study were counseling teachers implementing the internship program with 17 males and 55 females. This type of parameter needs to be identified by the category coefficient of the RASCH scoring function model for polycotomic responses.


The results show that as many as 34 items proved to be compatible with SWAS instruments. The cronbach alpha of the instrument was 0.91 which means that the all items were in the high category of reliability. The misfit items were only 5,88, so all of the items in SWAS were well understood by the participants.


The development of SWAS instrument is valid and reliable, so it can be used to measure the variable of the supervisory work alliance

Keywords: Supervision work alliance, RASCH Model, Ability, Cronbach alpha, Coefficient, Guidance.


Working alliances in counseling and psychotherapy activities are a central construct that appears in the clinical supervision literature. This alliance is recognized for being significant in increasing the effectiveness of supervision activities. It is carried out between the client and the therapist in a counseling or therapy activity, which plays a crucial role in facilitating the process. Even the changes that occur in the client are a function of developing a solid working alliance rather than the theory or technique used by the therapist [1]. The supervisory alliance can be said to be strong when the supervisor and clients can reach a mutual agreement about the purpose of supervision implementation. The deal will make the supervisory relationship less tense. The implementation of supervision can be achieved optimally, and participants can get new experiences that can foster their professional performance.

Supervision can be adequate when participants, in this case, the counselor, can be competent. Bernard and Goodyear (2014) say that increasing participants' competence through supervision is necessary because their abilities will not improve if there is no identification and assessment of their professional activities [2]. Supervision needs to be done by paying attention to the thoughts, feelings, actions, and ideas expressed or combining these [3].

A study conducted by McCarthy in 2013 discussed the relationship between supervisory work alliances and the results of rehabilitation counseling. This study then directs supervisors to build a strong working alliance relationship with the participants, in this case, the counselor. Also, regular contact is made so that it will create a healthy working alliance. A code of conduct is applied to provide experience between supervisor and client. This study indicates that the supervisory work alliance has a positive relationship with the results of rehabilitation counseling. Therefore, a work alliance is an activity that needs to be formed to improve the counselor's ability to conduct counseling [4]

Another study conducted by Parcover and Swanson in 2013 discussed the supervisory work alliance's role in the implementation of career counseling carried out by participants, in this case, the counselor. The research was conducted to understand the effect of the supervisory work alliance in the supervision of career counselors, which was carried out using the case study research method. The results showed that the supervisory work alliance impacted the activity's beginning before the participants carried out the counseling. Also, the supervision achieves good results and awareness of differences in the ability to be internalized by participants so that it fosters the desire to learn more about competent counselors [5]

The studies mentioned are just a few of the research results that have positively impacted the supervision work alliance. Through the work alliance, the counselor's supervision will help support the running of guidance and counseling services [6]. Counselors need to have optimal skills to understand and follow the provisions in carrying out their professional activities [7]. Frank and Gunderson (1990) said that counselors who have the right alliances are predicted to have a better retention rate and better service outcomes than counselors who do not have a supervisory work alliance [8].

Therefore, identifying supervisory work alliances is worthwhile. The identification results will provide interventions so that the counselor has better potential and offers higher quality guidance and counseling services [9]. Accuracy in delivering interventions will also provide answers to solutions to the problems faced so that the counselor's satisfaction in providing services will be achieved

The research discussed in this article is about the formulation of a scaling instrument for the supervisory work alliance named the Supervision Work Alliance Scale (SWAS). The purpose of this research is to describe the quality of SWAS to measure the supervision work alliance. The instruments tested will produce accurate data about the conditions in the field to provide the proper intervention.

In this research, Rasch Model is used to analyze the results of the instrument. The Rasch model has the advantage of producing a measurement scale with the same interval to provide accurate information about the participants and the quality of the work [10,11]. The model offers advantages, including (1) providing a linear scale with the same interval, (2) being able to predict missing data, (3) being able to provide a more accurate estimate/estimation, (4) being able to detect model inaccuracy, (5) and provide replicable measurements. SWAS will be analyzed by looking for unidimensionality, wright maps, item analysis, ability analysis, and instrument analysis.


The research employed a quantitative approach using a cross-sectional research design. The sampling technique was carried out using random sampling techniques considering the sample being adjusted to the research subject. The data obtained were analyzed using the RASCH model. The application used to analyze the RASCH model was WINSTEP version 3.92.0.

2.1. Population and Research Samples

The research subjects were the guidance and counseling teachers in schools implementing the internship program of guidance and counseling, totaling 72 people. The following is a table of research samples:

Table 1. Research sample.
Gender Amount
Male 17
Female 55
Total 72

Based on Table 1, 72 people participated in this study consisting of 55 females and 17 males.

2.2. Research Variable

The research uses unidimensionality variables, wright map analysis, item analysis, stability analysis, and analysis of the instruments in the output table on the WINSTEP application.


3.1. Unidimensionality

Unindimensionality can be seen by looking at the output Table 2 on WINSTEP. Unidimensionality of measurement can be proven if raw variance is explained by measures ≥ 20% (Note: the general criteria for interpretation are: enough if 20-40%, good if 40-60%, and excellent if above 60%) and if the unexplained variance in 1st to fifth contrast of residuals <15% each [12]. The following Table 2 describes the unidimensionality:

Table 2. Standardized residual variance.
Empirical Modeled
Total raw variance in observations 70,2 100% 100%
Raw variance explained by measures 36,2 51,6% 51,6%
Raw variance explained by persons 8,2 11,6% 11,6%
Raw variance explained by items 28,0 39,9% 39,8%
Raw unexplained variance (total) 34,0 48,4% 100%
Unexplained variance in 1st contrast 11,2 16% 33,1%
Unexplained variance in 2nd contrast 2,9 4,1% 8,4%
Unexplained variance in 3rd contrast 2,0 2,8% 5,8%
Unexplained variance in 4th contrast 1,8 2,6% 5,4%
Unexplained variance in 5th contrast 1,4 2,0% 4,1%

Table 2 shows the results of the analysis regarding the unidimensionality of the instrument. Data analysis showed that the raw variance explained by measures was 51.6%. Meanwhile, the unexplained variance in 1st to 5th contrast of residuals was 16.0%, 4.1%, 2.8%, 2.6%, and 2.0%, respectively. It appears that unexplained variance in second contrast to unexplained variance in fifth contrast can measure the SWAS variable. In contrast, the unexplained variance in the first contrast is 16.0% which means it cannot measure one variable of the supervision work alliance scale.

3.2. Wright Map Analysis

The Wright Map Analysis, shown in the output Table 1, states that the supervisory work alliance's scale map spreads in the -2 to 4 logit range. The position of the participants' ability is at -1SD and + 3SD. Based on the supervisory work alliance's scale map and the participant's knowledge, 3 participants have outlier abilities. The three participants have higher abilities than the scale map of the supervision work alliance. Output Table 7, which measures participants' ability, and output Table 3, which measures the item, state that the participants' average ability is 0.25 and the average logit item is 0.00. These results show that the supervisory work alliance's average scale is above the average difficulty level of the standard items [13].

3.3. Item Analysis

Item analysis was carried out by measuring the item difficulty level, suitability level, diagnostic rating scale, and bias detection. Measurement of the first item is carried out to determine the level of difficulty of each item. The output of Table 1 states that the SD value is 0.63. If this value is combined with a logit average value of 0.0, then item difficulty level can be categorized as follows:

Table 3. Difficulty level category.
Range Category
> 0,63 Hard
0,00 - 0,63 Difficult
-0,63 - -0,01 Easy
< -0,63 Very easy

Based on Table 3, the 13 items can be classified based on the item difficulty level category. Following are the items based on the level of difficulty (Table 4).

Table 4. Item classification based on the level of difficulty.
Category Number of Items
Hard 22, 33, 24, 21, 25, 34, 30, 23
Difficult 31, 28, 12, 26, 19, 27, 20
Easy 5, 11, 6, 32, 13, 29, 15, 16, 18, 9, 14, 2, 7, 10
Very Easy 17, 4, 3, 8, 1

The following item analysis is about the item suitability analysis. The research is carried out to identify the function of the item, whether it is functioning normally or not when the measurement is taken. The normal function of the items will give participants an appropriate conception of the item items. The analysis was carried out using Table 4 about the fit order items.

The analysis was carried out by observing the column mean square OUTFIT (MNSQ), OUTFIT Z-Standard (ZSTD), and Point Measure Correlation (PT MEASURE CORR). Boone (2014) stated that to identify the mismatch of items, some standards need to be considered: (1) MNSQ OUTFIT value is more significant than 0.5 and less than 1.5, and the closer to 1, the better; (2) ZSTD OUTFIT value is greater than -2.0 and smaller than +2.0, and the closer to 0 the better; and (3) the value of PT MEASURE CORR is more than 0.4 and less than 0.85. An item can be considered fit if it meets at least 1 of the three criteria [14].

Based on this data, items that can be said to be inappropriate or misfit are described in Table 5. The following are items that can be said to be inappropriate or misfit.

The following item analysis is about the item suitability analysis. This analysis is carried out to identify the function of the item during the measurement. The normal function of the items will give participants an appropriate conception of the items. The analysis was performed using Table 4 about the item fit order.

Based on these data, wrong items or misfits can be described in Table 5.

Table 5. Misfit items.
Criteria Misfit Item
OUTFIT MNSQ Value 0,5 > x > 1,5 34, 5
OUTFIT ZSTD Value -2,0 > x > 2,0 34, 28, 12, 5, 11, 13, 16, 9, 7.
PT MEASURE CORR 0,4 > x > 0,85 34, 12, 5, 11, 1

If it is noted in Table 5, there are two items for which none of the criteria are met, namely numbers 34 and 5. The participants do not adequately understand the two items do not measure the supervision work alliance scale. However, the other 32 items can be suitable or fit because the average item meets at least one predetermined criterion. Therefore, 32 items are well understood and can be used to measure the supervisory work alliance scale.

The following item analysis is the diagnostic rating scale. The analysis was carried out to diagnose whether the participants understood each answer's difference or not [14]. Output (Table 3) regarding the rating (partial credit) scale is used to analyze the diagnostic rating scale. The ANDRICH THRESHOLD value in the output of Table 3 must show suitability and the same increase in alternative answers 1, 2, 3, 4, 5, 6, and 7 [15]. Following is the Andrich Threshold value, as described in Table 6.

Table 6. Andrich threshold.
Item Value
1 None
2 -0,74
3 -0,33
4 -1,12
5 0,46
6 0,21
7 1,51

The Andrich Threshold value in Table 6 shows a discrepancy and does not increase in alternative answers 4 and 6. Thus, the differences in answer choices 1, 2, 3, 5, and 7 can be understood by the participants, while the participants cannot understand answer choices 4 and 6.

The following item analysis is item bias detection. A validity measure ensures that the instruments and items used do not contain bias or favor specific individuals. Item bias was detected using the output (Table 4) by looking at the probability of the things. An item is biased if the probability value of an item is below 0.5 [16]. The item bias in this analysis was based on gender.

The results of the bias analysis based on gender showed that there were 4 bias items, namely the number 17 (p = 0.0476), 24 (p = 0.0456), 29 (p = 0.0180), and 32 (0.0060). Items 17, 29, and 32 are more favorable for male participants, and item number 24 is more beneficial for female participants

3.4. Ability Analysis

This analysis was conducted to determine each participant's ability and the level of conformity of the results with the participant's ability. The output table used in performing the ability analysis is output (Table 7) regarding person measures and output (Table 6) regarding person fit orders.

The analysis of participant ability was carried out by looking at the SD and the average in the output Table 17. The SD and mean values ​​were 0.62 and 0.25. The two values ​​are combined to obtain the participant's ability category [17]. Table 7 is the result of a combination of SD and average values. The following is Table 7 regarding the categories of participant ability:

Table 7. Participants ability category.
Range Category
> 0,87 High
-0,37 - 0,87 Medium
< -0,37 Low

Based on the value of each participant and considering Table 7 regarding the category of participant's level of ability, it is found that there are 4 participants in the high ability category, 62 participants in the moderate ability category, and 6 participants in the low ability category.

Analysis of the results' suitability with the participant's ability was carried out by looking at the output Table 6 regarding the person fit to order. The analysis was carried out by looking at the OUTFIT Mean Squire (MNSQ) column, OUTFIT Z-standard (ZSTD), and Point Measure Correlation (PT MEASURE CORR). Boone (2014) states that there are criteria to determine the suitability of the results with the participant's ability: (1) MNSQ OUTFIT value is more significant than 0.5 but smaller than 1.5 and the closer to 1, the better; (2) ZSTD OUTFIT value greater than -2.0 and smaller than +2.0, and the closer to 0 the better; and (3) the value of PT MEASURE CORR is more than 0.4 and less than 0.85. A participant can be considered fit if it meets at least 1 of the three criteria [14].

Based on these criteria, it is known that 69 participants were declared fit in the sense of providing answers according to their level of ability. Meanwhile, the other 3 gave answers that were not according to their level of ability.

3.5. Instrument Analysis

Instrument analysis is also carried out by paying attention to output (Table 8) regarding summary statistics. Table 8 describes the mean value, SD, separation, reliability, and Cronbach alpha. The following is Table 8, which is used to analyze the instrument:

Table 8. Statistic summary.
- M SD Sp R CA
P 0,25 0,62 3,01 0,90 0,91
I 0,00 0,63 6,03 0,97

Based on Table 8, the results obtained showed the participants and items' ability, interactions between participants and items, and reliability. The participants' average ability, which was 0.25, and the average of the items, which was 0.00, indicated that the participants' ability was higher than the items' difficulty. The Cronbach alpha item's value was 0.91, which meant that it was in the perfect category. The participants' Cronbach alpha value was 0.90, which meant that the participants' answers' consistency was included in the ideal category. Simultaneously, the item reliability was 0.97, which meant that the items' quality was included in the special category [18].

From the output of Table 8, it is known that the separation for a person is 3.01, and for items, it is 6.03. The greater the separation value, the better the overall quality of the person and instruments. The separation value is calculated more carefully using the following formula: H = {(4 x separation) + 1} / 3 [19]. Thus, the separation value for persons is 4.34 rounded to 4, while the separation for items is 8.37 rounded to 8. It implies that the quality of research participants is perfect, and the instruments' quality is of special quality [20].

Other data in Table 8 that can be used are INFIT MNSQ and OUTFIT MNSQ, both in the Person table and the Item table [14]. Based on the Person Table, it is known that the average MNSQ INFIT and MNSQ OUTFIT values ​​are 1.05 and 1.01, respectively. Meanwhile, based on the Item Table, it is known that the average MNSQ INFIT and MNSQ OUTFIT values ​​are 0.98 and 1.01, respectively. The closer the criteria are to number 1, the better because the ideal value is 1. Thus, the average person and item approach the ideal criteria.

Meanwhile, to INFIT ZSTD and OUTFIT ZSTD, the average scores for a person are -0.20 and -0.20, respectively. Meanwhile, the INFIT ZSTD and ZSTD OUTFIT values ​​for items are -0.20 and -0.10, respectively. The ideal value of ZSTD is 0, and the closer to 0, the better. Thus it can be said that the quality of the person and item is acceptable.

Regarding information about the results of the measure- ment focus, it can be illustrated in the following Figure:

Based on Fig. (1), the information function test curve shows that the item separation has a high value [16]. Thus, 34 items given to 72 participants indicated that they were suitable for determining the supervision alliance scale.

Fig (1). Test information function.


The supervisory work alliance scale is one instrument that can be used to determine the supervisory work alliance's quality for supervisors and guidance and counseling teachers [21]. The results of the measurements carried out provide notes to improve the instrument to obtain better results.

The analysis of instrument items indicates that it is necessary to carry out several evaluations to obtain an instrument with items that can measure the supervisory work alliance scale. Unidimensionality analysis shows that the scale of the supervisory work alliance still cannot measure the overall variable. The first contrast consistently exceeds the measurement limit in unexplained variance, so it cannot measure one variable in the supervisory work alliance scale [12,18]. Furthermore, two items, namely 34 and 5, and alternative answers 4 and 6 need to be improved. It was done because the participants could not understand that it was fine not to obtain optimal results. The habits in items are also found in items 17, 24, 29, and 32. These habits will impact certain groups' benefits so that these items will produce unfair values ​​to other groups [22].

The results of the participant ability analysis showed that 62 participants were in the moderate category. It indicates that the high quality in a supervisory work alliance is still not wholly owned. Healthy relationships need to be formed through training activities [23,24]. Facilitating learning by providing space to express anxieties, monitoring the counseling process, and so on will give a stimulus to participants to get a healthy relationship in the supervisory work alliance [25]. Although bonding between trainees and supervisors will be seen as the primary key, the trainees' characteristics that stand out will help form a healthy relationship in the supervisory work alliance [26]. Participants' attachment also needs to be a concern in training by paying attention to personality, relationships, and work behavior [27-30].

The results of instrument reliability get outstanding results. The reliability of the supervisory work alliance scale instrument is in a special category [18]. The SWAS instrument will give consistent results if the measurement is carried out more than once [31]. Consistent results will provide confidence in the results presented from the SWAS instrument. Reliable results will lead to further action according to the identified needs [32].


The analysis results show that the SWAS instrument is still unable to measure the supervisory work alliance variable and several items need to be corrected. There is also a bias in several item questions that need to be addressed. Unidimensionality analysis shows that the instrument is still unable to measure the supervision work alliance variable as a whole because there are results that exceed the predetermined measurement standards.

Meanwhile, the reliability of the instrument is in a special category. The instrument provides the right consistency when the instrument is tested more than once, so the results will not differ from the previous results. Therefore, the results obtained will lead to each participant's objective conditions, and the supervisory work alliance scale can be used to measure the supervision work alliance.


This study was approved by Indonesia Guidance and Counseling Association (ABKIN) Jawa Barat, Indonesia under approval no. 05/ABKIN Jabar/04/2020.


No Animals were used in this research. All human research procedures followed were in accordance with the ethical standards of the committee responsible for human experimentation (institutional and national), and with the Helsinki Declaration of 1975, as revised in 2013.


Informed consent is obtained from all participants when they are registered.


Data should be shared upon request with the relevant author [D.S] upon a reasonable request.


Letter of Agreement on the Implementation of the Research Grant Program in the Universitas Pendidikan Indonesia in 2020 between the Chair of the Institute for Research and Community Service at the University and the Chair of Research Lecturer at the University Number 894 / UN40.D / IPT / 2020.


There is no conflict of interest.


The gratitude goes to Universitas Pendidikan Indonesia for giving the confidence to research testing the SWAS using the RASCH model. Thank you for the guidance and counseling teachers who have contributed as participants.


[1] Bordin ES. A working alliance based model of supervision. Couns Psychol 1983.
[2] Logan JD. The relationship among counseling supervision satisfaction, counselor self-efficacy, working alliance and multicultural factors ProQuest Diss Theses 2014.
[3] Bordin ES. Supervision in counseling: II. Contemporary models of supervision: A working alliance based model of supervision. Couns Psychol 1983.
[4] McCarthy AK. Relationship between supervisory working alliance and client outcomes in state vocational rehabilitation counseling. Rehabil Couns Bull 2013.
[5] Parcover JA, Swanson JL. Career counselor training and supervision: Role of the supervisory working alliance. J Employ Couns 2013.
[6] Garner CM, Webb LK, Chaffin C, Byars A. The soul of supervision: Counselor spirituality. Couns Values 2017.
[7] ACA. Licensure & certification—State professional counselor licensure board 2012. available at:
[8] Lambert MJ, Barley DE. Introduction—Research summary on the therapeutic relationship and psychotherapy outcome. Psychotherapy 2001; 4(38): 357.
[9] Crockett S, Hays DG. The influence of supervisor multicultural competence on the supervisory working alliance, supervisee counseling self-efficacy, and supervisee satisfaction with supervision: A mediation model. Couns Educ Superv 2015.
[10] Perdana SA. Analisis kualitas instrumen pengukuran pemahaman konsep persamaan kuadrat melalui teori tes klasik dan rasch model. Jurnal Kiprah 2018; 6(1): 41-8.
[11] Cresswell JW, Plano Clark VL. Designing and conducting mixed methods research 2nd ed. 2011.
[12] Beglar D. A Rasch-based validation of the vocabulary size test. Lang Test 2010.
[13] Abdullah N, Noranee S, Khamis MR. The use of rasch wright map in assessing conceptual understanding of electricity. Pertanika J Soc Sci Humanit 2017.
[14] Boone W J, Yale M S, Staver J R. Rasch analysis in the human sciences 2014.
[15] Andrich D. Rasch modelsInternational Encyclopedia of Education 2010.
[16] Sumintono B. Rasch model measurements as tools in assessment for learning 2018.
[17] Stout JL, Gorton GE III, Novacheck TF, et al. Rasch analysis of items from two self-report measures of motor function: Determination of item difficulty and relationships with children’s ability levels. Dev Med Child Neurol 2012; 54(5): 443-50.
[18] Susongko P. Validation of science achievement test with the Rasch model 2016.
[19] Yasin RM, Yunus FAN, Rus RC, Ahmad A, Rahim MB. Validity and reliability learning transfer item using rasch measurement model. Procedia Soc Behav Sci 2015.
[20] Fisher W P. Rating scale instrument quality criteria Rasch Meas Trans 2007.
[21] Efstation JF, Patton MJ, Kardash CM. The supervisory working alliance inventory: A validity study. J Couns Psychol 1990.
[22] Adel S M R, Davoudi M, Ramezanzadeh A. A qualitative study of politeness strategies used by Iranian EFL learners in a class blog Iran J Lang Teach Res 2016.
[23] Gunn JE, Carole Pistole M. Trainee supervisor attachment: Explaining the alliance and disclosure in supervision. Train Educ Prof Psychol 2012.
[24] Holloway EL. Clinical supervision: A systems approach 1995.
[25] Bernard JM, Goodyear RK. Fundamentals of clinical supervision 4th ed. 2009.
[26] Ladany N, Friedlander M L, Nelson M L. Critical events in psychotherapy supervision: An interpersonal approach
[27] Bennett CS. Attachment-informed supervision for social work field education. Clin Soc Work J 2008.
[28] Fitch J C, Pistole M C, Gunn J E. The bonds of development: An attachment-caregiving model of supervision Clin Superv 2010.
[29] Neswald-McCalip R. Development of the secure counselor: Case examples supporting Pistole & Watkins’s (1995) discussion of attachment theory in counseling supervision. Couns Educ Superv 2001.
[30] Pistole MC, Fitch JC. Attachment theory in supervision: A critical incident experience. Couns Educ Superv 2008.
[31] Duruturk N, Tonga E, Gabel CP, Acar M, Tekindal A. Cross-cultural adaptation, reliability and validity of the Turkish version of the lower limb functional index. Disabil Rehabil 2015; 37(26): 2439-44.
[32] Strobl C, Kopf J, Zeileis A. Rasch Trees: A new method for detecting differential item functioning in the rasch model. Psychometrika 2015; 80(2): 289-316.