Birgit Hawelka

August 14, 2025

Teaching Concepts

Exam diversity instead of a standardised exam: Flexible assessment with specification grading

It is a regular occurrence for students to begin to complain loudly at the end of a semester: they complain about the mental strain and the pressure that come with the large number of exams they have to take in one particular period. Lecturers, on the other hand, voice their dissatisfaction with the students’ pointless bulimic revision or the large pile of marking they have to do. Alternative exam approaches like specifications grading (or specs grading for short) promise relief from this.

Specs grading can be understood as modular performance assessment. There does not seem to be a German term established yet that corresponds to it. This approach was first suggested by Nilson (2014) and is now widely used, particularly at US universities. In essence, specs grading means this:
lecturers design suitable assignments based on the module or course objectives that students can use to demonstrate that they have achieved an intended learning outcome. Then – and this is what is unusual about this method to begin with – the students decide themselves what grade they are aiming for. Depending on their goal, they choose the number and/or depth of the assignments to work on during the semester.

What might initially seem to be an anything goes approach tinged with a sense of anarchy actually follows clear principles based on a didactic context. For the first time, Howitz et al. (2025) have now compiled a systematic review summarising the method behind the approach, and the findings speak for or against the use of this method. A total of 90 individual studies were included in the analysis

that were published in peer-reviewed journals,
cross-referenced at least two aspects of specs grading and
were from the field of higher education (Bachelor’s or Master’s degree programmes).

The work identifies three core elements of specs grading: (1) bundles of assignments, (2) clear and simple evaluation and (3) the use of tokens.

(1) Bundles of assignments

The key element of specs grading lies in different assignments based on intended learning outcomes which students work on throughout the semester.

Howitz et al. (2025) identified three basic variants:
In variant 1, lecturers put together bundles of assignments for basic and in-depth skills (see Figure 1a). All students have to demonstrate the basic skills in order to pass the course. Students aiming for a higher grade work on additional in-depth assignments. Figure 1b uses a fictitious example to illustrate how a grade could come about depending on the assignments completed.

*Figure 1a.* Assignments set for variant 1 (left).
*Figure 1b*. Assignments completed for variant 1 (right).

This form of bundles of assignments specifically fosters basic knowledge and the fundamental specialist skills the students require for their further studies. It is mainly used in introductory courses, where the knowledge serves as a foundation for later modules.

In variant 2, lecturers give equal weighting to all the intended learning outcomes of a course or module. When grading, it is not which intended learning outcomes were achieved that is decisive but how many assignments were completed successfully overall (see Figures 2a and 2b).

assignments - equal weighting to all the intended learning outcomes — *Figure 2a.* Assignments set for variant 2 (left).
*Figure 2b*. Assignments completed for variant 2 (right).

This approach fosters the capacity to deal with all the course content in a way that is evenly balanced. It enables students to collect points systematically without individual content being weighted disproportionately. This clear assessment structure is particularly suitable for in-depth courses or courses in optional subjects.

A third variant is suitable for formats where there is a small number of intended learning outcomes which are comprehensively formulated (e.g. laboratory or writing courses). The focus here is on the learning process and the continuous development of skills – in writing courses, for example, this would involve the development from ideation to structuring and drafting through to revising and linguistic composition. Each assignment is designed to cover intended learning outcomes at different depths, which may well overlap in different assignments. Students should demonstrate the same intended learning outcome (e.g. producing different scientific texts) several times. They then do this at different levels of complexity and in different variations depending on the grade they are aiming for. In the example shown in Figure 3, there are four assignments set. The grade results from how many assignments students complete and the level at which they complete them (acceptable or very interesting). Three assignments at an acceptable level would be enough to pass, for example. A satisfactory result would be achieved if four assignments were completed at an acceptable level. At least one of three assignments would have to be very interesting to achieve a grade of 2. An excellent performance would be achieved if two out of three assignments were excellent.

assignments with focus on the learning process — *Figure 3a.* Assignments set for variant 3 (left).
*Figure 3b*. Assignments completed for variant 3 (right). Circles with a star indicate a particularly interesting quality.

(2) Clear and simple evaluation

In traditional examination systems, the assessment of student performance is based on differentiated criteria that are clearly defined. The assignments are not evaluated across the board but are analysed and assessed specifically according to different areas of requirement instead (Schaper et al., 2013).

The requirements for specs grading are also formulated transparently. Students know exactly what they have to do to pass an assignment. In most of the cases investigated by Howitz et al. (2025), however, the assessment is binary – a performance either meets the requirements or it does not; no gradations are provided for.

Occasionally, lecturers also work with three assessment levels, such as ‘clearly passed’ – ‘just passed’ – ‘needs revision’. This option enables more differentiated feedback and makes progress in the learning process more visible. It is usually used for bundles of assignments where there is a small number of intended learning outcomes which are comprehensively formulated (see Figure 3).

(3) Tokens

Of the 90 studies included in the review, 46 explicitly describe the use of a token system in the context of specifications grading.

Some teachers allocate a set number of tokens at the beginning of the course, which students can then use flexibly. Possible ways to use them here include assignments involving extra experiments, an extended deadline or making up for periods of absence if attendance is compulsory, for example. In other courses, students can acquire tokens by handing in short reflection assignments or doing low-threshold quizzes, for example. Mixed forms are also possible here. Students receive a basic contingent of tokens at the start and can earn extra tokens during the semester.

As a rule of thumb, some authors recommend basing the number of possible tokens per student on the number of assignments available in the course. The studies included show that this number is generously calculated. There were only a few exceptions where students used all the tokens. Instead, they tended to use this option in a strategic and targeted manner.

What are the effects of the changeover to specs grading?

Even if many of the individual publications referenced in the review (Howitz et al., 2025) describe the effects of the changeover, the results available were mainly based on anecdotal evidence; there have been no controlled studies conducted on the effects of spec grading as yet. Nevertheless, a good picture of the effect emerges in many points of didactic relevance.

Time commitment for lecturers

The changeover from a traditional grading system to specs grading is sometimes associated with the promise of minimised time expenditure for lecturers (Nilson, 2014). However, the literature review paints a more differentiated picture:

Although a significant proportion of lecturers (42.8%) reported a noticeable reduction in time expenditure, a good quarter of the studies examined (28.5%) show that the total expenditure remained constant or even increased (28.5%) over the course of the changeover. The initial designing of suitable bundles of assignments is described as being particularly labour-intensive. The development of tailored assignments requires a considerable amount of preparation. This is only partially compensated for by the simplified correction process.

Lecturers only report noticeable time savings once courses have run several times –especially when specific structural adjustments have been made. These include, for example,

reducing the number of assignments assessed,
limiting the number of repetitions per assignment (e.g. using a token system)

and setting clear deadlines for revisions and resubmissions.

Students’ learning behaviour

Numerous lecturers observed a noticeable change in students’ learning behaviour as a result of specs grading being introduced. In particular, the possibility of using tokens to revise assignments several times often leads to feedback being used more intensively. The students’ focus shifts from simply collecting points to their dealing with the learning content in more depth and in a more sustained way. This can be interpreted as an indication of a shift in performance orientation: away from extrinsic motivation and strategic performance optimisation towards a stronger focus on content competency and genuine mastery of the subject matter.

In other cases, the option of making repeat attempts can lead to students submitting lower quality work on their first attempts. If a solid performance counts just as much as high quality, there may be a lack of incentive to process the assignments as well as possible. Some lecturers also observed a tendency towards increased procrastination in the students.

Learning success

The majority of publications (66.6%) report an improvement in final grades as a result of spec grading. In only four out of 22 cases did the grades deteriorate. As there are still no systematic and controlled comparative studies with traditional assessment systems available, it remains to be seen whether this is due to improved learning behaviour (see above) or low expectations (in the sense of grade inflation).

Stress and exam nerves

A core element of specs grading is the ability to revise and resubmit assignments. From the students’ point of view, this leeway reduces stress and performance anxiety in many cases.
This phenomenon can be explained by various aspects: firstly, the increased clarity regarding expectations could contribute to a noticeable reduction in stress and exam nerves. In addition, the results of individual assignments become less important – grades are no longer awarded based on peaks in performance at specific times but are based on the continuous fulfilment of clearly defined criteria. The structured design and comprehensibility of the assessment system also gives students a sense of orientation and security – they can work specifically towards the final grade they desire. Finally, some students also perceive the option of choosing certain assignments as particularly motivating and stress-reducing.

If students felt increased pressure to perform in specs grading, this was mainly due to the (mis)interpretation of the binary grading system. Students interpreted a rejection of their work as a complete failure rather than as an integrative and valuable part of the learning process. Other students felt increased pressure to perform because they considered the threshold for passing an assignment to be very high.

Overall evaluation

Even if the results are not uniform across all the courses in the study, the overall mood among lecturers and students who have taught and learned with specs grading is quite positive. Three points seem to be particularly important for it to be successful:

Transparency. It must be clear to students which assignments they have to complete in order to achieve a certain grade and when they have passed these assignments.

Quality standards. High quality standards are important in order to make it clear that the assignments cannot be completed simply by being worked through. At the same time, a realistic, though not the best possible performance must be expected for a pass grade to be achieved.

Rules. Strictly regulated deadlines and repeat attempts in connection with a token system appear to ensure greater acceptance among lecturers and students alike.

From a didactic point of view, this system is certainly interesting. When lecturers work on creating the grade bundles, they are encouraged to reassess the intended learning outcomes of their course and the assignments that correspond to these intended learning outcomes in terms of constructive alignment. This could be a good way to make teaching more targeted and examinations more valid.

References

Howitz, M., Sommer, S., & Bastian, J. (2025). Specifications grading in higher education: A systematic review. Education Sciences, 15(1), 83. https://doi.org/10.3390/educsci15010083

Nilson, L. B. (2014). Specifications grading: Restoring rigor, motivating students, and saving faculty time. Stylus Publishing.

Schaper, N., Hilkenmeier, F., & Bender, E. (2013). Umsetzungshilfen für kompetenzorientiertes Prüfen: HRK‑Zusatzgutachten (Projekt nexus). Hochschulrektorenkonferenz. https://www.hrk-nexus.de/fileadmin/redaktion/hrk-nexus/07-Downloads/07-03-Material/zusatzgutachten.pdf

Suggestion of citation for this blog post

Hawelka, B. (2025, August 14). Exam diversity instead of a standardised exam: Flexible assessment with specification grading. Lehrblick – ZHW Uni Regensburg. https://doi.org/10.5283/ZHW.20250814.EN

Birgit Hawelka

+ posts

Dr. Birgit Hawelka is a research associate at the center for University and Academic Teaching at the University of Regensburg. Her research and teaching focuses on the topics of teaching quality and evaluation. She is also curious about all developments and findings in the field of university teaching.

You also might like...

Bunte Holzbausteine in verschiedenen Formen, die ineinandergreifen und einen Turm bilden.

Dovetailing teaching and assessment: 21st-century skills meet student orientation

To pastures new! – A review of the symposium “Analogue | Digital | Integrated: Competency-based assessment of the future”

ChatGPT and the Future of University Assessment

assessment