The rationale for establishing a national text labelling and annotation service for the education sector


A large number of opportunities arise if a national text labelling and annotation service is established for the education sector. It can be used to address a wide range of problems that are encountered by individuals and institutions who operate within the field. In this short article I explore the rationale for such a service.

Labelling and annotating text
As mentioned in previous articles, schools, colleges, universities and other organisations and companies who operate within the education sector create, process and store large quantities of text. At the present moment in time, the bulk of that text is not labelled or annotated (Hussain, 2022a and 2022b). Let's review a pool of anonymised text that is available from an institution for all the complaints that it has received from students in recent years. How could that text be labelled and what can be derived from that labelled and annotated text? Firstly, the team that manages complaints will need to agree upon the set of labels that can be assigned to this body of text. At a basic level, these labels may simply reference the names of departments or services. Once the body of complaint text is added to the labelling and annotation platform the sentences that pertain to specific departments or services can be labelled with the department(s) name. The classification model that pertains to this labelled data is routinely tested as training progresses. Once the model is ready, a service can be made ready that handles incoming complaints. The labelled and annotated text means that complaints can be automatically directed to department or service managers. If institutions agree to share or pool their labelled data for this specific use case then that dataset can be utilised by subscribers to the national platform. This simple example can be repeated in all other use cases and contexts by individual institutions, by groups of institutions or collectively across the education sector.

A national text labelling and annotation service.

Enabling the development of AIED services
At the present moment in time there is a scarcity of operational AIED services within the education sector. There are various reasons to explain this. For instance, the time and cost that it takes to develop machine learning models that utilise labelled and annotated text should not be overlooked. If the education sector and the companies and organisations that serve schools, colleges and universities have access to a national text labelling and annotation service; they then have the means to bring down the time and cost that are associated with developing new digital services to support students, teachers and campus administrators. One of the key derivatives of a national text labelling and annotation service will be its ability to aid and foster the development of new AIED services. For example if subject topics are classified, labelled, annotated and made available to the EdTech sector, one can easily envisage an array of products and services that could emerge from that dataset. The potential use cases for a national text labelling and annotation service are many and wide ranging and equally, there are a wide range of products and services that may emerge from the service.

A large and growing library of labelled and annotated text may serve the needs of education researchers as they further their exploration of the teaching, learning and assessment landscape. The library could serve multiple research teams as they explore writing proficiency, the evaluation of automated processes to evaluate student writing, assessing the efficacy of automated feedback, the symbiotic relationship between Norbert Weiner's agents (Hussain, 2019) with teachers and students; and so much more. A text labelling and annotation service could also serve researchers as they conduct qualitative research; especially for large sample sizes. In addition, the service could be well placed to support the knowledge exchange activities of higher education institutions; especially those that pertain to the stores of labelled and annotated text.

Open or closed?
When I write about a national text labelling and annotation service I perceive it to be an open educational resource (OER) that is made available to the education sector. I also assume that the majority of individuals and organisations within the education sector will favour and support an OER model for sustaining, developing and nurturing the service. Furthermore, schools, colleges, universities, publishers, awarding bodies and more will be more confident in sharing their stores of text if the national text labelling and annotation service can demonstrate good governance arrangements and ethical practices in the handling, processing, distribution and use of labelled text.