Data Labellers within the education sector
The data labelling role is common in industries such as finance, insurance, mapping and medicine. These roles support the classification of financial information, insurance claims, buildings, vehicles, medical records, clinical images and more. However, within the education sector data labelling roles are not yet prevalent. In this article I focus on data labelling roles that support text labelling and text annotation within the education sector. Please note that there are concerns regarding the labelling of education datasets that pertain to postal codes, gender, race, age, learning support needs, facial images or emotions and I have expressed those in an earlier article.
How may the education sector benefit from the work of data labellers?
The large repositories of text that are generated and stored by schools, colleges and universities across the globe offer the education sector the opportunity to create multiple products and services to support the needs of students, teachers and campus administrators along all points on the student-life-cycle. These may include conversational services such as campus chatbots and digital assistants, formative assessment solutions, or the myriad of workflows and processes that support students in campus settings. If these products and services are to be realised, they need access to an abundant supply of labelled and annotated text. Data labellers may support the labelling of local repositories of text that serve the needs of a single institution; they may work for companies who offer digital services for their education clients, or research teams who wish to explore the use of annotated data to support teaching, learning and assessment. Data labellers may also support national data labelling and annotation projects which exist to serve the needs and demands of students, teachers and campus support teams on a much larger scale.
How is Bolton College utilising the services of data labellers?
Bolton College has created a platform called FirstPass which is designed to support students and teachers with the formative assessment of open-ended questions. The platform's ability to mediate this process is underpinned by a growing library of subject topic classifiers which have been trained with labelled sentences. As the volume of text grows, the more accurate the classifiers are at assigning the most appropriate label(s) to sentences that the models have not seen before. The ability to label text enables the FirstPass platform to offer realtime feedback to students as they curate their responses to open-ended questions that have been posed by their teachers.
The training method that has been utilised by Bolton College to inform the FirstPass platform is called supervised learning and subject specialist data labellers are required to support this process. As mentioned earlier in this article, the data labellers will be subject specialist teachers or individuals who have a wealth of experience of working in a specialist field or industry. Funding from the NCFE's Assessment Innovation Fund is being used to pay teachers and vocational experts as they carry out this data labelling work. During the current academic year, colleagues at Bolton College will be supported by academics at the University of Bolton to gather qualitative and quantitative data to assess the success of hiring data labellers to support the supervised learning of subject topic classifiers on the FirstPass platform.
Potential roles and responsibilities for data labellers within the education sector
At the current moment in time data labelling roles are not well defined within the education sector. Perhaps there are very few educational institutions, education technology companies, publishers or awarding bodies who currently require the support of data labellers. However, once the value of annotated data is recognised by the sector, data labelling roles are likely to become common place. Like so many other professions multiple tiered roles will emerge within the education sector. The need for domain experts means that data labellers will have the opportunity undertake roles such as data labelling, text curation, content management, testing, quality assurance, the training of other domain experts, team leadership and management.