7. Statement by the Minister for Education and the Welsh Language: The Welsh Language Technology Action Plan

– in the Senedd at 5:38 pm on 20 February 2024.

Alert me about debates like this

Photo of Elin Jones Elin Jones Plaid Cymru 5:38, 20 February 2024

(Translated)

Next, a statement by the Minister for Education and the Welsh language on the Welsh language technology action plan. The Minister to make the statement.

Photo of Jeremy Miles Jeremy Miles Labour

(Translated)

Thank you, Llywydd. Earlier today, I published the final report on our Welsh language technology action plan. And I'm here today to celebrate the progress that has been made. The report looks back at the five years since we published the plan. It also marks the end of the plan’s lifespan. There is more in the report than I can talk about in this short session today, of course, and the report brings everything together for you to see.

The three themes of the plan back in 2018 were Welsh language speech technology, computer-assisted translation, and conversational artificial intelligence. We have taken the task seriously. Since then, we have funded, created and worked on many of the things that a language needs to thrive in the digital age. For example, with our grant, Bangor University’s Welsh language transcriber turns spoken Welsh into typed text. This helps Welsh-speaking users who have specific needs. It can also generate subtitles automatically for Welsh language videos and sound files.

By now, people who use Welsh language synthetic voices—for medical or accessibility reasons, for example—have more choice of voices, male and female, in different accents. Wales is a bilingual nation, and so are these voices. They reflect how speakers switch between Welsh and English in everyday conversations.

Machine translation has benefited from our website BydTermCymru. Here we share translation memories freely. With support from our grant, Bangor University has created specialist machine translation engines, one for health and care, and the other for legislation. The university has also worked with the company behind ChatGPT, OpenAI, to improve how their most powerful chatbot, GPT-4, processes the Welsh language. This is an example of partnership working, and partnership is central to our work. For example, as part of our partnership with Microsoft, we have collaborated to create a simultaneous interpretation facility in Microsoft Teams meetings.

The plan’s philosophy was to foster a culture of open innovation by ensuring that Welsh language digital resources and data were available to everyone, without unnecessary restrictions on their use. This has enabled developers and Welsh speakers to use and reuse them for their own purposes, and to create new products and services that benefit our language. It has also ensured that the resources that we’ve created are sustainable for the future. Our vision puts our language at the heart of digital developments. The work that we have done supports our goal of doubling the number of us who use Welsh every day, and, of course, of reaching a million Welsh speakers by 2050.

Photo of Samuel Kurtz Samuel Kurtz Conservative 5:42, 20 February 2024

(Translated)

Thank you, Minister, for bringing forward this statement this afternoon and informing us about the current situation.

As an enthusiastic supporter of the Welsh language and as one who is always very excited about new technologies, I recognise how important it is to ensure that technology plays a role in Wales’s journey to realise the ambitions of 'Cymraeg 2050'.

Moving on to this report, it’s almost seven years since the launch of the action plan by your predecessor, who is now the health Minister. According to the report, although technology is all around us, there’s little opportunity to use the Welsh language and, when using the Welsh language is an option, it’s not always clearly accessible. I do feel that we have made important progress since 2017, and today’s statement has endorsed that. Technology should be there to assist our use of the Welsh language, to develop the confidence of Welsh learners or those who are renewing their language skills, and provide a smoother, more user-friendly experience for the user if they do want to use technology through the medium of Welsh. The fact that the action plan recognises that technology is a priority area in terms of securing a place for the Welsh language in our lives is laudable, and, in light of today’s statement, I do feel that progress has been made.

The role of technology in supporting the use of a second language means engagement. I was pleased to see how SaySomethinginWelsh, the developers of Welsh learning apps, have created a free-of-charge short course to help people to learn the Welsh national anthem in time for the six nations match this weekend. They claim that one will be able to learn ‘Hen Wlad fy Nhadau’ in just four lessons, using the app. This kind of innovation embraces the power of technology, with the desire to develop new skills and to make the targets of 'Cymraeg 2050' realisable.

The 2017 report had three areas of priority, as you’ve mentioned: Welsh language speech technology, computer-assisted translation and conversational artificial intelligence, and I know that there are major developments across these three areas.

In focusing on some specific details in the report, therefore, in terms of what will be developed during the time of the action plan, I’m interested in learning how the voice banking facility for individual Welsh voices has been developed, because it is crucial that technology can identify the full range of accents and dialects that are part of our language. So, how will you ensure that all aspects of our language are recognised by technology?

Finally, in terms of Welsh language digital content, what support do we have at present to improve spell-checkers, grammar-checkers and mutation-checkers in Welsh, with the aim of ensuring that more facilities are available free of charge? Have you had any meetings with Microsoft or Google, for example, to discuss the issues, especially given the rapid growth of AI?

Minister, as I said at the outset, realising the ambitions of 'Cymraeg 2050' is something that I want to see happen. The success of the Welsh language technology implementation plan, which is under way, will make that possible.

Perhaps this is the last opportunity for me to discuss Welsh language issues with you in your current role, so, I would like to express my thanks to you for the work that you have done in promoting the Welsh language during your time as Minister. Thank you, Llywydd.

Photo of Jeremy Miles Jeremy Miles Labour 5:46, 20 February 2024

(Translated)

Thank you very much for that, and thank you for very important and incisive questions, if I may say. In terms of what Sam Kurtz said at the outset of his contribution, it is very important, isn't it, that we do ensure, as we set ourselves the aim of doubling the use of daily Welsh, that we do ensure that that is something that is real in our lives in every way, and the contribution of technology in our everyday lives, of course, is a reality and has been for a while and increasingly so, and so it's obviously important for us to continue to be relevant and to ensure that we evolve our strategy in the light of that. Today's statement talks about the infrastructure, if you like, the building blocks behind some of the public offer, but there's also other work happening in terms of learning Welsh through VR—virtual reality—the Aberwla programme and so forth. So, it's important that we continue to be relevant.

The Member talked about the importance of the statement and the strategy of 2017, and in terms of health and care, how important technology can be in that context, so we have funded bilingual synthetic voices that are used in the health sector by speech and language therapists, and the voices are used in Bangor University's Lleisiwr project, which helps patients use the Welsh language after treatment. We've also funded 16 augmentative and alternative communication voices for young people who depend on technology to communicate, for example, and they include voices, as he mentioned, that have accents from north Wales and south Wales, and for girls and boys and for teenagers. Therefore, a diverse range of voices to communicate through.

It is important—

Photo of Elin Jones Elin Jones Plaid Cymru 5:48, 20 February 2024

(Translated)

I do apologise for interrupting you. My mistake, I thought you'd concluded your remarks.

Photo of Jeremy Miles Jeremy Miles Labour

(Translated)

Just to say, the idea of banking, language banking, is very important. It contributes to the resource available for others to develop what we are investing in. So, Mozilla Common Voice, for example, is an example of how we can use this to create a bank of diverse Welsh voices and then data that can be used to train speech and language systems in Welsh.

Just to say, the partnership working is very important. We have a very good relationship with Microsoft and others, other companies that haven't yet been announced, but I hope to be able to do that before long, and the work with OpenAI is also very exciting.

Photo of Heledd Fychan Heledd Fychan Plaid Cymru 5:49, 20 February 2024

(Translated)

Thank you, Minister, for your statement. I enjoyed reading the report very much and I'd like to congratulate everyone who's been involved with this innovative and important work. It is really valuable for people to know the breadth of what is going on, and I think that one of the challenges is how we promote the fact that these resources are available, so that as many people as possible can benefit from them.

So, one of my questions is: what work will be done now to increase the use of all of these resources to ensure that everyone is aware of them? Because, certainly, I've found out about new things in reading the report, and I do think it's extremely important that we do ensure that that information is out there for people, because there's no point in having these resources unless they are used for their intended purpose, and that people use them in their daily lives.

I also reflect on the fact that what we have here is a final report. Clearly, technology is evolving quickly and all the time, so what are the next steps? Will there be a new plan? How will this work proceed and progress, because we're at the beginning of the journey for many of these projects, and they will evolve, I'm sure?

I would also reflect on the fact that it's seven years since Llyr Gruffydd spoke to Siri in this very Chamber and failed to get a response when communicating in Welsh, and unfortunately I think if you spoke to Siri in Welsh now, the same thing would happen today. It doesn't always understand me when I try and speak English to it, because of my accent. [Laughter.] But there is great work to be done to ensure that that range of technology that is part of our daily lives is available. So, can I ask specifically on Siri, have there been any conversations specifically with Apple, because I note in the report in terms of Alexa and the steps taken there, but clearly Apple forms a large part of many people's daily lives, and Siri, therefore?

One of the other things that I wanted to focus on—. Samuel Kurtz has already raised many of these issues. I do very warmly welcome the partnership on Duolingo. That's a key app, and I do think that as we revive and normalise the use of the Welsh language, ensuring that all of these apps are available will be important too. We've talked about standards previously, in relation to banking and so on, but there are so many barriers now. Almost all of our lives—every form and every helpline—is now online rather than face to face, so how will we, through the standards perhaps, ensure that there is that emphasis on technology, and how do we ensure that companies that previously were enthusiastic about the Welsh language when they had an individual available ensure that those services can be as easily provided through technology, so that the Welsh language can be used on a day-to-day basis, because it's becoming increasingly more difficult, in my view, to use the Welsh language? Where we have all this investment and so much to celebrate, as I said, why then do we find it sometimes more difficult to use the Welsh language? So, can I ask, how are we going to work with those organisations that are increasingly pushing us towards technology, to ensure that this work does progress, so that we all reach the aim of normalising the use of the Welsh language, but also to make that easy?

Photo of Jeremy Miles Jeremy Miles Labour 5:52, 20 February 2024

(Translated)

That's a fair point. In terms of the questions asked by Heledd Fychan—. It's resources that we are creating here, which are useful in terms of the work that technology developers, on the whole, can do. There are elements that are public facing, but many of the resources involve language infrastructure. I just want to recognise—. I've talked about Bangor University's work in my statement, but also the University of South Wales, Cardiff, Aberystwyth and Lancaster have also been involved in elements of this work, so there is an established network of very enthusiastic people working on this on the academic side. But I also want to recognise the work that a broad range of volunteers are doing, as in all technological areas, and that work is a completely natural part of this development landscape as well, so I want to recognise their role in disseminating the message too.

One of things that we've been clear about in terms of the principle of this, in order that we can promote in a way that does encourage other innovative developments, is that we do license for free, so the investment that we're making is creating material that we can license for free to the world, so that that can encourage innovation, and that's something that I'm very proud of. It's an important investment, but at the end of the day, to provide this openly is the way to ensure progress in this area.

In terms of a new plan or new scheme, I haven't decided yet whether a new plan is the right decision. At the end of the day, my starting point is that we need the activity to continue, certainly, and maybe prioritising that rather than making another plan is the best thing to do, given that the principles are very clear in the first plan and the relationships and the network of relationships have been established by now. So, I do think that building on that is the most productive thing to do, but I am keeping that under a watching brief, and resources have been earmarked for allowing us to do that in the future, so the work certainly is not coming to an end.

Heledd Fychan asked about Siri. We're not yet in a position to be able to do that. To develop that kind of technology, you need a significant range of linguistic data. Developers in the field require tens, if not hundreds of thousands of hours of data before they can create that kind of technology, and at the moment we have about 200 hours of relevant data in the Welsh language. That's completely normal for a minority language, internationally, so it's not unexpected, but it is a challenge. That's why we've prioritised linguistic banking—increasing that—and, in the end, that will enable that to happen.

In terms of Duolingo, that is important. Of course, the experience of people using Duolingo, on the whole, hasn't changed. For nearly all users, the experience is exactly the same, and they've agreed to have a discussion with us on a continuing basis. If we did decide that further development was needed, well, we do have dialogue to be able to discuss that with them, so that's constructive.

In terms of the apps, I just want to say, regarding the use of everyday technology, if you like, Bangor University has a chatbot called Macsen that is available through the medium of Welsh. So, there is an opportunity to develop further from that kind of development. We also made the Welsh interface of Office365 the default in Welsh-medium schools, and that has meant, overnight, that the use of that is now completely natural, and there's no need to make that specific choice. And I think that that kind of thing does show how we can make progress. At the end of the day, people do become accustomed to it very quickly, if you make the decision to do it. So, there are opportunities to do that in other ways as well.

Photo of Alun Davies Alun Davies Labour 5:57, 20 February 2024

(Translated)

Thank you, Minister. I'm very pleased to see this report. I'm delighted that you've published it, and that you have made the statement that you've made this afternoon. I warmly welcome it. Like Heledd, I am very eager to know when I'll be able to communicate through the medium of Welsh with Alexa and Siri, and I would like to know if—. You have answered a previous question, but I would like to know if you have a timetable for that additional development work that needs to be done in order to complete the process.

But I'm also eager to understand how AI is going to change the way you develop Welsh language policy, because it's extremely important that we do look at future developments for the Welsh language and ensure that the Welsh language is part of those developments, and also that the Welsh language is part of the developments undertaken by these global multinationals that we've been discussing, but the way we do this is also important to me. I'm very eager that we should focus on open-source technology rather than technologies that are closed or restricted, and I would like to know what the Government's stance on that is.

And then, how does the Government collaborate with large companies and businesses to ensure that apps are available through the medium of Welsh? I use an iPhone, for example, and the operating system is almost wholly available through the medium of Welsh, so I use the Welsh language every day there. But when I go into the apps on the phone, I have to turn to English, so I would like to ensure that we can have a Welsh language environment or digital landscape where we can use the Welsh language all the time.

Photo of Jeremy Miles Jeremy Miles Labour 5:59, 20 February 2024

(Translated)

Thank you for those questions. In terms of the work with Alexa and Siri and Android and so forth, the next challenge, and we're already working on it, is to create the bank of hundreds of thousands of hours of data. That's what we need to do now, and we have quite a long journey before we reach that aim, just because of the size of the Welsh language compared to the international languages that are used most often. But that work is already ongoing to collect more data, and that work feeds into the resources that we're funding, such as Macsen, which I mentioned, which is Bangor University's chatbot, but also to external resources. Alun Davies asked how we're collaborating with these companies. There are external resources, such as Amazon's MASSIVE dataset, which is a parallel dataset of about a million sayings, which lies behind what feeds into the Alexa system, so we have that relationship already.

I agree entirely with him that there's a great opportunity here in terms of AI. That's part of the reason why we're funding Bangor University to build on Macsen and work with OpenAI to see how the chatbot that they have, namely GPT-4, can process the Welsh language. That's then a major step forward in terms of provision, and importantly provision in terms of apps. My mobile phone has been set to the Welsh language, but, similar to his experience, when I look at an app, it's not in Welsh, but I anticipate that, when we're able to grow this opportunity of using Welsh through ChatGPT and so forth, that will be easier to do.

Just to say, finally, I do agree entirely with him that open licensing is the way ahead here. That's what's expected in this area, and that will ensure that we can continue to innovate.