Online English Language Testing – Increased Access Comes With Security ChallengesBy Jennifer Maguire
August 2, 2022
Denver, Co. – English language testing has long been used for a variety of purposes, such as hiring and promotions, college and university admissions and immigration. The shift to online delivery of language tests did not start with the COVID-19 pandemic, but the pandemic certainly accelerated growth in the industry and acceptance of online tests by many more organizations, including US and Canadian schools. Need drove testing companies to expand or transition to the online format, and during the last few years, one recurring challenge has been ensuring testing security. Security has always been a focus for high-stakes tests, as some percentage of test takers continually seek ways to cheat the system. However, as tests move from secure testing centers to candidates’ homes or any location with a laptop or desktop computer and Wi-Fi access, testing companies experience new, continually evolving methods of cheating and have found the need to enact both technology- and human-driven security methods of increasing sophistication.
We interviewed nine global English language testing organizations to provide an overview of online language testing trends. Some of these companies were already leaders in online testing, while others expanded testing options due to the pandemic closures. In either case, the shift to remote, at-home testing highlighted new challenges in testing security. Conversations with the testing company representatives focused on digging into current security trends and issues, as well as the impact of the shift from paper-based to online English language testing. The most common security theme is layering of security measures. These layers may include Artificial Intelligence (AI) or computer-based security flagging systems coupled with human evaluation at differing levels. The combination of technology and human assessment creates a system with multiple safety nets that test providers hope will catch both simple security issues, such as candidates sneaking peeks at a mobile phone, and larger-scale operations, such as items harvesting or fraudulent candidate swapping. Interviews included representatives from Bright Language, Duolingo, Cambridge University Press and Assessment, IELTS, ETS, iTEP, Kaplan International Tools for English, LanguageCert and Pearson English.
Testing Companies Pave New Ground
“When you introduce more technology into the equation, there are more opportunities for cheating.”
Bright Language provides corporations, training organizations, universities, businesses, engineering schools and more with tests that assess the language proficiency of employees, staff or students. The testing company’s main office is in Paris, France, with additional offices in the US. Professional tests are used to assess mastery of items in training and development of programs, as well as within the selection process for hiring. Academic tests are used for international candidate university admissions and to assess whether students may obtain their undergraduate or graduate diploma. English is one of 11 languages tested by Bright Language. The Bright test consists of a written and an oral online questionnaire in two parts, comprising 60 questions each and lasting 40 to 60 minutes in total. Candidates do not need an appointment and can take the test at home or in a secure, private environment of their choosing.
“Language schools also use our tests at the beginning and at the end of their courses,” Bright Language Business Development Consultant Cinthia Cristaldi said.
Testing given at the beginning and end of courses provides data on the effectiveness of teaching strategies, as well as trends in student proficiency levels.
Cambridge University Test and Assessment offers what Operations Manager David Budd considers low- to mid-stakes testing, mainly for adults, typically for institutions. The test may also be used at language schools to test students to assess if a course may need to be retaken. IELTS (International English Language Testing System) is a partnership with Cambridge University, the British Council and IDP. “They produce the test. The British Council and IDP manage the centers,” Budd said. Tests offered include Linguaskill, IELTS and OET (Occupational English Test).
Duolingo, an EdTech company that produces the popular Duolingo language app, also offers an online high-stakes English proficiency test. The Duolingo English Test (DET) can be taken from anywhere and is primarily used for university admissions, both graduate and undergraduate. Headquartered in Pittsburgh, Pennsylvania, Duolingo also has offices in New York, Seattle and Beijing and has teams all over the world. The Duolingo English Test launched six years ago.
ETS (Educational Testing Service) provides TOEFL iBT (English-language test used for study, work and immigration, preferred by universities and institutions), TOEFL Essentials (used for study, work and immigration) and TOEIC (assessing English-language proficiency for the workplace). TOEFL IBT is available in 2,000+ test centers in 180+ countries, but now that it’s available online from home, it can be taken anywhere.
“We do a lot of different testing, paper-based, at home and in testing centers,” said Raymond Nicosia, Principal of Test Security for ETS. Nicosia has been with ETS for 33 years, overseeing test security. TOEFL is accepted in more than 160 countries, predominantly for international students who want to study in the US.
IELTS is an English language test for study, migration or work. The test is popular for candidates looking to migrate to Australia, Canada, New Zealand, the UK or the US. With the US headquarters in New York, IELTS has a presence all over the world. IELTS is paper-based and computer-delivered.
“We’re the world’s leading test for admission and immigration,” said IELTS USA Executive Director/CEO Ariel Foster. “The test is mainly for higher education and occupation, health care, as well as immigration.”
IELTS partners, Cambridge University Press and Assessment, the British Council and IDP, are with the former EU, so tests adhere to GDPR (General Data Protection Regulation) regulations. In October, IELTS announced the option for at-home testing with IELTS Online. The online test option will roll out this year. The test will be delivered online by trained IELTS examiners and will cover the four skills of IELTS academic, which include listening, reading, writing and speaking. The online, at-home test will be the same test candidates take in testing centers. “IELTS is focused on delivering the same test construct,” Foster said. “The test is the same however you take it.”
“It is critical, especially for US schools and universities, that this is a secure test and that the students’ levels reflect what the students are capable of.”
iTEP (International Test of English Proficiency) is headquartered in Los Angeles with operations in 61 countries. iTEP language testing is used for several purposes, including university admissions, ESL and IEP programs, student and English teacher assessment and business hiring assessment.
“In Ecuador, we assess all English teachers,” iTEP President Jim Brosam said. iTEP Ecuador is a testing center for English teachers, as well as a public testing center for study abroad. Additionally, the center provides direct sales for companies, government agencies and nonprofit organizations. iTEP also provides English language proficiency testing in Egypt as part of the public-school hiring process.
iTEP still has paper-based exams, but they are limited. “There are a few areas of the world that still need that,” Brosam said. “If you do a paper test in Africa, things have to be uploaded and sent.”
The shift to online, at-home testing did introduce new risks. “You’re taking testing [in] a very controlled environment, machines, computers, proctors watching,” Brosam said. “When you introduce more technology into the equation, there are more opportunities for cheating.”
Brosam gave the example of the paper-based testing practice of allowing students to make notes when completing the writing and speaking portion of language testing. In an at-home testing environment, there is no way of monitoring what is on any papers, so iTEP developed an on-screen, keyboard-based note system that allowed proctors to see any notes.
Other safeguards against cheating include plagiarism recognition software and an autosave feature that prevents students from purposefully crashing the system or faking power or Wi-Fi outages to buy time for outside assistance or to look at hidden notes. In the case of a system crash, students have instructions to wait five minutes, reboot the system and sign back in with their ID. New content appears, so any attempt at cheating based on the previous content cannot be made.
Kaplan International Tools for English, previously based in the US but now based in the UK, also has global offices all over the world. The testing company provides language assessments, first developed in 2009, for university admissions, academic placement/progress, language proficiency, course evaluation, talent management and lead generation. Kaplan International Tools for English specializes in online assessments, with no paper-based test options. Tests may be taken in the home or any private environment with a computer or laptop and Wi-Fi access.
LanguageCert, the language testing division of PeopleCert, is headquartered in Athens and London, with offices based worldwide. The US division headquarters is in Boston.
LanguageCert offers a range of tests for different purposes including the International ESOL (English to Speakers of Other Languages) exam, which is used to evidence English proficiency for study purposes. The IESOL exam can be taken online or at one of their test centers in over 120 countries. The exam is also approved by migration authorities, including UK Visas and Immigration. PeopleCert is a global organization that delivers certifications in business and information technology certifications. LanguageCert provides language testing and qualifications with different tests, such as LanguageCert Test of English (LTE), LanguageCert USAL esPro and LanguageCert Test of Classical Greek (LCTG) for different needs and for both adults and young learners, who are not high-stakes in terms of age. Tests include both high-stakes and low-stakes options.
LanguageCert uses a layered approach of AI and human proctors, as well as recorded test sessions to ensure test security. “It is critical, especially for US schools and universities, that this is a secure test and that the students’ levels reflect what the students are capable of,” said LanguageCert University Partnership Manager, US Bram van Kempen.
Pearson English provides both high- and low-stakes tests for multiple audiences. “Our three key audiences are individuals, institutions and businesses, and we have a comprehensive portfolio of English language tests as part of our ELL ecosystem that meet the needs of committed learners in these spaces, from young learners to students and working professionals,” said Pearson English President of English Language Learning Giovanni Giovannelli. “We offer both formative assessments generally used in the classroom and summative assessments usually taken in a test [center]. For instance, English Benchmark Young Learners, which can be taken on a tablet at school, or PTE Academic, which is a test for academic study and visa purposes and is taken in a test [center].”
Pearson English also has remote proctoring for online tests, including PTE Academic Online, with the Versant test, and the Pearson International English Certificate (PIEC). This year, Pearson English began offering the PIEC test both in testing centers and at home through a remote proctor.
Pandemic Is the Key Driver of Online Delivery
“We were always thinking about innovative ways to make our tests more accessible; the pandemic just accelerated that vision.”
Like many other industries, the COVID-19 pandemic greatly impacted language testing companies. In some cases, the companies already provided online tests, and so the impact was a rush to scale up the availability of tests and increase security measures. In other cases, traditionally paper-based and/or testing center-based tests were expanded to include remote options.
Cristaldi said Bright Language was already remote when the pandemic began, but scores-aligned remote testing exploded as the need arose.
“We changed our proctoring protocol to add another level of verification and security,” she said. “That grew a lot, as well as the amount of people and organizations open to remote testing. Universities were not very keen on doing online testing before. Now, more and more universities are adopting this. Now, they realize it’s a secure option for them.”
The challenge is to provide a remote, quality test with secure reliable testing conditions.
IELTS was mostly paper-based prior to COVID-19, but Cambridge University Press and Assessment was always an online test. “The product I worked on was always digital,” Budd said, “But we saw a big expansion. That put our name in the limelight, and our numbers have increased.” Tests prior to the pandemic were completed in testing centers, but when they began shutting down, Cambridge partnered with its first proctoring vendor, Sumadi, to offer a remote testing option.
When the school closures began, Duolingo was well-prepared, having launched the DET in 2016, several years before the pandemic.
“We were the only high-stakes English proficiency test available online and on demand,” said Sophie Wodzak, Duolingo Research Communications Specialist.
During the pandemic, a huge challenge for the testing industry was how to deliver a secure product online. “Duolingo had already tackled this problem,” Wodzak said. “We saw a lot of security flaws in an in-person testing model [and] with the things we’ve done to make it more accessible, it also makes it more secure.”
Duolingo was already accepted by more than 1,000 programs when the pandemic hit, and the pandemic accelerated growth and adoption by more universities.
“Universities were not very keen on doing online testing before. Now, more and more universities are adopting this. Now, they realize it’s a secure option for them.”
ETS has been in testing centers for decades, but in recent years they started looking at at-home delivery options with high security. Nicosia said ETS started considering proctoring vendors several years ago, and by 2020, the TOEFL test was available at home.
Before COVID, iTEP English language tests were delivered in testing centers or on university campuses. “COVID changed everything, not only for us but other companies with the shift to online testing,” Brosam said. “The big change COVID brought is that now, American and Canadian universities will accept home testing. The genie is out of the bottle; I don’t think they’ll ever put it back in.”
During the pandemic, Kaplan International Tools for English saw increased interest from colleges and universities to use online tests. “We already had a great product,” Director of Business Operations Rachel Kimber said. “We used it more widely.”
The Kaplan International Tools for English online test was originally developed in the US, but it moved to the UK during the pandemic, which is when Kimber’s staff came on board to manage it. The test was used by some universities and by talent management, but as schools and testing centers around the world shut down, the testing company saw increased demand, particularly for university admissions.
“That launched it onto another level,” Kimber said. “We’re promoting it much more. We want universities to promote it on their website. We’re becoming more vocal about the test and its advantages. I think COVID just made everybody say, ‘Actually, maybe there is something to these online tests; maybe there’s something we should be looking at.’”
LanguageCert already had online testing in 2019, so was well placed to help candidates impacted by testing center closures. The testing company saw huge demand from candidates and institutions.
“I think because we manage everything from end to end, from proctors to software downloaded, we were able to scale up pretty quickly,” Alison (McCale) Woolnough, Head of Educational Partnerships at LanguageCert, UK, said. “We weren’t dependent on external providers.”
“There was no scramble to create some kind of online platform,” van Kempen added. “We were very well set up and it was just a matter of volume.”
“The big change Covid brought is that now, American and Canadian universities will accept home testing. The genie is out of the bottle; I don’t think they’ll ever put it back in.”
“We were always thinking about innovative ways to make our tests more accessible; the pandemic just accelerated that vision,” Giovannelli said of the impact on Pearson English Language Learning. “We also, however, wanted to get it right; for instance, it took us almost a year to fully develop our PTE Academic Online remote proctored test.
“What is great to see is that the pandemic did not reduce demand; test takers are still interested in global mobility and the doors that an English test like PTE Academic can unlock. The whole world started doing more things remotely, and taking a test remotely is a natural extension of that.”
Security Technology Facilitates Transition to Online Testing
While paper-based testing centers may seem more secure, in many ways, technology, such as laptop and desktop cameras recording, keystrokes recording, AI grading, adaptive tests and more, provide increased methods of security, according to the consensus of testing company representatives. While some of these companies may have already been considered EdTech companies, providing only web-based testing prior to the COVID-19 pandemic, many transitioned to online testing or expanded these testing options to include language testing as schools and testing centers shut down in 2019 and 2020. The shift to web-based testing felt uncomfortable to some educational institutions due to the ingrained belief that testing centers provided a more secure environment and that test takers would perform more poorly. However, as time passed, many language test stakeholders became increasingly comfortable with the format, and test takers provided positive feedback about the increased accessibility of tests and appreciation for the reduction of testing anxiety that resulted from testing at home.
While in some ways, there are more opportunities for security features in online testing, there are also new ways for candidates to try to cheat. “There are more factors for cheating because people are in their bedrooms in China or [at] their kitchen table in Mexico,” Brosam said. “But there are really more opportunities to achieve, too.”
Benefits to online testing may include increased test accessibility, reduced travel costs, reduced test anxiety and reduced environmental impact. Reduction in paper usage and travel eco-footprint in paper-based testing at testing centers are some things to consider, agreed Woolnough and van Kempen, of LanguageCert.
Several representatives also agree that the use of live proctors has many benefits, including increased security of real-time assessment and reduction of testing anxiety.
Bright Language’s Cristaldi said the live proctors who guide the oral portion of the Bright Language tests initially provide a comfortable, easy-going atmosphere to calm the nerves of the candidates, many of whom are initially nervous, envisioning a highly-structured, formal oral interaction.
“We talk a lot about trying to provide a test that reduces anxiety,” Cristaldi said, adding, “At the end of the day, the idea for us is to use the innovation and tech that is available today to increase possibilities and accessibility for the candidate and to always provide the security and reliability but to make things easier without losing quality or security.”
Duolingo’s Head of Security, Basim Baig, stressed the value of accessibility of online language testing. “Pandemic aside, a test center model is a really inaccessible model for a lot of people. There are people all over the world who can’t access a test center.”
This issue of accessibility was the impetus for Duolingo founder Luis von Ahn’s creation of the Duolingo test model.
Sophie Wodzak, Research Communications Specialist with Duolingo, said von Ahn experienced this lack of accessibility when he was a student. Born in Guatemala, the EdTech founder saw the need for greater accessibility for testing when he had to travel to test and realized how difficult and expensive that process would be for many people.
“Our mission as a company is to expand access to education, and testing is a huge component,” Wodzak said. “There is very disparate access to testing centers. There are some places in dense countries where one place has to serve everyone.”
Other language test company representatives saw the same issues with accessibility and have continued to hear accolades from candidates who are thankful for opportunities to test from home and continue their education, apply for a job or work visa or apply for a new position or promotion that requires English language certification.
Kimber, of Kaplan International Tools for English, said, “Some students say, ‘I couldn’t have taken a test unless it was online.’ Some had COVID. Some took it while their children were sleeping. Some said, ‘We didn’t want to leave during Ramadan, but we could take this test instead.’ That whole prohibitive cost of traveling, overnight stay, booking a hotel, waiting for results, that whole thing takes so much time [and money].”
Accessibility, reducing travel costs and easing test anxiety are recurring themes.
“I think the biggest benefit is accessibility,” van Kempen said. “An online testing experience provides access to a wider group of students. Test centers can also be a bit of an intimidating environment for people who have test anxiety or those kinds of things. Being able to take a secure test in the comfort of your own home or environment can take away some of the stresses of taking tests.”
Online English language testing may be high-stakes or low-stakes. Security of the test is always considered but is of greater importance in high-stakes testing, which may determine university admissions, work or academic visa status and whether a candidate meets hiring requirements. Low-stakes testing may be used to ascertain education level for class placement or to identify areas for improvement or study to prepare to take a high-stakes test. In some cases, educational institutions utilize low-stakes tests to create data for assessment of viability of teaching practices or new curriculum.
While accessibility is an asset of online language testing, ensuring the security of tests, particularly high-stakes tests, remains an important issue for testing companies, whose reputations rest on the ability to certify the testing results.
“It is important to strike a balance between accessibility for test takers and control of the testing environment,” Giovanelli said. “Where we offer formative assessments designed for use in the classroom, it’s not necessary for us to apply security controls, such as identity verification, as ultimately the organization is administering the assessment for its own purposes.
“Whereas, where an assessment is used for the purposes of a job, university or visa application, for example, and the test is not administered by the [organization] considering the application, a higher degree of control over the testing environment and process is necessary. Ultimately, the organization receiving that test score needs to have confidence that the score is fair and valid.”
Testing Platforms Manage Layered Security
“We have our own security measures in place to detect fraud. We have thresholds and alerts in place and other measures like ad hoc checks of audio and video recordings where malpractice is suspected.”
Each language testing company uses a testing platform that provides secure testing with use of many security features, such as software or applications that prevent any other browser, application or extension from opening during the testing session, as well as connection of another screen.
The companies profiled have developed their own test security software platforms. Bright Language has the Bright Language Management Platform, Bright Secure; Duolingo has the Duolingo application; ETS has its own secure remote testing software; IELTS has its own secure proctoring remote testing platform with integrated video call speaking software; iTEP uses its own secure testing application; Kaplan uses its own KITE (Kaplan International Tools for English) software; and LanguageCert has its own, award-winning secure testing platform, ExamShield, which must be downloaded and used by candidates to ensure security during the testing session.
Because Cambridge University Press and Assessment uses third-party proctors, the testing management platform depends on the proctoring company.
Testing management software has a variety of functions, including preventing other browsers, tabs or extensions from opening. In most cases, sessions are recorded or regular photographs are taken, audio is analyzed, keyboard strokes are recorded and internal systems flag suspicious activities with a series of color-coded warnings, such as red, yellow and green. The flags are typically reviewed by human proctors to assess whether cheating is taking place or whether a false warning is due to something simple, such as a cat walking in the background behind the candidate.
Cambridge University Test and Assessment’s Budd described the process of remote testing as secure but warned that nothing is 100 percent secure. Browsers provided by the test proctors typically block everything, but some candidates find ways around the system and around the camera. Some proctoring services take still images of candidates regularly rather than recording the test taker, but Budd worries that surveillance methodology could create opportunities for test takers to take photos of content and post it online.
“The candidate is seen on camera but, of course, things off camera aren’t seen,” he said, of video or photographic technology. “There are always candidates who will try to beat the system. The best security is to have a large item bank so candidates cannot memorize information found online.”
ETS’ Senior Manager of Public Relations, Stephanie Winters, describes a steep investment that is necessary for these many layers of security to ensure test certifications are a true depiction of candidates’ abilities.
“Security is paramount to ETS. We spend tens of millions of dollars annually on our security infrastructure and technology,” Winters said. “Without this level of investment, we would not be able to stand behind the scores we send each day to the thousands of institutions that rely on them for high-stakes decisions. We hold ourselves to the highest standards that are not only expected by our stakeholders but that we expect of ourselves as an industry leader in security.”
The Kaplan International Tools for English testing platform “is housed on a very secure site, both in the Western World and in China,” Kimber said. We have our own server in China, which makes it more secure, as well.”
Pearson English and other testing companies each have unique security features that provide ongoing assessment of test security during each session. Security features are always evolving and innovating to meet new challenges, as a small percentage of candidates continue to try new ways to circumvent security systems.
“We have our own security measures in place to detect fraud,” Giovannelli said. “We have thresholds and alerts in place and other measures like ad hoc checks of audio and video recordings where malpractice is suspected.”
Identification Verification Is Critical First Step
Confirming the identity of test takers is the first step in ensuring testing security. Testing organizations require an official ID with a photo of the candidate, which may be a passport, driver’s license or other official photo ID. The process differs depending on whether the candidate is taking the online test at home or in a testing center with in-person proctors, but in either case, extensive efforts are made to ensure the person who is identified is the person taking the test throughout the entire session and through all sections of the test, including the speaking portion.
The details of verifying identification vary a bit among the language testing organizations, even for the remote testing sessions, but Bright Language described the verification process with the following steps:
- The candidate provides an official ID with a photo before beginning the test.
- The candidate logs into the testing platform where he or she sees an explanation of the process.
- The screen shows a button to click so the system can check the computer’s audio and video are working properly.
- The system asks the candidate to present the ID in front of the device webcam, and a photo is taken of the ID.
- At the next security level, which is completed by a person, the ID is checked against the live video of the candidate to ensure that the photo matches the person taking the test.
IELTS requires tests to be taken at testing centers, so confirmation of identity is completed in person by a proctor at the facility. LanguageCert testing may be completed at a testing center or in a home environment, but in online tests, identity is confirmed with a live, online proctor who verifies the official ID and the identity of the test taker in a manner similar to the one described above.
ETS follows a similar process for remote testing but also gathers a voice sample to be used for voice biometrics. The goal is to ensure the person completing the speaking portion of the test is the person identified during onboarding.
LanguageCert proctors are trained in facial recognition, so they assess the person taking the test and photo ID. A 360-degree room check is completed as well, which is also common in other remote testing practices. To increase candidate success, LanguageCert also ensures proctors are multilingual, improving communication between test takers and proctors.
“We are also developing a multilingual solution for proctors,” Woolnough said. The goal is to ensure that, in terms of onboarding, proctors can effectively communicate clearly with all candidates.
Video Recording of Candidates Is Mainstream Practice
Testing sessions are recorded with video surveillance in most cases; however, some testing management software instead takes periodic photos of candidates. In the case of video, many proctors have test takers pan the room with their cameras, including behind doors, in closets, under the desk, etc., to ensure no one else is in the room and no cheating materials are accessible during the test.
Even with a reasonable assurance that the room is secure, there are always other opportunities for candidates to cheat. One way that might not be readily evident is through bathroom breaks.
“Candidates are recorded, and any anomalies create a flag to be reviewed.”
One thing to bear in mind, Budd said, is toilet breaks. In testing centers, candidates are typically walked to restrooms and the proctor will scan the bathroom to ensure no one is inside. Then, the candidate is escorted back to the testing room. Because online tests may be completed in one’s home, safeguards must be put into place to ensure candidates are not using that time to cheat on the exam. “We will only allow toilet breaks between modules of the test,” Budd said. “There’s just no way to ensure security otherwise.” Since modules are completed before the breaks, candidates cannot seek help on test questions during the breaks.
All test company representatives described flagging systems for anomalies or concerning behavior detected during test systems. The flagging systems may vary a bit, but in all cases, flags are coded to indicate increasing levels of concerning behavior. For example, a red flag may indicate a serious concern, which could prompt immediate pausing or closing of the test session. Or, a yellow flag may indicate a mid-level concern that prompts a pause and assessment of the activity by the proctor.
“Candidates are recorded, and any anomalies create a flag to be reviewed,” Duolingo’s Baig said. “The face has to be visible, and the system flags it [for example] if the person keeps looking away.”
ETS also uses gaze tracking, which creates a flag if the candidate is repeatedly looking down or off to the side. “Any flags are then reviewed by proctors and by the ETS Office of Testing Integrity,” Nicosia said. “After the test, there is continued analysis to ensure test integrity.”
In all cases, statistical analysis is done after the test is completed but before candidate scores are released.
“We have three main areas at ETS where we have the experts,” Nicosia said. “These include the Office of Integrity, technical IT people making sure nothing can get hacked from delivery and the area of research and statistics where we look for these abnormalities.”
“Any flags are then reviewed by proctors and by the ETS Office of Testing Integrity. After the test, there is continued analysis to ensure test integrity.”
Testing company examiners and proctors are also trained to identify any potential security breaches to ensure that there are layers of security assessment for every test.
“Our computer-based testing [centers] used to deliver PTE and PEIC represent the gold standard in terms of control,” Giovannelli said. “Staffed by trained test administrators and featuring video and audio monitoring, palm-vein biometrics and ID authentication amongst other security measures.”
Some Suggest Adaptive Tests Increase Security
There is one area of security that remains up for debate: adaptive vs. non-adaptive tests. For companies with computer adaptive tests that adjust in real time based on an initial assessment of candidate responses to questions, the adaptive nature of the test provides an additional layer of security. Tests may be longer or shorter, which some argue may better assess candidate abilities, but as a security measure, this also means that no two candidates receive the same test. This may prevent test takers from cheating in a variety of ways, including receiving help from friends or family who have taken the test and reviewing online leaks of test questions. Adaptive tests that pull from a large item bank may also provide a strong layer of security in this capacity.
Of the test company representatives interviewed, the following companies have adaptive tests: Duolingo, Cambridge University Press and Assessment, ETS, Kaplan, LanguageCert and Pearson English.
The Duolingo test is always relatively short, under an hour. So, the adaptive test does not necessarily get shorter, but within that time frame, questions seen by the candidate may become more difficult or less difficult based on responses to previous questions.
“Because it adapts, you don’t see as many questions that are outside your proficiency level,” Baig said. “It’s faster, but also makes it possible to deliver securely online because the tests differ for everyone. Having this big item bank to pull from for each person’s tests means there’s a far reduced risk of anyone getting the same questions.”
Other tests may become shorter based on the candidate’s responses, and in any case, the questions will be adapted to meet the proficiency level of the test taker. From a security standpoint, the adaptive nature means no two candidates will see the same test, and with a large bank of questions to pull from, the likelihood of seeing the same questions is also reduced.
Additionally, the large item bank ensures that scoring isn’t susceptible to tampering by proctors or administrators.
Some company representatives conversely suggested adaptability may not accurately assess students’ language mastery, arguing that online tests should closely match those created for in-person testing. This argument was typically made for tests that are used for school admissions. As far as security, the consensus was that use of secure identity verification measures, recording of sessions, computer or AI grading and human graders make up the necessary security features needed to ensure any cheating attempts will be caught.
Testing companies that do not offer adaptive tests include Bright Language, IELTS and iTEP.
“If you take a look at an IELTS test, you’re producing a lot of language. You’re not identifying and selecting,” Foster said. “We’re really aligned with what the academic needs; the expectation is to be communicative. It’s a lot of short answers, extended writing and, of course, the speaking test.”
Brosam, of iTEP, said, “We have not invested yet in a lot of AI software. A lot of our partners are US institutions; the partners have privacy concerns.” Brosam provided the example of retinal tracking, which tracks eye movements to identify potential cheating actions, such as looking down or to the side repeatedly. “Also, [AI is not used] because a lot is not yet accurate,” he added.
Brosam also argued that an adaptive test does not necessarily mean the test is more secure. “The key is having a unique test format (versus a fixed form with the same questions) and a large inventory of content,” he said.
“You can have an adaptive test; however, if you do not have a large content inventory, you can end up with the students getting the same questions over and over,” he explained. “So, while iTEP does not conduct adaptive testing, we do utilize unique content so test takers get different questions each time. The goal is a more comprehensive assessment (still within 90 minutes).”
Whether or not tests adapt to user proficiency or provide standard questions from large test banks, live graders and examiners are often used to ensure accurate assessment of candidates’ submissions. The adaptive tests make the testing process faster by quickly homing in on a student’s level. As far as security, the benefit of adaptive testing is that it can make each test unique to a user by pulling from a bank of tests. This makes it harder for students to cheat.
Humans Still Play a Role
While technology plays a very active role in identifying a variety of security threats and improving testing access and functionality, it’s clear that humans are still a necessary element in the testing process. AI is often used to adapt tests to candidates’ proficiency levels and flag anomalies to increase security, and many companies use computer grading for at least some portions of the test. But, in all cases, testing company representatives stressed the importance of including humans, particularly in security measures. Human proctors, whether live or asynchronous, and human graders are still standard practice and, in the case of remote test delivery, may play an essential role in maintaining test security.
Use of Human Proctoring Services Reflects Limits of Technology
Some testing companies use e-proctoring services, while others have their own internally trained proctors. In either case, proctors are an important element in online testing security, as they perform the task of monitoring the test candidate.
Both Duolingo and iTEP use asynchronous proctors, meaning the test sessions are recorded and reviewed after the test is complete. “For us, one benefit is we don’t have any scheduling issues,” Brosam said of iTEP’s practice of asynchronous proctoring. “As long as you have a camera and web phone, you can take a test any time of the day.”
“Speaking is where most try to cheat, but it’s live videotape and we can watch them speak,” Brosam said. He described the value of asynchronous proctoring as stemming from two reasons: lack of scheduling challenges related to live proctoring and removal of the issue of proctors managing large groups of students simultaneously, leaving room for security breaches.
“We use the PhotoSure and live video recordings for speaking and we have everything documented,” he said. “It’s also great for partner schools, too. The interesting thing about the camera is people pretty quickly forget they’re on camera.”
Duolingo records the test session on the desktop app, which has security features such as shutting down any other programs, as well as recording the candidate and keyboard strokes. Once the test is submitted, the review is completed through AI and human reviewers.
“First, we have AI review for flags,” Baig said. “Then, human proctors go in and review the algorithm’s assessment. To us, this is a much more comprehensive and secure way of doing it.”
Bright Language uses a proctoring vendor, but the proctor is integrated into the Bright Language system. Both the proctor system and the Bright Language system record the test. A live meeting is scheduled with the candidate.
Some testing company representatives touted the importance of live proctoring for increased security, while others preferred subsequent analysis of asynchronous, recorded proctoring sessions. In either case, layers of computer analysis and human analysis are used to identify any fraudulent activity.
Cambridge University Press and Assessment uses third-party vendors, including Talview Proctoring Solution, Sumadi and several others. Cambridge tests are described as mid- to low-stakes; while a secure environment is still important, the stakes are not as important as those in a test that may determine admission into a competitive university or approval of a work visa.
“Because they are lower stakes, we’re able to put a lot of the autonomy in the hands of the agents as to how much scrutiny [goes into the test],” Budd said. “We do run our own analyses as well. The agents are third parties.”
Schools working with Cambridge University Press and Assessment typically choose the proctoring service they want students to use for testing.
iTEP does not use live proctors for the speaking portion of the test, instead reviewing the recording of the test for security and using human graders to assess the student’s performance.
Kaplan International Tools for English uses an e-proctoring vendor. The vendor records the session and reviews it, and then KITE has human graders who also review the recorded sessions.
ETS, which uses live proctoring, maintains a small ratio of proctor to candidates for home delivery, which may also improve security compared to traditional proctoring. In many cases, proctoring at testing centers may include one proctor to large groups of test takers. The ETS proctoring vendor also utilizes multiple levels of proctors. A test flag may trigger a supervising proctor to log into a test session, and a serious flag may even trigger a member of the ETS Office of Integrity to join the test session for additional oversight.
In some cases, the testing companies train and utilize their own proctors, preferring to maintain control of the proctor training and oversight process.
“IELTS has its own proctors who are recruited, trained and managed,” Foster said. “We have a face-to-face speaking test with a trained examiner who is ESL. In our face-to-face test, it’s focused on giving an accurate ability of the test taker to use English in real life rather than fixing the tech to fit what’s available. It’s a human-led and driven test, and the online test combines those authentic tasks with the online test.”
LanguageCert test candidates see one proctor per eight test takers, and proctors have supervisors who can assist as needed. The live proctors, supported by AI alerts, react to and manage any security issues in real time as test takers proceed through the tests.
“Our approach is multifaceted,” Woolnough said. “We very much believe in the use of live proctors, not just for onboarding but for continuous monitoring through the whole test. These are proctors that we employ, train and monitor.”
Pearson English also uses a live proctoring system coupled with test session recording to ensure control of the testing environment and real-time responses to any security flags.
“Remote proctoring technologies have allowed us to retain aspects of the control offered in test [centers] whilst offering accessibility to those unable to access a test [center],” Giovannelli said of Pearson English’s live proctoring system, OnVUE. “For instance, The OnVUE solution we use to deliver PTE Academic Online has excellent security features – including constant webcam monitoring by a human proctor.”
Human, Computer-Based Grading Increases Security
For many English language testing companies, human graders complete all grading, but in most cases, it’s a combination of AI or computer grading and human graders. Typically, the speaking and writing sections are graded by human graders or examiners. If there are concerns about security, AI creates a flag relevant to the level of concern, and then human graders review the work. For example, a plagiarism alert may be a red flag. Examiners would then review that portion of the written test to ensure the content is, in fact, plagiarized.
In several interviews, the importance of human graders was stressed. Minimizing multiple-choice or fill-in-the-blank answers not only increases the security of the tests but also provides a more accurate assessment of students’ abilities to utilize the language in an appropriate manner for work or school.
Duolingo uses an automated scoring system, which, Baig said, ensures every student is scored uniformly, reducing the risk of subjectivity in scoring.
Kaplan International Tools for English, like many others, uses plagiarism software to catch any plagiarized content. Listening and grammar are computer graded, but human examiners perform grading and assessment on the recorded writing and speaking portions of the test.
ETS uses a combination of human graders and AI grading for its adaptive test. “We have human graders for TOEFL speaking and writing,” Nicosia said. “Some AI is used to supplement that, as well. But there are humans involved.”
The LanguageCert test is assessed by LanguageCert trained markers; it’s all live graded,” Woolnough said. “Once the test is finished, the responses are uploaded to a secure system and are assigned to markers anonymously. Closed responses (i.e., right/wrong answers) are graded by the system, but the rest are marked by an examiner.”
The speaking portion of the LanguageCert test is a live, interactive meeting with a LanguageCert-appointed interlocutor. “The interlocutor conducts the room check and onboarding, and they also conduct the speaking part,” Woolnough said. “We feel it’s a more authentic way of assessing speaking skills. How we assess speaking (and other skills) is the same no matter how the test is delivered. The speaking exam is recorded and uploaded for our markers to assess and grade.”
Pearson English tests are scored using AI, as well as trained examiners.
“The spoken section of our exams is delivered by the computer and scored via AI,” Giovannelli said. “Online tests are scored by our automated scoring. Creating an accurate automated scoring system requires a vast amount of data. A test taker’s answers are not just reviewed by one examiner; instead, they are compared against millions of past responses and the combined knowledge of hundreds of examiners to give the most accurate, objective, consistent score possible.”
IELTS grading is machine-scored. “If there’s a set of accurate, like fill-in responses, they are machine-scored,” Foster said. “Nothing is sitting on the stove unmonitored here.”
Foster said the IELTS test does not have a lot of “fill-in” responses; instead, candidates are expected to communicate in ways that meet academic communication expectations.
iTEP only uses human graders. “Our graders are US ESL professionals and have been with us at least five years,” Brosam said. “Our ESL grading only includes writing and speaking prompts. Grammar, listening and reading are all computer graded. We have professional graders and above master graders. So, we have sets of eyes. Admissions, jobs and business careers are high stakes.”
There may not be consensus on the most secure but valid means of grading among the testing companies, but in each case, continued measures are taken to ensure scoring is an accurate representation of candidate abilities.
Regular Internal, External Audits Continue Innovating Security Measures
Security audit practices vary among the testing companies, with some bringing in third-party solutions to check for any vulnerabilities. Most companies perform consistent internal evaluations, and some complete scheduled internal evaluations.
“We do regular diagnostic testing and audits,” Baig said of Duolingo’s security auditing policy. “Near the end of last year, an external firm was hired to try to hack into the test and reveal any issues with security. We engage with that regularly. It’s a very proactive approach. We also regularly hire outside people to audit … Plus, we are always monitoring tests.”
IELTS also combines internal evaluation with outside auditing solutions.
“We use a combination of our own processes and we have leading third-party solutions,” Foster said of IELTS’ auditing practices. “We do have an audit program and have for many years. IELTS is a partnership and there are standards, so we have a regular audit program that we run. And there are other solutions working with third parties.”
LanguageCert, like several other test providers, utilizes statistical analysis, not only to catch candidates who are trying to cheat but also to identify any vulnerabilities. “For example, if someone is taking the exam too many times, that might be a sign of item harvesting [test information],” Woolnough said. “We run statistical analysis on ourselves and on candidates. We are regulated by an external regulator, so there are strict standards of test development and delivery.”
The external regulator is Ofqual (the Office of Qualifications and Examinations Regulation), a non-ministerial department that regulates qualifications, examinations and assessments in England.
Pearson English has a dedicated Academic Standards and Research team, “who are key to the development of our tests,” Giovannelli said. “This team rigorously [analyzes] all data and compares the tests against the language proficiency levels required to ensure they are accurate, fit for purpose and high-quality tests.”
In addition to conducting analysis and research in-house, Pearson English also commissions “research as required with world-leading universities and professionals in the field to provide an independent and expert view of our tests.” Reviews encompass both security and test validity.
Research Endorses Transition to Online Testing
Because online, at-home language testing is a relatively new phenomenon, much of the research on the topic is relatively recent. As the practice of remote testing delivery becomes more commonplace, there is sure to be continued research into all avenues of the topic. Much of the current research has been completed during the past few years, as the pandemic created the necessity for online delivery and educational institutions began accepting online test certifications.
Some of the common findings among the following articles include acknowledgement of initial resistance to online proctoring of high-stakes testing, followed by increasing adoption of acceptance of the practice, praise of accessibility and reduction of testing anxiety in remote tests from students, identification of new avenues of cheating with online tests, particularly third-party paid assistance, the effectiveness of a large item bank as a means of test security and reduction in cheating in exams that are text-rich with live proctoring supervision.
In the article, “Online invigilation of English language examinations: A survey of past China candidates’ attitudes and perceptions,” by David Coniam, published in the International Journal of TESOL studies in 2022, researchers draw on a previous study of reactions of past candidates in China of online proctoring of high-stakes English language testing. The paper references the shift to remote testing during the COVID-19 pandemic and the acceleration of remote testing practices, influencing the adoption and increasing acceptance of remote teaching, as well.
In the Coniam article, 64 (7%) of the 920 survey respondents sent to past candidates of LanguageCert IESOL examinations were from China, and responses indicate a positive acceptance of online proctored tests (OLP), with a strong endorsement of OLP delivery versus traditional paper-based delivery.
Coniam also points out that cheating on tests is not a new phenomenon, suggesting the use of the internet has simply provided a new avenue of digital sources and networks of people who facilitate paid cheating.
In the article, “Technology enhanced assessment (TEA) in COVID 19 pandemic,” conducted by researchers Rehan Amed Khan and Masood Jawaid and published in the Pakistan Journal of Medical Sciences in 2020, it is noted that the three areas of teaching, learning and assessment should be equally embraced in the context of access and delivery. Researchers emphasize the importance of changing negative perceptions of online delivery of assessments and embracing technology-enhanced learning and assessment in education.
Another article, “Testing in the time of COVID-19: A sudden transition to unproctored online exams,” by Ted Clark, was published in the Journal of Chemical Education in 2020. In the study, Clark notes that in the context of courses taught remotely, a remote exam provides continuity. Clark also found that use of a large test item bank is among the most effective means of providing security against the threat of compromised exam security.
“Whilst we believe there is still a place and a demand for classroom-based teaching and bricks and mortar test buildings, we also see that online testing is here to stay and will keep evolving.”
The article, “Detecting contract cheating: Examining the role of assessment type,” by Rowena Harper, Tracey Bretag and Kiata Rundle, published in 2021 by Higher Education Research and Development, explores the many issues of cheating in exams over the past decade, noting three areas of fraudulent activity. First, students report most frequently cheating on exams, particularly through the use of third-party cheating, and yet, staff members report rarely identifying evidence of cheating on exams. Secondly, students report cheating less on written assessments, such as essays; however, teaching staff report identifying evidence of cheating on written assessments more frequently than on exams with multiple-choice or fill-in-the-blank type responses. Thirdly, staff detection rates of cheating are highest in text-rich assignments and in invigilated or supervised exams, which bodes well for online language testing rich in textual assignments and with live proctoring.
Despite greater acceptance of remote and hybrid teaching practices, researchers suggest there may be continued resistance to remote testing from some organizations, particularly for high-stakes tests, but over time, the practice will become mainstream. Much of the research on online, at-home language testing is recent, as the pandemic created a situation that may have jumpstarted the at-home testing solution earlier than expected. However, now that the practice has become normalized, with strict security measures in place, it’s likely that researchers will continue monitoring all aspects of the practice, eventually providing long-term data.
Testing Companies Will Continue to Innovate, Meet Evolving Security Needs
The future looks bright for online language testing, but methods of ensuring security will continue to evolve. The consensus of testing company representatives is the need for layers of security, combining AI, humans and statistical analysis.
“As a company, we are continually investing in our platform and our systems, staying one step ahead in terms of security.”
“We’re always trying to see what’s on the market, see if we can add a new layer or feature regarding the security,” Cristaldi said.
Cambridge University Press and Assessment is experimenting with having two cameras for test candidates. “You’d have the web camera and an integration through the mobile phone through WhatsApp,” Budd said. “[Candidates] have to put their phone to one side, so the face and side of the candidate [would be visible]. This would make it more secure, but there are challenges with privacy and the tech cost of that.”
Duolingo recently launched the new item, Interactive Reading, which requires candidates to complete missing words and letters to make a passage whole. The new item helps students exercise skills they’re using in an educational environment, such as isolating different parts of passages.
“We’re always trying to see what’s on the market, see if we can add a new layer or feature regarding the security.”
“We’re trying to make the test a real-life use of language,” said Research Communications Specialist Sophie Wodzak.
ETS will continue to work with machine learning to improve security functions and decrease “false positives” that flag anomalies that aren’t security concerns.
“We are always looking to upgrade. We know it’s a chess game. You always want to be one move ahead,” Nicosia said. “We’re looking at different ways of delivery. Making sure they can’t do anything during the test and to detect it when they do. We have some wonderful tools we use to protect it now, but we’re always working.”
“iTEP has a lot of testing for multinational companies, higher, middle management and executive English levels,” Brosam said. “The growth for us is on the business career side. We have a great hospitality exam.”
Brosam sees the future of language testing as increasingly localized and with greater test prep available online. He also predicts increases in asynchronous usage as AI gets better and privacy concerns are addressed. The goal for iTEP is to put the control of test design into the hands of clients/schools.
“We’re very flexible and can do it,” he said. “We hope that within a few years, [agencies] can call up and say, ‘Here’s what I want,’ and we can do it.”
“We are always looking to upgrade. We know it’s a chess game. You always want to be one move ahead.”
Kimber said Kaplan International Tools for English plans for the future include continuing to work toward a seamless remote testing experience that is secure and robust.
Woolnough said the future for LanguageCert will focus on ensuring that, “As a company, we are continually investing in our platform and our systems, staying one step ahead in terms of security.”
She envisions increased use of biometrics information, in terms of high-stakes testing. It may be assumed that online testing is a cheaper option for testing organizations, but the cost of these security features is much higher than the traditional model.
“It’s a challenge for organized test providers,” Woolnough said. “It’s quite a big investment.”
Pearson English’s Giovannelli shared the sentiment that online testing is here to stay: “Whilst we believe there is still a place and a demand for classroom-based teaching and bricks and mortar test buildings, we also see that online testing is here to stay and will keep evolving.”