Megatester

Megatester is a Web-based Computer Adaptive Testing system with the primary function of delivering multiple diagnostic versions of different kinds of tests, as well as providing robust test creation and administrative features. Developed for a company specializing in online training located in California, USA.

When an examinee is taking a test on a computer, an estimate of the examinee's ability can be updated after each response. That ability estimate can then be used to decide on the choice of the next question. That is what Computer Adaptive Testing (CAT) is.

CAT is much more efficient than a traditional paper-and-pencil test which is typically a "fixed-item" test where all the examinees answer the same questions. Since everyone is asked the entire set of questions, all examinees are forced to answer many questions which are too easy or too difficult for them. These easy and hard items provide relatively little information about the actual ability level of an examinee.

Megatester allows:

  • tests creation, as well as using of multimedia content in them,
  • taking tests, controlled by CAT module,
  • detailed statistical reports,
  • administering users and tests (including CAT parameters).

A test consists of a set of pages, where each page contains a question with multiple answers to it, one or more of which can be correct. A question may be a text, a sound, a Macromedia Shockwave or Flash demonstration. Every question and every answer have certain CAT scores (according to CAT, there are three types of scores: discrimination, difficulty and guessing), which later are used by CAT subsystem. The basic workflow is as follows: when a user logs into the system he gets a screen where he can choose either to take a new test or view the history of his previous tests. When a new test is chosen, the user is shown the following screen, where he can select a section of the test to take. After that every consecutive screen displays in browser a new question, whose complexity depends on the decision of the CAT-module based on the previous answers.

User's types

  1. User is able to take tests and see summary review
  2. Administrator is able to get permissions and create test creators, group administrators, and users and another administrators in the system, publish tests
  3. Test creator is able to create tests, but not to publish them
  4. Group administrator is able to create users in his group and make searching and other reviews on the whole his group

Functionality

Depending upon their privilege, a user is able to see different screens. Functionality available according to the privilege of a logged user:

  1. User section
    • CAT-based testing system
    • Support of multiple types of questions
    • Ability to continue an interrupted test
    • Detailed results reporting, also in graphical form and with export of results in Excel and XML-files
    • Animated tutorials to help the user use the system
    • CAT- and 'questions type'-modules as independent add-ins to the system
  2. Administrator section
    • Grouping of teachers, courses-creators and administrators
    • Search for user score via name, test, date, group
    • Ability to manage courses (create new, edit and delete existing)
  3. Tests creation section
    • Creation of new tests, editing of existing tests and deleting of it.
    • Upload on the server possibility for audio files and pictures.
  4. Teacher's section
    • Examination of users, belonging to teacher's group, seeing their statistics and results, search capability.
    • Checking users essay.
    • Administration of users in teacher's group.

Basics of implementation

The system is using Microsoft Internet Information Server and Active Server Pages technology for dynamic HTML content generating. Browsers allowed: Netscape from version 3.0 and Internet Explorer from version 4.0. The content sent to user is a standard HTML version 1.3, with simple parts of JavaScript version 1.1 code, which are independent of the browser platform.

The system on the server side is based on XML. We intensively used XML and XSL in this project, putting it from the beginning in the core of the whole architecture. Usage of XML allowed us to store tests (questions and answers) in a very flexible format, which can be read by different sub-systems of the system as well as by the CAT module. A sample of XML file for TOEFL test is here. For parsing of XML system we use Microsoft XML-Parser, the version that is distributed for free with Microsoft Internet Explorer 5.0. This means the system requires Microsoft Internet Explorer 5.0 installed on the server.


Figure 1. The main schema of the system

A Web-browser sends a request to the web-server (Microsoft Internet Information Server). This request is processed by ASP script implementing the core functionality of the system. The ASP script reads the XML file containing course questions and depending on the type of question processed the script uses the question template specified for this type module and generates HTML output to represent the question. The next question to be sent to the user depends on the calculations done in the CAT-module.

If user exceeds given time limit to answer a question, an automatic submit from the browser to the server will be done and user will get the next question with no scores for the previous.

Mathematics

According to the Item Response Theory, probability for correct (u=1) answer for some user which true ability theta is defined could be calculated using formula (1).

Item information function:

To calculate new proficiency we are using iteration

where

until right side of equation (adjustment) becomes extremely small.

Test is finished when standard error becomes less than some value

Items calibration is particularly difficult. Our implementation is not flexible enough and does not work correctly for each of the three parameters of the IRT model. If you have any investigation carried out in this area and would like to tail it with us, we would be happy.

Reports

The reporting section displays the results and allows the user to go through his or her answered questions, see their answers and the correct answers.

The user is able to see the summary review in two forms: as a table and as graphs.

  1. Tables
    The results of the tests taken are presented as a table showing question numbers, user's score, maximal score value for this question, time period provided to answer the question and time taken by the user. By clicking on the question number the user can look through the visual presentation of the question with the answer selected by the user and the correct answer shown. If a question was sound-based (Listening section), the sound will be repeated.
  2. Graphs
    Two graphical representations of the user results are possible.
    1) The question numbers on the X-axis, scores on the Y-axis, and two graphs: the red graph showing the correct answers with the maximum scores and the blue graph showing the user's answers and scores. By clicking on the question number on the X-axis the user opens a screen with the selected question.
    2) The question numbers on the X-axis, time on the Y-axis (where the maximum is the maximal time provided to answer a certain question), and one graph showing time spent to answer each question. In this way the user can determine which questions were especially difficult for him to answer. As above, by clicking on the question number on the X-axis the user can open a screen with the selected question.

Tools and technologies used:

    User Interface: HTML, DHTML, XML/XSL/XSLT, JavaScript
    Server-side: ASP, ADO, C++/ATL (CAT component)
    Web-servers: MS IIS 4.0
    OS: Windows NT/2000

Time spent:
36 man-months.