Last updated 2017/6/1
|The Data Developed by the ICNALE Team
||The Data Developed by the 3rd Party
The ICNALE Edited Essays 0.3 (Formerly called The ICNALE-Proofread) has been released. It now includes 440 learners essays and the same number of edited essays. Rating info is also included.
The ICNALE-Proofread 0.2 has been released, which comprises 300 essays written by EFL learners and the edited essays by professional proofreaders. A unique dataset for an error analysis.
The ICNALE-SW 1.1 has been released, which comprises The ICNALE-Spoken V1.2 and The ICNALE-Written V2.1.
The ICNALE-SW 1.0 has been released, which comprises The ICNALE-Spoken V1.1 and The ICNALE-Written V2.1.
The ICNALE-Spoken 1.0 has been released.
The ICNALE-Spoken Baby 1.3 has been released.
Data of Indonesian learners was newly added. The new Baby includes transcripts and audio files of 2,900 speeches by 650 learners in Asia and 75 English native speakers.
The ICNALE ASMS (Automatic Speech Morphing System) Version 1.0 has been released.
This standalone software changes the pitch and the formant of collected speech data, which helps to keep participants' anonymity.
Dr./ Assoc. Prof. Ryo NAGATA of Konan University (Website) designed a new phrase structure annotation system for parsing learner English. Using this system, he tagged a part of the ICNALE-Written. Now you can download the ICNALE-PSA (phrase structure annotation), Sample Data, which includes 134 parsed texts ( 33,913 tokens).
What are included in The ICNALE for Download?
Registered users can download the whole corpus data and analyze it with concordancers such as AntConc and Wordsmith or self-made analytical programs.
Released by the ICNALE Development Team
The ICNALE-SW_1.1 Texts [17MB]
---- ICNALE_SW_1.1_Merged Texts (Plain/ Tagged)
---- ICNALE_SW_1.1_Unmerged Texts (Plain)
---- ICNALE_SW_V1.1_Release Note
The ICNALE_SW_1.1_Sounds [Caution! Approx. 1GB]
---- mp3 sound files
Registered users can obtain the whole corpus data and freely use it for academic purposes. If you plan to use the data for commercial purposes, please contact the project team in advance.
Released by the 3rd Party
The ICNALE-PSA (phrase structure annotated edition), Sample Data [280KB] (Compiled by Dr./ Assoc. Prof. Ryo NAGATA of Konan University (Website))
How to Obtain the Data
1. Firstly, download the data you need.
2. Then, register from the The ICNALE User Registration Form to obtain passwords for unzipping.
You do NOT need to register separately for each data.
If you cannot reach the registration page, please send your name, your institute, and your position (eg. Prof./ Grad Student/ Undergrad/ Independent researcher) directly to the project leader.
3. You will receive a password within a few days. If you do not receive any replies, please contact the project team.
If you use Mac OS, you may need some software such as Stufflt and ZipEZ to unzip/ uncompress the downloaded file. Or try this. Or ask your friend using the Windows to unzip it for you :-)
All the texts are encoded in the UTF-8 containing the BOM character [More Info
]. When using a concordance, you may need to set the character code before conducting analysis.
AntConc : Global Settings< Character Settings<Edit
When using the Wordmisth with a default setting, you will be required to convert each file to Unicode. Please choose No.
Or you can unclick "Convert from UTF8" option beforehand.
Codes for Individual Files