ICNALE: The International Corpus Network of Asian Learners of English
The ICNALE-Written: A collection of 1M+ words of controlled essays written by English learners in 10 countries and areas in Asia.
Project Leader: Dr. Shin'ichiro Ishikawa, Kobe University, Japan

About The ICNALE

Last updated 2014/05/31


The ICNALE-Spoken Baby 1 has been released. This includes the transcripts and the mp3 sound files of 1,000 speeches by 200 learners in Asia (JPN, CHN, TWN, and PHL) as well as 50 English native speakers. The ICNALE team plans to compile a new learner speech corpus as a spoken counter part of the ICNALE-Written by the end of 2015.

See the download page.

What is The ICNALE?

The International Corpus Network of Asian Learners of English (ICNALE) is one of the largest learner corpora focusing Asian leaners. The ICNALE holds 1.3 M words of controlled essays written by 2,600 college students in 10 Asian countries and areas as well as 200 English Native Speakers.

The ICNALE is designed as a reliable database for international contrastive interlanguage analysis and it can also be used for studies of the World Englishes in Asia. The ICNALE was compiled by Dr. Shin'ichiro Ishikawa of Kobe University, Japan in the research project supported by the MEXT (Ministry of Education, Science, Sports and Culture of Japan) / JSPS (Japan Society for the Promotion of Science), Grant-in-Aid for Scientific Research (B), 2010-2013, No. 22320104.2.


Ishikawa, S. (2014). Design of the ICNALE-Spoken: A new database for multi-modal contrastive interlanguage analysis. In S. Ishikawa (Ed.), Learner corpus studies in Asia and the world, 2 (pp. 63-76). Kobe, Japan: Kobe University.
Ishikawa, S. (2013). The ICNALE and sophisticated contrastive interlanguage analysis of Asian ;earners of English. In S. Ishikawa (Ed.), Learner corpus studies in Asia and the world, 1 (pp. 91-118). Kobe, Japan: Kobe University.
Ishikawa, S. (2011). A new horizon in learner corpus studies: The aim of the ICNALE project. In G. Weir, S. Ishikawa, & K. Poonpon (Eds.), Corpora and language technologies in teaching, learning and research (pp.3-11). Glasgow, UK: University of Strathclyde Publishing.
Ishikawa, S. (2012). Beshikku Kopasu Gengogaku. Tokyo: Hitsuji Shobo. [Basic Corpus Linguistics].

Focus on Asian Learners

The ICNALE includes essays written by EFL learners (China, Indonesia, Japan, Korea, Taiwan, Thailand) and ESL users (Hong Kong, Singapore, Pakistan, Philippines) in Asia, as well as English native speakers (US, UK, Australia, etc.), covering all of the Inner, Outer, and Expanding Circles in Asia (Kachru, 1992).


Countries covered in Asia (Original map from UNESCO)
Country Code Country/ Area  Writers/Essays # of Tokens
Inner Circle 
 ENS* USA, UK, CAN, AUS, NZ  200/ 400 88,792 
Outer Circle
 HKG  Hong Kong 100/ 200  46,111
 PAK  Pakistan 200/ 400   93,100
 PHL  Philippines  200/ 400   96,586
 SIN  Singapore  200/ 400   96,733
Expanding Circle
 CHN  China  400/ 800  194,613 
 IDN  Indonesia 200/ 400   92,316 
 JPN  Japan 400/ 800  176,537 
 KOR  Korea 300/ 600  130,626 
 THA  Thailand   400/ 800  176,936 
 TWN  Taiwan  200/ 400  89,736 
Total --- 2,800/ 5,600 1,282,086*
   *1,306,660 tokens based on the word count by Wordmith.

Control on Writing Conditions

In order to conduct a reliable contrastive study, we need to control varied factors potentially influencing the language of the essays (Adel, 2008). In the ICNALE, writing conditions are controlled as strictly as possible.

 ◆The number of topics ... 2
 (A) "It is important for college students to have a part time job."
 (B)"Smoking should be completely banned at all the restaurants in the country."

◆Time ... 20 to 40 minutes

◆Length ... 200-300 words (±10%)

◆Dictionary use... Prohibited

◆Spell checker use... Mandatory

Control on Writers' L2 Proficiency

Another factor influencing essay data is a writer's L2 proficiency. In the project, based on the writers' scores in the standard L2 proficiency tests such as TOEIC or TOEFL or in the standard vocabulary size test (VST) (Nation, & Begler, 2007), we classified writers' proficiencies into four levels: A2 (Waystage), B1_1 (Threshold: Lower), B1_2 (Threshold: Upper), and B2+ (Vantage or higher). These are identical with the levels proposed in the CEFR (Common European Framework of Reference). The table below shows the percentage of writers at each proficiency level in individual countries and areas.

Levels A2 B1_1 B1_2  B2+ 
 TOEIC  -545 550+  670+  785+ 
TOEFL  -56 (-486)  57 (487)+  72 (527) +  87 (567)+ 
 VST -24 25+  36+  47+ 
 HKG  1.0% 30.0% 52.0% 17.0%
 PAK  9.0% 45.5% 44.0% 1.5%
 PHL  1.0% 5.5% 88.0% 5.5%
 SIN  0.0% 0.0% 67.0% 33.0%
 CHN  12.5% 58.0% 26.3% 3.3%
 IDN  16.0% 41.0% 41.5% 1.5%
 JPN  38.5% 44.8% 12.3% 4.5%
 KOR  25.0% 20.3% 29.3% 25.3%
 THA  29.8% 44.8% 25.0% 0.5%
 TWN  14.5% 43.5% 30.5% 11.5%

2014/6/19 Corrected typos in the threshold scores of the TOEFL. See the "Data Collection & Processing" page for detail.

Comparable NS Data

The ICNALE also includes 400 essays written by English native speakers. As the topics and writing conditions are identical, you can conduct a reliable compariosn between NS and NNS.

The ENS Module includes 400 essays written by 200 writers, who are subdivided into ENS1 (ENS_001-100) and ENS 2 (ENS_101-200). ENS 1 writers are college students, while ENS 2 writers are employed (the average age is 34.3). Countries included in the ENS module (ENS1 + ENS2) are USA (57.0%), United Kingdom (14.0%), Canada (14.0%), Australia (8.5%), and New Zealand (6.5%).  The ICNALE covers both of the British English and American English.

How to Access The ICNALE

There are two ways for you to utilize The ICNALE. One is The ICNALE Online, where you can conduct KWIC Search, Collocation Search, Wordlist Search, and Keywords Search. The other is The ICNALE for Download. You can download the whole data and analyze it with your favorite concordancer such as AntCoc or Wordsmith.

The ICNALE Online
The ICNALE for Download

The ICNALE Development Team

Project Leader --- Shin'ichiro Ishikawa (Kobe University)

Academic Advisers --- Masao Aikawa (Kyoto University of Foreign Studies), Ichiro Akano (Kyoto University of Foreign Studies), Kazuaki Goto (Setsunan University), Tetsuya Enokizono (Chukyo University), Hideo Masuda (Kyoto Instiute of Technology), Masamichi Mochizuki (Reitaku University), Yasumi Murata (Meijyo University), Hiroshi Shimatani (Kumamoto University), Masahiro Hori (Kumamoto Gakuen University)

China --- Katsuki Mayumi (Dalian University of Technology), Fang Li (Wuhan University), Lu Yuanwen (School of Foreign Languages, Shanghai Jiaotong University)

Indonesia --- Leonardi Lucky Kurniawan(Polytechnic of Ubaya, Surabaya)

Korea --- Sook Kyung Jung (Daejeon University)/ Oryang Kwon (Seoul National University)

Japan --- Shin'ichiro Ishikawa (Kobe University)/ Yuka Ishikawa (Nagoya Institute of Technology)

Hong Kong --- John Milton (Hong Kong University of Science & Technology)

Pakistan --- Asim Mahmood (Government College University (GCU) Faisalabad)

The Philippines --- Karen L. Gabinete (De La Salle University-Manila)

Singapore --- Vincent Ooi (National Singapore University)

Taiwan (Chinese Taipei) --- Siaw-Fong Chung (National Chengchi University)

Thailand --- Sonthida Keyuravong/ Punjaporn Pojanapunya (King Mongkut's University of Technology, Thonburi)