Text Simplification for Reading Assistance: A Project Note Kentaro Inui Atsushi Fujita Tetsuro Takahashi Ryu Iida Nara Advanced Institute of Science and Technology Takayama, Ikoma, Nara, 630-0192, Japan CUinui,atsush-f,tetsu-ta,ryu-iCV@is.aist-nara.ac.jp Tomoya Iwakura Fujitsu Laboratories Ltd. Kamikodanaka, Nakahara, Kawasaki, Kanagawa, 211-8588, Japan iwakura.tomoya@jp.fujitsu.com Abstract This paper describes our ongoing research project on text simplification for congenitally deaf people. Text simplification we are aiming at is the task of offering a deaf reader a syn- tactic and lexical paraphrase of a given text for assisting her/him to understand what it means. In this paper, we discuss the issues we should address to realize text simplification and re- port on the present results in three different aspects of this task: readability assessment, paraphrase representation and post-transfer er- ror detection. 1 Introduction This paper reports on our ongoing research into text simplification for reading assistance. Potential users targeted in this research are congenitally deaf people (more specifically, students at (junior-)high schools for the deaf), who tend to have difficulties in reading and writing text. We are aiming at the development of the technology of text simplification with which a reading assistance system lexically and structurally paraphrases a given text into a simpler and plainer one that is thus more comprehensible. The idea of using paraphrases for reading as- sistance is not necessarily novel. For example, Carroll et al. (1998) and Canning and Taito (1999) report on their project in which they address syn- tactic transforms aiming at making newspaper text accessible to aphasics. Following this trend of re- search, in this project, we address four unexplored issues as below besides the user- and task-oriented evaluation of the overall system. Before going to the detail, we first clarify the four issues we have addressed in the next section. We then reported on the present results on three of the four, readability assessment, paraphrase representa- tion and post-transfer error detection, in the subse- quent sections. 2 Research issues and our approach 2.1 Readability assessment The process of text simplification for reading as- sistance can be decomposed into the following three subprocesses: a. Problem identification: identify which portions of a given text will be difficult for a given user to read, b. Paraphrase generation: generate possible candi- date paraphrases from the identified portions, and c. Evaluation: re-assess the resultant texts to choose the one in which the problems have been resolved. Given this decomposition, it is clear that one of the key issues in reading assistance is the problem of as- sessing the readability or comprehensibility 1 of text because it is involved in subprocesses (a) and (c). Readability assessment is doubtlessly a tough is- sue (Williams et al., 2003). In this project, however, we argue that, if one targets only a particular popu- lation segment and if an adequate collection of data is available, then corpus-based empirical approaches may well be feasible. We have already proven that one can collect such readability assessment data by conducting survey questionnaires targeting teachers at schools for the deaf. 1 In this paper, we use the terms readability and comprehen- sibility interchangeably, while strictly distinguishing them from legibility of each fragment (typically, a sentence or paragraph) of a given text. 2.2 Paraphrase acquisition One of the good findings that we obtained through the aforementioned surveys is that there are a broad range of paraphrases that can improve the readabil- ity of text. A reading assistance system is, therefore, hoped to be able to generate sufficient varieties of paraphrases of a given input. To create such a sys- tem, one needs to feed it with a large collection of paraphrase patterns. Very timely, the acquisition of paraphrase patterns has been actively studied in re- cent years: AF Manual collection of paraphrases in the context of language generation, e.g. (Robin and McKeown, 1996), AF Derivation of paraphrases through existing lexical resources, e.g. (Kurohashi et al., 1999), AF Corpus-based statistical methods inspired by the work on information extraction, e.g. (Jacquemin, 1999; Lin and Pantel, 2001), and AF Alignment-based acquisition of paraphrases from comparable corpora, e.g. (Barzilay and McKe- own, 2001; Shinyama et al., 2002; Barzilay and Lee, 2003). One remaining issue is how effectively these meth- ods contribute to the generation of paraphrases in our application-oriented context. 2.3 Paraphrase representation One of the findings obtained in the previous stud- ies for paraphrase acquisition is that the automatic acquisition of candidates of paraphrases is quite re- alizable for various types of source data but acquired collections tend to be rather noisy and need manual cleaning as reported in, for example, (Lin and Pan- tel, 2001). Given that, it turns out to be important to devise an effective way of facilitating manual correc- tion and a standardized scheme for representing and storing paraphrase patterns as shared resources. Our approach is (a) to define first a fully express- ible formalism for representing paraphrases at the level of tree-to-tree transformation and (b) devise an additional layer of representation on its top that is de- signed to facilitate handcoding transformation rules. 2.4 Post-transfer text revision In paraphrasing, the morpho-syntactic informa- tion of a source sentence should be accessible throughout the transfer process since a morpho- syntactic transformation in itself can often be a mo- tivation or goal of paraphrasing. Therefore, such an approach as semantic transfer, where morpho- syntactic information is highly abstracted away as in (Dorna et al., 1998; Richardson et al., 2001), does not suit this task. Provided that the morpho- syntactic stratum be an optimal level of abstraction for representing paraphrasing/transfer patterns, one must recall that semantic-transfer approaches such as those cited above were motivated mainly by the need for reducing the complexity of transfer knowledge, which could be unmanageable in morpho-syntactic transfer. Our approach to this problem is to (a) leave the de- scription of each transfer pattern underspecified and (b) implement the knowledge about linguistic con- straints that are independent of a particular trans- fer pattern separately from the transfer knowledge. There are a wide range of such transfer-independent linguistic constraints. Constraints on morpheme connectivity, verb conjugation, word collocation, and tense and aspect forms in relative clauses are typ- ical examples of such constraints. These four issues can be considered as different aspects of the overall question how one can make the development and maintenance of a gigantic re- source for paraphrasing tractable. (1) The introduc- tion of readability assessment would free us from cares about the purposiveness of each paraphrasing rule in paraphrase acquisition. (2) Paraphrase ac- quisition is obviously indispensable for scaling up the resource. (3) A good formalism for representing paraphrasing rules would facilitate the manual re- finement and maintenance of them. (4) Post-transfer error detection and revision would make the system tolerant to flows in paraphrasing rules. While many researchers have addressed the issue of paraphrase acquisition reporting promising results as cited above, the other three issues have been left relatively unexplored in spite of their significance in the above sense. Motivated by this context, in the rest of this paper, we address these remaining three. 3 Readability assessment To the best of our knowledge, there have never been no reports on research to build a computational model of the language proficiency of deaf people, ex- cept for the remarkable reports by Michaud and Mc- Coy (2001). As a subpart of their research aimed at developing the ICICLE system (McCoy and Master- man, 1997), a language-tutoring application for deaf learners of written English, Michaud and McCoy de- veloped an architecture for modeling the writing pro- ficiency of a user called SLALOM. SLALOM is de- signed to capture the stereotypic linear order of ac- quisition within certain categories of morphological and/or syntactic features of language. Unfortunately, the modeling method used in SLALOM cannot be directly applied to our domain for three reasons. AF Unlike writing tutoring, in reading assistance, tar- get sentences are in principle unlimited. We therefore need to take a wider range of morpho- syntactic features into account. AF SLALOM is not designed to capture the difficulty of any combination of morpho-syntactic features, which it is essential to take into account in reading assistance. AF Given the need to consider feature combinations, a simple linear order model that is assumed in SLALOM is unsuitable. 3.1 Our approach: We ask teachers To overcome these deficiencies, we took yet an- other approach where we designed a survey ques- tionnaire targeting teachers at schools for the deaf, and have been collecting readability assessment data. In this questionnaire, we ask the teachers to compare the readability of a given sentence with paraphrases of it. The use of paraphrases is of critical importance in our questionnaire since it makes manual readabil- ity assessment significantly easier and more reliable. 3.1.1 Targets We targeted teachers of Japanese or English liter- acy at schools for the deaf for the following reasons. Ideally, this sort of survey would be carried out by targeting the population segment in question, i.e., deaf students in our study. In fact, pedagogists and psycholinguists have made tremendous efforts to ex- amine the language proficiency of deaf students by giving them proficiency tests. Such efforts are very important, but they have had difficulty in capturing enough of the picture to develop a comprehensive and implementable reading proficiency model of the population due to the expense of extensive language proficiency testing. In contrast, our approach is an attempt to model the knowledge of experts in this field (i.e., teaching deaf students). The targeted teachers have not only rich experiential knowledge about the language pro- ficiency of their students but are also highly skilled in paraphrasing to help their students’ comprehension. Since such knowledge gleaned from individual ex- periences already has some generality, extracting it through a survey should be less costly and thus more comprehensive than investigation based on language proficiency testing. 3.1.2 Questionnaire In the questionnaire, each question consists of sev- eral paraphrases, as shown in Figure 1 (a), where (A) is a source sentence, and (B) and (C) are para- phrases of (A). Each respondent was asked to as- sess the relative readability of the paraphrases given for each source sentence, as shown in Figure 1 (b). The respondent judged sentence (A) to be the most difficult and judged (B) and (C) to be comparable. A judgment that sentence D7 CX is easier than sentence D7 CY means that D7 CX is judged likely to be understood by a larger subset of students than D7 CY . We asked the respondents to annotate the paraphrases with format-free comments, giving the reasons for their judgments, alternative paraphrases, etc., as shown in Figure 1 (b). To make our questionnaire efficient for model ac- quisition, we had to carefully control the variation in paraphrases. To do that, we first selected around 50 morpho-syntactic features that are considered influ- ential in sentence readability for deaf people. For each of those features, we collected several sim- ple example sentences from various sources (literacy textbooks, grammar references, etc.). We then man- ually produced several paraphrases from each of the collected sentences so as to remove the feature that characterized the source sentence from each para- phrase. For example, in Figure 1, the feature char- acterizing sentence (A) is a non-restrictive relative clause (i.e., sentence (A) was selected as an example of this feature). Neither (B) nor (C) has this feature. We also controlled the lexical variety to minimize the effect of lexical factors on readability; we also restricted the vocabulary to a top-2000 basic word set (NIJL, 1991). 3.1.3 Administration We administrated a preliminary survey targeting three teachers. Through the survey, we observed that (a) the teachers largely agreed in their assessments of relative readability, (b) their format-free comments indicated that the observed differences in readabil- ity were largely explainable in terms of the morpho- syntactic features we had prepared, and (c) a larger- scaled survey was needed to obtain a statistically re- liable model. Based on these observations, we con- ducted a more comprehensive survey, in which we prepared 770 questions and sent questionnaires with a random set of 240 of them to teachers of Japanese or English literacy at 50 schools for the deaf. We Figure 1: Sample question and response asked them to evaluate as many as possible anony- mously. We obtained 4080 responses in total (8.0 responses per question). 3.2 Readability ranking model The task of ranking a set of paraphrases can be de- composed into comparisons between two elements combinatorially selected from the set. We consider the problem of judging which of a given pair of para- phrase sentences is more readable/comprehensible for deaf students. More specifically, given para- phrase pair B4D7 CX BND7 CY B5, our problem is to classify it into either left (D7 CX is easier), right (D7 CY is easier), or com- parable (D7 CX and D7 CY are comparable). Once the problem is formulated this way, we can use various existing techniques for classifier learn- ing. So far, we have examined a method of using the support vector machine (SVM) classification tech- nique. A training/testing example is paraphrase pair B4D7 CX BND7 CY B5 coupled with its quantified class label BWB4D7 CX BND7 CY B5 BE CJA0BDBNBDCL. Each sentence D7 CX is character- ized by a binary feature vector BY D7 CX , and each pair B4D7 CX BND7 CY B5 is characterized by a triple of feature vectors CWBY BV D7 CX D7 CY BNBY C4 D7 CX D7 CY BNBY CA D7 CX D7 CY CX, where AF BY BV D7 CX D7 CY BP BY D7 CX CMBY D7 CY (features shared by D7 CX and D7 CY ), AF BY C4 D7 CX D7 CY BP BY D7 CX CMBY D7 CY (features belonging only to D7 CX ), AF BY CA D7 CX D7 CY BP BY D7 CX CMBY D7 CY (features belonging only to D7 CY ). BWB4D7 CX BND7 CY B5 represents the difference in readability be- tween D7 CX and D7 CY ; it is computed in the following way. 1. Let CC D7 CX D7 CY be the set of respondents who assessed B4D7 CX BND7 CY B5. 2. Given the degree of readability respondent D8 as- signed to D7 CX (D7 CY ), map it to real value CSD3D6B4D8BND7B5 BE CJBCBNBDCL so that the lowest degree maps to 0 and the highest degree maps to 1. For example, the de- gree of readability assigned to (A) in Figure 1 (b) maps to around 0.1, whereas that assigned to (B) maps to around 0.9. 3. BWB4D7 CX BND7 CY B5BP BD CYCC D7 CX D7 CY CY C8 D8BECC D7 CX D7 CY CSD3D6B4D8BND7 CX B5A0CSD3D6B4D8BND7 CY B5BM Output score CBCR C5 B4D7 CX BND7 CY B5 BE CJA0BDBNBDCL for input B4D7 CX BND7 CY B5 was given by the normalized distance be- tween B4D7 CX BND7 CY B5 and the hyperplane. 3.3 Evaluation and discussion To evaluate the two modeling methods, we con- ducted a ten-fold cross validation on the set of 4055 paraphrase pairs derived from the 770 questions used in the survey. To create a feature vector space, we used 355 morpho-syntactic features. Feature annota- tion was done semi-automatically with the help of a morphological analyzer and dependency parser. The task was to classify a given paraphrase pair into either left, right,orcomparable. Model C5’s output class for B4D7 CX BND7 CY B5 was given by BVD0D7 C5 B4D7 CX BND7 CY B5BP B4 D0CTCUD8 (CBCR C5 B4D7 CX BND7 CY B5 AKA0AI D1 ) D6CXCVCWD8 (CBCR C5 B4D7 CX BND7 CY B5 AL AI D1 ) CRD3D1D4CPD6CPCQD0CT (otherwise) BN where AI D1 BE CJA0BDBNBDCL is a variable threshold used to balance precision with recall. We used the 473 paraphrase pairs that satisfied the following conditions: AFCYBWB4D7 CX BND7 CY B5CY was not less than threshold AI CP (AI CP BP BCBMBH). The answer of B4D7 CX BND7 CY B5 is given by BVD0D7 BTD2D7 B4D7 CX BND7 CY B5BP D2 D0CTCUD8 (BWB4D7 CX BND7 CY B5 AKA0AI CP ) D6CXCVCWD8 (BWB4D7 CX BND7 CY B5 AL AI CP ) BM AF B4D7 CX BND7 CY B5 must have been assessed by more then one respondent, i.e., CYCC D7 CX D7 CY CY BQ BDBM AF Agreement ratio BTCVD6B4D7 CX BND7 CY B5 must be suffi- ciently high, i.e., BTCVD6B4D7 CX BND7 CY B5 AL BCBMBL, where BTCVD6B4D7 CX BND7 CY B5BPB4CUD3D6B4D7 CX BND7 CY B5 A0CPCVD7D8B4D7 CX BND7 CY B5B5BP CYCC D7 CX D7 CY CY, and CUD3D6B4D7 CX BND7 CY B5 and CPCVD7D8B4D7 CX BND7 CY B5 are the number of respondents who agreed and disagreed with BVD0D7 BTD2D7 B4D7 CX BND7 CY B5, respectively. We judged output class BVD0D7 C5 B4D7 CX BND7 CY B5 correct if and only if BVD0D7 C5 B4D7 CX BND7 CY B5 BP BVD0D7 BTD2D7 B4D7 CX BND7 CY B5. The overall performance was evaluated based on recall CACR and precision C8D6: CACR BP CYCUB4D7 CX BND7 CY B5CY BVD0D7 C5 B4D7 CX BND7 CY B5 is correctCVCY CYCUB4D7 CX BND7 CY B5CY BVD0D7 BTD2D7 B4D7 CX BND7 CY B5BECUD0CTCUD8BND6CXCVCWD8CVCVCY C8D6BP CYCUB4D7 CX BND7 CY B5CY BVD0D7 C5 B4D7 CX BND7 CY B5 is correctCVCY CYCUB4D7 CX BND7 CY B5CY BVD0D7 C5 B4D7 CX BND7 CY B5BECUD0CTCUD8BND6CXCVCWD8CVCY . The model achieved 95% precision with 89% re- call. This result confirmed that the data we collected through the questionnaires were reasonably noiseless and thus generalizable. Furthermore, both models exhibited a clear trade-off between recall and preci- sion, indicating that their output scores can be used as a confidence measure. 4 Paraphrase representation We represent paraphrases as transfer patterns be- tween dependency trees. In this section, we propose a three-layered formalism for representing transfer patterns. 4.1 Types of paraphrases of concern There are various levels of paraphrases as the fol- lowing examples demonstrate: (1) a. She burst into tears, and he tried to comfort her. b. She cried, and he tried to console her. (2) a. It was a Honda that John sold to Tom. b. John sold a Honda to Tom. c. Tom bought a Honda from John. (3) a. They got married three years ago. b. They got married in 2000. Lexical vs. structural paraphrases Example (1) includes paraphrases of the single word “comfort” and the canned phrase “burst into tears”. The sen- tences in (2), on the other hand, exhibit structural and thus more general patterns of paraphrasing. Both types of paraphrases, lexical and structural para- phrases, are considered useful for many applications including reading assistance and thus should be in the scope our discussion. Atomic vs. compositional paraphrases The pro- cess of paraphrasing (2a) into (2c) is compositional because it can be decomposed into two subpro- cesses, (2a) to (2b) and (2b) to (2c). In develop- ing a resource for paraphrasing, we have only to cover non-compositional (i.e., atomic) paraphrases. Compositional paraphrases can be handled if an ad- ditional computational mechanism for combining atomic paraphrases is devised. Meaning-preserving vs. reference-preserving paraphrases It is also useful to distinguish reference-preserving paraphrases from meaning- preserving ones. The above example in (3) is of the reference-preserving type. This types of paraphras- ing requires the computation of reference to objects outside discourse and thus should be excluded from our scope for the present purpose. 4.2 Dependency trees (MDSs) Previous work on transfer-based machine transla- tion (MT) suggests that the dependency-based repre- sentation has the advantage of facilitating syntactic transforming operations (Meyers et al., 1996; Lavoie et al., 2000). Following this, we adopt dependency trees as the internal representations of target texts. We suppose that a dependency tree consists of a set of nodes each of which corresponds to a lexeme or compound and a set of edges each of which repre- sents the dependency relation between its ends. We call such a dependency tree a morpheme-based de- pendency structure (MDS). Each node in an MDS is supposed to be annotated with an open set of typed features that indicate morpho-syntactic and semantic information. We also assume a type hierarchy in de- pendency relations that consists of an open set of de- pendency classes including dependency, compound, parallel, appositive and insertion. 4.3 Three-layered representation Previous work on transfer-based MT sys- tems (Lavoie et al., 2000; Dorna et al., 1998) and alignment-based transfer knowledge acqui- sition (Meyers et al., 1996; Richardson et al., 2001) have proven that transfer knowledge can be best represented by declarative structure mapping (transforming) rules each of which typically consists of a pair of source and target partial structures as in the middle of Figure 2. Adopting such a tree-to-tree style of representa- tion, however, one has to address the issue of the trade-off between expressibility and comprehensi- bility. One may want a formalism of structural rule editing translation compilation simplified MDS transfer rule N shika V- nai -> V no wa N dake da. (someone does not V to nothing but N) (it is only to N that someone does V) MDS transfer rule sp_rule(108, negation, RefNode) :- match(RefNode, X4=[pos:postp,lex: shika]), depend(X3=[pos:verb], empty, X4), depend(X1=[pos:aux_verb,lex: nai], X2=[pos:aux_verb*], X3), depend(X4, empty, X5=[pos:noun]), replace(X1, X6=[pos:aux_verb,lex: da]), substitute(X5, X12=[pos:noun]), move_dtrs(X5, X12), substitute(X3, X10=[pos:verb]), : pos: postp lex: shika (except) pos: aux_verb lex: da (copula) pos: postp lex: wa (TOP) X6 X11 X12 pos: noun lex: no (thing) pos: postp lex: dake (only) pos: noun pos: noun aux_verb* pos: aux_verb lex: nai (not) pos: verbX3 X4 X1 X5 X2 X7 X8 X10 pos: verb X9 vws MDS processing operators (=X5) (=X2) (=X3) Figure 2: Three-layered rule representation transformation patterns that is powerful enough to represent a sufficiently broad range of paraphrase patterns. However, highly expressible formalisms would make it difficult to create and maintain rules manually. To mediate this trade-off, we devised a new layer of representation to add on the top of the layer of tree-to-tree pattern representation as illustrated in Figure 2. At this new layer, we use an extended natu- ral language to specify transformation patterns. The language is designed to facilitate the task of hand- coding transformation rules. For example, to define the tree-to-tree transformation pattern given in the middle of Figure 2, a rule editor needs only to spec- ify its simplified form: (4) N shika V- nai AX V no ha N dake da. (Someone does V to nothing but N AX It is only to N that someone does V) A rule of this form is then automatically translated into a fully-specified tree-to-tree transformation rule. We call a rule of the latter form an MDS rewriting rule (SR rule), and a rule of the former form a sim- plified SR rule (SSR rule). The idea is that most of the specifications of an SR rule can usually be abbreviated if a means to auto- matically complement it is provided. We use a parser and macros to do so; namely, the rule translator com- plements an SSR rule by macro expansion and pars- ing to produce the corresponding SR rule specifica- tions. The advantages of introducing the SSR rule layer are the following: AF The SSR rule formalism allows a rule writer to edit rules with an ordinary text editor, which makes the task of rule editing much more efficient than providing her/him with a GUI-based com- plex tool for editing SR rules directly. AF The use of the extended natural language also has the advantage in improving the readability of rules for rule writers, which is particularly impor- tant in group work. AF To parse SSR rules, one can use the same parser as that used to parse input texts. This also im- proves the efficiency of rule development because it significantly reduces the burden of maintaining the consistency between the POS-tag set used for parsing input and that used for rule specifications. The SSR rule layer shares underlying motiva- tions with the formalism reported by Hermjakob et al. (2002). Our formalism is, however, considerably extended so as to be licensed by the expressibility of the SR rule representation and to be annotated with various types of rule applicability conditions includ- ing constraints on arbitrary features of nodes, struc- tural constraints, logical specifications such as dis- junction and negation, closures of dependency rela- tions, optional constituents, etc. The two layers for paraphrase representation are fully implemented on our paraphrasing engine KURA (Takahashi et al., 2001) coupled with another layer for processing MDSs (the bottom layer illus- trated in Figure 2). The whole system of KURA and part of the transer rules implemented on it (see Section 5 below) are available at http://cl.aist- nara.ac.jp/lab/kura/doc/. 5 Post-transfer error detection What kinds of transfer errors tend to occur in lex- ical and structural paraphrasing? To find it out, we conducted a preliminary investigation. This section reports a summary of the results. See (Fujita and Inui, 2002) for further details. We implemented over 28,000 transfer rules for Japanese paraphrases on the KURA paraphrasing en- gine based on the rules previously reported in (Sato, 1999; Kondo et al., 1999; Kondo et al., 2001; Iida et al., 2001) and existing lexical resources such as the- sauri and case frame dictionaries. The implemented rules ranged from such lexical paraphrases as those that replace a word with its synonym to such syn- tactic/structural paraphrases as those that remove a cleft construction from a sentence, devide a sentence, etc. We then fed KURA with a set of 1,220 sentences randomly sampled from newspaper articles and ob- tained 630 transferred output sentences. The following are the tendencies we observed: AF The transfer errors observed in the experiment ex- hibited a wide range of variety from morphologi- cal errors to semantic and discourse-related ones. AF Most types of errors tended to occur regardless of the types of transfer. This suggests that if one creates an error detection module specialized for a particular error type, it works across different types of transfer. AF The most frequent error type involved inappropri- ate conjugation forms of verbs. It is, however, a matter of morphological generation and can be easily resolved. AF Errors in regard to verb valency and selectional restriction also tended to be frequent and fatal, and thus should have preference as a research topic. AF The next frequent error type was related to the difference of meaning between near synonyms. However, this type of errors could often be de- tected by a model that could detect errors of verb valency and selectional restriction. Based on these observations, we concluded that the detection of incorrect verb valences and verb- complement cooccurrence was one of the most se- rious problems that should have preference as a re- search topic. We are now conducting experiments on empirical methods for detecting this type of er- rors (Fujita et al., 2003). 6 Conclusion This paper reported on the present results of our ongoing research on text simplification for reading assistance targeting congenitally deaf people. We raised four interrelated issues that we needed address to realize this application and presented our previ- ous activities focuing on three of them: readabil- ity assessment, paraphrase representation and post- transfer error detection. Regarding readability assessment, we proposed a novel approach in which we conducted questionnaire surveys to collect readability assessment data and took a corpus-based empirical method to obtain a readability ranking model. The results of the sur- veys show the potential impact of text simplification on reading assistance. We conducted experiments on the task of comparing the readability of a given para- phrase pair and obtained promising results by SVM- based classifier induction (95% precision with 89% recall). Our approach should be equally applicable to other population segments such as aphasic read- ers and second-language learners. Our next steps includes the investigation of the drawbacks of the present bag-of-features modeling approach. We also need to consider a method to introduce the notion of user classes (e.g. beginner, intermediate and ad- vanced). Textual aspects of readability will also need to be considered, as discussed in (Inui and Nogami, 2001; Siddahrthan, 2003). Regarding paraphrase representation, we pre- sented our revision-based lexico-structural para- phrasing engine. It provides a fully expressible scheme for representating paraphrases, while pre- serving the easiness of handcraft paraphrasing rules by providing an extended natural language as a means of pattern editting. We have handcrafted over a thousand transfer rules that implement a broad range of lexical and structural paraphrasing. The problem of error detection is also critical. When we find a effective solution to it, we will be ready to integrate the technologies into an applica- tion system of text simplification and conduct user- and task-oriented evaluations. Acknowledgments The research presented in this paper was partly funded by PREST, Japan Science and Technology Corporation. We thank all the teachers at the schools for the deaf who cooperated in our questionnaire sur- vey and Toshihiro Agatsuma (Joetsu University of Education) for his generous and valuable coopera- tion in the survey. We also thank Yuji Matsumoto and his colleagues (Nara Advanced Institute of Sci- ence and Technology) for allowing us to use their NLP tools ChaSen and CaboCha, Taku Kudo (Nara Advanced Institute of Science and Technology) for allowing us to use his SVM tool, and Takaki Makino and his colleagues (Tokyo University) for allow- ing us to use LiLFeS, with which we implemented KURA. We also thank the anonymous reviewers for their suggestive and encouraging comments. References Barzilay, R. and McKeown, K. 2001. Extracting para- phrases from a parallel corpus. In Proc. of the 39th An- nual Meeting and the 10th Conference of the European Chapter of Association for Computational Linguistics (EACL), pages 50–57. Barzilay, R. and Lee, L. 2003. Learning to paraphrases: an unsupervised approach using multiple-sequence align- ment. In Proc. of HLT-NAACL. Canning, Y. and Taito, J. 1999. Syntactic simplification of newspaper text for aphasic readers. In Proc. of the 22nd Annual International ACM SIGIR Conference (SIGIR). Carroll, J., Minnen, G., Canning, Y., Devlin, S. and Tait, J. 1998. Practical simplification of English newspaper text to assist aphasic readers. In Proc. of AAAI-98 Workshop on Integrating Artificial Intelligence and As- sistive Technology. Dorna, M., Frank, A., Genabith, J. and Emele, M. 1998. Syntactic and semantic transfer with F-structures. In Proc. of COLING-ACL, pages 341–347. Fujita, A. and Inui, K. 2002. Decomposing linguistic knowledge for lexical paraphrasing. In Information Processing Society of Japan SIG Technical Reports, NL-149, pages 31–38. (in Japanese) Fujita, A., Inui, K. and Matsumoto, Y. 2003. Automatic detection of verb valency errors in paraphrasing. In In- formation Processing Society of Japan SIG Technical Reports, NL-156. (in Japanese) Hermjakob, U., Echihabi, A. and Marcu, D. 2002. Nat- ural language based reformulation resource and Web exploitation for question answering. In Proc. of the TREC-2002 Conference. Iida, R., Tokunaga, Y., Inui, K. and Eto, J. 2001. Explo- ration of clause-structural and function-expressional paraphrasing using KURA.InProc. of the 63th Annual Meeting of Information Processing Society of Japan, pages 5–6. (in Japanese). Inui, K. and Nogami, M. 2001. A paraphrase-based explo- ration of cohesiveness criteria. In Proc. of the Eighth European Workshop on Natulan Language Generation, pages 101–110. Jacquemin, C. 1999. Syntagmatic and paradigmatic rep- resentations of term variations. In Proc. of the 37th Annual Meeting of the Association for Computational Linguistics (ACL), pages 341–349. Kondo, K., Sato, S. and Okumura, M. 1999. Paraphras- ing of “sahen-noun + suru”. Journal of Information Processing Society of Japan, 40(11):4064–4074. (in Japanese). Kondo, K., Sato, S. and Okumura, M. 2001. Para- phrasing by case alternation. Journal of Informa- tion Processing Society of Japan, 42(3):465–477. (in Japanese). Kurohashi, S. and Sakai, Y. 1999. Semantic analysis of Japanese noun phrases: a new approach to dictionary- based understanding. In Proc. of the 37th Annual Meet- ing of the Association for Computational Linguistics (ACL), pages 481–488. Lavoie, B. Kittredge, R. Korelsky, T. Rambow, O. 2000. A framework for MT and multilingual NLG ystems based on uniform lexico-structural processing. In Proc. of ANLP-NAACL. Lin, D. and Pantel, P. 2001. Discovery of inference rules for question-answering. Natural Language Engineer- ing, 7(4):343–360. McCoy ,K. F. and Masterman (Michaud), L. N. 1997. A Tutor for Teaching English as a Second Language for Deaf Users of American Sign Language, In Proc. of ACL/EACL ’97 Workshop on Natural Language Pro- cessing for Communication Aids. Meyers, A., Yangarber, R. and Grishman, R. 1996. Align- ment of shared forests for bilingual corpora. In Proc. of the 16th International Conference on Computational Linguistics (COLING), pages 460–465. Michaud, L. N. and McCoy, K. F. 2001. Error profiling: toward a model of English acquisition for deaf learn- ers. In Proc. of the 39th Annual Meeting and the 10th Conference of the European Chapter of Association for Computational Linguistics (EACL), pages 386–393. NIJL, the National Institute for Japanese Language. 1991. Nihongo Kyˆoiku-no tame-no Kihon-Goi Ch ˆosa (The basic lexicon for the education of Japanese). Shuei Shuppan, Japan. (In Japanese) Richardson, S., Dolan, W., Menezes, A. and Corston- Oliver, M. 2001. Overcoming the customization bottle- neck using example-based MT. In Proc. of the 39th An- nual Meeting and the 10th Conference of the European Chapter of Association for Computational Linguistics (EACL), pages 9–16. Robin, J. and McKeown, K. 1996. Empirically designing and evaluating a new revision-based model for sum- mary generation. Artificial Intelligence, 85(1–2):135– 179. Sato, S. 1999. Automatic paraphrase of technical pa- pers’ titles. Journal of Information Processing Society of Japan, 40(7):2937–2945. (in Japanese). Shinyama, Y., Sekine, S. Kiyoshi, Sudo. and Grishman, R. 2002. Automatic paraphrase acquisition from news articles. In Proc. of HLT, pages 40–46. Siddahrthan, A. 2003. Preserving discourse structure when simplifying text. In Proc. of European Workshop on Natural Language Generation, pages 103–110. Takahashi, T., Iwakura, T., Iida, R., Fujita, A. and Inui, K. 2001. KURA: a transfer-based lexico-structural para- phrasing engine. In Proc. of the 6th Natural Language Processing Pacific Rim Symposium (NLPRS) Workshop on Automatic Paraphrasing: Theories and Applica- tions, pages 37–46. Williams, S., Reiter, E. and Osman, L. 2003. Experiments with discourse-level choices and readability. In Proc. of European Workshop on Natural Language Generation, pages 127–134.