تعداد نشریات | 20 |
تعداد شمارهها | 385 |
تعداد مقالات | 3,170 |
تعداد مشاهده مقاله | 4,342,695 |
تعداد دریافت فایل اصل مقاله | 2,937,674 |
تهیة پیکرة زبانآموز فارسیآموزان غیرایرانی (مورد نوشتار فارسیآموزان چینی) | ||
پژوهش نامه آموزش زبان فارسی به غیر فارسی زبانان | ||
مقاله 4، دوره 12، شماره 1 - شماره پیاپی 25، فروردین 1402، صفحه 23-43 اصل مقاله (1.03 M) | ||
نوع مقاله: مقاله پژوهشی | ||
شناسه دیجیتال (DOI): 10.30479/jtpsol.2021.14990.1518 | ||
نویسندگان | ||
محمد باقر میرزایی حصاریان* 1؛ لیلا گل پور2؛ امیر رضا وکیلی فرد3 | ||
1نویسندۀ مسئول، استادیار گروه آموزش زبان فارسی به غیرفارسیزبانان،دانشگاه بین المللی امام خمینی(ره)، قزوین. ایران. | ||
2استادیار گروه آموزش زبان فارسی به غیرفارسیزبانان، دانشگاه بین المللی امام خمینی(ره)، قزوین. ایران. | ||
3دانشیار گروه آموزش زبان فارسی به غیرفارسیزبانان،دانشگاه بین المللی امام خمینی(ره)، قزوین. ایران. | ||
تاریخ دریافت: 11 بهمن 1399، تاریخ بازنگری: 19 اردیبهشت 1400، تاریخ پذیرش: 28 اردیبهشت 1400 | ||
چکیده | ||
یکی از گامهای اساسی و ضروری در آموزش زبان فارسی به غیرفارسی زبانان (آزفا)، جمعآوری و ثبت دادههای خام زبانی و تهیة پیکرة زبانآموز فارسیآموزان غیرایرانی و توصیف دادههای آن با استفاده از نظریههای زبانشناسی است. پژوهش حاضر با هدف برداشتن گامی در راستای ایجاد پیکره زبانی فارسیآموزان غیرایرانی انجام شده است. دادههای خام زبانی پژوهش برای تهیه پیکره مورد نظر برگرفته از آزمون نگارش پایان دوره فارسیآموزان چینی مرکز آموزش زبان فارسی دانشگاه بین المللی امام خمینی(ره) (90 فارسی آموز سطح فراپایه(A2) و 36 فارسیآموز سطح فرامیانی(B2)) است و برچسبگذاری نحوی برپایه دستور مقوله و میزان و توصیف باطنی از دستور زبان فارسی و به صورت دستی انجام شده است. تعداد ده برچسب دستوری شامل جمله، بند؛ بند مرتبهبندیشده و بند واژگونمرتبه؛ بند خودایستا و بند ناخودایستا، گروه فعلی، گروه اسمی(متمم و مسند الیه) و گروه قیدی در نوشتار فارسیآموزان ثبت شده است. پیکره تهیه شده مجموعاً از 126 متن نوشتاری شامل 212 پاراگراف و 29857 واژه ، 3175 جمله، 4912 بند، 19369 گروه شامل 4912 گروه فعلی، 8760 گروه اسمی و 4912 گروه قیدی شامل ادات و گروههای حرف اضافهای تشکیل شده است. پژوهش همچنین کارایی دستور توصیفی باطنی را که مبتنی بر دستور مقوله و میزان است، در برچسبگذاری نحوی نوشتار فارسیآموزان چینی تایید میکند. | ||
کلیدواژهها | ||
پیکره زبانآموز؛ دستور مقوله و میزان؛ نوشتار؛ فارسی آموز چینی؛ فراپایه؛ فرامیانی | ||
عنوان مقاله [English] | ||
Preparation of a Corpus Linguistics of Non-Iranian Learners of Persian: the case of the writings of Chinese learners of Persian | ||
نویسندگان [English] | ||
Mohammad Bagher Mirzaei Hesarian1؛ leyla golpour2؛ Amirreza vakilifard3 | ||
1Corresponding author, Assistant Professor of Teaching Persian Language to speakers of other languages, Imam Khomeini International University, Qazvin, Iran. | ||
2Assistant Professor of Teaching Persian Language to speakers of other languages, Imam Khomeini International University, Qazvin, Iran. | ||
3Associate Professor of Teaching Persian Language to speakers of other languages, Imam Khomeini International University, Qazvin, Iran. | ||
چکیده [English] | ||
One of the basic and necessary steps in teaching Persian to Non-Persian speakers is collecting and recording linguistic data and preparing the Corpus Linguistics of Non-Iranian Learners of Persian and describing its data using linguistic theories. The aim of this study was to take a step towards creating a Non-Iranian Persian learners’ Corpus. The linguistic data of the research for preparing the desired corpus is taken from the final writing test of Chinese learners of Persian at Persian language teaching center at Imam Khomeini International University (90 Chinese students’ level A2 and 36 Chinese students’ level B2) and the syntactic labeling based on category and Scale grammar is done manually. Nine grammatical labels including: sentence, clause, rank clause, rank-shift clause, finite clause, non-finite clause, verbal group, nominal group (complement and predicate), and adverbial group are recorded in the writing of the students. The corpus consists of a total of 126 written texts including 212 paragraphs, 29857 words, 3175 sentences, 4912 clauses, and 19369 groups. These groups are 4912 verbal, 8760 nominal, and 4912 adverbial groups (including adjective and prepositional groups). The research also confirms the effectiveness of Bateni's descriptive grammar, which is based on Category and Scale grammar, in the syntactic labeling of writings of Chinese learners of Persian. Extended Abstract: Teaching Persian to Foreigners (TPF) is at the beginning of its ups and downs; therefore, one of the basic and necessary steps is to collect and record raw linguistic data and prepare a corpus for Non-Iranian Learners of Persian (CNLP) and describe its data using linguistic theories. The present study is the result of an in-university research project that has been carried out with the support of Imam Khomeini International University (IKIU) of Qazvin to take a step towards creating a CNLP, identifying and resolving potential problems and meeting some of the needs of researchers. The research is based on the book describing the grammatical structure of the Persian language based on the theory of Category and Scale grammar (CSG). In the CSG, four categories have been discussed. These four categories are "unit", "structure", "class" and "system". "Unit" and “structure" belong to the syntagmatic axis, which represents the sequence of the constituent or elements of language over time, while "class" and "system" belong to the paradigmatic axis, which represents a variety of possibilities at each point in the speech chain for the speaker to choose from. The corpus of the research is taken from one of the final writing tests of the General and supplementary Persian language courses of the Persian Language Teaching Center (PLC) at IKIU. 90 Chinese Persian learners at the general level and 36 Chinese Persian learners at the supplementary level participated in the mentioned test; hence, a total of 126 test sheets were used as raw data. To prepare the corpora, the writing sheets of Chinese Learners of Persian (CLP) were first typed in the Microsoft Word software. Attempts were made to type as much as possible what the CPLs had written in their composition. Then, the grammatical tagging of the typed content was done within the framework of CSG. At this stage, 9 grammatical tags such as sentence, clause, ranked and rank shifted clause, finite and non-finite clauses, verbal group, and nominal and adverbial groups were recorded in the writings of CLPs. Since the dots are intended as the boundary between the end of one sentence and the beginning of another, the punctuation has been revised by scholars and, if necessary, corrected or supplemented. Next, the work of identifying and separating the sentences has been done. While analyzing corpora, the components of the rank-shifted clauses are identified and calculated as the constituent elements of the clause (verbal group, nominal group, and adverbial group). In the analysis of nominal groups with other nominal dependents, only the main nominal group is considered. Adverbial groups are also identified as a unit; this means that nominal groups are not labelled separately within adverbial groups. Also, due to the subject pronoun dropping feature of Persian, in a significant number of sentences of CLP's writings, the subject is not specified in the form of a noun group. In the labelling of the corpus, an attempt was made to analyze the components of the text by Persian Learner's writings and the written text to be labelled without applying linguistic corrections as much as possible. The CPLs at the general level were students who participated in the PLTC at IKIU for 16 weeks and 20 hours per week for the four skills of listening, reading, speaking and writing skills. So, they participated in a total of 320 training hours in face-to-face classes. Considering the quality and quantity of the educational program and the individual characteristics of CPL, the GPLC can be considered equivalent to the pre-intermediate level (A2) in the Common European Framework of Reference for Languages (CEFRL). The CLPs at the supplementary level were students who participated in the PLTC at IKIU for 32 weeks and 20 hours per week for the four skills. So, they participated in a total of 640 training hours in face-to-face classes. Considering the quality and quantity of the educational program and the individual characteristics of CPL, the GPLC can be considered equivalent to the upper-intermediate level (B2) in the (CEFRL) The most important achievement of the research is the preparation of the initial version of the CNLP with the characteristics that will be mentioned below: A total of 126 writings of CLPs were used as raw data for the CNLP at two levels (90 writings at the general level and 36 writings at the supplementary level). Therefore, the corpus is composed of a total of 126 written texts including 212 paragraphs and 29,857 words. Also, the corpus contains a total of 3175 sentences, 4912 clauses, and 19369 groups (including 4912 current groups, 8760 noun groups and 4912 adverb groups including adjectives and preposition groups). The study proves the effectiveness of CSG in accurately describing the CLP's writings. | ||
کلیدواژهها [English] | ||
Language learners corpora, Chinese learners of Persian, Category and Scale Grammar, Writing | ||
مراجع | ||
باطنی، محمد رضا. (1392). توصیف ساختمان دستوری زبان فارسی. چاپ سیام، تهران، انتشارات امیرکبیر.
بیجنخان، محمود. (۱۳۸۳). نقش پیکرههای زبانی در نوشتن دستور زبان: معرفی یک نرمافزار رایانهای. مجله زبانشناسی. ۱۹(۲)، ۴۸-۶۷.
بیجنخان، محمود. (۱۳۹۵). پیکرۀ گفتار محاورهای زبان فارسی امروز. مجموعه مقالات دومین همایش ملی زبانشناسی پیکرهای. تهران: نشر نویسه پارسی.
جهانگردی، کیومرث. (1395). تحلیل محتوای کتابهای آموزش زبان فارسی به غیرفارسی زبانان. رساله دکتری. تهران: پژوهشگاه علوم انسانی و مطالعات فرهنگی.
صحرایی، رضامراد؛ مرصوص، فائزه. (1395). استاندارد مرجع آموزش زبان فارسی. تهران: انتشارات دانشگاه علّامه طباطبایی.
صفری، سعید (۱۳۹۴). از زبانشناسی پیکرهای تا پیکرة زبانآموز. مجموعه مقالات نخستین همایش ملی زبانشناسی پیکرهای. تهران: نشر نویسه پارسی.
قربانزاده، ف. (۱۳۹۴). معرفی پیکره فارسی روز. مجموعه مقالات نخستین همایش ملی زبانشناسی پیکرهای. تهران: نشر نویسه پارسی.
میرزایی، آزاده؛ صفری، پگاه. (1394). ساختواژه – متنهای تخصصی و عمومی زبان فارسی، بر اساس بسامدگیری واژههای نقشی و محتوایی. در مجموعه مقالات نخستین همایش ملی زبانشناسی پیکرهای. تهران: نشر نویسه پارسی.
References:
Assi, S. M. (1997). Farsi linguistic database (FLDB). International Journal of Lexicography, 10(3), 5.
Bateni , M.R.(2013). Description of Persian Grammatical Structure (30rd Ed). Tehran:Amir kabir.[In Persian]
Bi Jen Khan, M.(2016). The Corpus of Contemporary Colloquial Persian. Proceedings of the Second National Conference on Corpus Linguistics. Tehran: Nevise-e-Parsi.
Bijankhan, M., Sheikhzadegan, J., Roohani, M. R., Samareh, Y., Lucas, C., & Tebyani, M. (1994). FARSDAT-The Speech Database of Farsi Spoken Language. The Proceedings of the Australian Conference on Speech Science and Technology, 2, ۸۲۶–۸۳۱.
Bijankhan, M., Sheykhzadegan, J., Bahrani, M., & Ghayoomi, M. (2011). Lessons from building a Persian written corpus: Peykare. Language Resources and Evaluation, 45(2), ۱۴۳–۱۶۴.
Eghbalzadeh, H., Hosseini, B., Khadivi, S., and Khodabakhsh, A. (2012, November). Persica: A Persian Corpus for Multipurpose Text Mining and Natural Language Processing. In Sixth International Symposium on Telecommunications (IST). IEEE. Tehran.
Ghorbanzadeh, F. (2015). Introducing the Contemporary Persian Corpus. Proceedings of the First National Conference on Corpus Linguistics. Tehran: Nevise-e-Parsi.
Halliday, M.A.K., & Matthiessen, C. M. I. M. (2004). An introduction to functional grammar (3rd ed.). London: Arnold.
Jahangardi, K. (2016). An Analysis of Textbooks for Teaching Persian to Non-Persians:
A Corpus-Cognitive Approach to Teaching Vocabulary. Ph.D.Thesis. Tehran: Institute for Humanities & Cultural Studies.
Rasooli, M. S. Kouhestani, M. and Moloodi, A. S. (2013). Development of a Persian Syntactic Dependency Treebank. In The 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT), Atlanta, USA.
Safari, S. (2015). From Corpus Linguistics to Learner Corpus. Proceedings of the First National Conference on Corpus Linguistics. Tehran: Nevise-e-Parsi.
Sahraei, R.M.& Marsoos,F.(2016). Persian Teaching Reference Standard. Tehran: Allameh Tabatabai publications.[In Persian]
Shamsfard, M., Hesabi, A., Fadaei, H., Mansoory, N., Famian, A., Bagherbeigi, S., Fekri, E. and et al. (2010). Semi Automatic Development of Farsnet; the Persian Wordnet. Proceedings of 5th Global WordNet Conference (GWA2010). Mumbai, India.
Mirzaei,A.& Safari,P.(2014). Building specialized and general documents in Persian based on the frequency of function and content words. Mirzaei,A., proceeding of 1st National Conference on Corpus Linguistics(175-192), Tehran: Neviseh Parsi. [In Persian]
Mirzaei, A., and Safari, P. (2018). Persian Discourse Treebank and Coreference Corpus. In LREC 2018, 4049-4055. | ||
آمار تعداد مشاهده مقاله: 433 تعداد دریافت فایل اصل مقاله: 351 |