What's in a Word Family? The Assumptions of Lexical Units
Saved in:
| Title: | What's in a Word Family? The Assumptions of Lexical Units |
|---|---|
| Language: | English |
| Authors: | Phil Bennett (ORCID |
| Source: | Vocabulary Learning and Instruction. 2026 15. |
| Availability: | Castledown Publishers. Ground Level, 470 St Kilda Road, Melbourne, 3004, Australia. Tel: +61-3-7003-8355; e-mail: contact@castledown.com; Web site: https://www.castledown.com/journals/vli |
| Peer Reviewed: | Y |
| Page Count: | 26 |
| Publication Date: | 2026 |
| Document Type: | Journal Articles Reports - Research |
| Descriptors: | English, Morphemes, Etymology, Word Lists, Vocabulary, Form Classes (Languages), Dictionaries |
| ISSN: | 2981-9954 |
| Abstract: | Lemmas, flemmas, and level 6 word families (WF6) are three commonly discussed lexical units. Because each makes differing assumptions about learner knowledge, the selection of one unit over another in research or pedagogy has a great impact on interpretations of the lexical challenge. It is therefore important to fully understand these assumptions so that practitioners can select the most suitable unit for a given purpose. This study introduces an enhanced version of Nation's BNC-COCA word lists that can be used to quantify several features of lexical units. The original WF6 lists were adapted by including flemma and lemma groupings, part-of-speech (POS) tags, morphological codings, frequency data, and an expanded list of proper nouns. Analyses of lexical unit composition reveal that, owing to their much greater inclusivity than flemmas or lemmas, WF6 units provide rapid corpus coverage over the 1-2k bands, and that irregular forms make up a considerable proportion of 1k tokens regardless of the unit chosen. Accuracy checks then suggest that POS-tagged lists offer an improvement over untagged lists due to the latter's overestimation of coverage and blocking of homographic concepts. Finally, lexical and morphological profiling shows that threshold coverage values are unlikely to be reached without knowledge of at least mid-frequency headwords and tens of derivational affixes in most genres. |
| Abstractor: | As Provided |
| Notes: | https://osf.io/4mz6y |
| Entry Date: | 2026 |
| Accession Number: | EJ1501238 |
| Database: | ERIC |
| FullText | Text: Availability: 0 CustomLinks: – Url: https://eric.ed.gov/contentdelivery/servlet/ERICServlet?accno=EJ1501238 Name: ERIC Full Text Category: fullText Text: Full Text from ERIC |
|---|---|
| Header | DbId: eric DbLabel: ERIC An: EJ1501238 AccessLevel: 3 PubType: Academic Journal PubTypeId: academicJournal PreciseRelevancyScore: 0 |
| IllustrationInfo | |
| Items | – Name: Title Label: Title Group: Ti Data: What's in a Word Family? The Assumptions of Lexical Units – Name: Language Label: Language Group: Lang Data: English – Name: Author Label: Authors Group: Au Data: <searchLink fieldCode="AR" term="%22Phil+Bennett%22">Phil Bennett</searchLink> (ORCID <externalLink term="https://orcid.org/0000-0002-6313-6760">0000-0002-6313-6760</externalLink>) – Name: TitleSource Label: Source Group: Src Data: <searchLink fieldCode="SO" term="%22Vocabulary+Learning+and+Instruction%22"><i>Vocabulary Learning and Instruction</i></searchLink>. 2026 15. – Name: Avail Label: Availability Group: Avail Data: Castledown Publishers. Ground Level, 470 St Kilda Road, Melbourne, 3004, Australia. Tel: +61-3-7003-8355; e-mail: contact@castledown.com; Web site: https://www.castledown.com/journals/vli – Name: PeerReviewed Label: Peer Reviewed Group: SrcInfo Data: Y – Name: Pages Label: Page Count Group: Src Data: 26 – Name: DatePubCY Label: Publication Date Group: Date Data: 2026 – Name: TypeDocument Label: Document Type Group: TypDoc Data: Journal Articles<br />Reports - Research – Name: Subject Label: Descriptors Group: Su Data: <searchLink fieldCode="DE" term="%22English%22">English</searchLink><br /><searchLink fieldCode="DE" term="%22Morphemes%22">Morphemes</searchLink><br /><searchLink fieldCode="DE" term="%22Etymology%22">Etymology</searchLink><br /><searchLink fieldCode="DE" term="%22Word+Lists%22">Word Lists</searchLink><br /><searchLink fieldCode="DE" term="%22Vocabulary%22">Vocabulary</searchLink><br /><searchLink fieldCode="DE" term="%22Form+Classes+%28Languages%29%22">Form Classes (Languages)</searchLink><br /><searchLink fieldCode="DE" term="%22Dictionaries%22">Dictionaries</searchLink> – Name: ISSN Label: ISSN Group: ISSN Data: 2981-9954 – Name: Abstract Label: Abstract Group: Ab Data: Lemmas, flemmas, and level 6 word families (WF6) are three commonly discussed lexical units. Because each makes differing assumptions about learner knowledge, the selection of one unit over another in research or pedagogy has a great impact on interpretations of the lexical challenge. It is therefore important to fully understand these assumptions so that practitioners can select the most suitable unit for a given purpose. This study introduces an enhanced version of Nation's BNC-COCA word lists that can be used to quantify several features of lexical units. The original WF6 lists were adapted by including flemma and lemma groupings, part-of-speech (POS) tags, morphological codings, frequency data, and an expanded list of proper nouns. Analyses of lexical unit composition reveal that, owing to their much greater inclusivity than flemmas or lemmas, WF6 units provide rapid corpus coverage over the 1-2k bands, and that irregular forms make up a considerable proportion of 1k tokens regardless of the unit chosen. Accuracy checks then suggest that POS-tagged lists offer an improvement over untagged lists due to the latter's overestimation of coverage and blocking of homographic concepts. Finally, lexical and morphological profiling shows that threshold coverage values are unlikely to be reached without knowledge of at least mid-frequency headwords and tens of derivational affixes in most genres. – Name: AbstractInfo Label: Abstractor Group: Ab Data: As Provided – Name: Note Label: Notes Group: Note Data: https://osf.io/4mz6y – Name: DateEntry Label: Entry Date Group: Date Data: 2026 – Name: AN Label: Accession Number Group: ID Data: EJ1501238 |
| PLink | https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=eric&AN=EJ1501238 |
| RecordInfo | BibRecord: BibEntity: Languages: – Text: English PhysicalDescription: Pagination: PageCount: 26 Subjects: – SubjectFull: English Type: general – SubjectFull: Morphemes Type: general – SubjectFull: Etymology Type: general – SubjectFull: Word Lists Type: general – SubjectFull: Vocabulary Type: general – SubjectFull: Form Classes (Languages) Type: general – SubjectFull: Dictionaries Type: general Titles: – TitleFull: What's in a Word Family? The Assumptions of Lexical Units Type: main BibRelationships: HasContributorRelationships: – PersonEntity: Name: NameFull: Phil Bennett IsPartOfRelationships: – BibEntity: Dates: – D: 01 M: 01 Type: published Y: 2026 Identifiers: – Type: issn-electronic Value: 2981-9954 Numbering: – Type: volume Value: 15 Titles: – TitleFull: Vocabulary Learning and Instruction Type: main |
| ResultId | 1 |