Module:TaxonItalics/sandbox
See also the companion subpage for test cases (run).
Module:TaxonItalics (talk · edit · hist · links · doc · subpages · sandbox · testcases)
To avoid major disruption and server load, any changes should be tested in the module's /sandbox or /testcases subpages, or in your own module sandbox. The tested changes can be added to this page in a single edit. Consider discussing changes on the talk page before implementing them.
Purpose
[edit ]The module is primarily intended for use by the automated taxobox system. It supports the correct italicization of scientific names. Botanical (ICNafp) names may contain "connecting terms"; these must not be italicized. The hybrid symbol, ×ばつ, should also not be italicized. The module optionally wikilinks and abbreviates italicized names.
For non-virus taxa, italics are used at the rank of genus or below. The module does not decide whether a scientific name should be italicized. Use {{Is italic taxon}} for this purpose.
Usage
[edit ]- {{#invoke:TaxonItalics|main|TAXON_NAME}} – italicizes a taxon name
- {{#invoke:TaxonItalics|main|TAXON_NAME|linked=yes}} – italicizes a taxon name, wikilinking the italicized output to the unchanged input
- {{#invoke:TaxonItalics|main|TAXON_NAME|abbreviated=yes}} – italicizes a taxon name, abbreviating all but the last part to the first letter
- {{#invoke:TaxonItalics|main|TAXON_NAME|dab=yes}} – italicizes a taxon name, treating any parenthesized part as a disambiguation term, and not italicizing it
The parameters can be combined. It can also be used via {{Taxon italics }}.
Examples
[edit ]Just italicized
[edit ]- Connecting terms
- Pinus subg. Pinus → Pinus subg. Pinus
- P. subgenus Pinus → P. subg. Pinus
- P. subsect. Pinaster → P. subsect. Pinaster
- Acer tataricum subsp. ginnala → Acer tataricum subsp. ginnala
- Aster ericoides var. ericoides → Aster ericoides var. ericoides
- A. ericoides varietas ericoides → A. ericoides var. ericoides
- A. e. subvar. ericoides → A. e. subvar. ericoides
Botanical names may contain only one infraspecific epithet; a string like "Fragaria vesca subsp. vesca f. semperflorens" is a classification, not a name, and is not handled by the module.
- Hybrid symbols
- Elaeagnus ×ばつ submacrophylla → Elaeagnus ×ばつ submacrophylla
- ×ばつBeallara → ×ばつBeallara
- × Beallara → ×ばつ Beallara
- {{hybrid}}Beallara → ×ばつBeallara
Linked
[edit ]Using |linked=yes
- Populus sect. Aigeiros → Populus sect. Aigeiros
- Elaeagnus ×ばつ submacrophylla → ×ばつ submacrophylla">Elaeagnus ×ばつ submacrophylla
Abbreviated
[edit ]Using |abbreviated=yes
- Populus sect. Aigeiros → P. sect. Aigeiros
- Acer tataricum subsp. ginnala → A. t. subsp. ginnala
- [also linked] ×ばつ Sorbaronia fallax → ×ばつ Sorbaronia fallax">×ばつ S. fallax
- [also linked] Elaeagnus ×ばつ submacrophylla → ×ばつ submacrophylla">E. ×ばつ submacrophylla
- Elaeagnus ×ばつsubmacrophylla → E. ×ばつsubmacrophylla
- Elaeagnus {{hybrid}} submacrophylla → E. ×ばつ submacrophylla
Disambiguation terms
[edit ]By default, a parenthesized part of a taxon name is assumed to be a subgenus name, and is italicized:
- Varanus (Hapturosaurus) → Varanus (Hapturosaurus)
- Caia (plant) → Caia (plant) – wrong
To treat a parenthesized part as a disambiguation term, use |dab=yes
- Caia (plant) → Caia (plant)
- (also linked) Caia (plant) → Caia (plant)
For even more examples, see the testcases.
Editors can experiment in this module's sandbox (edit | diff) and testcases (edit | run) pages.
Add categories to the /doc subpage. Subpages of this module.
--[[========================================================================= Italicize a taxon name appropriately by invoking italicizeTaxonName. The algorithm used is: * If the name has italic markup at the start or the end, do nothing. * Else * Remove (internal) italic markup. * If the name is made up of four words and the third word is a botanical connecting term, de-italicize the connecting term and add italic markup to the outside of the name. * Else if the name is made up of three words and the second word is a botanical connecting term or a variant of "cf.", de-italicize the connecting term and add italic markup to the outside of the name. * Else just add italic markup to the outside of the name. The module also: * Ensures that the hybrid symbol, ×ばつ, and parentheses are not italicized, as well as any string inside parentheses if dab is true. * Has an option to abbreviate all parts of taxon names other than the last to the first letter (e.g. "Pinus sylvestris var. sylvestris" becomes "P. s. var. sylvestris"). * Has an option to wikilink the italicized name to the input name. =============================================================================]] localp={} locall={}-- used to store purely local functions --connecting terms in three part names (e.g. Pinus sylvestris var. sylvestris) localcTerms3={ --subsp. subspecies="subsp.", ["subsp."]="subsp.", subsp="subsp.", ["ssp."]="subsp.", ssp="subsp.", --var. varietas="var.", ["var."]="var.", var="var.", --subvar. subvarietas="subvar.", ["subvar."]="subvar.", subvar="subvar.", --f. forma="f.", ["f."]="f.", f="f.", --subf. subforma="subf.", ["subf."]="subf.", subf="subf." } --connecting terms in two part names (e.g. Pinus sect. Pinus) localcTerms2={ --subg. subgenus="subg.", ["subgen."]="subg.", ["subg."]="subg.", subg="subg.", --supersect. supersection="supersect.", ["supersect."]="supersect.", supersect="supersect.", --sect. section="sect.", ["sect."]="sect.", sect="sect.", --subsect. subsection="subsect.", ["subsect."]="subsect.", subsect="subsect.", --ser. series="ser.", ["ser."]="ser.", ser="ser.", --subser. subseries="subser.", ["subser."]="subser.", subser="subser.", --cf. cf="cf.", ["cf."]="cf.", ["c.f."]="cf." } --[[========================================================================= Main function to italicize a taxon name appropriately. For the purpose of the parameters, see p.italicizeTaxonName(). =============================================================================]] functionp.main(frame) localname=frame.args[1]or'' locallinked=frame.args['linked']=='yes' localabbreviated=frame.args['abbreviated']=='yes' localdab=frame.args['dab']=='yes' returnp.italicizeTaxonName(name,linked,abbreviated,dab) end --[[========================================================================= Utility local function to abbreviate an input string to its first character followed by ".". Both ×ばつ" and an HTML entity at the start of the string are skipped over in determining first character, as is an opening parenthesis and an opening ", which cause a matching closing character to be included. =============================================================================]] functionl.abbreviate(str) localresult="" localhasParentheses=false localisQuoted=false ifmw.ustring.len(str)<2then --single character strings are left unchanged result=str else --skip over an opening parenthesis that could be present at the start of the string ifmw.ustring.sub(str,1,1)=="("then hasParentheses=true result="(" str=mw.ustring.sub(str,2,mw.ustring.len(str)) elseifmw.ustring.sub(str,1,1)=='"'then isQuoted=true result='"' str=mw.ustring.sub(str,2,mw.ustring.len(str)) end --skip over a hybrid symbol that could be present at the start of the string ifmw.ustring.sub(str,1,1)==×ばつ"then result=×ばつ" str=mw.ustring.sub(str,2,mw.ustring.len(str)) end --skip over an HTML entity that could be present at the start of the string ifmw.ustring.sub(str,1,1)=="&"then locali,dummy=mw.ustring.find(str,";",2,plain) result=result..mw.ustring.sub(str,1,i) str=mw.ustring.sub(str,i+1,mw.ustring.len(str)) end --if there's anything left, reduce it to its first character plus ".", --adding the closing parenthesis or quote if required ifstr~=""then result=result..mw.ustring.sub(str,1,1).."." ifhasParenthesesthenresult=result..")" elseifisQuotedthenresult=result..'"' end end end returnresult end --[[========================================================================= The function which does the italicization. Parameters: name (string) – the taxon name to be processed linked (boolean) – should a wikilink be generated? abbreviated (boolean) – should the first parts of the taxon name be reduced to capital letters? dab (boolean) – should any parenthesized part be treated as a disambiguation term and left unitalicized? =============================================================================]] functionp.italicizeTaxonName(name,linked,abbreviated,dab) name=mw.text.trim(name) -- if the name begins with '[', then assume formatting is present ifmw.ustring.sub(name,1,1)=='['thenreturnnameend -- otherwise begin by replacing any use of the HTML italic tags -- by Wikimedia markup; replace any entity alternatives to the hybrid symbol -- by the symbol itself; prevent the hybrid symbol being treated as -- a 'word' by converting a following space to the HTML entity localitalMarker="''" name=string.gsub(mw.text.trim(name),"</?i>",italMarker) name=string.gsub(string.gsub(name,"×",×ばつ"),"×",×ばつ") name=string.gsub(name,"</?span.->","")-- remove any span markup name=string.gsub(name,×ばつ ",×ばつ ") -- now italicize and abbreviate if required localresult=name ifname~=''then ifstring.sub(name,1,2)==italMarkerorstring.sub(name,-2)==italMarkerthen -- do nothing if the name already has italic markers at the start or end else name=string.gsub(name,italMarker,"")-- first remove any internal italics localwords=mw.text.split(name," ",true) if#words==4andcTerms3[words[3]]then -- the third word of a four word name is a connecting term -- ensure the connecting term isn't italicized words[3]='<span style="font-style:normal;">'..cTerms3[words[3]]..'</span>' ifabbreviatedthen words[1]=l.abbreviate(words[1]) words[2]=l.abbreviate(words[2]) end result=words[1].." "..words[2].." "..words[3].." "..words[4] elseif#words==3andcTerms2[words[2]]then -- the second word of a three word name is a connecting term -- ensure the connecting term isn't italicized words[2]='<span style="font-style:normal;">'..cTerms2[words[2]]..'</span>' ifabbreviatedthen words[1]=l.abbreviate(words[1]) end result=words[1].." "..words[2].." "..words[3] elseifabbreviatedthen-- not a name as above; only deal with abbreviation if#words>1then result=l.abbreviate(words[1]) fori=2,#words-1,1do result=result.." "..l.abbreviate(words[i]) end result=result.." "..words[#words] end else result=name end -- deal with any hybrid symbol as it should not be italicized result=string.gsub(result,×ばつ",'<span style="font-style:normal;"&g×ばつ</span>') -- deal with any parentheses as they should not be italicized ifdabthen result=string.gsub(string.gsub(result,"%(",'<span style="font-style:normal;">('),"%)",')</span>') else result=string.gsub(string.gsub(result,"%(",'<span style="font-style:normal;">(</span>'),"%)",'<span style="font-style:normal;">)</span>') end -- any question marks surrounded by spans can have the spans joined result=string.gsub(result,'</span>%?<span style="font%-style:normal;">','?') -- add outside markup iflinkedthen ifresult~=namethen result="[["..name.."|"..italMarker..result..italMarker.."]]" else result=italMarker.."[["..name.."]]"..italMarker end else result=italMarker..result..italMarker end end end returnresult end --[[========================================================================= Utility function used by other modules to check if a connecting term is present in a name. The value of name is assumed to be plain text. =============================================================================]] functionp.hasCT(frame) returnp.hasConnectingTerm(frame.args[1]or'') end functionp.hasConnectingTerm(name) localwords=mw.text.split(name," ",true) return(#words==4andcTerms3[words[3]]) or(#words==3andcTerms2[words[2]]) end returnp