wikiscript - scripts for wikipedia (get wikitext for page etc.)¶ ↑
-
home :: github.com/wikiscript/wikiscript
-
gem :: rubygems.org/gems/wikiscript
-
rdoc :: rubydoc.info/gems/wikiscript
Usage¶ ↑
Read-only access to wikikpedia pages. Example - Get wikitext source (via en.wikipedia.org/w/index.php?action=raw&title=<title>
):
page = Wikiscript::Page.get( '2022_FIFA_World_Cup' ) # same as Wikiscript.get page.text
prints
The '''2022 FIFA World Cup''' is scheduled to be the 22nd edition of the [[FIFA World Cup]], the quadrennial international men's [[association football]] championship contested by the [[List of men's national association football teams|national teams]] of the member associations of [[FIFA]]. It is scheduled to take place in [[Qatar]] in 2022. This will be the first World Cup ever to be held in the [[Arab world]] and the first in a Muslim-majority country...
Or build your own page from scratch (no download):
page = Wikiscript::Page.new( <<TXT, title: '2022_FIFA_World_Cup' ) The '''2022 FIFA World Cup''' is scheduled to be the 22nd edition of the [[FIFA World Cup]], the quadrennial international men's [[association football]] championship contested by the [[List of men's national association football teams|national teams]] of the member associations of [[FIFA]]. It is scheduled to take place in [[Qatar]] in 2022. This will be the first World Cup ever to be held in the [[Arab world]] and the first in a Muslim-majority country... TXT page.text
prints
The '''2022 FIFA World Cup''' is scheduled to be the 22nd edition of the [[FIFA World Cup]], the quadrennial international men's [[association football]] championship contested by the [[List of men's national association football teams|national teams]] of the member associations of [[FIFA]]. It is scheduled to take place in [[Qatar]] in 2022. This will be the first World Cup ever to be held in the [[Arab world]] and the first in a Muslim-majority country...
Tables¶ ↑
Parse wiki tables into an array. Example:
table = Wikiscript.parse_table( <<TXT ) {| |- ! header1 ! header2 ! header3 |- | row1cell1 | row1cell2 | row1cell3 |- | row2cell1 | row2cell2 | row2cell3 |} TXT # -or- table = Wikiscript.parse_table( <<TXT ) {| ! header1 !! header2 !! header3 |- | row1cell1 || row1cell2 || row1cell3 |- | row2cell1 || row2cell2 || row2cell3 |} TXT # -or- table = Wikiscript.parse_table( <<TXT ) {| |- ! header1 ! header2 ! header3 |- | row1cell1 | row1cell2 | row1cell3 |- | row2cell1 | row2cell2 | row2cell3 |} TXT
resulting in:
pp table #=> [["header1", "header2", "header3"], # ["row1cell1", "row1cell2", "row1cell3"], # ["row2cell1", "row2cell2", "row2cell3"]]
Note: parse_table
will strip/remove (leading) style attributes (e.g. àttribute="value" |
and (inline) bold and italic emphases (e.g. ''
) from the (cell) text. Example:
table = Wikiscript.parse_table( <<TXT ) {| |- ! style="width:200px;"|Club ! style="width:150px;"|City |- |[[Biu Chun Rangers]]||[[Sham Shui Po]] |- |bgcolor=#ffff44 |''[[Eastern Sports Club|Eastern]]''||[[Mong Kok]] |- |[[HKFC Soccer Section]]||[[Happy Valley, Hong Kong|Happy Valley]] |} TXT
resulting in:
pp table #=> [["Club", "City"], # ["[[Biu Chun Rangers]]", "[[Sham Shui Po]]"], # ["[[Eastern Sports Club|Eastern]]", "[[Mong Kok]]"], # ["[[HKFC Soccer Section]]", "[[Happy Valley, Hong Kong|Happy Valley]]"]]
Links¶ ↑
Split links into two parts. Note: The alternate link title is optional. Example:
link, title = Wikiscript.parse_link( '[[La Florida, Chile|La Florida]]' ) link #=> "La Florida, Chile" title #=> "La Florida" link, title = Wikiscript.parse_link( '[[ La Florida, Chile]]' ) link #=> "La Florida, Chile" title #=> nil link, title = Wikiscript.parse_link( 'La Florida' ) link #=> nil title #=> nil
Document Element Structure¶ ↑
Get the document's element structure. Note: For now only section headings (h1
, h2
, h3
, …) and tables are supported. Example:
nodes = Wikiscript.parse( <<TXT ) =Heading 1== ==Heading 2== ===Heading 3=== {| |- ! header1 ! header2 ! header3 |- | row1cell1 | row1cell2 | row1cell3 |- | row2cell1 | row2cell2 | row2cell3 |} TXT pp nodes #=> [[:h1, "Heading 1"], # [:h2, "Heading 2"], # [:h3, "Heading 3"], # [:table, [["header1", "header2", "header3"], # ["row1cell1", "row1cell2", "row1cell3"], # ["row2cell1", "row2cell2", "row2cell3"]]]
That's all for now. More functionality will get added over time.
Install¶ ↑
Just install the gem:
$ gem install wikiscript
License¶ ↑
The wikiscript
scripts are dedicated to the public domain. Use it as you please with no restrictions whatsoever.