Blog:The LWP’s handling of mathematical formulae evolves: Difference between revisions

no edit summary
No edit summary
No edit summary
 
Line 42: Line 42:
Up until 1.39, in the configuration which was adopted on our website, the extension relied on Wikimedia’s so-called “Mathoid” service to convert the LaTeX source code into the visual rendition of the formula. In other words, the MediaWiki site (for example, the LWP’s site) would send a request to the Wikimedia servers via an API; the Mathoid service, [[mediawikiwiki:RESTBase|running on the Wikimedia servers]], would perform the rendering; the rendered formula would be sent back to the MediaWiki site as an SVG image; and it would then simply be displayed as an image in the user’s browser.
Up until 1.39, in the configuration which was adopted on our website, the extension relied on Wikimedia’s so-called “Mathoid” service to convert the LaTeX source code into the visual rendition of the formula. In other words, the MediaWiki site (for example, the LWP’s site) would send a request to the Wikimedia servers via an API; the Mathoid service, [[mediawikiwiki:RESTBase|running on the Wikimedia servers]], would perform the rendering; the rendered formula would be sent back to the MediaWiki site as an SVG image; and it would then simply be displayed as an image in the user’s browser.


[[File:MediaWiki Math extension with Mathoid.png|center|thumb|500px]]
[[File:MediaWiki Math extension with Mathoid.png|center|thumb|600px]]


For the LWP, this system mostly worked. Sometimes, however, the communication between our website and the remote Wikimedia servers would fail, and an ugly error message would be displayed instead of the formula. This outcome would be likelier in the case of pages with relatively many formulae (for example, the ''Tractatus'' editions). Moreover, even when the communication was successful, it would increase the loading time of the page significantly. In the case of long pages with very many formulae (for example, the [[Tractatus Logico-Philosophicus (multilingual side-by-side view)|multilingual side-by-side view]] of the ''Tractatus''), the process would often timeout and the page would then fail to load altogether.
For the LWP, this system mostly worked. Sometimes, however, the communication between our website and the remote Wikimedia servers would fail, and an ugly error message would be displayed instead of the formula. This outcome would be likelier in the case of pages with relatively many formulae (for example, the ''Tractatus'' editions). Moreover, even when the communication was successful, it would increase the loading time of the page significantly. In the case of long pages with very many formulae (for example, the [[Tractatus Logico-Philosophicus (multilingual side-by-side view)|multilingual side-by-side view]] of the ''Tractatus''), the process would often timeout and the page would then fail to load altogether.
Line 51: Line 51:
The way SimpleMathJax works is different from the process I described above in that there is no communication between the LWP’s site and the Wikimedia servers. When the page is loaded, the rendering of the formula is performed client-side, in the user’s web browser, thanks to the “[https://www.mathjax.org/ MathJax]” JavaScript library.
The way SimpleMathJax works is different from the process I described above in that there is no communication between the LWP’s site and the Wikimedia servers. When the page is loaded, the rendering of the formula is performed client-side, in the user’s web browser, thanks to the “[https://www.mathjax.org/ MathJax]” JavaScript library.


[[File:MediaWiki Math extension with MathJax.png|center|thumb|500px]]
[[File:MediaWiki Math extension with MathJax.png|center|thumb|600px]]


This solution is very stable, and the output is as pretty and as accessible as the Mathoid-generated version. Moreover, the loading times are much faster than they were before.
This solution is very stable, and the output is as pretty and as accessible as the Mathoid-generated version. Moreover, the loading times are much faster than they were before.
Line 75: Line 75:
The result of Frederic’s work was [[Project:Downloading, exporting, and manipulating the texts|the LWP’s ebook export feature]], which is not a MediaWiki extension but is also free and open-source software ([https://github.com/wittgenstein-project/wittgenstein-project.github.io the code is hosted on GitHub]). It is mostly written in Python and works as follows: it uses the [https://pypi.org/project/beautifulsoup4/ Beautifulsoup] web scraping library to retrieve the list of books to be converted from our “[[Project:All texts|All texts]]” page; for each of those, it retrieves the plain HTML version which MediaWiki generates when the string <code>?action=render</code> is appended to the page’s URL; it uses a custom-made parser to convert that HTML code into [[wikipedia:Markdown|MarkDown]] code, a simple markup language which is designed to only preserve the formatting which is semantically relevant; it then uses the [https://pandoc.org/ Pandoc] library to convert the MarkDown code into PDF, EPUB, and MOBI files. In addition to those, the MarkDown file remains available as a very clean, plain-text version of the ebook.
The result of Frederic’s work was [[Project:Downloading, exporting, and manipulating the texts|the LWP’s ebook export feature]], which is not a MediaWiki extension but is also free and open-source software ([https://github.com/wittgenstein-project/wittgenstein-project.github.io the code is hosted on GitHub]). It is mostly written in Python and works as follows: it uses the [https://pypi.org/project/beautifulsoup4/ Beautifulsoup] web scraping library to retrieve the list of books to be converted from our “[[Project:All texts|All texts]]” page; for each of those, it retrieves the plain HTML version which MediaWiki generates when the string <code>?action=render</code> is appended to the page’s URL; it uses a custom-made parser to convert that HTML code into [[wikipedia:Markdown|MarkDown]] code, a simple markup language which is designed to only preserve the formatting which is semantically relevant; it then uses the [https://pandoc.org/ Pandoc] library to convert the MarkDown code into PDF, EPUB, and MOBI files. In addition to those, the MarkDown file remains available as a very clean, plain-text version of the ebook.


[[File:Wittgenstein texts HTML to ebook.png|center|thumb|600px]]
[[File:Wittgenstein texts HTML to ebook.png|center|thumb|700px]]


The procedure runs automatically every 24 hours through GitHub Actions; the output files are hosted on GitHub, but they can be downloaded through direct links from the LWP’s website.
The procedure runs automatically every 24 hours through GitHub Actions; the output files are hosted on GitHub, but they can be downloaded through direct links from the LWP’s website.