ArXiv Accessibility | In the Dark


ArXiv Accessibility | In the Dark

Just over a year ago I did a post the need to make arXiv more accessible, particularly for readers with some form of visual impairment. Although I missed it at the time, there was an announcement from arXiv in December 2023 that new papers (by which is meant submitted after 1st December 2023) will be available as HTML as well as PDF format. This has been in development for some time, actually, and HTML versions have been available from arXiv labs by changing the address of a paper from, for example, “” to ““. The latter produces this HTML version of one of the papers we have published at the Open Journal of Astrophysics:

As you can see, it works pretty well for this example.

Naturally I tried out the new “beta” release of the HTML generator, which you can now find on the right-hand panel of the abstract page alongside the PDF download instead of fiddling around with the URL. Here is an example of one of our papers on which it works well:

Here it is on another of our OJAp papers which, as you can see, does not work:

You can see the reason for the failure, which is that the LaTex used to generate the paper contains a package the HTML generator does not know about. One of the difficulties for arXiv is that new packages are always being developed and it is hard to keep up. I’m told that on average arXiv achieves ~75% successful conversions (and 97% partial success), but the articles from January 2024 (which contain more new packages) convert with a success rate of only 62%. It’s far from perfect, but it will improve -especially if authors follow the advice on best practice produced by arXiv; I actually think authors have a responsibility to help arXiv as much as possible in this regard.

This all reminds me of past experiences I’ve had teaching theoretical physics to blind and partially-sighted students. Years ago this used to involve making Braille copies of notes, but there are now various bits of software to help such people manage LaTeX both for creating and reading documents. In particular there are programs that can read Latex documents (including formulae and equations) which means that if a lecturer can supply LaTeX source version of their notes, the student can hear them spoken out loud as well as make their own annotations/corrections. While HTML may well be better for some fields, I do wonder if physicists and other people in disciplines that make heavy use of mathematics might prefer to use the LaTeX source code which is already downloadable from arXiv?

Leave a Comment