Iām deeply concerned about the power lying in the webometrics league table. http://repositories.webometrics.info/toprep_inst.asp
The give a ranking bonus for your number of āRich Filesā, which basicaly means āNumber of PDFsā. This means that if we were to push for using āscholarly HTMLā rather than PDF than our rank would drop.
Currently eprints.ecs.soton.ac.uk is at 22 and eprints.soton.ac.uk is at 60. ā I couldnāt tell you why, but stats isnāt my strong suit.
My real concen is that this league table will stifle innovation by only measuring common quality factors, rather than promoting new ones. Also, I think the ādeltaā is more important than the size, and always have. The success criteria for the TARDIS project, which launched eprints.soton was that it should have a number (2000, I think) of records by a date. I opposed that at the time, and still think it was wrong. A better criteria would have been a sustained deposit rate and (in the first 2 years) a continuous increasing number of contributors.
http://roar.eprints.org/ is run by one of my colleagues, but Iām very happy to see that they show graphs of ādeposit activityā rather than size. This shows that eprints.soton is in very robust healt; http://roar.eprints.org/1423/ with a sustained level of daily deposits over the past few years.
Whatās unhealthy is that a drop in the ranking for eprints.soton caused the board which oversees the site to discuss how to improve our rankings, and there was no really obvious way I could see to do it without generating un-necisary additional PDF files. Of course this was rejected as a silly idea, but my fear is that other sites may feel pressured to improve their ranking and make bad decisions. The community should be calling the shots of what metrics make a good repository. Iām not sure what those metrics should be, but they should be as careful as they can to avoid a situation where I can inflate my score by making my repository worse, eg. by encouraging bad formats like PDF.
If youāve not heard the PDF rant, then in short itās that people write and read papers primariy on computers. In most cases they write in a format with some markup (latex or Word) and then convert it to simulated sheets of A4 paper (PDF). Computers rarely have displays whre an A4 page is useful. I donāt see how itās acceptable to produce papers (gah, even the name is inappropriate) which cantā be comfortably viewed on my landscape laptop screen, on my phone, and on the iPad I might justify buying one day. Reading papers is one of the key things an academic does for a living and itās still easier to read them by printing them out first.
Thereās some people moving in the right direction, at least: http://scholarlyhtml.org/ but the repository and research-publication community needs to be goaded into this direction out of itās PDF comfort zone.
I’ll repost the comment I made in response to Chris’ original rant on Brian Kelly’s blog:
Chris and Brian, you should take no notice of the webometrics site, which is fundamentally flawed in so many ways (IMHO). For a start, it only ranks those sites that follow its pattern! From its methodology page:
ā- Only repositories with an autonomous web domain or subdomain are included:
repository.xxx.zz (YES)
http://www.xxx.zz/repository (NO)ā
PS I donāt mind ordinary PDFs too much (always read on-screen, never print out), but I SO HATE 2-column PDFs, which are a roaring pain to read on screen :-(.
Don’t forget EPUB! That’s, IMO, one real direction for scholarly document packaging.