Iâm deeply concerned about the power lying in the webometrics league table. http://repositories.webometrics.info/toprep_inst.asp
The give a ranking bonus for your number of âRich Filesâ, which basicaly means âNumber of PDFsâ. This means that if we were to push for using âscholarly HTMLâ rather than PDF than our rank would drop.
Currently eprints.ecs.soton.ac.uk is at 22 and eprints.soton.ac.uk is at 60. â I couldnât tell you why, but stats isnât my strong suit.
My real concen is that this league table will stifle innovation by only measuring common quality factors, rather than promoting new ones. Also, I think the âdeltaâ is more important than the size, and always have. The success criteria for the TARDIS project, which launched eprints.soton was that it should have a number (2000, I think) of records by a date. I opposed that at the time, and still think it was wrong. A better criteria would have been a sustained deposit rate and (in the first 2 years) a continuous increasing number of contributors.
http://roar.eprints.org/ is run by one of my colleagues, but Iâm very happy to see that they show graphs of âdeposit activityâ rather than size. This shows that eprints.soton is in very robust healt; http://roar.eprints.org/1423/ with a sustained level of daily deposits over the past few years.
Whatâs unhealthy is that a drop in the ranking for eprints.soton caused the board which oversees the site to discuss how to improve our rankings, and there was no really obvious way I could see to do it without generating un-necisary additional PDF files. Of course this was rejected as a silly idea, but my fear is that other sites may feel pressured to improve their ranking and make bad decisions. The community should be calling the shots of what metrics make a good repository. Iâm not sure what those metrics should be, but they should be as careful as they can to avoid a situation where I can inflate my score by making my repository worse, eg. by encouraging bad formats like PDF.
If youâve not heard the PDF rant, then in short itâs that people write and read papers primariy on computers. In most cases they write in a format with some markup (latex or Word) and then convert it to simulated sheets of A4 paper (PDF). Computers rarely have displays whre an A4 page is useful. I donât see how itâs acceptable to produce papers (gah, even the name is inappropriate) which cantâ be comfortably viewed on my landscape laptop screen, on my phone, and on the iPad I might justify buying one day. Reading papers is one of the key things an academic does for a living and itâs still easier to read them by printing them out first.
Thereâs some people moving in the right direction, at least: http://scholarlyhtml.org/ but the repository and research-publication community needs to be goaded into this direction out of itâs PDF comfort zone.
I’ll repost the comment I made in response to Chris’ original rant on Brian Kelly’s blog:
Chris and Brian, you should take no notice of the webometrics site, which is fundamentally flawed in so many ways (IMHO). For a start, it only ranks those sites that follow its pattern! From its methodology page:
â- Only repositories with an autonomous web domain or subdomain are included:
repository.xxx.zz (YES)
http://www.xxx.zz/repository (NO)â
PS I donât mind ordinary PDFs too much (always read on-screen, never print out), but I SO HATE 2-column PDFs, which are a roaring pain to read on screen :-(.
Don’t forget EPUB! That’s, IMO, one real direction for scholarly document packaging.