{"id":198,"date":"2011-06-28T16:10:19","date_gmt":"2011-06-28T15:10:19","guid":{"rendered":"http:\/\/blog.soton.ac.uk\/oneshare\/?p=198"},"modified":"2011-08-27T09:05:04","modified_gmt":"2011-08-27T08:05:04","slug":"campusroar-finding-feeds","status":"publish","type":"post","link":"https:\/\/blog.soton.ac.uk\/oneshare\/2011\/06\/28\/campusroar-finding-feeds\/","title":{"rendered":"CampusROAR: Finding Feeds"},"content":{"rendered":"<p>Recently I&#8217;ve been looking at methods to get all the feeds from the University web presence and compile them all into a big list for later use.\u00a0 This is harder than it looks since the University website is a large beast, with hundreds of sub-domains.<\/p>\n<p>I&#8217;ve been writing a basic web spider just using basic command line tools and bash, focusing around wget&#8217;s -r flag which downloads all the files in a given domain.\u00a0 It iteratively goes through each subdomain it finds and gets these files too.\u00a0 After each subdomain it deletes all files over 1mb to save space, since it&#8217;s unlikely that any web pages will be this large and that&#8217;s all we&#8217;re concerned about.\u00a0 Afterwards it looks for rss tags in all the remaining files and gives the paths to them.<\/p>\n<p>Currently it&#8217;s going through downloading everything, which I imagine will take a long while.\u00a0 Oh well.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Recently I&#8217;ve been looking at methods to get all the feeds from the University web presence and compile them all into a big list for later use.\u00a0 This is harder than it looks since the University website is a large &hellip;<\/p>\n<p class=\"read-more\"> <a class=\"more-link\" href=\"https:\/\/blog.soton.ac.uk\/oneshare\/2011\/06\/28\/campusroar-finding-feeds\/\"> <span class=\"screen-reader-text\">CampusROAR: Finding Feeds<\/span> Read More &raquo;<\/a><\/p>\n","protected":false},"author":188,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[4013],"class_list":["post-198","post","type-post","status-publish","format-standard","hentry","category-uncategorized","tag-campusroar"],"_links":{"self":[{"href":"https:\/\/blog.soton.ac.uk\/oneshare\/wp-json\/wp\/v2\/posts\/198","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.soton.ac.uk\/oneshare\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.soton.ac.uk\/oneshare\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.soton.ac.uk\/oneshare\/wp-json\/wp\/v2\/users\/188"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.soton.ac.uk\/oneshare\/wp-json\/wp\/v2\/comments?post=198"}],"version-history":[{"count":2,"href":"https:\/\/blog.soton.ac.uk\/oneshare\/wp-json\/wp\/v2\/posts\/198\/revisions"}],"predecessor-version":[{"id":310,"href":"https:\/\/blog.soton.ac.uk\/oneshare\/wp-json\/wp\/v2\/posts\/198\/revisions\/310"}],"wp:attachment":[{"href":"https:\/\/blog.soton.ac.uk\/oneshare\/wp-json\/wp\/v2\/media?parent=198"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.soton.ac.uk\/oneshare\/wp-json\/wp\/v2\/categories?post=198"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.soton.ac.uk\/oneshare\/wp-json\/wp\/v2\/tags?post=198"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}