{"id":1277,"date":"2015-08-27T11:42:43","date_gmt":"2015-08-27T11:42:43","guid":{"rendered":"http:\/\/blog.soton.ac.uk\/webteam\/?p=1277"},"modified":"2015-08-27T11:46:49","modified_gmt":"2015-08-27T11:46:49","slug":"deliberately-tainted-data","status":"publish","type":"post","link":"https:\/\/blog.soton.ac.uk\/webteam\/2015\/08\/27\/deliberately-tainted-data\/","title":{"rendered":"Deliberately Tainted Data"},"content":{"rendered":"<div id=\"attachment_1279\" style=\"width: 242px\" class=\"wp-caption alignright\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-1279\" class=\"size-full wp-image-1279\" src=\"http:\/\/blog.soton.ac.uk\/webteam\/files\/2015\/08\/button_05.jpg\" alt=\"Do you see trainted data? Call ServiceLine now on...\" width=\"232\" height=\"95\" \/><p id=\"caption-attachment-1279\" class=\"wp-caption-text\">Do you see trainted data? Call now&#8230;<\/p><\/div>\n<p>In iSolutions, we run a fairly common system for services of having a live, a pre-production and one or more development copies of the service. It took me a while to come around to this approach, having mostly done seat-of-the-pants hacking on live services in the past, but come around to it I have.\u00a0 One problem with this approach is a problem we&#8217;ve encountered a number of times where the pre-production service contained more or less the same data as the live service so either people used it in error, or it sent real emails to real people based on test data.<\/p>\n<p>A long time ago I came up with a way to massively reduce such incidents. Not stop, but reduce. The idea was inspired by the smell of natural gas. Natural gas doesn&#8217;t actually have much of a smell but the distinctive smell of unburned gas is added artificially and makes it very easy to notice if there&#8217;s a leak. While this approach doesn&#8217;t directly stop explosions, it means that 99% of incidents are caught longbefore anything bad can happen.<\/p>\n<h2>ChristopheX GutteridgX<\/h2>\n<p>My idea is to add a &#8220;taint&#8221; to some columns in the dev. and pre-production databases to make it obvious to a human that the data is tainted, but not impact testing. To do this I pick some free text columns which are going to be frequently viewed in any user-interface. For example Person_Forename, Person_Surname, Event_Title, Document_title. If these have 3 or more characters, I replace the last with a capital X. That way it doesn&#8217;t change the length of any data or notably change the indexing. So I would appear as &#8220;ChristopheX GutteridgX&#8221; and John Wu would be &#8220;JohX Wu&#8221;.\u00a0 It&#8217;s immediately obvious that something is off, but the system can be tested as usual. If ever preprod or dev data accidentally ends up in a live system, it&#8217;s immediately obvious. This can happen if a database hostname is accidentally included in the version controlled part of the application, rather than in a config file outside the normal version control.<\/p>\n<p>This is no substitue for proper checks and processes but it makes an excellent extra line of defence for no significant cost.<\/p>\n<h3>It works! (sample size: 1)<\/h3>\n<p>Today someone told me that their live database is showing tainted data. I&#8217;ve checked the database tables, and they have the correct untainted data, so I can deduce he&#8217;s still using an ODBC connection to the pre-prod database. A small victory, but it&#8217;s the first time this approach has paid off, so I wrote this blog post to celebrate.<\/p>\n<p>I&#8217;m sure I can&#8217;t be the only person who&#8217;s thought of this. Does the technique have a name? Is it a good idea or an antipattern?<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In iSolutions, we run a fairly common system for services of having a live, a pre-production and one or more development copies of the service. It took me a while to come around to this approach, having mostly done seat-of-the-pants hacking on live services in the past, but come around to it I have.\u00a0 One [&hellip;]<\/p>\n","protected":false},"author":5,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"ngg_post_thumbnail":0,"footnotes":""},"categories":[198,352,86],"tags":[],"class_list":["post-1277","post","type-post","status-publish","format-standard","hentry","category-best-practice","category-data","category-database"],"_links":{"self":[{"href":"https:\/\/blog.soton.ac.uk\/webteam\/wp-json\/wp\/v2\/posts\/1277","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.soton.ac.uk\/webteam\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.soton.ac.uk\/webteam\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.soton.ac.uk\/webteam\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.soton.ac.uk\/webteam\/wp-json\/wp\/v2\/comments?post=1277"}],"version-history":[{"count":3,"href":"https:\/\/blog.soton.ac.uk\/webteam\/wp-json\/wp\/v2\/posts\/1277\/revisions"}],"predecessor-version":[{"id":1282,"href":"https:\/\/blog.soton.ac.uk\/webteam\/wp-json\/wp\/v2\/posts\/1277\/revisions\/1282"}],"wp:attachment":[{"href":"https:\/\/blog.soton.ac.uk\/webteam\/wp-json\/wp\/v2\/media?parent=1277"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.soton.ac.uk\/webteam\/wp-json\/wp\/v2\/categories?post=1277"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.soton.ac.uk\/webteam\/wp-json\/wp\/v2\/tags?post=1277"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}