Shortly after our ‘Docs group’ meeting on 22nd October (reported in Our meetings), Graham Klyne devised the following dialogue between Carol and Martin to offer a more approachable explanation of what we presently believe active metadata to be about.

As our own understanding within the CREAM project evolves, it is inevitable that we will choose to explain some aspects differently, so naturally we expect to modify this dialogue …

Carol and Martin are walking in the park.  Martin is a software and information architect, and Carol is a research scientist. Martin has been talking for some time about “active metadata”, but Carol is confused about what it all means…

Carol: So what is this “Active Metadata” stuff you’ve been banging on about for the past 2 seasons?

Martin: It’s metadata, or indeed any data, that is used actively in some process.

C: What kind of process?

M: I’ve considered particularly the conduct of a research investigation, or an experiment that is part of such an investigation, but it could also be the process of creating an artwork, or choosing a place to live.

C: Is there something in common about these processes that makes the use of active metadata significant?

M: Yes, they involve decisions that determine how subsequent steps of the process are performed.  The “active metadata” is anything that informs these decisions.

C: So if I were making “active use” of some data, I would be using it to make decisions about how I conduct my research?  For example, when some computation fails to yield a useful result, I might choose to try a different computation, or change the parameters to the computation, or even choose to work with a completely different dataset?

M: Yes, that sounds about right.

C: Then if I run a pre-determined climate prediction model to obtain some result, that’s not making any active use?

M: At that level, maybe not, but if that prediction is part of some larger research project where the result informs how the project proceeds, then that would be a case of active use.  Also, if you make a closer inspection of your prediction model, you might find that there are input parameters that affect the way the computation is performed, so that too might be considered to be a form of active use.

C: Ah, so there’s nothing distinguishing about active metadata per se.  Any data might be “active metadata” in some context of use in which it is used to control an unfolding process.

M: Yes, quite so.

C: But I make decisions about research directions all the time – it’s fundamental to my role as a research scientist.  Why do you think I might be interested in this “Active Metadata” idea?

M: Decisions are often based on tacit knowledge, and it can be difficult for others to understand the decisions you have taken, and hence to follow a path that you have discovered.  Sometimes the tacit knowledge may so ingrained that you might not even realize that you are actually making decisions.  When key decisions are articulated, it is easier to revisit those decisions and be more agile in your research.  Further, by recording active metadata, you reveal more information about your process, and hence help others to interpret, validate, reproduce and re-use your work, generally increasing its overall value to science.

C: So how would I go about recording active metadata?

M: Well, I’m not sure there’s a single answer for that.  One suggestion is to adopt a reflective approach.  Whenever a decision is made, make note of the decision made, and any inputs upon which it was based.  These notes then form part of the record of an investigation, and may be of varying formality depending on the particular situation.  Some of this information might be captured automatically by research instrumentation, but it seems likely that some of the most important decisions may be out of reach of such automated capture.

C: You said earlier that active metadata could be used with processes other than research.  How might it apply, for example, to creating an artwork?

M: Your question seems to ask how research processes are relevant to creative art, but I think that’s the wrong question.  Rather, I’d ask what research can learn from creative processes.  After all, research itself is, at heart, a creative process, is it not?  The importance of decision points in the overall process is something that has been highlighted by reflecting on creative artistic processes, and in particular on a process model called “Procedural Blending”.  According to this  model, a creative process involves a number of key decision points, at which available materials (“inputs”) are selected and blended to form some new idea or artifact.  This appears to apply also to the creation of scientific knowledge and the methods by which it can be probed.

C: So a record of active use of metadata would include information about decisions made and the data or factors which informed those decisions.

M: Yes, I think so, but an open challenge here would be to recognize those decisions that are in some sense significant, and to pass over those that are part of some less important underlying or background activity.  Also, note that the decisions we talk about here are inherently part of some enveloping process, so we might well expect to see the record include a description of the process performed, for example based on models of provenance or workflow planning.

C: It seems that recognizing these key decisions is tricky enough, let alone the form that their record should take.  Are there any hints or pointers you can offer for this?

M: It’s only a start, but there are some techniques that are often used to help make complex or important decisions, such as checklists, pro/con lists, scoring regimes.  Checklists are common where safety-related decisions are to be made (e.g. do I try to fly this aircraft?).  Lists of advantages and disadvantages of different courses of action are often used informally when trying to choose among a number of options (e.g. which of these houses should I buy?); scoring regimes are a more formal variation (e.g. used in the award of contracts among competing proposals).  These suggest just some models through which active metadata use might be manifested.

C: How does this all this fit with other models of information capture and dissemination, such as the DCC life cycle?

M: The DCC life cycle model focuses very broadly on what happens to discovered and/or created knowledge, and is very non-specific about how such knowledge comes to be.  The role and use of active metadata is very much concerned with this discovery or creation phase, and as such it might be seen as exploring in greater detail that part of the overall life cycle of curated information.

C: What help is available if I want to capture and record active metadata?

M: Well, we’re working on that …

C: What should I look for in existing tools that might help me?

M: We don’t have complete answers for this, but have already noticed that any data can be active metadata.  So you’d be looking for tools that don’t restrict the kind of information you can access or record.  Also, because the active use is intimately related to some process, tools that help you to plan and/or record a process or workflow seem likely to be useful.

