Hi @rdmpage ,
Thank you for your message.
I understand your reasoning here, but at scale it just doesn’t work. That’s why we advise our members the way we do.
When our members register DOI suffixes using patterns we find two consistent problems that have led us to the advice that you quoted:
DOIs should not include information that can be understood, interpreted or predicted, especially information that may change. Page numbers and dates are examples of information that shouldn’t be included in suffixes. It is particularly problematic if the suffix includes information that conflicts with the metadata associated with the DOI.
- People and the systems we build are inconsistent and/or make mistakes.
- When people register DOI suffixes with human-readable patterns, they will inevitably change those patterns over time, thus compromising the persistence of the DOI itself.
As you may know, DOIs for journal articles and other traditional content (books, conference papers, datasets, etc.) are citation identifiers. The most fundamental objective for us for that DOI is its own persistence. What we find time and again is over time when human-readable information or patterns are entered into suffixes our members will decide that they want to alter those patterns. It may not happen tomorrow or next month, but it will happen.
Let’s look at an example you flagged to us in a different thread in the forum, this one: Three DOis for the same article!
In this thread, you highlight that there are three DOIs for the same journal article:
You can see that there is what looks like some human-readable information in the suffixes of these DOIs, which, like you said above, is helpful for you:
mmng
is likely an identifier for the journal or organization responsible for the journal (you’d know better than me, but I assume that is meaningful)
2004
- present in one of the three DOIs is the publication year of this journal article
07
and 01
- the volume and issue number
153
- the first page of the article itself present in that final DOI
486
- I don’t know; is that meaningful?
I don’t know the complex history of this journal article and why it has three DOIs, but what we sometimes find is that our members will decide to change the human-readable information in a DOI suffix so that it is uniform. They, like you, want to be able to look at the suffix and know for sure what each blip of information is about, right? In some cases, that’s why we get duplicate DOIs like 10.1002/mmng.4860070108 and 10.1002/mmng.20040070108…because the member has modified their suffix pattern and registered new DOIs for existing articles in order to have uniform DOIs. They may have made a mistake entering that human readable information or decided to adjust the human-readable information over time.
When that happens it makes it much more difficult for you, others in the community, and for us to know which is the definitive DOI that will be used going forward for the journal article. It’s harder for you as a human. It’s harder for us to establish relationships, most notably cited-by links. Which one do we match to? Which one do you share with your colleagues?
I asked above which of the human-readable information was meaningful in this example. Where I can find that information? In our APIs. All of the metadata is there and available to all for these DOIs. We don’t have to guess if, for instance, 486
is meaningful, we can simple review the whole record because the metadata record itself is the definitive source of meaning.
If you prefer XML: http://doi.crossref.org/search/doi?pid=support@crossref.org&format=unixsd&doi=10.1002%2Fmmng.20040070108
If you prefer JSON:
http://api.crossref.org/works/10.5194/fr-7-153-2004?mailto=support@crossref.org
While people, including me, like patterns, they don’t hold up over time and scale. Many of us want those DOI suffixes to have consistent meaning, but they just don’t.
Thanks again,
Isaac