Posted by: Arash Hejazi10 JUN 2014
Our current products (including our website, PJOnline.com) were organised around packages of content, such as articles, journals, and issues, which of course has its value. However, what was clear to us, both from surveys and the information we extracted from our website usage analytics, was the fact that our readers are trying to find information that will support them in the real world of practice, and help them to solve everyday problems. Often, they are searching for information to answer a specific question.
Let's take an example: the short supply of the preparation Syndol.
When we looked at the PJOnline traffic stats, we could easily see that 'syndol' was one of the top terms that readers searched for using the PJOnline search engine. However, when readers go to Google and search for 'syndol shortage', there's information available but the search engines assume that we have nothing to say about it:
This had a clear meaning. Our content was not structured around real world stuff that actually mattered to our readers, but things that we 'thought' mattered.
More importantly, although we were quite active on twitter, we were ignoring how social media has become as important as, if not more important than, search engines in content discovery. Hashtags have become a way to link data and are acting like keywords or keyphrases for a conversation visible to all users of twitter. People don't necessarily go directly to sources of information before they look for that information on a search engine or find something on a social network. What matters is the 'keyword' or 'keyword phrase' they are looking for, or stumble upon.
We realised that we had to provide a way for our readers to find the information they needed, whether it was via our main web platform or any of our social media spaces, and regardless of whether it was published by us or another trustworthy source. So, our aim became to start organising our content around 'data', and the 'relationship' between these pieces of data, not just storing a repository of our articles. This would make our platform of content and the satellite platforms (such as blogs and other social networks) a destination for pharmacists and pharmaceutical scientists; a destination linked at both the data and user level, around concepts about which our audience needed to answer questions.
We had to take the first steps towards creating a knowledge base or Pharmacy Graph, and for this we needed three things a taxonomy and a tool that defined the relationship between these pieces of information This meant that we had to change the way we create our content and manage our production workflows, and establish a digital-first workflow based on XML (a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable). This would be the foundation for our web ontology, characterised by formal semantics,which would improve search results.
We also had to set the foundations of building a triplestore that would store the relationship between each piece of content with other pieces of content and information, including articles, data, and people. For this, we needed to have a taxonomy of all the terms that could answer the following questions:
- What a piece of content was about
- Who was involved in this article (eg. authors(s), other people, institutions, etc.)
- What relationship this piece of content has with other pieces of information (e.g. This is a formulation of a drug, which is used in the management of a disease, and is situated within a drug class, and is produced by a pharma company, and is approved for prescribing in a country)
Interestingly, although hundreds, maybe thousands, of taxonomies for various fields have been produced, we couldn't find a ready-made taxonomy for pharmacy and pharmaceutical sciences. You may be familiar with the Medical Subject Headings (MeSH) produced by the National Library of Medicine, which has become a standard for indexing medical journals. Although it is probably the most comprehensive medical taxonomy to date, MeSH does not offer a suitable classification that could work for the specific disciplines of pharmacy and pharmaceutical sciences. So we eventually reached the conclusion that we should build our own. The taxonomy would remain an experiment and we plan to review it periodically and improve it based on the usage and behaviour data we receive from our users. Furthermore, the system had to work well with our other platforms such as BNF or even Medicines Complete as a whole, so if PJOnline did not have the information the user was looking for, he or she would quickly be directed to another trustworthy source.
It was a significant project to build the taxonomy, but version 1.0 is ready now. You will see the new version on the new website, as soon as it is launched, and together we will take it from there.
The taxonomy is the first step for us in organising our content in a coherent way, as a knowledge graph. There are many more steps to take, including the implementation of a tool that can store the relationships between these pieces of data. We will get there. Our aim is to eventually create a comprehensive pharmacy and pharmaceutical sciences knowledge graph, an infrastructure that will connect our readers with the information they want and need. You can see an example for a graph here:
In simple terms, by adding the right descriptive label or meta-data to the content we produce and also assigning a unique identifier (URI) to each piece of content, we (and you) will be empowered to do all sorts of things with it. Imagine creating subject pages that collate everything we have on that subject in a logical and meaningful way, improved search functionality, a better user experience, and easier navigation through an ocean of information to find exactly what you need.