In the context of our Project Mirabel system that manages DITA content, I need to be able answer the question “for topic X, what other topics link to it directly or indirectly?”

 

That is, say Topic A links to Topic B that Links to Topic C.

 

Asking the question “What topics ultimately link to topic C?” I would like to get the answer “Topic A, Topic B”.

 

Getting the answer for direct references is easy—I already build a where-used index that captures, for each DITA map or topic, what other maps and topics link directly to it.

 

But to get the Topic A part of the answer I need some kind of link graph index and I’m not sure how best to go about calculating this or capturing it in some index or set of indexes.

 

In our content the fan out from a single Topic C to the set of topics that ultimately reference it could be 10s of 1000s of topics. We have about 45K topics in the content for each version of the ServiceNow Platform and a number of topics that are used by a large number of other topics, so the explosion can be quite large. That suggests that a simple topic-to-ultimately-referenced-topics index would be very inefficient in that the entry for any given topic could potentially have 45K – 1 entries (we don’t care that a topic references itself).

 

On the other hand, working backwards through chains of direct references can also be expensive and is probably too slow, so maybe the brute-force index is the best option?

 

At the same time, I would like to be able to quickly visualize the link graph extending from or ending in any given topic or simply the link graph for the entire information set, which requires capturing the nodes and edges.

 

My question: does anyone either have experience or insight into this kind of link graph challenge or know of relevant papers or general discussion of graph processing I might look at?

 

Thanks,

 

Eliot

_____________________________________________

Eliot Kimber

Sr Staff Content Engineer

O: 512 554 9368

M: 512 554 9368

servicenow.com

LinkedIn | Twitter | YouTube | Facebook