fbpx
Wikipedia

Pathfinder network

A method for pruning dense networks to highlight key links

Rationale edit

Relationships among a set of elements are often represented as a square matrix with entries representing the relations between all pairs of the elements. Relations such as distances, dissimilarities, similarities, relatedness, correlations, co-occurrences, conditional probabilities, etc., can be represented by such matrices. Such data can also be represented as networks with weighted links between the elements. Such matrices and networks are extremely dense and are not easily apprehended without some form of data reduction or pruning.

A pathfinder network results from applying a pruning method that removes weaker links from a (usually dense) network according to the lengths of alternative paths (see below).[1][2][3] It is used as a psychometric scaling method based on graph theory and used in the study of expertise, education,[4] knowledge acquisition, mental models,[5] and knowledge engineering. It is also employed in generating communication networks,[6] software debugging,[7] visualizing scientific citation patterns,[8] information retrieval, and other forms of data visualization.[9] Pathfinder networks are potentially applicable to any problem addressed by network theory.

Overview edit

Network pruning aims to highlight the more important links between elements represented in a network. It helps to simplify the collection of connections involved which is valuable in data visualization and in comprehending essential relations among the elements represented in the network.

Several psychometric scaling methods start from pairwise data and yield structures revealing the underlying organization of the data. Data clustering and multidimensional scaling are two such methods. Network scaling represents another method based on graph theory. Pathfinder networks are derived from matrices of data for pairs of entities. Because the algorithm uses distances, similarity data are inverted to yield dissimilarities for the computations.

In the pathfinder network, the entities correspond to the nodes of the generated network, and the links in the network are determined by the patterns of proximities. For example, if the proximities are similarities, links will generally connect nodes of high similarity. When proximities are distances or dissimilarities, links will connect the shorter distances. The links in the network will be undirected if the proximities are symmetrical for every pair of entities. Symmetrical proximities mean that the order of the entities is not important, so the proximity of i and j is the same as the proximity of j and i for all pairs i,j. If the proximities are not symmetrical for every pair, the links will be directed.

Algorithm edit

The pathfinder algorithm uses two parameters.

  1. The   parameter constrains the number of indirect proximities examined in generating the network.   is an integer between   and  , inclusive where   is the number of nodes or items. Shortest paths can have no more than   links. When  , all possible paths are included.
  2. The   parameter defines the metric used for computing the distance of paths (cf. the Minkowski distance).   is a real number between   and  , inclusive.

Path distance   is computed as:  , where   is the distance of the   link in the path and  . For  ,   is simply the sum of the distances of the links in the path. For  ,   is the maximum of the distances of the links in the path because  . A link is pruned if its distance is greater than the minimum distance of paths between the nodes connected by the link. Efficient methods for finding minimum distances include the Floyd–Warshall algorithm (for  ) and Dijkstra's algorithm (for any value of  ).

A network generated with particular values of   and   is called a  . Both of the parameters have the effect of decreasing the number of links in the network as their values are increased. The network with the minimum number of links is obtained when   and  , i.e.,  .

With ordinal-scale data (see level of measurement), the   parameter should be   because the same   would result from any positive monotonic transformation of the proximity data. Other values of   require data measured on a ratio scale. The   parameter can be varied to yield the desired number of links in the network or to focus on more local relations with smaller values of  .

Essentially, pathfinder networks preserve the shortest possible paths given the data. Therefore, links are eliminated when they are not on shortest paths. The   will be the minimum spanning tree for the links defined by the proximity data if a unique minimum spanning tree exists. In general, the   includes all of the links in any minimum spanning tree.

Example edit

Here is an example of an undirected pathfinder network derived from average ratings of a group of biology graduate students. The students rated the relatedness of all pairs of the terms shown, and the mean rating for each pair was computed. The solid blue links are the   (labeled "both" in the figure). The dotted red links are added in the  . For the added links, there are no 2-link paths shorter than the link distance but there is at least one shorter path with more than two links in the data. A minimal spanning tree would have 24 links so the 26 links in   implies that there is more than one minimum spanning tree. There are two cycles present so there are tied distances in the set of links in the cycle. Breaking each cycle would require removing one of the tied links in each cycle.

 

References edit

  1. ^ Schvaneveldt, R.W., ed. (1990). Pathfinder associative networks: Studies in knowledge organization. Norwood, NJ: Ablex. ISBN 978-0893916244.
  2. ^ Schvaneveldt, R.W.; Durso, F.T.; Dearholt, D.W. (1989). "Network structures in proximity data". In Bower, G. (ed.). The psychology of learning and motivation: Advances in research and theory (PDF). Vol. 24. New York: Academic Press. pp. 249–284.
  3. ^ Schvaneveldt, R.W.; Dearholt, D.W.; Durso, F.T. (1989). "Graph theoretic foundations of Pathfinder Networks" (PDF). Computers and Mathematics with Applications. 15: 337–345.
  4. ^ Goldsmith, T.; Johnson, P.; Acton, W. (1991). "Assessing structural knowledge". Journal of Educational Psychology. 83 (4): 88–96.
  5. ^ Kudikyala, U.K.; Vaughn, R.B. (2005). "Software requirement understanding using Pathfinder networks: discovering and evaluating mental models". Journal of System Software. 74 (1): 101–108. doi:10.1016/j.jss.2003.09.028.
  6. ^ Shope, S. M.; DeJoode, J. A.; Cooke, N. J.; Pedersen, H. (2004). "Using Pathfinder to Generate Communication Networks in a Cognitive Task Analysis". Proceedings of the Human Factors and Ergonomics Society Annual Meeting. 48 (3): 678–682. doi:10.1177/154193120404800386.
  7. ^ Serrano, E.; Quinn, A.; Botia, J.A.; Cordón, O. (2010). "Debugging complex software systems by means of pathfinder networks". Information Science. 180 (5): 561–583.
  8. ^ Chen, C.; Song (2017). "Science Mapping Tools and Applications.". Representing Scientific Knowledge. Springer. pp. 57–137.
  9. ^ Cribbin, T. (2010). "Visualising the structure of document search results: A comparison of graph theoretic approaches". Information Visualization. 9 (2): 83–97.

Other Information edit

Further information on pathfinder networks and several examples of the application of PFnets to a variety of problems can be found in the references.

  • The 1990 book is out of print. A zipped copy of pdf chapters can be downloaded.
  • The 1989 chapter in the Bower book is an article summarizing pathfinder networks and some applications. pdf

Three papers describing fast implementations of pathfinder networks:

  • Guerrero-Bote, V.; Zapico-Alonso, F.; Esinosa-Calvo, M.; Gomez-Crisostomo, R.; Moya-Anegon, F. (2006). "Binary pathfinder: An improvement to the pathfinder algorithm". Information Processing and Management. 42 (6): 1484–1490. CiteSeerX 10.1.1.378.5375. doi:10.1016/j.ipm.2006.03.015.
  • Quirin, A; Cordón, O; Santamaría, J; Vargas-Quesada, B; Moya-Anegón, F (2008). "A new variant of the Pathfinder algorithm to generate large visual science maps in cubic time". Information Processing and Management. 44 (4): 1611–1623. doi:10.1016/j.ipm.2007.09.005.
  • Quirin, A.; Cordón, O.; Guerrero-Bote, V. P.; Vargas-Quesada, B.; Moya-Anegón, F. (2008). "A Quick MST-based Algorithm to Obtain Pathfinder Networks". Journal of the American Society for Information Science and Technology. 59 (12): 1912–1924. CiteSeerX 10.1.1.331.1548. doi:10.1002/asi.20904.

(The two variants by Quirin et al. are significantly faster. While the former can be applied with   or   and any value for  , the latter can only be applied in cases where   and  .)

External links edit

  • Extensive list of papers using Pathfinder
  • Pathfinder Software Download Site
  • Implementation of the original, Binary, Fast and MST variants of the algorithm in C
  • Comparison table between the different variants

pathfinder, network, method, pruning, dense, networks, highlight, links, contents, rationale, overview, algorithm, example, references, other, information, external, linksrationale, editrelationships, among, elements, often, represented, square, matrix, with, . A method for pruning dense networks to highlight key links Contents 1 Rationale 2 Overview 3 Algorithm 4 Example 5 References 6 Other Information 7 External linksRationale editRelationships among a set of elements are often represented as a square matrix with entries representing the relations between all pairs of the elements Relations such as distances dissimilarities similarities relatedness correlations co occurrences conditional probabilities etc can be represented by such matrices Such data can also be represented as networks with weighted links between the elements Such matrices and networks are extremely dense and are not easily apprehended without some form of data reduction or pruning A pathfinder network results from applying a pruning method that removes weaker links from a usually dense network according to the lengths of alternative paths see below 1 2 3 It is used as a psychometric scaling method based on graph theory and used in the study of expertise education 4 knowledge acquisition mental models 5 and knowledge engineering It is also employed in generating communication networks 6 software debugging 7 visualizing scientific citation patterns 8 information retrieval and other forms of data visualization 9 Pathfinder networks are potentially applicable to any problem addressed by network theory Overview editNetwork pruning aims to highlight the more important links between elements represented in a network It helps to simplify the collection of connections involved which is valuable in data visualization and in comprehending essential relations among the elements represented in the network Several psychometric scaling methods start from pairwise data and yield structures revealing the underlying organization of the data Data clustering and multidimensional scaling are two such methods Network scaling represents another method based on graph theory Pathfinder networks are derived from matrices of data for pairs of entities Because the algorithm uses distances similarity data are inverted to yield dissimilarities for the computations In the pathfinder network the entities correspond to the nodes of the generated network and the links in the network are determined by the patterns of proximities For example if the proximities are similarities links will generally connect nodes of high similarity When proximities are distances or dissimilarities links will connect the shorter distances The links in the network will be undirected if the proximities are symmetrical for every pair of entities Symmetrical proximities mean that the order of the entities is not important so the proximity of i and j is the same as the proximity of j and i for all pairs i j If the proximities are not symmetrical for every pair the links will be directed Algorithm editThe pathfinder algorithm uses two parameters The q displaystyle q nbsp parameter constrains the number of indirect proximities examined in generating the network q displaystyle q nbsp is an integer between 2 displaystyle 2 nbsp and n 1 displaystyle n 1 nbsp inclusive where n displaystyle n nbsp is the number of nodes or items Shortest paths can have no more than q displaystyle q nbsp links When q n 1 displaystyle q n 1 nbsp all possible paths are included The r displaystyle r nbsp parameter defines the metric used for computing the distance of paths cf the Minkowski distance r displaystyle r nbsp is a real number between 1 displaystyle 1 nbsp and displaystyle infty nbsp inclusive Path distance d p displaystyle d p nbsp is computed as d p i 1 k l i r 1 r displaystyle d p sum i 1 k l i r 1 r nbsp where l i displaystyle l i nbsp is the distance of the i t h displaystyle ith nbsp link in the path and 2 k q displaystyle 2 leq k leq q nbsp For r 1 displaystyle r 1 nbsp d p displaystyle d p nbsp is simply the sum of the distances of the links in the path For r displaystyle r infty nbsp d p displaystyle d p nbsp is the maximum of the distances of the links in the path because lim r d p max i 1 k l i displaystyle lim r rightarrow infty d p max i 1 k l i nbsp A link is pruned if its distance is greater than the minimum distance of paths between the nodes connected by the link Efficient methods for finding minimum distances include the Floyd Warshall algorithm for q n 1 displaystyle q n 1 nbsp and Dijkstra s algorithm for any value of q displaystyle q nbsp A network generated with particular values of q displaystyle q nbsp and r displaystyle r nbsp is called a P F N e t q r displaystyle PFNet q r nbsp Both of the parameters have the effect of decreasing the number of links in the network as their values are increased The network with the minimum number of links is obtained when q n 1 displaystyle q n 1 nbsp and r displaystyle r infty nbsp i e P F N e t n 1 displaystyle PFNet n 1 infty nbsp With ordinal scale data see level of measurement the r displaystyle r nbsp parameter should be displaystyle infty nbsp because the same P F N e t displaystyle PFNet nbsp would result from any positive monotonic transformation of the proximity data Other values of r displaystyle r nbsp require data measured on a ratio scale The q displaystyle q nbsp parameter can be varied to yield the desired number of links in the network or to focus on more local relations with smaller values of q displaystyle q nbsp Essentially pathfinder networks preserve the shortest possible paths given the data Therefore links are eliminated when they are not on shortest paths The P F N e t n 1 displaystyle PFNet n 1 infty nbsp will be the minimum spanning tree for the links defined by the proximity data if a unique minimum spanning tree exists In general the P F N e t n 1 displaystyle PFNet n 1 infty nbsp includes all of the links in any minimum spanning tree Example editHere is an example of an undirected pathfinder network derived from average ratings of a group of biology graduate students The students rated the relatedness of all pairs of the terms shown and the mean rating for each pair was computed The solid blue links are the P F N e t n 1 displaystyle PFNet n 1 infty nbsp labeled both in the figure The dotted red links are added in the P F N e t 2 displaystyle PFNet 2 infty nbsp For the added links there are no 2 link paths shorter than the link distance but there is at least one shorter path with more than two links in the data A minimal spanning tree would have 24 links so the 26 links in P F N e t n 1 displaystyle PFNet n 1 infty nbsp implies that there is more than one minimum spanning tree There are two cycles present so there are tied distances in the set of links in the cycle Breaking each cycle would require removing one of the tied links in each cycle nbsp References edit Schvaneveldt R W ed 1990 Pathfinder associative networks Studies in knowledge organization Norwood NJ Ablex ISBN 978 0893916244 Schvaneveldt R W Durso F T Dearholt D W 1989 Network structures in proximity data In Bower G ed The psychology of learning and motivation Advances in research and theory PDF Vol 24 New York Academic Press pp 249 284 Schvaneveldt R W Dearholt D W Durso F T 1989 Graph theoretic foundations of Pathfinder Networks PDF Computers and Mathematics with Applications 15 337 345 Goldsmith T Johnson P Acton W 1991 Assessing structural knowledge Journal of Educational Psychology 83 4 88 96 Kudikyala U K Vaughn R B 2005 Software requirement understanding using Pathfinder networks discovering and evaluating mental models Journal of System Software 74 1 101 108 doi 10 1016 j jss 2003 09 028 Shope S M DeJoode J A Cooke N J Pedersen H 2004 Using Pathfinder to Generate Communication Networks in a Cognitive Task Analysis Proceedings of the Human Factors and Ergonomics Society Annual Meeting 48 3 678 682 doi 10 1177 154193120404800386 Serrano E Quinn A Botia J A Cordon O 2010 Debugging complex software systems by means of pathfinder networks Information Science 180 5 561 583 Chen C Song 2017 Science Mapping Tools and Applications Representing Scientific Knowledge Springer pp 57 137 Cribbin T 2010 Visualising the structure of document search results A comparison of graph theoretic approaches Information Visualization 9 2 83 97 Other Information editFurther information on pathfinder networks and several examples of the application of PFnets to a variety of problems can be found in the references The 1990 book is out of print A zipped copy of pdf chapters can be downloaded The 1989 chapter in the Bower book is an article summarizing pathfinder networks and some applications pdf Three papers describing fast implementations of pathfinder networks Guerrero Bote V Zapico Alonso F Esinosa Calvo M Gomez Crisostomo R Moya Anegon F 2006 Binary pathfinder An improvement to the pathfinder algorithm Information Processing and Management 42 6 1484 1490 CiteSeerX 10 1 1 378 5375 doi 10 1016 j ipm 2006 03 015 Quirin A Cordon O Santamaria J Vargas Quesada B Moya Anegon F 2008 A new variant of the Pathfinder algorithm to generate large visual science maps in cubic time Information Processing and Management 44 4 1611 1623 doi 10 1016 j ipm 2007 09 005 Quirin A Cordon O Guerrero Bote V P Vargas Quesada B Moya Anegon F 2008 A Quick MST based Algorithm to Obtain Pathfinder Networks Journal of the American Society for Information Science and Technology 59 12 1912 1924 CiteSeerX 10 1 1 331 1548 doi 10 1002 asi 20904 The two variants by Quirin et al are significantly faster While the former can be applied with q 2 displaystyle q 2 nbsp or q n 1 displaystyle q n 1 nbsp and any value for r displaystyle r nbsp the latter can only be applied in cases where q n 1 displaystyle q n 1 nbsp and r displaystyle r infty nbsp External links editExtensive list of papers using Pathfinder Pathfinder Software Download Site Implementation of the original Binary Fast and MST variants of the algorithm in C Comparison table between the different variants Retrieved from https en wikipedia org w index php title Pathfinder network amp oldid 1190614990, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.