Intra-firm information flow: a content-structure perspective Conference Paper uri icon


  • This paper endeavors to bring together two largely disparate areas of research. On one hand, text mining methods treat each document as an independent instance despite the fact that in many text domains, documents are linked and their topics are correlated. For example, web pages of related topics are often connected by hyperlinks and scientific papers from related fields are typically linked by citations. On the other hand, Social Network Analysis (SNA) typically treats edges between nodes according to” flat” attributes in binary form alone. This paper proposes a simple approach that addresses both these issues in data mining scenarios involving corpora of linked documents. According to this approach, after assigning weights to the edges between documents, based on the content of the documents associated with each edge, we apply standard SNA and network theory tools …

publication date

  • October 29, 2011