Sampling heterogeneous networks

Cheng Lun Yang, Perng Hwa Kung, Cheng Te Li, Chun An Chen, Shou De Lin

Research output: Contribution to journalConference article

1 Citation (Scopus)

Abstract

Online social networks are mainly characterized by large-scale and heterogeneous semantic relationships. Unfortunately, for online social network services such as Facebook or Twitter, it is very difficult to obtain the fully observed network without privilege to access the data internally. To address the above needs, social network sampling is a means that aims at identifying a representative sub graph that preserves certain properties of the network, given the information of any instance in the network is unknown before being sampled. This study tackles heterogeneous network sampling by considering the conditional dependency of node types and link types, where we design a property, Relational Profile, to account such characterization. We further propose a sampling method to preserve this property. Lastly, we propose to evaluate our model from three different angles. First, we show that the proposed sampling method can more faithfully preserve the Relational Profile. Second, we evaluate the usefulness of the Relational Profile showing such information is beneficial for link prediction tasks. Finally, we evaluate whether the networks sampled by our method can be used to train more accurate prediction models comparing to networks produced by other methods.

Original languageEnglish
Article number6729629
Pages (from-to)1247-1252
Number of pages6
JournalProceedings - IEEE International Conference on Data Mining, ICDM
DOIs
Publication statusPublished - 2013 Dec 1
Event13th IEEE International Conference on Data Mining, ICDM 2013 - Dallas, TX, United States
Duration: 2013 Dec 72013 Dec 10

Fingerprint

Heterogeneous networks
Sampling
Semantics

All Science Journal Classification (ASJC) codes

  • Engineering(all)

Cite this

Yang, Cheng Lun ; Kung, Perng Hwa ; Li, Cheng Te ; Chen, Chun An ; Lin, Shou De. / Sampling heterogeneous networks. In: Proceedings - IEEE International Conference on Data Mining, ICDM. 2013 ; pp. 1247-1252.
@article{28e466a1a9fd494385da07bfde79eaa9,
title = "Sampling heterogeneous networks",
abstract = "Online social networks are mainly characterized by large-scale and heterogeneous semantic relationships. Unfortunately, for online social network services such as Facebook or Twitter, it is very difficult to obtain the fully observed network without privilege to access the data internally. To address the above needs, social network sampling is a means that aims at identifying a representative sub graph that preserves certain properties of the network, given the information of any instance in the network is unknown before being sampled. This study tackles heterogeneous network sampling by considering the conditional dependency of node types and link types, where we design a property, Relational Profile, to account such characterization. We further propose a sampling method to preserve this property. Lastly, we propose to evaluate our model from three different angles. First, we show that the proposed sampling method can more faithfully preserve the Relational Profile. Second, we evaluate the usefulness of the Relational Profile showing such information is beneficial for link prediction tasks. Finally, we evaluate whether the networks sampled by our method can be used to train more accurate prediction models comparing to networks produced by other methods.",
author = "Yang, {Cheng Lun} and Kung, {Perng Hwa} and Li, {Cheng Te} and Chen, {Chun An} and Lin, {Shou De}",
year = "2013",
month = "12",
day = "1",
doi = "10.1109/ICDM.2013.102",
language = "English",
pages = "1247--1252",
journal = "Proceedings - IEEE International Conference on Data Mining, ICDM",
issn = "1550-4786",

}

Sampling heterogeneous networks. / Yang, Cheng Lun; Kung, Perng Hwa; Li, Cheng Te; Chen, Chun An; Lin, Shou De.

In: Proceedings - IEEE International Conference on Data Mining, ICDM, 01.12.2013, p. 1247-1252.

Research output: Contribution to journalConference article

TY - JOUR

T1 - Sampling heterogeneous networks

AU - Yang, Cheng Lun

AU - Kung, Perng Hwa

AU - Li, Cheng Te

AU - Chen, Chun An

AU - Lin, Shou De

PY - 2013/12/1

Y1 - 2013/12/1

N2 - Online social networks are mainly characterized by large-scale and heterogeneous semantic relationships. Unfortunately, for online social network services such as Facebook or Twitter, it is very difficult to obtain the fully observed network without privilege to access the data internally. To address the above needs, social network sampling is a means that aims at identifying a representative sub graph that preserves certain properties of the network, given the information of any instance in the network is unknown before being sampled. This study tackles heterogeneous network sampling by considering the conditional dependency of node types and link types, where we design a property, Relational Profile, to account such characterization. We further propose a sampling method to preserve this property. Lastly, we propose to evaluate our model from three different angles. First, we show that the proposed sampling method can more faithfully preserve the Relational Profile. Second, we evaluate the usefulness of the Relational Profile showing such information is beneficial for link prediction tasks. Finally, we evaluate whether the networks sampled by our method can be used to train more accurate prediction models comparing to networks produced by other methods.

AB - Online social networks are mainly characterized by large-scale and heterogeneous semantic relationships. Unfortunately, for online social network services such as Facebook or Twitter, it is very difficult to obtain the fully observed network without privilege to access the data internally. To address the above needs, social network sampling is a means that aims at identifying a representative sub graph that preserves certain properties of the network, given the information of any instance in the network is unknown before being sampled. This study tackles heterogeneous network sampling by considering the conditional dependency of node types and link types, where we design a property, Relational Profile, to account such characterization. We further propose a sampling method to preserve this property. Lastly, we propose to evaluate our model from three different angles. First, we show that the proposed sampling method can more faithfully preserve the Relational Profile. Second, we evaluate the usefulness of the Relational Profile showing such information is beneficial for link prediction tasks. Finally, we evaluate whether the networks sampled by our method can be used to train more accurate prediction models comparing to networks produced by other methods.

UR - http://www.scopus.com/inward/record.url?scp=84894683410&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84894683410&partnerID=8YFLogxK

U2 - 10.1109/ICDM.2013.102

DO - 10.1109/ICDM.2013.102

M3 - Conference article

AN - SCOPUS:84894683410

SP - 1247

EP - 1252

JO - Proceedings - IEEE International Conference on Data Mining, ICDM

JF - Proceedings - IEEE International Conference on Data Mining, ICDM

SN - 1550-4786

M1 - 6729629

ER -