graphslim.dataset package

graphslim.dataset.attack module

graphslim.dataset.attack.attack(data, args)[source]

graphslim.dataset.convertor module

graphslim.dataset.convertor.csr2ei(adjacency_matrix_csr)[source]
graphslim.dataset.convertor.dense2sparsetensor(mat: torch.Tensor, has_value: bool = True)[source]
graphslim.dataset.convertor.ei2csr(edge_index, num_nodes)[source]
graphslim.dataset.convertor.from_dgl(g, name, hetero=True)[source]
graphslim.dataset.convertor.loadSparseGraph(dataset_name)[source]

Load original graph from file from paper CHEN Y, YE H, VEDULA S, et al. Demystifying graph sparsification algorithms in graph properties preservation[M/OL].

GraphSlim package only supports undirected graph and we do not distinguish the weighted and unweighted pyg->nt->save sparsified nt->pyg->evaluation

Parameters:
  • dataset_name (str) – dataset name

  • config (dict) – config loaded from json

  • undirected_only (bool, optional) – Set to True to override graph directness in config file and load undirected graph only. Defaults to False. This is used for sparsifiers that only support undirected graph.

Returns:

original graph

Return type:

nk graph

graphslim.dataset.convertor.networkit_to_pyg(graph)[source]
graphslim.dataset.convertor.pyg2gsp(edge_index)[source]
graphslim.dataset.convertor.pyg_to_networkit(pyg_graph)[source]

graphslim.dataset.loader module

class graphslim.dataset.loader.DataGraphSAINT(root, dataset, **kwargs)[source]

Bases: object

datasets used in GraphSAINT paper

get(idx)[source]
process_labels(class_map)[source]

setup vertex property map for output classests

class graphslim.dataset.loader.LargeDataLoader(*args: Any, **kwargs: Any)[source]

Bases: Module

GCF(adj, x, k=2)[source]

Graph convolution filter :param adj: torch.Tensor, adjacency matrix, must be self-looped :param x: torch.Tensor, features :param k: int, number of hops

Returns:

torch.Tensor, filtered features

get_batch(i)[source]
getitem(idx)[source]

对于给定的 idx 输出对应的 node_features, labels, sub Ajacency matrix

normalize_data(data)[source]

normalize data :param data: torch.Tensor, data need to be normalized

Returns:

torch.Tensor, normalized data

properties()[source]
split_batch()[source]

split data into batches :param split_method: str, method to split data, default is ‘kmeans’

class graphslim.dataset.loader.OgbDataLoader(*args: Any, **kwargs: Any)[source]

Bases: Module

class graphslim.dataset.loader.TransAndInd(data, dataset, norm=True)[source]

Bases: object

pyg_saint(data)[source]
reset()[source]
retrieve_class(c, num=256)[source]
retrieve_class_sampler(c, adj, args, num=256)[source]
to(device)[source]

Move data to the specified device.

graphslim.dataset.loader.get_dataset(name='cora', args=None, load_path='./data')[source]

graphslim.dataset.utils module

graphslim.dataset.utils.canonical_label_to_naturals(dataset_labels)[source]
graphslim.dataset.utils.disjointed_union(tree_list, class_=None, device=None)[source]

Computes disjointed union of trees inside tree_list. trees are in torch_geometric.data.Data format. Returns a torch_geometric.data.Data graph without the roots_to_embed information.

graphslim.dataset.utils.get_data(node_label_map, edge_label_map, node_label_map_original, node_label_map_orig, edge_label_map_orig, node_label_map_full, class_)[source]
graphslim.dataset.utils.get_dataloader(dataset_classwise, **kwargs)[source]
graphslim.dataset.utils.get_invalid_trees(tree_class_count)[source]
graphslim.dataset.utils.get_label_maps(dataset)[source]
graphslim.dataset.utils.get_syn_data(data, args, model_type, verbose=False)[source]

Loads or computes synthetic data for evaluation.

Parameters:
  • data (Dataset) – The dataset containing the graph data.

  • model_type (str) – The type of model used for generating synthetic data.

  • verbose (bool, optional, default=False) – Whether to print detailed logs.

Returns:

  • feat_syn (torch.Tensor) – Synthetic feature matrix.

  • adj_syn (torch.Tensor) – Synthetic adjacency matrix.

  • labels_syn (torch.Tensor) – Synthetic labels.

graphslim.dataset.utils.index2mask(index, size)[source]

Convert an index list to a boolean mask.

Parameters:
  • index (list or tensor) – List or tensor of indices to be set to True.

  • size (int or tuple of int) – Shape of the mask. If an integer, the mask is 1-dimensional.

Returns:

mask – A boolean tensor of the specified size, with True at the given index positions and False elsewhere.

Return type:

tensor

Examples

>>> index = [0, 2, 4]
>>> size = 5
>>> index2mask(index, size)
tensor([True, False, True, False, True], dtype=torch.bool)
graphslim.dataset.utils.load_reduced(args, data=None)[source]
class graphslim.dataset.utils.myDataset(*args: Any, **kwargs: Any)[source]

Bases: Dataset

get(idx)[source]
len()[source]
graphslim.dataset.utils.parse_canonical_label(label)[source]
graphslim.dataset.utils.parse_canonical_label_bak(label)[source]
graphslim.dataset.utils.preprocess_dataset(dataset, full_dataset)[source]
graphslim.dataset.utils.preprocess_dataset_test(dataset, node_label_map, edge_label_map, node_label_map_full)[source]
graphslim.dataset.utils.prettify_canonical_label(label)[source]
graphslim.dataset.utils.process_labels(dataset, node_label_map, node_label_map_full)[source]
graphslim.dataset.utils.pyfpgrowth_wrapper(classwise, freq_thresholds)[source]
graphslim.dataset.utils.roots_to_embed(data)[source]
graphslim.dataset.utils.save_reduced(adj_syn=None, feat_syn=None, labels_syn=None, args=None)[source]
graphslim.dataset.utils.sparsify(model_type, adj_syn, args, verbose=False)[source]

Applies sparsification to the adjacency matrix based on the model type and given arguments.

This function modifies the adjacency matrix to make it sparser according to the model type and method specified. For specific methods and datasets, it adjusts the threshold used for sparsification.

Parametersm

model_typestr

The type of model used, which determines the sparsification strategy. Can be ‘MLP’, ‘GAT’, or other.

adj_syntorch.Tensor

The adjacency matrix to be sparsified.

argsargparse.Namespace

Command-line arguments and configuration parameters which may include method-specific settings.

verbosebool, optional

If True, prints information about the sparsity of the adjacency matrix before and after sparsification. Default is False.

returns:

adj_syn – The sparsified adjacency matrix.

rtype:

torch.Tensor

graphslim.dataset.utils.splits(data, exp='default')[source]
graphslim.dataset.utils.tree_class_ctr(classes, dataset)[source]