CanaDAM 2017
Université Ryerson, 12 - 16 juin 2017

Graphs and Data

The Combinatorial Data Fusion Problem  [PDF]

Different data sources place incompatible relational structures on a set $V$. In credit fraud, $V$ may be merchants, addresses, credit-card accounts, and user profiles. A data fusion algorithm achieves global consistency by cutting certain relationships.

Define two combinatorial structures on vertex set $V$:


\item A connected edge-weighted graph $G = (V,E_0, w)$, weights $w:E_0\rightarrow (0, \infty)$.

\item A hypergraph $(V, \mathcal{F})$ where $\mathcal{F} \cap E_0 = \emptyset$, with no singletons, no $F \in \mathcal{F}$ a proper subset of another. Call $\mathcal{F}$ the {\bf forbidden sets}.


Any $E_1 \subseteq E_0$ induces a partition of $V$ into disjoint graph components. The \textbf{data fusion problem} is to find $E_1 \subset E_0$ of maximum total weight such that no graph component of $(V, E_1)$ contains a forbidden set.

Multicut and multiway cut are special cases where $|F|=2$, $\forall F \in \mathcal{F}$. We present some exact algorithms where applicable, and approximation algorithms otherwise.

KAMAL GUPTA, Indian Institute of Technology Roorkee, India
Fractal Modelling of Earthquake Sequence Information using Iterative Function Systems  [PDF]

Kamal and Ekansh Gupta

Indian Institute of Technology Roorkee, Roorkee, India. email:;\\

We present an efficient method to process and analyze information hidden in earthquake sequences around the globe using Iterative Function Systems (IFS) of a dynamic process. Data created from earthquakes observed around the world in last 30 years are used to generate IFS images for several parameters like magnitude and depth of earthquakes. The entire data range is scanned, using a mathematical method called coarse graining, in four bins and selecting a specific data window. Since Earth is a dynamic system, the variations in the patterns manifest interesting fractal patterns. A four cornered chaos-game was analyzed and definite patterns like sierpinski gasket and straight line are seen indicating preferences in earthquake sequences over a particular magnitude or depth range. The process may be automatize for information processing.

RUCHA JOSHI, Westwood High School
Method and Complexity of Finding Spatially Correlated Fault Zones in Dynamic 3D Networks  [PDF]

In dynamic 3D networks, the connection between all nodes is essential as it ensures proper quality of service. The coverage is modeled by the wireless spherical range of each node that will connect with its neighbors if within range. A 3D spatially correlated fault zone is the spherical volume with center (x, y, z) and radius ‘r’ for which the removal of nodes in the region leads to a decrease in network connectivity. Obtaining the fault zones using the smallest number of deterministic and placed ‘spheres’ is desired. This can identify potential vulnerabilities in the network and be used defensively to strengthen the network or offensively to cause maximum damage. In this paper we find a polynomial number of such spatially correlated faults and their locations in polynomial time to cover any ‘n’ randomly placed nodes in 3D space, when the radius of the fault zone is given.

REBECCA KOTSONIS, Wake Forest University
A New Look at Clustering Coefficients with Generalization to Weighted and Multi-Faction Networks  [PDF]

In this talk we propose a new method for studying local and global clustering in networks employing random walk pairs. The method is intuitive and directly generalizes standard local and global clustering coefficients to weighted networks and networks containing nodes of multiple types. In the case of two-mode networks, the values obtained for commonly considered social networks are in stark contrast to those obtained previously and provide a different viewpoint for clustering. The approach is also applicable in questions related to the study of segregation and homophily, in a general sense. Applications to existent data sets are considered.

XITENG LIU, Advanced Micro Devices, Inc
Essential Data Elements  [PDF]

In tradition, equivalent data matrices or ordered data sets must be in the same size. We introduce a new computing method that breaches the tradition and builds equivalence between data sets in different sizes. Essential data elements are extracted from data sets. And The full data sets can be recovered from the essential elements. Both extraction and recovery may be accomplished in real time. Equivalence between data sets and their essential subsets is thus realized by the invertible operations. This equivalence changes traditional concepts and methodologies in mathematics, data science and data processing engineering. The new method may be extensively applied to image and video technologies, medical imaging, remote sensing, wireless communication and so on. It greatly improves efficiency of data acquisition, compression and visualization, especially in resolution scalable applications. It makes revolutionary improvement in technical performance than existing methods. Demo software and testing data are downloadable at


Atlantic Association for Research in the Mathematical Sciences Centre de recherches mathmatiques The Fields Institute Pacific Institute for the Mathematical Sciences Socit mathmatique du Canada Université Ryerson Office of Naval Research Science and Technology