Great recent article in UDaily about our joint project with Argonne National Lab, JPMorgan Chase, Fujitsu and Menten AI. We discuss the obstacles and opportunities of quantum computing in finance.
We solve algorithmic, modeling and computational problems in AI, machine learning, quantum computing, network science, graphs, combinatorial scientific computing and more.
Tuesday, March 12, 2024
Thursday, January 4, 2024
Benchmarking Biomedical Literature-based Discovery and Hypothesis Generation Systems
Update: The paper is accepted for publication in BMC Bioinformatics.
We introduce a benchmarking framework Dyport for evaluating biomedical hypothesis generation (HG) and literature based discovery (LBD) systems. The evaluation of HG and LBD is still one of the major problems of these systems, especially when it comes to fully automated large-scale general purpose systems. For these, a massive assessment (that is normal in the machine learning and general AI domains) is often infeasible. One traditional evaluation approach is to make a system “rediscover” some of the landmark findings. However, this approach does not scale. Another traditional approach is to automatically discover some information in biomedical texts, train the system on historical data and test it on that "future" information. While this approach does scale well, the reliability and biomedical importance of the extracted test set are far from being illuminating.
Utilizing curated datasets, Dyport tests HG/LBD systems under realistic conditions, enhancing the relevance of our evaluations. We integrate knowledge from the curated databases into a dynamic graph, accompanied by a method to quantify discovery importance. This not only assesses hypothesis accuracy but also their potential impact in biomedical research which significantly extends traditional link prediction benchmarks. Applicability of our benchmarking process is demonstrated on several link prediction systems applied on biomedical semantic knowledge graphs. Being flexible, our benchmarking system is designed for broad application in hypothesis generation quality verification, aiming to expand the scope of scientific discovery within the biomedical research community.
Dyport is available at https://github.com/IlyaTyagin/Dyport
Paper: https://arxiv.org/pdf/2312.03303.pdf
Thursday, December 14, 2023
Ankit Kulshrestha's PhD Defense
Congratulations to Dr. Ankit Kulshrestha for successfully defending his Ph.D. thesis "A Machine Learning Approach To Improve Scalability and Robustness of Variational Quantum Circuits"! Ankit will join the quantum computing research group at Fujitsu Research USA.
Friday, September 1, 2023
Welcoming our new PhD students!
We're pleased to share that our group has expanded with the addition of two PhD students.
Bao Bach, joining from the Quantum Science and Engineering program, will be working on quantum machine learning and variational quantum algorithms.
Saeideh Valipour has started her journey with us through the Computer Science PhD program. Her research areas include text mining, machine learning, and biomedical data analysis.
Let's extend a warm welcome to Bao and Saeideh as they begin their research endeavors with our group.
Tuesday, August 29, 2023
Two papers are accepted at IEEE High-Performance and Extreme Computing
Monday, July 10, 2023
Learning To Optimize Quantum Neural Network Without Gradients
Our paper is accepted in IEEE Quantum Computing and Engineering (QCE) 2023. The preprint is available at https://arxiv.org/abs/2304.07442
Congratulations to PhD student Ankit Kulshrestha and two former students from our lab, now quantum computing research scientists at Fujitsu Research, Xiaoyuan Liu and Hyato Ushijima-Mwesigwa!
Quantum Machine Learning is an emerging sub-field in machine learning where one of the goals is to perform pattern recognition tasks by encoding data into quantum states. This extension from classical to quantum domain has been made possible due to the development of hybrid quantum-classical algorithms that allow a parameterized quantum circuit to be optimized using gradient based algorithms that run on a classical computer. The similarities in training of these hybrid algorithms and classical neural networks has further led to the development of Quantum Neural Networks (QNNs). However, in the current training regime for QNNs, the gradients w.r.t objective function have to be computed on the quantum device. This computation is highly non-scalable and is affected by hardware and sampling noise present in the current generation of quantum hardware. In this paper, we propose a training algorithm that does not rely on gradient information. Specifically, we introduce a novel meta-optimization algorithm that trains a meta-optimizer network to output parameters for the quantum circuit such that the objective function is minimized. We empirically and theoretically show that we achieve a better quality minima in fewer circuit evaluations than existing gradient based algorithms on different datasets.
Thursday, June 1, 2023
Accepted paper in Nature Physics Reviews
Monday, May 1, 2023
Graduate student achievement
Congratulations to our PhD student Ankit Kulshrestha for receiving Rama and Shashi Marda fellowship for graduate studies! Ankit's research is in the areas of quantum and classical machine learning.
Saturday, April 1, 2023
Biomedical hypothesis generation and literature-based discovery for landscape design
It is incredibly exciting to see our work transcend the bounds of its intended domain and find novel applications in unexpected areas. We initially developed AGATHA, a biomedical hypothesis generation system, to facilitate knowledge discovery within biomedical sciences. Yet, to our surprise and excitement, AGATHA's algorithm found an unanticipated yet transformative use in landscape design. AGATHA's unique ability to identify intricate conceptual relationships was exploited to understand the nexus between emerging infectious diseases (EIDs) and deforestation, enabling landscape planners to tread innovative research pathways.
Our paper is available at arXiv https://arxiv.org/abs/2306.02588
David Marasco, Ilya Tyagin, Justin Sybrandt, James H. Spencer, Ilya Safro "Literature-based Discovery for Landscape Planning"
Sunday, January 15, 2023
Generating and optimizing water networks
Our paper is accepted in the Journal of Pipeline Systems Engineering!
Ahmad Momeni, Varsha Chauhan, Abdulrahman Bin Mahmoud, Kalyan Piratla, Ilya Safro "Generation of Synthetic Water Distribution Data Using a Multi-Scale Generator-Optimizer", Journal of Pipeline Systems Engineering and Practice, vol. 14(1), https://ascelibrary.org/doi/full/10.1061/JPSEA2.PSENG-1358, 2023
Rare or limited access to real-world data has widely been a stumbling block for the development and employment of design optimization and simulation models in water distribution systems (WDS). Primary reasons for such accessibility issues could include data unavailability and high security protocols. Synthetic data can play a major role as a reliable alternative to mimic and replicate real-world WDS for modeling purposes. This study puts forth a comprehensive approach to generate synthetic WDS infrastructural data by (1) employing graph-theory concepts to generate multitudinous WDS skeleton layouts through retaining the critical topological features of a given real WDS; and (2) assigning component sizes and operational features such as nodal demands, pump curves, pipe sizes, and tank elevations to the generated WDS skeleton layouts through a multiobjective genetic algorithm (GA)–based design optimization scheme. Thousands of such generated-optimized networks are statistically analyzed in terms of the fundamental WDS characteristics both collectively and granularly. An outstanding novelty in this study includes an automatedly integrated algorithmic function that attempts to (1) simultaneously optimize the generated network in a biobjective scheme, (2) rectify pipe intersections that violate pipeline embedding standards, and (3) correct the unusual triangular loops in the generator by honoring the conventional square-shaped loop connectivity in a WDS. The proposed modeling approach was demonstrated in this study using the popular Anytown water distribution benchmark system. Generation and optimization of such representative synthetic networks pave the way for extensive access to representative case-study models for academic and industrial purposes while the security of the real-world infrastructure data is not compromised.
Friday, September 30, 2022
Optimal Contraction Trees for Tensor Network Quantum Circuit Simulation
We are excited to announce that our paper received the best student paper award at #IEEE High Performance Extreme Computing #HPEC2022! Leading student is Cameron Ibrahim. Congratulations to all!
Cameron Ibrahim, Danylo Lykov, Zichang He, Yuri Alexeev, Ilya Safro "Constructing Optimal Contraction Trees for Tensor Network Quantum Circuit Simulation", 2022, preprint at https://arxiv.org/abs/2209.02895
One of the key problems in tensor network based quantum circuit simulation is the construction of a contraction tree which minimizes the cost of the simulation, where the cost can be expressed in the number of operations as a proxy for the simulation running time. This same problem arises in a variety of application areas, such as combinatorial scientific computing, marginalization in probabilistic graphical models, and solving constraint satisfaction problems. In this paper, we reduce the computationally hard portion of this problem to one of graph linear ordering, and demonstrate how existing approaches in this area can be utilized to achieve results up to several orders of magnitude better than existing state of the art methods for the same running time. To do so, we introduce a novel polynomial time algorithm for constructing an optimal contraction tree from a given order. Furthermore, we introduce a fast and high quality linear ordering solver, and demonstrate its applicability as a heuristic for providing orderings for contraction trees. Finally, we compare our solver with competing methods for constructing contraction trees in quantum circuit simulation on a collection of randomly generated Quantum Approximate Optimization Algorithm Max Cut circuits and show that our method achieves superior results on a majority of tested quantum circuits.
Thursday, July 14, 2022
Recently accepted papers in optimization and quantum and quantum inspired computing.
- Xiaoyuan Liu, Hayato Ushijima-Mwesigwa, Indradeep Ghosh, Ilya Safro "Partitioning Dense Graphs with Hardware Accelerators", International Conference on Computational Science (ICCS), 2022, preprint at https://arxiv.org/pdf/2202.09420.pdf
- Ankit Kulshrestha, Ilya Safro "BEINIT: Avoiding Barren Plateaus in Variational Quantum Algorithms", IEEE Quantum Computing and Engineering (QCE), 2022, preprint at https://arxiv.org/abs/2204.13751
- Xiaoyuan Liu, Ruslan Shaydulin, Ilya Safro "Quantum Approximate Optimization Algorithm with Sparsified Phase Operator", IEEE Quantum Computing and Engineering (QCE), 2022, preprint at https://arxiv.org/abs/2205.00118
- Xiaoyuan Liu, Hayato Ushijima-Mwesigwa, Avradip Mandal, Sarvagya Upadhyay, Ilya Safro, Arnab Roy "Leveraging Special-Purpose Hardware for Local Search Heuristics", Computational Optimization and Applications, 2022, preprint at https://arxiv.org/pdf/1911.09810.pdf
Wednesday, June 1, 2022
Quantum Computing for Finance
Quantum computers are expected to surpass the computational capabilities of classical computers and have transformative impact on numerous industry sectors, particularly finance. Finance is estimated to be one of the first industry sectors to benefit from quantum computing, not only in the medium and long terms, but even in the short term (however, this is subject to overcoming several engineering and algorithmic obstacles).
Our team (Argonne National Lab - JPMorgan Chase - Menten AI - University of Chicago - University of Delaware) presents a comprehensive summary of the state of the art of quantum computing for financial applications, with particular emphasis on stochastic modeling, optimization, and machine learning, describing how these solutions, adapted to work on a quantum computer, can potentially help to solve financial problems, such as derivative pricing, risk modeling, portfolio optimization, natural language processing, and fraud detection, more efficiently and accurately. We also discuss the feasibility of these algorithms on nearterm quantum computers with various hardware implementations and demonstrate how they relate to a wide range of use cases in finance as well as discuss various obstacles to achieve these goals. We hope this article will not only serve as a reference for academic researchers and industry practitioners but also inspire new ideas for future research.
Herman, Dylan, Cody Googin, Xiaoyuan Liu, Alexey Galda, Ilya Safro, Yue Sun, Marco Pistoia, and Yuri Alexeev. "A survey of quantum computing for finance." arXiv preprint https://arxiv.org/abs/2201.02773 (2022).
Monday, May 23, 2022
Graduate Student Achievement Award
Congratulations to my former PhD student Xiaoyuan Joey Liu for winning Frank A. Pehrson Graduate Student Achievement Award! Her thesis is titled "Combinatorial Optimization Algorithms on Quantum and Quantum Inspired Devices". Xiaoyuan is now a research scientist at Fujitsu working in the area of quantum computing. This award is given to a computer science graduate student in recognition of outstanding performance, and potential for future significant contribution to the field of computer science.
Wednesday, December 29, 2021
Accelerating COVID-19 scientific discovery with hypothesis generation system
In 2020, the White House released the, “Call to Action to the Tech Community on New Machine Readable COVID-19 Dataset,” wherein artificial intelligence experts are asked to collect data and develop text mining techniques that can help the science community answer high-priority scientific questions related to COVID-19. The Allen Institute for AI and collaborators announced the availability of a rapidly growing open dataset of publications, the COVID-19 Open Research Dataset (CORD-19). As the pace of research accelerates, biomedical scientists struggle to stay current. To expedite their investigations, scientists leverage hypothesis generation systems, which can automatically inspect published papers to discover novel implicit connections.
Ilya Tyagin, Ankit Kulshrestha, Justin Sybrandt, Krish Matta, Michael Shtutman, Ilya Safro "Accelerating COVID-19 research with graph mining and transformer-based learning", Innovative Applications of Artificial Intelligence (AAAI/IAAI), preprint at https://www.biorxiv.org/content/10.1101/2021.02.11.430789v1, 2022
We present an automated general purpose hypothesis generation systems AGATHA-C and AGATHA-GP for COVID-19 research. The systems are based on graph-mining and the transformer model. The systems are massively validated using retrospective information rediscovery and proactive analysis involving human-in-the-loop expert analysis. Both systems achieve high-quality predictions across domains (in some domains up to 0.97% ROC AUC) in fast computational time and are released to the broad scientific community to accelerate biomedical research. In addition, by performing the domain expert curated study, we show that the systems are able to discover on-going research findings such as the relationship between COVID-19 and oxytocin hormone.
Transferring QAOA parameters from small instances to large
The Quantum approximate optimization algorithm (QAOA) is one of the most promising candidates for achieving quantum advantage through quantum-enhanced combinatorial optimization. However, finding optimal parameters to run QAOA is a hard problem that is usually tackled by variational loop, i.e., the circuit is executed with the current parameters on a quantum device, and then the parameters are updated on a classical device. Such variational loops are extremely time consuming. In our recent paper we find the QAOA parameters for small optimization instances and transfer them to the larger ones accelerating the entire process.
Alexey Galda, Xiaoyuan Liu, Danylo Lykov, Yuri Alexeev, Ilya Safro "Transferability of optimal QAOA parameters between random graphs", IEEE International Conference on Quantum Computing and Engineering (QCE), preprint at https://arxiv.org/pdf/2106.07531.pdf, 2021
In a typical QAOA setup, a set of quantum circuit parameters is optimized to prepare a quantum state used to find the optimal solution of a combinatorial optimization problem. Several empirical observations about optimal parameter concentration effects for special QAOA MaxCut problem instances have been made in recent literature, however, a rigorous study of the subject is still lacking. We show that convergence of the optimal QAOA parameters around specific values and, consequently, successful transferability of parameters between different QAOA instances can be explained and predicted based on the local properties of the graphs, specifically the types of subgraphs (lightcones) from which the graphs are composed. We apply this approach to random regular and general random graphs. For example, we demonstrate how optimized parameters calculated for a 6-node random graph can be successfully used without modification as nearly optimal parameters for a 64-node random graph, with less than 1% reduction in approximation ratio as a result. This work presents a pathway to identifying classes of combinatorial optimization instances for which such variational quantum algorithms as QAOA can be substantially accelerated.
Friday, December 10, 2021
Congratulations to Dr. Joey!
Congratulations to Dr. Xiaoyuan Joey Liu for successfully defending her Ph.D. thesis "Combinatorial Optimization Algorithms on Quantum and Quantum-inspired Devices"! Joey will join quantum computing research group at Fujitsu Research USA.
Wednesday, September 1, 2021
NIH funding for literature based discovery
NIH awarded $2.11M grant for University of South Carolina (PI Shtutman) - University of Delaware (PI Safro) collaborative project "Knowledge discovery and machine learning to elucidate the mechanisms of HIV activity and interaction with substance use disorder." This work will leverage our hypothesis generation model AGATHA that is based on information extracted from full MEDLINE.
https://github.com/IlyaTyagin/AGATHA-C-GP
Here is its recent customization for COVID-19 in which Medline is fused with CORD-19, the dataset of all COVID-19 related papers.
https://arxiv.org/abs/2102.07631
Monday, August 30, 2021
Tutorial on the quantum approximate optimization algorithm, its applications and simulation
We recorded this tutorial for IEEE International Conference on Quantum Computing and Engineering (QCE) 2020. This tutorial consists of four parts:
- QAOA theory and quantum computing basics
- Hands on example of QAOA and MaxCut
- Introduction to problem decomposition and solving large-scale problems with QAOA
- Tensor networks and simulation of QAOA with classical computers
Thursday, February 25, 2021
Multilevel Graph Partitioning for Three-Dimensional Discrete Fracture Network Flow Simulations
Combinatorial scientific computing in action! Our paper on accelerating 3D discrete fracture network flow simulations by multilevel graph partitioning is accepted in Mathematical Geosciences!
Hayato Ushijima-Mwesigwa, Jeffrey D. Hyman, Aric Hagberg, Ilya Safro, Satish Karra, Carl W. Gable, Gowri Srinivasan "Multilevel Graph Partitioning for Three-Dimensional Discrete Fracture Network Flow Simulations", accepted in Mathematical Geosciences, preprint at https://arxiv.org/abs/1902.08029, 2020
We present a topology-based method for mesh-partitioning in three-dimensional discrete fracture network (DFN) simulations that takes advantage of the intrinsic multi-level nature of a DFN. DFN models are used to simulate flow and transport through low-permeability fracture media in the subsurface by explicitly representing fractures as discrete entities. The governing equations for flow and transport are numerically integrated on computational meshes generated on the interconnected fracture networks. Modern high-fidelity DFN simulations require high-performance computing on multiple processors where performance and scalability depends partially on obtaining a high-quality partition of the mesh to balance work work-loads and minimize communication across all processors.
The discrete structure of a DFN naturally lends itself to various graph representations, which can be thought of as coarse-scale representations of the computational mesh. Using this concept, we develop a variant of the multilevel graph partitioning algorithm to partition the mesh of a DFN. We compare the performance of this DFN-based mesh-partitioning with standard multi-level graph partitioning using graphbased metrics (cut, imbalance, partitioning time), computational-based metrics (FLOPS, iterations, solver time), and total run time. The DFN-based partition and the mesh-based partition are comparable in terms of the graph-based metrics, but the time required to obtain the partition is several orders of magnitude faster using the DFN-based partition. The computation-based metrics show comparable performance between both methods so, in combination, the DFN-based partition is several orders of magnitude faster than the mesh-based partition.
ELRUNA: Network Alignment Algorithm
Our network alignment algorithm ELRUNA is accepted in ACM Journal of Experimental Algorithmics. Turns out that three relatively simple node similarity rules can successfully compete with several state of the art algorithms and improve both the running time and alignment quality. You can get it at https://github.com/BridgelessAlexQiu/ELRUNA
Zirou Qiu, Ruslan Shaydulin, Xiaoyuan Liu, Yuri Alexeev, Christopher S. Henry, Ilya Safro "ELRUNA: Elimination Rule-based Network Alignment", accepted in ACM Journal of Experimental Algorithmics, preprint at https://arxiv.org/abs/1911.05486, 2020
Networks model a variety of complex phenomena across different domains. In many applications, one of the most essential tasks is to align two or more networks to infer the similarities between cross-network vertices and discover potential node-level correspondence. In this paper, we propose ELRUNA (Elimination rule-based network alignment), a novel network alignment algorithm that relies exclusively on the underlying graph structure. Under the guidance of the elimination rules that we defined, ELRUNA computes the similarity between a pair of cross-network vertices iteratively by accumulating the similarities between their selected neighbors. The resulting cross-network similarity matrix is then used to infer a permutation matrix that encodes the final alignment of cross-network vertices. In addition to the novel alignment algorithm, we also improve the performance of local search, a commonly used post-processing step for solving the network alignment problem, by introducing a novel selection method RAWSEM (Random walk based selection method) based on the propagation of the levels of mismatching (defined in the paper) of vertices across the networks. The key idea is to pass on the initial levels of mismatching of vertices throughout the entire network in a random-walk fashion. Through extensive numerical experiments on real networks, we demonstrate that ELRUNA significantly outperforms the state-of-the-art alignment methods in terms of alignment accuracy under lower or comparable running time. Moreover, ELRUNA is robust to network perturbations such that it can maintain a close to optimal objective value under a high level of noise added to the original networks. Finally, the proposed RAWSEM can further improve the alignment quality with a less number of iterations compared with the naive local search method.
Thursday, February 11, 2021
Can we outperform Quantum Approximate Optimization Algorithm?
Check our new paper:
Xiaoyuan Liu, Anthony Angone, Ruslan Shaydulin, Ilya Safro, Yuri Alexeev, Lukasz Cincio "Layer VQE: A Variational Approach for Combinatorial Optimization on Noisy Quantum Computers", preprint at https://arxiv.org/abs/2102.05566, 2021
We propose a hybrid quantum-classical algorithm, Layer Variational Quantum Eigensolver (L-VQE), inspired by the Variational Quantum Eigensolver (VQE). L-VQE is a heuristic approach to solve combinatorial optimization problems on near term intermediate-scale quantum devices. We demonstrate the potential of the proposed approach by applying it to the problem of community detection, a famous problem in network science. Our large-scale numerical simulation study shows that L-VQE has the potential to outperform Quantum Approximate Optimization Algorithm (QAOA), and is more robust to sampling noise as compared with standard VQE approaches.
Literature-based knowledge discovery to accelerate COVID-19 research
Wednesday, December 23, 2020
Reboot the Computing-Research Publication Systems
Saturday, December 12, 2020
Do we always need big data in text mining?
Do we always need Big Data text mining? Can we filter it? Check our new paper "Accelerating Text Mining Using Domain-Specific Stop Word Lists" accepted at IWBDR https://arxiv.org/pdf/2012.02294.pdf
Text preprocessing is an essential step in text mining. Removing words that can negatively impact the quality of prediction algorithms or are not informative enough is a crucial storage-saving technique in text indexing and results in improved computational efficiency. Typically, a generic stop word list is applied to a dataset regardless of the domain. However, many common words are different from one domain to another but have no significance within a particular domain. Eliminating domain-specific common words in a corpus reduces the dimensionality of the feature space, and improves the performance of text mining tasks. In this paper, we present a novel mathematical approach for the automatic extraction of domain-specific words called the hyperplane-based approach. This new approach depends on the notion of low dimensional representation of the word in vector space and its distance from hyperplane. The hyperplane-based approach can significantly reduce text dimensionality by eliminating irrelevant features. We compare the hyperplane-based approach with other feature selection methods, namely \c{hi}2 and mutual information. An experimental study is performed on three different datasets and five classification algorithms, and measure the dimensionality reduction and the increase in the classification performance. Results indicate that the hyperplane-based approach can reduce the dimensionality of the corpus by 90% and outperforms mutual information. The computational time to identify the domain-specific words is significantly lower than mutual information.
Thursday, December 10, 2020
Quantum Approximate Optimization Algorithm and Symmetry
Our new paper QAOA symmetry and its predictability using classical symmetry is out! We analyze the connections between the symmetries of the objective function and the symmetries of QAOA dynamics, and applications to performance prediction, simulation and more https://scirate.com/arxiv/2012.04713
Ruslan Shaydulin, Stuart Hadfield, Tad Hogg, Ilya Safro "Classical symmetries and QAOA", 2020
We study the relationship between the Quantum Approximate Optimization Algorithm (QAOA) and the underlying symmetries of the objective function to be optimized. Our approach formalizes the connection between quantum symmetry properties of the QAOA dynamics and the group of classical symmetries of the objective function. The connection is general and includes but is not limited to problems defined on graphs. We show a series of results exploring the connection and highlight examples of hard problem classes where a nontrivial symmetry subgroup can be obtained efficiently. In particular we show how classical objective function symmetries lead to invariant measurement outcome probabilities across states connected by such symmetries, independent of the choice of algorithm parameters or number of layers. To illustrate the power of the developed connection, we apply machine learning techniques towards predicting QAOA performance based on symmetry considerations. We provide numerical evidence that a small set of graph symmetry properties suffices to predict the minimum QAOA depth required to achieve a target approximation ratio on the MaxCut problem, in a practically important setting where QAOA parameter schedules are constrained to be linear and hence easier to optimize.
Friday, November 6, 2020
On improving nonlinear SVM scalability using multilevel frameworks
Our paper "AML-SVM: Adaptive Multilevel Learning with Support Vector Machines" is accepted at 2020 IEEE International Conference on Big Data. https://arxiv.org/abs/2011.02592
The support vector machines (SVM) is one of the most widely used and practical optimization based classification models in machine learning because of its interpretability and flexibility to produce high quality results. However, the big data imposes a certain difficulty to the most sophisticated but relatively slow versions of SVM, namely, the nonlinear SVM. The complexity of nonlinear SVM solvers and the number of elements in the kernel matrix quadratically increases with the number of samples in training data. Therefore, both runtime and memory requirements are negatively affected. Moreover, the parameter fitting has extra kernel parameters to tune, which exacerbate the runtime even further. This paper proposes an adaptive multilevel learning framework for the nonlinear SVM, which addresses these challenges, improves the classification quality across the refinement process, and leverages multi-threaded parallel processing for better performance. The integration of parameter fitting in the hierarchical learning framework and adaptive process to stop unnecessary computation significantly reduce the running time while increase the overall performance. The experimental results demonstrate reduced variance on prediction over validation and test data across levels in the hierarchy, and significant speedup compared to state-of-the-art nonlinear SVM libraries without a decrease in the classification quality.
Saturday, September 19, 2020
How to use NISQ quantum devices to solve large-scale problems?
Our paper on multilevel hybrid quantum-classical optimization "Multilevel Combinatorial Ooptimization Across Quantum Architectures" is accepted in ACM Transactions on Quantum Computing! Big kudos to two lead former students Hayato and Ruslan! https://arxiv.org/abs/1910.09985
Emerging quantum processors provide an opportunity to explore new approaches for solving traditional problems in the post Moore's law supercomputing era. However, the limited number of qubits makes it infeasible to tackle massive real-world datasets directly in the near future, leading to new challenges in utilizing these quantum processors for practical purposes. Hybrid quantum-classical algorithms that leverage both quantum and classical types of devices are considered as one of the main strategies to apply quantum computing to large-scale problems. In this paper, we advocate the use of multilevel frameworks for combinatorial optimization as a promising general paradigm for designing hybrid quantum-classical algorithms. In order to demonstrate this approach, we apply this method to two well-known combinatorial optimization problems, namely, the Graph Partitioning Problem, and the Community Detection Problem. We develop hybrid multilevel solvers with quantum local search on D-Wave's quantum annealer and IBM's gate-model based quantum processor. We carry out experiments on graphs that are orders of magnitudes larger than the current quantum hardware size, and we observe results comparable to state-of-the-art solvers in terms of quality of the solution.
Friday, August 21, 2020
New NSF grant to develop a simulator for quantum computing
NSF awarded grant to develop large-scale QAOA simulator! This is a collaborative project with the Yuri Alexeev@University of Chicago.
Generating biomedical scientific hypotheses with AGATHA
Accepted paper in the 29TH ACM International Conference on Information and Knowledge Management (CIKM)
Sybrandt, Tyagin, Shtutman, Safro "AGATHA: Automatic Graph-mining and Transformer based Hypothesis Generation Approach", preprint at http://arxiv.org/pdf/2002.05635.pdf
Sunday, July 12, 2020
Two papers on graph representation learning
Accepted papers at Workshop on Mining and Learning with Graphs co-located with ACM KDD 2020!
1) Ding, Zhang, Sybrandt, Safro "Unsupervised Hierarchical Graph Representation Learning by Mutual Information Maximization", 2020
Graph representation learning based on graph neural networks (GNNs) can greatly improve the performance of downstream tasks, such as node and graph classification. However, the general GNN models do not aggregate node information in a hierarchical manner, and can miss key higher-order structural features of many graphs. The hierarchical aggregation also enables the graph representations to be explainable. In addition, supervised graph representation learning requires labeled data, which is expensive and error-prone. To address these issues, we present an unsupervised graph representation learning method, Unsupervised Hierarchical Graph Representation (UHGR), which can generate hierarchical representations of graphs. Our method focuses on maximizing mutual information between "local" and high-level "global" representations, which enables us to learn the node embeddings and graph embeddings without any labeled data. To demonstrate the effectiveness of the proposed method, we perform the node and graph classification using the learned node and graph embeddings. The results show that the proposed method achieves comparable results to state-of-the-art supervised methods on several benchmarks. In addition, our visualization of hierarchical representations indicates that our method can capture meaningful and interpretable clusters.
2) Sybrandt, Safro "FOBE and HOBE: First- and High-Order Bipartite Embeddings", 2020
Typical graph embeddings may not capture type-specific bipartite graph features that arise in such areas as recommender systems, data visualization, and drug discovery. Machine learning methods utilized in these applications would be better served with specialized embedding techniques. We propose two embeddings for bipartite graphs that decompose edges into sets of indirect relationships between node neighborhoods. When sampling higher-order relationships, we reinforce similarities through algebraic distance on graphs. We also introduce ensemble embeddings to combine both into a "best of both worlds" embedding. The proposed methods are evaluated on link prediction and recommendation tasks and compared with other state-of-the-art embeddings. While being all highly beneficial in applications, we demonstrate that none of the considered embeddings is clearly superior (in contrast to what is claimed in many papers), and discuss the trade offs present among them.
Wednesday, June 10, 2020
PhD thesis defense
Congratulations to Dr. Ruslan Shaydulin for successfully defending his Ph.D. thesis "Quantum and Classical Multilevel Algorithms for (Hyper)Graphs"! Ruslan will join Argonne National Lab with MGM fellowship in August. This is the second defense in our lab this week!
Monday, June 8, 2020
PhD thesis defense
Congratulations to Dr. Ehsan Sadrfaridpour for successfully defending his Ph.D. thesis "Fast Machine Learning Algorithms for Massive Datasets with Applications in Biomedical Domain"! Ehsan will join Lowe's data science team this summer.
Friday, May 15, 2020
Our work in HPCwire
Argonne Receives Two Awards from DARPA for Quantum Information Science
Tuesday, April 21, 2020
NSF grant to tackle COVID-19
Clemson news coverage: Artificial intelligence could aid in fight against COVID-19
Ilya Safro of @socclemson said that his team will soon roll out a new artificial intelligence system aimed at helping researchers explore the scientific literature as they strive for new discoveries to combat the novel coronavirus.https://t.co/TwQyZqEUra pic.twitter.com/fh3K8iAoWO
— Clemson Engineering, Computing & Applied Sciences (@ClemsonCECAS) April 21, 2020
Tuesday, April 14, 2020
Board game night
Board game night from quarantine! pic.twitter.com/6env2hICaa
— Justin Sybrandt, Ph.D. (@Justin_Sybrandt) March 28, 2020
Monday, April 13, 2020
Friday, April 3, 2020
From SoC@Clemson
Congrats to Dr. Ilya Safro on receiving funding from @DARPA to work on hybrid quantum-classical algorithms for combinatorial optimization problems, a 4 year collaborative research with @argonne. He is looking for Ph.D. students to join the Algorithms & Computational Science lab. pic.twitter.com/YfIYVc7HzO
— School of Computing (@socclemson) April 3, 2020
Thursday, March 26, 2020
PhD thesis defense
Congratulations to Dr. Justin Sybrandt for successfully defending his Ph.D. thesis "Exploiting Latent Features of Text and Graphs"! Justin will join Google Brain this summer.
Wednesday, February 19, 2020
New papers on biomedical NLP and hypothesis generation
New papers on biomedical NLP+hypothesis generation.
Sybrandt, Safro "CBAG: Conditional Biomedical Abstract Generation", http://arxiv.org/pdf/2002.05637.pdf,
Sybrandt, Tyagin, Shtutman, Safro "AGATHA: Automatic Graph-mining and Transformer based Hypothesis Generation" http://arxiv.org/pdf/2002.05635.pdf
Friday, January 17, 2020
Congratulations to PhD student Ruslan Shaydulin
Congratulations to PhD student Ruslan Shaydulin, advised by Dr. Ilya Safro, for receiving 2020 Maria Goeppert Mayer fellowship for postdoctoral studies at Argonne National Lab. Ruslan will start in Fall 2020. He was selected among many candidates from top schools and different disciplines.
Saturday, October 19, 2019
How to find the best location for wireless charging lanes
Our paper is accepted in Journal of Industrial Management and Optimization
Ushijima-Mwesigwa, Khan, Chowdhury, Safro "Optimal Installation for Electric Vehicle Wireless Charging Lanes", 2019
The emergence of electric vehicle wireless charging technology, where a whole lane can be turned into a charging infrastructure, leads to new challenges in the design and analysis of road networks. From a network perspective, a major challenge is determining the most important nodes with respect to the placement of the wireless charging lanes. In other words, given a limited budget, cities could face the decision problem of where to place these wireless charging lanes. With a heavy price tag, a placement without a careful study can lead to inefficient use of limited resources. In this work, the placement of wireless charging lanes is modeled as an integer programming problem. The basic formulation is used as a building block for different realistic scenarios. We carry out experiments using real geospatial data and compare our results to different network-based heuristics.
Reproducibility: all datasets, algorithm implementations and mathematical programming formulation presented in this work are available at https://github.com/hmwesigwa/smartcities.git
Monday, October 14, 2019
Wednesday, September 25, 2019
Multistart Methods for Quantum Approximate Optimization
Best student paper award at IEEE HPEC2019 goes to our lab! Congratulations to the leading student Ruslan Shaydulin!
Ruslan Shaydulin, Ilya Safro, Jeffrey Larson "Multistart Methods for Quantum Approximate Optimization", accepted at IEEE High Performance Extreme Computing Conference (HPEC) 2019, preprint at https://arxiv.org/abs/1905.08768
Hybrid quantum-classical algorithms such as the quantum approximate optimization algorithm (QAOA) are considered one of the most promising approaches for leveraging near-term quantum computers for practical applications. Such algorithms are often implemented in a variational form, combining classical optimization methods with a quantum machine to find parameters to maximize performance. The quality of the QAOA solution depends heavily on quality of the parameters produced by the classical optimizer. Moreover, multiple local optima in the space of parameters make it harder for the classical optimizer. In this paper we study the use of a multistart optimization approach within a QAOA framework to improve the performance of quantum machines on important graph clustering problems. We also demonstrate that reusing the optimal parameters from similar problems can improve the performance of classical optimization methods, expanding on similar results for MAXCUT.
Thursday, June 20, 2019
The best of both worlds: how to solve real problems on modern quantum computers
Our work on hybrid quantum-classical algorithms is featured in news
The best of both worlds: how to solve real problems on modern quantum computers
Wednesday, June 19, 2019
Hybrid quantum-classical algorithms
Our paper on hybrid quantum-classical algorithms is featured in IEEE Computer, the June's issue on quantum realism.
Ruslan Shaydulin, Hayato Ushijima-Mwesigwa, Christian F.A. Negre, Ilya Safro, Susan M. Mniszewski, Yuri Alexeev "Hybrid Approach for Solving Optimization Problems on Small Quantum Computers", IEEE Computer, vol. 52(6), pp. 18-26, 2019
Solving larger-sized problems is an important area of research in quantum computing. Designing hybrid quantum-classical algorithms is a promising approach to solving this. We discuss decomposition-based hybrid approaches for solving optimization problems and demonstrate them for applications related to community detection.
Thursday, May 23, 2019
Solving network community detection problem on quantum computers
Accepted paper in Advanced Quantum Technology journal
Shaydulin, Ushijima-Mwesigwa, Safro, Mniszewski, Alexeev "Network Community Detection On Small Quantum Computers", 2019, preprint at https://arxiv.org/abs/1810.12484
In recent years, a number of quantum computing devices with small numbers of qubits have become available. A hybrid quantum local search (QLS) approach that combines a classical machine and a small quantum device to solve problems of practical size is presented. The proposed approach is applied to the network community detection problem. QLS is hardware‐agnostic and easily extendable to new quantum computing devices as they become available. It is demonstrated to solve the 2‐community detection problem on graphs of sizes of up to 410 vertices using the 16‐qubit IBM quantum computer and D‐Wave 2000Q, and compare their performance with the optimal solutions. The results herein demonstrate that QLS performs similarly in terms of quality of the solution and the number of iterations to convergence on both types of quantum computers and it is capable of achieving results comparable to state‐of‐the‐art solvers in terms of quality of the solution including reaching the optimal solutions.
Monday, May 13, 2019
Monday, May 6, 2019
Synthetic planar graph generation
Accepted paper in Applied Network Science journal
Chauhan, Gutfraind, Safro "Multiscale planar graph generation", 2019, preprint at https://arxiv.org/abs/1802.09617
The study of network representations of physical, biological, and social phenomena can help us better understand their structure and functional dynamics as well as formulate predictive models of these phenomena. However, due to the scarcity of real-world network data owing to factors such as cost and effort required in collection of network data and the sensitivity of this data towards theft and misuse, engineers and researchers often rely on synthetic data for simulations, hypothesis testing, decision making, and algorithm engineering. An important characteristic of infrastructure networks such as roads, water distribution and other utility systems is that they can be (almost fully) embedded in a plane, therefore to simulate these system we need realistic networks which are also planar. While the currently-available synthetic network generators can model networks that exhibit realism, they do not guarantee or achieve planarity. In this paper we present a flexible algorithm that can synthesize realistic networks that are planar. The method follows a multi-scale randomized editing approach generating a hierarchy of coarsened networks of a given planar graph and introducing edits at various levels in the hierarchy. The method preserves the structural properties with minimal bias including the planarity of the network, while introducing realistic variability at multiple scales.
Reproducibility: All datasets and algorithm implementation presented in this work are available at https://bit.ly/2CjOUAS
Monday, April 29, 2019
Designing scalable nonlinear support vector machines
Accepted paper in Machine Learning journal
Sadrfaridpour, Razzaghi, Safro, "Engineering fast multilevel support vector machines", 2019, preprint at https://arxiv.org/abs/1707.07657
The computational complexity of solving nonlinear support vector machine (SVM) is prohibitive on large-scale data. In particular, this issue becomes very sensitive when the data represents additional difficulties such as highly imbalanced class sizes. Typically, nonlinear kernels produce significantly higher classification quality to linear kernels but introduce extra kernel and model parameters which requires computationally expensive fitting. This increases the quality but also reduces the performance dramatically. We introduce a generalized fast multilevel framework for regular and weighted SVM and discuss several versions of its algorithmic components that lead to a good trade-off between quality and time. Our framework is implemented using PETSc which allows an easy integration with scientific computing tasks. The experimental results demonstrate significant speed up compared to the state-of-the-art nonlinear SVM libraries.
Reproducibility: our source code, documentation and parameters are available at https://github.com/esadr/mlsvm
Saturday, March 30, 2019
Finding influential nodes in networks with consumable resources
Accepted paper in Network Science journal
Ushijima-Mwesigwa, Khan, Chowdhury, Safro, "Centralities for Networks with Consumable Resources", 2019, preprint at https://arxiv.org/abs/1903.00642
Identification of influential nodes is an important step in understanding and controlling the dynamics of information, traffic, and spreading processes in networks. As a result, a number of centrality measures have been proposed and used across different application domains. At the heart of many of these measures lies an assumption describing the manner in which traffic (of information, social actors, particles, etc.) flows through the network. For example, some measures only count shortest paths while others consider random walks. This paper considers a spreading process in which a resource necessary for transit is partially consumed along the way while being refilled at special nodes on the network. Examples include fuel consumption of vehicles together with refueling stations, information loss during dissemination with error-correcting nodes, and consumption of ammunition of military troops while moving. We propose generalizations of the well-known measures of betweenness, random-walk betweenness, and Katz centralities to take such a spreading process with consumable resources into account. In order to validate the results, experiments on real-world networks are carried out by developing simulations based on well-known models such as Susceptible-Infected-Recovered and congestion with respect to particle hopping from vehicular flow theory. The simulation-based models are shown to be highly correlated with the proposed centrality measures.
Friday, March 29, 2019
Future of medicine - man or machine
Some thoughts about literature based discovery and our automated biomedical hypothesis generation tool MOLIERE for Clemson World Magazine
Saturday, February 9, 2019
Does it help to charge the electric cars at intersections?
Accepted paper in Computer-Aided Civil and Infrastructure Engineering
Khan, Khan Chowdhury, Safro, Ushijima-Mwesigwa "Wireless Charging Utility Maximization and Intersection Control Delay Minimization Framework for Electric Vehicles"
This study presents the Wireless Charging Utility Maximization (WCUM) framework, which aims to maximize the utility of Wireless Charging Units (WCUs) for electric vehicle (EV) charging through the optimal WCU deployment at signalized intersections. Furthermore, the framework aims to minimize the control delay at all signalized intersections of the network. The framework consists of a two‐step optimization formulation, a dynamic traffic assignment model to calculate the user equilibrium, a traffic microsimulator to formulate the objective functions, and a global Mixed Integer Non‐Linear Programming (MINLP) optimization solver. An optimization problem is formulated for each intersection, and another for the entire network. The performance of the WCUM framework is tested using the Sioux Falls network. We perform a comparative study of 12 global MINLP solvers with a case study. Based on solution quality and computation time, we choose the Couenne solver for this framework.
https://onlinelibrary.wiley.com/doi/abs/10.1111/mice.12439
Sunday, February 3, 2019
Predicting bariatric surgery outcomes
Accepted paper in Annals of Operations Research
Razzaghi, Safro, Ewing, Sadrfaridpour, Scott "Predictive models for bariatric surgery risks with imbalanced medical datasets"
Bariatric surgery (BAR) has become a popular treatment for type 2 diabetes mellitus which is among the most critical obesity-related comorbidities. Patients who have bariatric surgery, are exposed to complications after surgery. Furthermore, the mid- to long-term complications after bariatric surgery can be deadly and increase the complexity of managing safety of these operations and healthcare costs. Current studies on BAR complications have mainly used risk scoring for identifying patients who are more likely to have complications after surgery. Though, these studies do not take into consideration the imbalanced nature of the data where the size of the class of interest (patients who have complications after surgery) is relatively small. We propose the use of imbalanced classification techniques to tackle the imbalanced bariatric surgery data: synthetic minority oversampling technique (SMOTE), random undersampling, and ensemble learning classification methods including Random Forest, Bagging, and AdaBoost. Moreover, we improve classification performance through using Chi-squared, Information Gain, and Correlation-based feature selection techniques. We study the Premier Healthcare Database with focus on the most-frequent complications including Diabetes, Angina, Heart Failure, and Stroke. Our results show that the ensemble learning-based classification techniques using any feature selection method mentioned above are the best approach for handling the imbalanced nature of the bariatric surgical outcome data. In our evaluation, we find a slight preference toward using SMOTE method compared to the random undersampling method. These results demonstrate the potential of machine-learning tools as clinical decision support in identifying risks/outcomes associated with bariatric surgery and their effectiveness in reducing the surgery complications as well as improving patient care.
Monday, January 28, 2019
Saturday, January 12, 2019
Welcome new students!
Two new students, Zirou Qiu (MSc) and Korey Palmer (senior undergrad) are joining our research group.
Friday, January 11, 2019
Relaxation-Based Coarsening for Multilevel Hypergraph Partitioning
Accepted paper at SIAM Multiscale Modeling and Simulations
Ruslan Shaydulin, Jie Chen, Ilya Safro "Relaxation-Based Coarsening for Multilevel Hypergraph Partitioning", 2019, preprint at arXiv:1710.06552
Multilevel partitioning methods that are inspired by principles of multiscaling are the most powerful practical hypergraph partitioning solvers. Hypergraph partitioning has many applications in disciplines ranging from scientific computing to data science. In this paper we introduce the concept of algebraic distance on hypergraphs and demonstrate its use as an algorithmic component in the coarsening stage of multilevel hypergraph partitioning solvers. The algebraic distance is a vertex distance measure that extends hyperedge weights for capturing the local connectivity of vertices which is critical for hypergraph coarsening schemes. The practical effectiveness of the proposed measure and corresponding coarsening scheme is demonstrated through extensive computational experiments on a diverse set of problems. Finally, we propose a benchmark of hypergraph partitioning problems to compare the quality of other solvers.
Monday, November 26, 2018
Thesis defense
Congratulations to Varsha Chauhan for successfully defending her MSc thesis on planar graph generation!
Travel awards
Congratulations to Justin Sybrandt and Ruslan Shaydulin for receiving travel awards to present their papers at IEEE Big Data 2018 and APS 2018!
Thesis defense
Congratulations to Dr. Hayato Ushijima-Mwesigwa for successfully defending his Ph.D. thesis "Models for Networks with Consumable Resources"!
Community detection on NISQ devices
Accepted paper at at 3rd International Workshop on Post Moore's Era 2018
Supercomputing (PMES 2018)
Ruslan Shaydulin, Haayto Ushijima-Mwesigwa, Ilya Safro, Susan Mniszewski, Yuri Alexeev "Community Detection Across Emerging Quantum Architectures", preprint at arXiv:1810.07765, 2018
Sunday, November 25, 2018
Can we predict crimes in Chicago?
Our paper is accepted at IEEE Big Data 2018
Saroj K. Dash, I. Safro, Ravisutha S. Srinivasamurthy "Spatio-temporal prediction of crimes using network analytic approach", preprint at arXiv:1808.06241, 2018
It is quite evident that majority of the population lives in urban area today than in any time of the human history. This trend seems to increase in coming years. Studies say that nearly 80.7% of total population in USA stays in urban area. By 2030 nearly 60% of the population in the world will live in or move to cities. With the increase in urban population, it is important to keep an eye on criminal activities. By doing so, governments can enforce intelligent policing systems and hence many government agencies and local authorities have made the crime data publicly available. In this paper, we analyze Chicago city crime data fused with other social information sources using network analytic techniques to predict criminal activity for the next year. We observe that as we add more layers of data which represent different aspects of the society, the quality of prediction is improved. Our prediction models not just predict total number of crimes for the whole Chicago city, rather they predict number of crimes for all types of crimes and for different regions in City of Chicago.
What we do/Team/In news
Quantum Computing Quantum computers are expected to accelerate scientific discovery spanning many different areas such as medicine, AI, ...
-
Quantum Computing Quantum computers are expected to accelerate scientific discovery spanning many different areas such as medicine, AI, ...
-
We're pleased to share that our group has expanded with the addition of two PhD students. Bao Bach, joining from the Quantum Science an...
-
We have another accepted paper which is the result of our collaboration with amazing colleagues at NASA/USRA, Rigetti and Purdue. Filip B. M...