Transformer models coupled with a simplified molecular line entry system (SMILES) have recently proven to be a powerful combination for solving challenges in cheminformatics. These models, however, are often developed specifically for a single application and can be very resource-intensive to train. In this work we present the Chemformer model—a Transformer-based model which can be quickly applied to both sequence-to-sequence and discriminative cheminformatics tasks. Additionally, we show that self-supervised pre-training can improve performance and significantly speed up convergence on downstream tasks. On direct synthesis and retrosynthesis prediction benchmark datasets we publish state-of-the-art results for top-1 accuracy. We also improve on existing approaches for a molecular optimisation task and show that Chemformer can optimise on multiple discriminative tasks simultaneously. Models, datasets and code will be made available after publication.
Purpose-led Publishing is a coalition of three not-for-profit publishers in the field of physical sciences: AIP Publishing, the American Physical Society and IOP Publishing.
Together, as publishers that will always put purpose above profit, we have defined a set of industry standards that underpin high-quality, ethical scholarly communications.
We are proudly declaring that science is our only shareholder.
ISSN: 2632-2153
Machine Learning: Science and Technology is a multidisciplinary open access journal that bridges the application of machine learning across the sciences with advances in machine learning methods and theory as motivated by physical insights.
Open all abstracts, in this tab
Ross Irwin et al 2022 Mach. Learn.: Sci. Technol. 3 015022
Tanujit Chakraborty et al 2024 Mach. Learn.: Sci. Technol. 5 011001
Generative adversarial networks (GANs) have rapidly emerged as powerful tools for generating realistic and diverse data across various domains, including computer vision and other applied areas, since their inception in 2014. Consisting of a discriminative network and a generative network engaged in a minimax game, GANs have revolutionized the field of generative modeling. In February 2018, GAN secured the leading spot on the 'Top Ten Global Breakthrough Technologies List' issued by the Massachusetts Science and Technology Review. Over the years, numerous advancements have been proposed, leading to a rich array of GAN variants, such as conditional GAN, Wasserstein GAN, cycle-consistent GAN, and StyleGAN, among many others. This survey aims to provide a general overview of GANs, summarizing the latent architecture, validation metrics, and application areas of the most widely recognized variants. We also delve into recent theoretical developments, exploring the profound connection between the adversarial principle underlying GAN and Jensen–Shannon divergence while discussing the optimality characteristics of the GAN framework. The efficiency of GAN variants and their model architectures will be evaluated along with training obstacles as well as training solutions. In addition, a detailed discussion will be provided, examining the integration of GANs with newly developed deep learning frameworks such as transformers, physics-informed neural networks, large language models, and diffusion models. Finally, we reveal several issues as well as future research outlines in this field.
Ivan S Novikov et al 2021 Mach. Learn.: Sci. Technol. 2 025002
The subject of this paper is the technology (the 'how') of constructing machine-learning interatomic potentials, rather than science (the 'what' and 'why') of atomistic simulations using machine-learning potentials. Namely, we illustrate how to construct moment tensor potentials using active learning as implemented in the MLIP package, focusing on the efficient ways to automatically sample configurations for the training set, how expanding the training set changes the error of predictions, how to set up ab initio calculations in a cost-effective manner, etc. The MLIP package (short for Machine-Learning Interatomic Potentials) is available at https://mlip.skoltech.ru/download/.
Steven Dahdah and James Richard Forbes 2024 Mach. Learn.: Sci. Technol. 5 025038
This paper proposes a method to identify a Koopman model of a feedback-controlled system given a known controller. The Koopman operator allows a nonlinear system to be rewritten as an infinite-dimensional linear system by viewing it in terms of an infinite set of lifting functions. A finite-dimensional approximation of the Koopman operator can be identified from data by choosing a finite subset of lifting functions and solving a regression problem in the lifted space. Existing methods are designed to identify open-loop systems. However, it is impractical or impossible to run experiments on some systems, such as unstable systems, in an open-loop fashion. The proposed method leverages the linearity of the Koopman operator, along with knowledge of the controller and the structure of the closed-loop (CL) system, to simultaneously identify the CL and plant systems. The advantages of the proposed CL Koopman operator approximation method are demonstrated in simulation using a Duffing oscillator and experimentally using a rotary inverted pendulum system. An open-source software implementation of the proposed method is publicly available, along with the experimental dataset generated for this paper.
Mario Krenn et al 2020 Mach. Learn.: Sci. Technol. 1 045024
The discovery of novel materials and functional molecules can help to solve some of society's most urgent challenges, ranging from efficient energy harvesting and storage to uncovering novel pharmaceutical drug candidates. Traditionally matter engineering–generally denoted as inverse design–was based massively on human intuition and high-throughput virtual screening. The last few years have seen the emergence of significant interest in computer-inspired designs based on evolutionary or deep learning methods. The major challenge here is that the standard strings molecular representation SMILES shows substantial weaknesses in that task because large fractions of strings do not correspond to valid molecules. Here, we solve this problem at a fundamental level and introduce SELFIES (SELF-referencIng Embedded Strings), a string-based representation of molecules which is 100% robust. Every SELFIES string corresponds to a valid molecule, and SELFIES can represent every molecule. SELFIES can be directly applied in arbitrary machine learning models without the adaptation of the models; each of the generated molecule candidates is valid. In our experiments, the model's internal memory stores two orders of magnitude more diverse molecules than a similar test with SMILES. Furthermore, as all molecules are valid, it allows for explanation and interpretation of the internal working of the generative models.
Philippe Schwaller et al 2021 Mach. Learn.: Sci. Technol. 2 015016
Artificial intelligence is driving one of the most important revolutions in organic chemistry. Multiple platforms, including tools for reaction prediction and synthesis planning based on machine learning, have successfully become part of the organic chemists' daily laboratory, assisting in domain-specific synthetic problems. Unlike reaction prediction and retrosynthetic models, the prediction of reaction yields has received less attention in spite of the enormous potential of accurately predicting reaction conversion rates. Reaction yields models, describing the percentage of the reactants converted to the desired products, could guide chemists and help them select high-yielding reactions and score synthesis routes, reducing the number of attempts. So far, yield predictions have been predominantly performed for high-throughput experiments using a categorical (one-hot) encoding of reactants, concatenated molecular fingerprints, or computed chemical descriptors. Here, we extend the application of natural language processing architectures to predict reaction properties given a text-based representation of the reaction, using an encoder transformer model combined with a regression layer. We demonstrate outstanding prediction performance on two high-throughput experiment reactions sets. An analysis of the yields reported in the open-source USPTO data set shows that their distribution differs depending on the mass scale, limiting the data set applicability in reaction yields predictions.
Moritz Hoffmann et al 2022 Mach. Learn.: Sci. Technol. 3 015009
Generation and analysis of time-series data is relevant to many quantitative fields ranging from economics to fluid mechanics. In the physical sciences, structures such as metastable and coherent sets, slow relaxation processes, collective variables, dominant transition pathways or manifolds and channels of probability flow can be of great importance for understanding and characterizing the kinetic, thermodynamic and mechanistic properties of the system. Deeptime is a general purpose Python library offering various tools to estimate dynamical models based on time-series data including conventional linear learning methods, such as Markov state models (MSMs), Hidden Markov Models and Koopman models, as well as kernel and deep learning approaches such as VAMPnets and deep MSMs. The library is largely compatible with scikit-learn, having a range of Estimator classes for these different models, but in contrast to scikit-learn also provides deep Model classes, e.g. in the case of an MSM, which provide a multitude of analysis methods to compute interesting thermodynamic, kinetic and dynamical quantities, such as free energies, relaxation times and transition paths. The library is designed for ease of use but also easily maintainable and extensible code. In this paper we introduce the main features and structure of the deeptime software. Deeptime can be found under https://deeptime-ml.github.io/.
Arsenii Senokosov et al 2024 Mach. Learn.: Sci. Technol. 5 015040
Image classification, a pivotal task in multiple industries, faces computational challenges due to the burgeoning volume of visual data. This research addresses these challenges by introducing two quantum machine learning models that leverage the principles of quantum mechanics for effective computations. Our first model, a hybrid quantum neural network with parallel quantum circuits, enables the execution of computations even in the noisy intermediate-scale quantum era, where circuits with a large number of qubits are currently infeasible. This model demonstrated a record-breaking classification accuracy of 99.21% on the full MNIST dataset, surpassing the performance of known quantum–classical models, while having eight times fewer parameters than its classical counterpart. Also, the results of testing this hybrid model on a Medical MNIST (classification accuracy over 99%), and on CIFAR-10 (classification accuracy over 82%), can serve as evidence of the generalizability of the model and highlights the efficiency of quantum layers in distinguishing common features of input data. Our second model introduces a hybrid quantum neural network with a Quanvolutional layer, reducing image resolution via a convolution process. The model matches the performance of its classical counterpart, having four times fewer trainable parameters, and outperforms a classical model with equal weight parameters. These models represent advancements in quantum machine learning research and illuminate the path towards more accurate image classification systems.
Leopoldo Sarra et al 2024 Mach. Learn.: Sci. Technol. 5 025029
Despite rapid progress in the field, it is still challenging to discover new ways to leverage quantum computation: all quantum algorithms must be designed by hand, and quantum mechanics is notoriously counterintuitive. In this paper, we study how artificial intelligence, in the form of program synthesis, may help overcome some of these difficulties, by showing how a computer can incrementally learn concepts relevant to quantum circuit synthesis with experience, and reuse them in unseen tasks. In particular, we focus on the decomposition of unitary matrices into quantum circuits, and show how, starting from a set of elementary gates, we can automatically discover a library of useful new composite gates and use them to decompose increasingly complicated unitaries.
Alexandr Sedykh et al 2024 Mach. Learn.: Sci. Technol. 5 025045
Finding the distribution of the velocities and pressures of a fluid by solving the Navier–Stokes equations is a principal task in the chemical, energy, and pharmaceutical industries, as well as in mechanical engineering and in design of pipeline systems. With existing solvers, such as OpenFOAM and Ansys, simulations of fluid dynamics in intricate geometries are computationally expensive and require re-simulation whenever the geometric parameters or the initial and boundary conditions are altered. Physics-informed neural networks (PINNs) are a promising tool for simulating fluid flows in complex geometries, as they can adapt to changes in the geometry and mesh definitions, allowing for generalization across fluid parameters and transfer learning across different shapes. We present a hybrid quantum PINN (HQPINN) that simulates laminar fluid flow in 3D Y-shaped mixers. Our approach combines the expressive power of a quantum model with the flexibility of a PINN, resulting in a 21% higher accuracy compared to a purely classical neural network. Our findings highlight the potential of machine learning approaches, and in particular HQPINN, for complex shape optimization tasks in computational fluid dynamics. By improving the accuracy of fluid simulations in complex geometries, our research using hybrid quantum models contributes to the development of more efficient and reliable fluid dynamics solvers.
Open all abstracts, in this tab
Thomas Penfold et al 2024 Mach. Learn.: Sci. Technol. 5 021001
Computational spectroscopy has emerged as a critical tool for researchers looking to achieve both qualitative and quantitative interpretations of experimental spectra. Over the past decade, increased interactions between experiment and theory have created a positive feedback loop that has stimulated developments in both domains. In particular, the increased accuracy of calculations has led to them becoming an indispensable tool for the analysis of spectroscopies across the electromagnetic spectrum. This progress is especially well demonstrated for short-wavelength techniques, e.g. core-hole (x-ray) spectroscopies, whose prevalence has increased following the advent of modern x-ray facilities including third-generation synchrotrons and x-ray free-electron lasers. While calculations based on well-established wavefunction or density-functional methods continue to dominate the greater part of spectral analyses in the literature, emerging developments in machine-learning algorithms are beginning to open up new opportunities to complement these traditional techniques with fast, accurate, and affordable 'black-box' approaches. This Topical Review recounts recent progress in data-driven/machine-learning approaches for computational x-ray spectroscopy. We discuss the achievements and limitations of the presently-available approaches and review the potential that these techniques have to expand the scope and reach of computational and experimental x-ray spectroscopic studies.
Jonathan Kipp et al 2024 Mach. Learn.: Sci. Technol. 5 025060
The anomalous Hall effect has been front and center in solid state research and material science for over a century now, and the complex transport phenomena in nontrivial magnetic textures have gained an increasing amount of attention, both in theoretical and experimental studies. However, a clear path forward to capturing the influence of magnetization dynamics on anomalous Hall effect even in smallest frustrated magnets or spatially extended magnetic textures is still intensively sought after. In this work, we present an expansion of the anomalous Hall tensor into symmetrically invariant objects, encoding the magnetic configuration up to arbitrary power of spin. We show that these symmetric invariants can be utilized in conjunction with advanced regularization techniques in order to build models for the electric transport in magnetic textures which are, on one hand, complete with respect to the point group symmetry of the underlying lattice, and on the other hand, depend on a minimal number of order parameters only. Here, using a four-band tight-binding model on a honeycomb lattice, we demonstrate that the developed method can be used to address the importance and properties of higher-order contributions to transverse transport. The efficiency and breadth enabled by this method provides an ideal systematic approach to tackle the inherent complexity of response properties of noncollinear magnets, paving the way to the exploration of electric transport in intrinsically frustrated magnets as well as large-scale magnetic textures.
Anjana S Desai et al 2024 Mach. Learn.: Sci. Technol. 5 025059
This research underscores the profound impact of data cleansing, ensuring dataset integrity and providing a structured foundation for unraveling convoluted connections between diverse physical properties and cytotoxicity. As the scientific community delves deeper into this interplay, it becomes clear that precise data purification is a fundamental aspect of investigating parameters within datasets. The study presents the need for data filtration in the background of machine learning (ML) that has widened its horizon into the field of biological application through the amalgamation of predictive systems and algorithms that delve into the intricate characteristics of cytotoxicity of nanoparticles. The reliability and accuracy of models in the ML landscape hinge on the quality of input data, making data cleansing a critical component of the pre-processing pipeline. The main encounter faced here is the lengthy, broad and complex datasets that have to be toned down for further studies. Through a thorough data cleansing process, this study addresses the complexities arising from diverse sources, resulting in a refined dataset. The filtration process employs K-means clustering to derive centroids, revealing the correlation between the physical properties of nanoparticles, viz, concentration, zeta potential, hydrodynamic diameter, morphology, and absorbance wavelength, and cytotoxicity outcomes measured in terms of cell viability. The cell lines considered for determining the centroid values that predicts the cytotoxicity of silver nanoparticles are human and animal cell lines which were categorized as normal and carcinoma type. The objective of the study is to simplify the high-dimensional data for accurate analysis of the parameters that affect the cytotoxicity of silver NPs through centroids.
Stefan Heinen et al 2024 Mach. Learn.: Sci. Technol. 5 025058
For many machine learning applications in science, data acquisition, not training, is the bottleneck even when avoiding experiments and relying on computation and simulation. Correspondingly, and in order to reduce cost and carbon footprint, training data efficiency is key. We introduce minimal multilevel machine learning (M3L) which optimizes training data set sizes using a loss function at multiple levels of reference data in order to minimize a combination of prediction error with overall training data acquisition costs (as measured by computational wall-times). Numerical evidence has been obtained for calculated atomization energies and electron affinities of thousands of organic molecules at various levels of theory including HF, MP2, DLPNO-CCSD(T), DFHFCABS, PNOMP2F12, and PNOCCSD(T)F12, and treating them with basis sets TZ, cc-pVTZ, and AVTZ-F12. Our M3L benchmarks for reaching chemical accuracy in distinct chemical compound sub-spaces indicate substantial computational cost reductions by factors of ∼1.01, 1.1, 3.8, 13.8, and 25.8 when compared to heuristic sub-optimal multilevel machine learning (M2L) for the data sets QM7b, QM9, Electrolyte Genome Project, QM9, and QM9, respectively. Furthermore, we use M2L to investigate the performance for 76 density functionals when used within multilevel learning and building on the following levels drawn from the hierarchy of Jacobs Ladder: LDA, GGA, mGGA, and hybrid functionals. Within M2L and the molecules considered, mGGAs do not provide any noticeable advantage over GGAs. Among the functionals considered and in combination with LDA, the three on average top performing GGA and Hybrid levels for atomization energies on QM9 using M3L correspond respectively to PW91, KT2, B97D, and τ-HCTH, B3LYP(VWN5), and TPSSH.
Charles Fox et al 2024 Mach. Learn.: Sci. Technol. 5 025057
Symbolic regression (SR) can generate interpretable, concise expressions that fit a given dataset, allowing for more human understanding of the structure than black-box approaches. The addition of background knowledge (in the form of symbolic mathematical constraints) allows for the generation of expressions that are meaningful with respect to theory while also being consistent with data. We specifically examine the addition of constraints to traditional genetic algorithm (GA) based SR (PySR) as well as a Markov-chain Monte Carlo (MCMC) based Bayesian SR architecture (Bayesian Machine Scientist), and apply these to rediscovering adsorption equations from experimental, historical datasets. We find that, while hard constraints prevent GA and MCMC SR from searching, soft constraints can lead to improved performance both in terms of search effectiveness and model meaningfulness, with computational costs increasing by about an order of magnitude. If the constraints do not correlate well with the dataset or expected models, they can hinder the search of expressions. We find incorporating these constraints in Bayesian SR (as the Bayesian prior) is better than by modifying the fitness function in the GA.
Open all abstracts, in this tab
Thomas Penfold et al 2024 Mach. Learn.: Sci. Technol. 5 021001
Computational spectroscopy has emerged as a critical tool for researchers looking to achieve both qualitative and quantitative interpretations of experimental spectra. Over the past decade, increased interactions between experiment and theory have created a positive feedback loop that has stimulated developments in both domains. In particular, the increased accuracy of calculations has led to them becoming an indispensable tool for the analysis of spectroscopies across the electromagnetic spectrum. This progress is especially well demonstrated for short-wavelength techniques, e.g. core-hole (x-ray) spectroscopies, whose prevalence has increased following the advent of modern x-ray facilities including third-generation synchrotrons and x-ray free-electron lasers. While calculations based on well-established wavefunction or density-functional methods continue to dominate the greater part of spectral analyses in the literature, emerging developments in machine-learning algorithms are beginning to open up new opportunities to complement these traditional techniques with fast, accurate, and affordable 'black-box' approaches. This Topical Review recounts recent progress in data-driven/machine-learning approaches for computational x-ray spectroscopy. We discuss the achievements and limitations of the presently-available approaches and review the potential that these techniques have to expand the scope and reach of computational and experimental x-ray spectroscopic studies.
Tanujit Chakraborty et al 2024 Mach. Learn.: Sci. Technol. 5 011001
Generative adversarial networks (GANs) have rapidly emerged as powerful tools for generating realistic and diverse data across various domains, including computer vision and other applied areas, since their inception in 2014. Consisting of a discriminative network and a generative network engaged in a minimax game, GANs have revolutionized the field of generative modeling. In February 2018, GAN secured the leading spot on the 'Top Ten Global Breakthrough Technologies List' issued by the Massachusetts Science and Technology Review. Over the years, numerous advancements have been proposed, leading to a rich array of GAN variants, such as conditional GAN, Wasserstein GAN, cycle-consistent GAN, and StyleGAN, among many others. This survey aims to provide a general overview of GANs, summarizing the latent architecture, validation metrics, and application areas of the most widely recognized variants. We also delve into recent theoretical developments, exploring the profound connection between the adversarial principle underlying GAN and Jensen–Shannon divergence while discussing the optimality characteristics of the GAN framework. The efficiency of GAN variants and their model architectures will be evaluated along with training obstacles as well as training solutions. In addition, a detailed discussion will be provided, examining the integration of GANs with newly developed deep learning frameworks such as transformers, physics-informed neural networks, large language models, and diffusion models. Finally, we reveal several issues as well as future research outlines in this field.
Jakub Rydzewski et al 2023 Mach. Learn.: Sci. Technol. 4 031001
Analyzing large volumes of high-dimensional data requires dimensionality reduction: finding meaningful low-dimensional structures hidden in their high-dimensional observations. Such practice is needed in atomistic simulations of complex systems where even thousands of degrees of freedom are sampled. An abundance of such data makes gaining insight into a specific physical problem strenuous. Our primary aim in this review is to focus on unsupervised machine learning methods that can be used on simulation data to find a low-dimensional manifold providing a collective and informative characterization of the studied process. Such manifolds can be used for sampling long-timescale processes and free-energy estimation. We describe methods that can work on datasets from standard and enhanced sampling atomistic simulations. Unlike recent reviews on manifold learning for atomistic simulations, we consider only methods that construct low-dimensional manifolds based on Markov transition probabilities between high-dimensional samples. We discuss these techniques from a conceptual point of view, including their underlying theoretical frameworks and possible limitations.
James Stokes et al 2023 Mach. Learn.: Sci. Technol. 4 021001
This article aims to summarize recent and ongoing efforts to simulate continuous-variable quantum systems using flow-based variational quantum Monte Carlo techniques, focusing for pedagogical purposes on the example of bosons in the field amplitude (quadrature) basis. Particular emphasis is placed on the variational real- and imaginary-time evolution problems, carefully reviewing the stochastic estimation of the time-dependent variational principles and their relationship with information geometry. Some practical instructions are provided to guide the implementation of a PyTorch code. The review is intended to be accessible to researchers interested in machine learning and quantum information science.
Bahram Jalali et al 2022 Mach. Learn.: Sci. Technol. 3 041001
The phenomenal success of physics in explaining nature and engineering machines is predicated on low dimensional deterministic models that accurately describe a wide range of natural phenomena. Physics provides computational rules that govern physical systems and the interactions of the constituents therein. Led by deep neural networks, artificial intelligence (AI) has introduced an alternate data-driven computational framework, with astonishing performance in domains that do not lend themselves to deterministic models such as image classification and speech recognition. These gains, however, come at the expense of predictions that are inconsistent with the physical world as well as computational complexity, with the latter placing AI on a collision course with the expected end of the semiconductor scaling known as Moore's Law. This paper argues how an emerging symbiosis of physics and AI can overcome such formidable challenges, thereby not only extending AI's spectacular rise but also transforming the direction of engineering and physical science.
Open all abstracts, in this tab
Vigl et al
In this work we demonstrate that significant gains in performance and data efficiency can be achieved in High Energy Physics (HEP) by moving beyond the standard paradigm of sequential optimization or reconstruction and analysis components. We conceptually connect HEP reconstruction and analysis to modern machine learning workflows such as pretraining, finetuning, domain adaptation and high-dimensional embedding spaces and quantify the gains in the example usecase of searches of heavy resonances decaying via an intermediate di-Higgs system to four b-jets. To our knowledge this is the first example of a low-level feature extraction network finetuned for a downstream HEP analysis objective.
Gupta et al
Design of high entropy alloys (HEA) presents a significant challenge due to the large compositional space and composition-specific variation in their functional behavior. The traditional alloy design would include trial-and-error prototyping and high-throughput experimentation, which again is challenging due to large-scale fabrication and experimentation. To address these challenges, this article presents a computational strategy for HEA design based on the seamless integration of quasi-random sampling, molecular dynamics (MD) simulations and machine learning (ML). A limited number of algorithmically chosen molecular-level simulations are performed to create a Gaussian process-based computational mapping between the varying concentrations of constituent elements of the HEA and effective properties like Young's modulus and density. The computationally efficient ML models are subsequently exploited for large-scale predictions and multi-objective functionality attainment with non-aligned goals. The study reveals that there exists a strong negative correlation between Al concentration and the desired effective properties of AlCoCrFeNi HEA, whereas the Ni concentration exhibits a strong positive correlation. The deformation mechanism further shows that excessive increase of Al concentration leads to a higher percentage of FCC to BCC phase transformation which is found to be relatively lower in the HEA with reduced Al concentration. Such physical insights during the deformation process would be crucial in the alloy design process along with the data-driven predictions. As an integral part of this investigation, the developed ML models are interpreted based on Shapley Additive exPlanations, which are essential to explain and understand the model's mechanism along with meaningful deployment. The data-driven strategy presented here will lead to devising an efficient explainable machine learning-based bottom-up approach to alloy design for multi-objective non-aligned functionality attainment.
Liu et al
The Particle Swarm Optimization (PSO) algorithm is easy to implement owing to its simple framework, and has been successfully applied to many optimization problems. However, the standard PSO easily falls into the local optimum and has weak search ability. To enhance the optimization ability of the algorithm, this paper proposes an adaptive particle swarm optimization with information interaction mechanism (APSOIIM). First, a chaotic sequence strategy was used to produce uniformly distributed particles and enhance their convergence speed at the initialization stage of the algorithm. Then, an interaction information mechanism is introduced to enhance the diversity of the population with the progress of the search, which can effectively interact with the best information of neighboring particles to maintain the balance between exploration and exploitation. Besides, the convergence was proven to verify the robustness and efficiency of the proposed APSOIIM algorithm. Finally, the proposed APSOIIM was applied to solve the CEC2014 benchmark functions and CEC2017 benchmark functions as well as famous engineering optimization problems. The experimental results show that the proposed APSOIIM has significant advantages over the compared algorithms.
Luce et al
Optimizing the shapes and topology of physical devices is crucial for both scientific and technological advancements, given their wide-ranging implications across numerous industries and research areas. Innovations in shape and topology optimization have been observed across a wide range of fields, notably structural mechanics, fluid mechanics, and more recently, photonics. Gradient-based inverse design techniques have been particularly successful for photonic and optical problems, resulting in integrated, miniaturized hardware that has set new standards in device performance. To calculate the gradients, there are typically two approaches: namely, either by implementing specialized solvers using automatic differentiation or by deriving analytical solutions for gradient calculation and adjoint sources by hand. In this work, we propose a middle ground and present a hybrid approach that leverages and enables the benefits of automatic differentiation for handling gradient derivation while using existing, proven but black-box photonic solvers for numerical solutions. Utilizing the adjoint method, we make existing numerical solvers differentiable and seamlessly integrate them into an automatic differentiation framework. Further, this enables users to integrate the optimization environment seamlessly with other autodifferentiable components such as machine learning, geometry generation, or intricate post-processing which could lead to better photonic design workflows. We illustrate the approach through two distinct photonic optimization problems: optimizing the Purcell factor of a magnetic dipole in the vicinity of an optical nanocavity and enhancing the light extraction efficiency of a \textmu LED.
Lee et al
Variational quantum machine learning (VQML) models based on parameterized quantum circuits (PQC) have been expected to offer a potential quantum advantage for machine learning applications. However, comparison between VQML models and their classical counterparts is hard due to the lack of interpretability of VQML models. In this study, we introduce a graphical approach to analyze the PQC and the corresponding operation of VQML models to deal with this problem. In particular, we utilize the Stokes representation of quantum states to treat VQML models as network models based on the corresponding representations of basic gates. From this approach, we suggest the notion of active paths in the networks and relate the expressivity of VQML models with it. We investigate the growth of active paths in VQML models and observe that the expressivity of VQML models can be significantly limited for certain cases. Then we construct classical models inspired by our graphical interpretation of VQML models and show that they can emulate or outperform the outputs of VQML models for these cases. Our result provides a new way to interpret the operation of VQML models and facilitates the interconnection between quantum and classical machine learning areas.