Daniel Leykam's research blog: August 2021

Thursday, August 26, 2021

Updates

My posting has been slower than usual due to taking a holiday and then being busy catching up with work and finishing a grant application. A few things that I have been involved in over the past few weeks:

On Monday I gave a (virtual) talk on applications of topological data analysis to photonics at the Photonica conference. Here are the slides. The introduction to machine learning is based heavily on lectures given by David Saad last year at IBS PCS (Lecture 1 and Lecture 2) which I found an excellent introduction to the topic. Nice to see that in-person conferences are finally starting to resume in Europe!
We received the referee reports for our manuscript on dark soliton detection using persistent homology (discussed in an earlier post). We are fortunate that both referees took the time to carefully read our manuscript and offer detailed suggestions on how to improve the work. The revised manuscript should be significantly clearer and more accessible to non-specialists.
I wrote a Perspective on an article on optimised quantum algorithms for quantum simulation of fermionic systems recently published in Quantum. While I haven't published any research on this topic (yet), it's an area we are currently working on, and this summary served as good motivation for me to become more familiar with the recent literature.

Monday, August 23, 2021

IBS Conference on Flatbands: symmetries, disorder, interactions and thermalization

Last week my former affiliation, the Institute for Basic Science, hosted a conference on the physics of flatband lattices. I was fortunate to be invited to virtually present a colloquium talk, where I recounted how I became interested in flatband lattices during my PhD studies. I have uploaded my presentation slides, and the recorded talk is also available to watch on Youtube.

Unfortunately I was on leave for most of the week and missed most of the conference talks, so I hope all of the presentations will be uploaded for viewing. One of the talks I was able to watch live was by Hal Tasaki, one of the founders of the field back in the early 1990s, who shared some of his early encounters with flatbands, including learning about new results by reading preprints distributed via snail mail.

Tuesday, August 17, 2021

Responding to referee reports - part 1

When referees recommend rejection of your manuscript, it's normal to feel angry and reflexively attack the negative referees, but this is rarely the best course of action. I have refereed hundreds of papers and managed to get many of my own manuscripts past difficult referees and into high profile journals. Here are some of my thoughts on how best to respond to difficult reports.

Your response to the referee reports should be targeted at the editor handling your manuscript. Ultimately it is the editor who will decide whether your manuscript is accepted or rejected. Therefore when you craft your response letter, keep in mind what the editor wants!

The editor is looking to publish papers of interest to the readers of the journal, papers that are correct, papers that will be well-cited. Your response therefore needs to convince the editor that your manuscript meets these criteria. The editor's job is NOT to simply tally the referees' recommendations and go with the majority opinion. In rare cases, usually for controversial topics, the manuscript may be published even when all referees recommend rejection!

While the referees are anonymous to you, they are known (and were likely picked by) the handling editor because he or she thinks they are experts on the subject of your manuscript. Therefore it is unwise to directly attack the referees, as you are implying that the editor made a bad choice. The opinion of referees that are more senior and experienced will be weighted more heavily than a report provided by a junior researcher or student.

Perhaps the worst type of referee report to receive are very brief reports that (perhaps incorrectly) summarise your work in one or two sentences and then recommend rejection based on subjective reasons. You spent months (or even years!) working on a project, only for someone to skim your manuscript in 10 minutes and dismiss your work out of hand. While editors do not like this kind of report either, they are still of use. What is not said by the referee can be just as useful as what is said.

A brief, dismissive referee report indicates they did not find your work interesting enough to engage with it and provide detailed criticism or questions. This suggests your manuscript, as currently written, is not of interest to (some) readers of the journal. This is sufficient grounds for rejection from high profile journals such as Physical Review Letters, even if your results are all correct and of interest to specialists working on your research area.

Therefore your response should not be to attack the referee for not fairly considering your manuscript or not being an expert in your research area (the referee could be a leader of your field without time to write a more detailed report!). Instead, you should carefully revise your manuscript to improve the presentation and make your results more accessible and interesting to the journal's readership. You should explicitly highlight these revisions in your response to the editor.

On the other hand, long and detailed reports can be viewed favourably by the editor even if the referee is recommending rejection or significant revisions. A detailed report indicates that the referee was at least interested enough in the manuscript to spend time to read it carefully and provide detailed criticism. Thus, the manuscript is of interest to (some) readers of the journal, and the main challenge is to address any technical criticisms (regarding correctness or novelty) raised by the referee.

I will write more on this later.

Friday, August 6, 2021

A setback for quantum machine learning?

Previously I wrote about how the input and output of data are bottlenecks for quantum machine learning. The measurement problem of efficiently obtaining outputs is more an issue for NISQ quantum machine learning algorithms such as variational methods. The efficient input of large datasets is a problem even for algorithms for eventual fault-tolerant quantum computers, which typically assume the data can be quickly queried in quantum superpositions, i.e. using quantum RAM (QRAM).

A paper by Ewin Tang recently published in Physical Review Letters rigorously shows that caution is needed when analyzing potential speedups of quantum machine learning algorithms. Using a model for classical data input that is comparable to QRAM (based on classical sampling of data elements), it is shown that quantum principal component analysis and clustering only deliver a polynomial speedup compared to comparable classical algorithms. Unfortunately, polynomial speedups are not likely to be useful in practice due to the massive overhead of quantum error correction.

Interestingly, this work first appeared on arXiv in 2018 and is a follow-up to the author's undergraduate dissertation!! Inspiring!

Tuesday, August 3, 2021

Detecting dark solitons using persistent homology - Part 2

Previously I posted about work-in-progress on using persistent homology to analyze BEC images to identify dark solitons. We recently finished this project and the manuscript is now available on arXiv. The approach we ended up using differs from that outlined in my original post.

Our original method was based on applying persistent homology to point clouds formed by the coordinates of low intensity pixels. When applied to point clouds, persistent homology can reliably distinguish isolated minima (clusters) from the lines (loops) forming the dark solitons. However, the limitation of this method is that it requires some arbitrary choice of cutoff intensity to identify the low density pixels. Ideally when building models for machine learning we want to minimise the use of hyper-parameters that need to be optimised. For example, identifying optimal neural network hyper-parameters (number of layers, nodes, and their connectivity) is one of the most time-consuming parts of deep learning.

Therefore, in the final manuscript we instead used the "lower star image filtration" approach, which can directly compute persistent topological features of image data. This method avoids hyper-parameters and can achieve reasonable accuracy quite quickly.

On the machine learning side, we tried several different methods including support vector machines and logistic regression, which had comparable performance once we were able to identify the relevant features to use. These methods perform very well and using much less training data compared to neural networks. One important issue I overlooked in my original post was how to properly evaluate the performance of the trained model, given that the training data set had large imbalances between the three image classes. Luckily, scikit-learn includes a variety of suitable metrics for imbalanced supervised learning problems.

All in all, this project taught me a lot about persistent homology (the power of point summaries of persistence diagrams), machine learning (the power of simple models such as logistic regression if one can identify the right data features), and Jupyter notebooks (the power of being able to revise all the manuscript figures in a few seconds without having to muck around with touching them up in Inkscape).

I've now spent a year and finished two manuscripts exploring topological data analysis. My next goal is to employ these methods to solve some important and timely problem and publish the results in Physical Review Letters (yes, a generic goal that every physicist has!). The reason being that I think we need high impact publications in order to convince other physicists that these methods are really powerful and worth learning more about. I have a few promising ideas on problems to attack, so watch this space!