ICSE 2021 Report

International Conference on Software Engineering
May 25 - June 4, 2021

I attended the ICSE 2021 conference - an online conference this year. My main purpose was to present a paper on company-university collaborations at one of the ICSE workshops. I also viewed some of the paper presentations and keynotes.

Workshop on Software Engineering Research and Industrial Practice

I was at the 8th International Workshop on Software Engineering Research and Industrial Practice (SER&IP) -- more news about this session soon.

Main Conference Panel Sessions

I viewed two very interesting panel sessions during the main ICSE conference. The first panel was a discussion of whether academic researchers in Software Engineering doing a good job of meeting the needs of a wider world. The panel title was a bit provocative: "Are academics working on the right problem?"

The panelists represented four universities and a large industrial research lab, and they all agreed that Software Engineering research is not doing as well as it should. Here are the main points I collected from the discussion - the panelists' advice to software researchers:

The second panel was a discussion of diversity - with a good collection of young researchers represented on the panel. Some of the discussion addressed some of the current situation at University of Michigan. But in general, the discussion centered around things that women and minorities (and their allies) can do to improve the diversity situation across the research community. The co-chairs of the panel plan to put together a report about the discussion - more news soon.

Keynote talks

The most interesting keynote talks were delivered by Prem Devanbu (University of California - Davis) and Elaine Weyuker (University of Central Florida). Both Prem and Elaine are former Bell Labs researchers.

Prem gave an overview of his current research in code analysis: "Naturalness and Bimodality of Code." Prem's explanation is that some of the techniques from Natural Language Processing (including some Machine Learning results) can also apply to analyzing software code. Although we think of code as a very formal engineering product, code actually shares a lot of characteristics with natural languages. The "utterances" in a natural language have certain statistical properties (such as repetitiveness - certain combinations of words and phrases are often used together), and software also has this property. And software, although it is written to run on machines, is also designed to be read by humans, so the structure of real code is not that different from natural languages, that are also used to communicate between people. Prem has been working with researchers who are trying to collect information about large code bases using some of the same Machine Learning algorithms used for NLP. Prem talked about the challenges -- not everything in software development is a "translation from one language to another." But there is some promise in this approach.

Elaine Weyuker's talk, "A View from 40 Years in the Research Trenches," was a review of her career working in software, from academic research to industrial labs like Bell Labs. Elaine had two main points. First, there has been a lack of interest in doing effective software testing for a long time. She decided to focus her research on testing approaches because it was going to be important for software quality. But most companies don't use any of the important results from software testing research. Second, she encouraged all software engineering researchers to get some experience in real-world software development, and not just open-source development.

In two other keynote talks, Dario Gil (IBM Research) and Michael Lyu (Chinese University of Hong Kong) spend a lot of time talking about the future of using AI and Machine Learning to improve software development. Michael's presentation was the best -- he outlined a set of combined approaches to use AI and Machine Learning techniques to improve the management of Cloud Computing providers. The scale and criticality of cloud operations will require normal operations to be as automated as possible -- including monitoring the systems for anomolies, diagnosing failures, and performing recovery actions when there are failures. Dario split his talk between AI/ML for software and a discussion of IBM's Quantum Computing efforts. The biggest effort by IBM in the area of using AI/ML is the creation of IBM's Project CodeNet. They are building a large repository of tested software code that can be used as input data for Machine Learning models -- the intent is that this will be as influential as the ImageNet repository of image processing data. He thinks that AI/ML could be used to build tools to modernize legacy code and convert existing applications to microservices.

Jeannette Wing gave the last keynote talk, "Data for Good," was a bit of a pleasant surprise. I had viewed a pretty good talk of the same title in spring 2020, and I expected a simple rerun. However, her new talk was excellent. She addressed a number of important issues about Data Science and AI/ML.

Jeannette explained the problem of building "classifiers" (a common use for Machine Learning), with the property of being both Fair and Robust. The degree of fairness of some of the existing classifiers (such as a system used in some US state courts to assist in determining risk level for people convicted of crimes) seems to be dependent on the tuning parameters in the Machine Learning framework - very small perturbations of the parameters may convert the classifier from fair to unfair. This means that the classifier may not be robust. She is working on approaches that use two-person game theory to come up with approaches that are both fair and robust.

In the question and answer session, I had three big takeaways from the discussion:

Jeanette answered a question about how much we should trust Machine Learning systems. She said "not until we have assurance that the output of the machine learning system is what we want." She thinks that in most cases, there will still need to be a human at the end to evaluate results, especially in life-critical systems. There is a reason we don't have self-driving cars yet.

Information on ICSE 2022 conference

ICSE 2022 will be held the week of May 21-29, 2022 in Pittsburgh. There will be more information on the ICSE 2022 website.


Last modified: May 27, 2021