September 2023, vol. 11, No. 3 
IEEE Geoscience and Remote Sensing Magazine

Citation preview

Announcing a new Special Issue:

MICROWAVES IN CLIMATE CHANGE Initial Manuscript Submission Deadline: July 1st, 2024 Based on feedback from our special series article, “Making Waves: Microwaves in Climate Change,” the IEEE Journal of Microwave HYPERLINK "https://mtt.org/publications/journal-of-microwaves/"s is putting together a full special issue on this extremely prescient topic area. Microwave devices, instruments, systems, measurements, applications, and data analysis provide enabling technology and science retrieval in areas related to climate tracking, atmospheric chemistry and evolution, alternative energy development, efficient generation and usage of non-fossil-based fuels, waste conversion, electrification, transportation management, and every sector of the society relying on communications. We are soliciting articles on the following topics as well as entertaining alternative suggestions from interested contributors: • Active and Passive Microwave Remote Sensing • Microwave Heating • Microwave Generation of Alternative Fuels and Catalysts • Microwave Power Beaming • Microwave Energy Harvesting • Microwaves in Fusion and Energy Generation • Low-Loss Microwave Transmission • Microwaves in Waste Management • Microwave Assisted Chemistry • Microwaves in Geophysics • Microwave Data Analysis for Climate Research • Microwave Tracking for Habitat Assessment and Animal Science • Microwave Resource Monitoring • Other topics related to climate science and relevant resource monitoring If you have a particular idea or topic you would like to contribute, please contact Peter Siegel, JMW Editor-inChief ([email protected]). The deadline for submission of initial manuscripts is Monday, July 1st, 2024. Final upload of accepted paper proofs is September 1st, 2024. All production ready manuscripts will be posted on IEEE Xplore Early Access and will appear in final paginated sequence in the Special Issue – scheduled for final release in October 2024. Contributed papers should be targeted at ten pages but review and special invited papers can be longer. All submissions will be reviewed in accordance with the normal procedures of the journal. Please tag uploaded papers as “Special Issue” through our Author Portal. We hope you will consider contributing to this Special Issue of IEEE Journal of Microwaves and continue to support the journal through your regular research submissions.

Digital Object Identifier 10.1109/MGRS.2023.3306031

SEPTEMBER 2023 VOLUME 11, NUMBER 3 WWW.GRSS-IEEE.ORG

FEATURES

8

The Orbital X-Band Real-Aperture Side-Looking Radar of Cosmos-1500

by Ganna B. Veselovska-Maiboroda, Sergey A. Velichko, and Alexander I. Nosich

OPENER CREDIT: LAVA AND SMOKE BLANKET FAGRADALSFJALL IN ICELAND, JULY 2023 © NASA

21

Airborne Lidar Data Artifacts

46

Interferometric Phase Linking

by Wai Yeung Yan

by Dinh Ho Tong Minh and Stefano Tebaldini

Are No Data Like More Data 63 Tbyhere  Michael Schmitt, Seyed Ali Ahmadi, Yonghao Xu, Güls¸en Tas¸kin, Ujjwal Verma, Francescopaolo Sica, and Ronny Hänsch

ON THE COVER: Fig. S1 from Schmitt et al., page 63 in this issue, showing a schematic illustration of the size measure used to characterize Earth Observation datasets. BACKGROUND IMAGE LICENSED BY INGRAM PUBLISHING

SCOPE IEEE Geoscience and Remote Sensing Magazine (GRSM) will inform readers of activities in the IEEE Geoscience and Remote Sensing Society, its technical committees, and chapters. GRSM will also inform and educate readers via technical papers, provide information on international remote sensing activities and new satellite missions, publish contributions on education activities, industrial and university profiles, conference news, book reviews, and a calendar of important events.

Digital Object Identifier 10.1109/MGRS.2023.3304525 SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

1

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

COLUMNS & DEPARTMENTS

4 FROM THE EDITOR



6 PRESIDENT’S MESSAGE

98 SOFTWARE AND DATA SETS

EDITORIAL BOARD Dr. Paolo Gamba University of Pavia Department of Electrical, Biomedical, and Computer Engineering Pavia, Italy [email protected] Subit Chakrabarti (Cloud to Street, USA) Gong Cheng (Northwestern Polytechnical University, P.R. China)

114 CONFERENCE REPORTS

Michael Inggs (University of Cape Town, South Africa)

126 TECHNICAL COMMITTEES

George Komar (NASA retired, USA)

Vice President of Technical Activities Dr. Fabio Pacifici Maxar, USA Secretary Dr. Steven C. Reising Colorado State University, USA Chief Financial Officer Dr. John Kerekes Rochester Institute of Technology, USA IEEE PUBLISHING OPERATIONS Journals Production Manager Sara T. Scudder

Andrea Marinoni (UiT, Artic University of Norway, Norway)

Senior Manager, Production Katie Sullivan

Fabio Pacifici (Maxar, USA)

Senior Art Director Janet Dudar

Mario Parente (University of Massachusetts, USA)

Associate Art Director Gail A. Schnitzer

Nirav N. Patel (Defence Innovation Unit, USA)

Production Coordinator Theresa L. Smith

Michael Schmitt (Universität der Bundeswehr, Germany)

Director, Business Development– Media & Advertising Mark David +1 732 465 6473 [email protected] Fax: +1 732 981 1855

Hanwen Yu (University of Electronic Science and Technology of China, P.R. China)

The IEEE Geoscience and Remote Sensing Society of the IEEE seeks to advance science and technology in geoscience, remote sensing and related fields using conferences, education, and other resources.

Vice President of Meetings and Symposia Sidharth Misra NASA-JPL, USA

Joseé Levesque (Defence Research and Development, Canada)

Vicky Vanthof (Univ. of Waterloo, Canada)

MISSION STATEMENT

Vice President of Professional Activities Dr. Lorenzo Bruzzone University of Trento, Italy

Advertising Production Manager Felicia Spagnoli

GRS OFFICERS President Mariko Sofie Burgin NASA Jet Propulsion Laboratory, USA

Production Director Peter M. Tuohy

Executive Vice President Saibun Tjuatja The University of Texas at Arlington, USA

Senior Director, Publishing Operations Dawn M. Melley

Director, Editorial Services Kevin Lisankie

Vice President of Publications Alejandro C Frery Victoria University of Wellington, NZ Vice President of Information Resources Keely L Roth Salt Lake Cty UT, USA

IEEE Geoscience and Remote Sensing Magazine (ISSN 2473-2397) is published quarterly by The Institute of Electrical and Electronics Engineers, Inc., IEEE Headquarters: 3 Park Ave., 17th Floor, New York, NY 10016-5997, +1 212 419 7900. Responsibility for the contents rests upon the authors and not upon the IEEE, the Society, or its members. IEEE Service Center (for orders, subscriptions, address changes): 445 Hoes Lane, Piscataway, NJ 08854, +1 732 981 0060. Individual copies: IEEE members US$20.00 (first copy only), nonmembers US$110.00 per copy. Subscription rates: included in Society fee for each member of the IEEE Geoscience and Remote Sensing Society. Nonmember subscription prices available on request. Copyright and Reprint Permissions: Abstracting is permitted with credit to the source. Libraries are permitted to photocopy beyond the limits of U.S. Copyright Law for private use of patrons: 1) those post-1977 articles that carry a code at the bottom of the first page,

provided the per-copy fee indicated in the code is paid through the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923 USA; 2) pre-1978 articles without fee. For all other copying, reprint, or republication information, write to: Copyrights and Permission Department, IEEE Publishing Services, 445 Hoes Lane, Piscataway, NJ 08854 USA. Copyright © 2023 by the Institute of Electrical and Electronics Engineers, Inc. All rights reserved. Application to Mail at Periodicals Postage Prices is pending at New York, New York, and at additional mailing offices. Canadian GST #125634188. Canada Post Corporation (Canadian distribution) publications mail agreement number 40013885. Return undeliverable Canadian addresses to PO Box 122, Niagara Falls, ON L2E 6S8 Canada. Printed in USA. IEEE prohibits discrimination, harassment, and bullying. For more information, visit http://www.ieee.org/web/aboutus/whatis/policies/p9-26.html.

Digital Object Identifier 10.1109/MGRS.2023.3301259

2

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

Harness the publishing power of IEEE Access. ®

IEEE Access is a multidisciplinary open access journal offering high-quality peer review, with an expedited, binary review process of 4 to 6 weeks. As a journal published by IEEE, IEEE Access offers a trusted solution for authors like you to gain maximum exposure for your important research.

Explore the many benefits of IEEE Access: • Receive high-quality, rigorous peer review in only 4 to 6 weeks • Reach millions of global users through the IEEE Xplore® digital library by publishing open access • Submit multidisciplinary articles that may not fit in narrowly focused journals • Obtain detailed feedback on your research from highly experienced editors

Learn more at ieeeaccess.ieee.org

• Establish yourself as an industry pioneer by contributing to trending, interdisciplinary topics in one of the many topical sections IEEE Access hosts • Present your research to the world quickly since technological advancement is ever-changing • Take advantage of features such as multimedia integration, usage and citation tracking, and more • Publish without a page limit for $1,750 per article

FROM THE EDITOR BY PAOLO GAMBA 

Issues and Special Issues

I

n line with what I did for the June issue, I will use my IEEE Geoscience and Remote Sensing Magazine (GRSM) editorial on one hand to summarize the content of the current issue and, on the other hand, to introduce a feature of this magazine that may not be well known to (or understood by) all our readers. Specifically, I will describe the possibility to publish special issues in GRSM. As mentioned in my first editorial this year, GRSM accepts proposals for special issues. In June I spent some time to explain why articles should be submitted in a white paper format; here I will go a little bit into explaining how white papers and special issues go together and what types of special issues are welcome in GRSM. IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE ISSUE CONTENT Let us first start, however, by describing the contents of this issue, which is full of valuable technical articles, as well as interesting and timely columns providing insights about what happens in our Society. Indeed, this issue includes four technical articles and five columns. As you have seen, I am pushing to have more columns than in the past years because columns are meant to provide information about the activities of the Society and are very valuable for IEEE Geoscience and Remote Sensing Society (GRSS) members. Columns have both an informative and an engaging purpose: readers are informed about what happens in the various GRSS committees and working groups and find out how they could be involved, if interested. The technical content of this issue encompasses hardware and software and refers to multiple sensors and systems. The first article [A1] introduces the structure, history, and performance of the Cosmos-1500 real aperture radar, with an interesting description of the challenges faced by its developers in the designing and

Digital Object Identifier 10.1109/MGRS.2023.3304503 Date of current version: 19 September 2023

4

operating phase. Still considering radar systems but focusing on Interferometric Synthetic Aperture Radar (InSAR), [A2] reviews the phase linking algorithms that have been proposed with respect to their efficiency, precision, and usefulness with respect to final applications. As a preprocessing tool for various SAR applications, these algorithms well match the topic of the following article [A3], which focuses on the many different types of artifacts that affect light detection and ranging (lidar) datasets. The comprehensive analysis of these artifacts and the most common techniques to reduce them is a very good introduction of preprocessing techniques applied to lidar data. Finally, the last technical article in the issue is devoted to datasets coming from multiple sources and their relationship with deep learning techniques [A4]. Indeed, Earth observation (EO) datasets are very different from what is usually considered for algorithm development in computer vision, and a better knowledge of the datasets that are available and their features is a way to strengthen the link between the GRSS and the computer vision community without losing the link with the physics at the very basis of EO data. The columns are actually well connected to the fourth technical article by means of the first two of them because they are reports about new datasets that are available, one for deep learning and the other one for classification purposes. The first column [A5] provides details about a large-scale, global, multimodal, and multiseasonal corpus of satellite imagery from multispectral and radar sensors. The second column [A6] introduces an open dataset that tackles the burned area delineation problem using prefire and postfire Sentinel-2 acquisitions of California forest fires that took place starting in 2015. The following two columns refer to activities performed in the framework of the recent 2023 IEEE International Geoscience and Remote Sensing Symposium. In [A7], the interested reader will find the highlights of the awards and opening sessions during the first day of the conference. The next column [A8] summarizes the IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

activities related to diversity, equity, and inclusion that have been promoted and organized by the GRSS Inspire, Develop, Empower, Advance Committee right before and during the conference. Finally, the last column [A9] reports about a session organized to explore the topic of data science at scale, which was part of the summer school, “High Performance and Disruptive Computing in Remote Sensing,” hosted at the University of Iceland last May. IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE AND SPECIAL ISSUES In this second part of my editorial, I will introduce GRSM special issues, which are collections of technical articles (neither reviews nor tutorials) that are published in GRSM after a rigorous selection and peer review evaluation procedure based on a specific topic that has been suggested by a few guest editors and that has been accepted by the GRSM editorial board. To match the scope of a magazine like GRSM, each of such collections cannot be composed by articles showing specific technical results in the area of the special issue. Instead, special issue articles should be technical articles summarizing the achievements, the statuses, and the challenges of the technical activities connected with one of the aspects of the aforementioned topic. In the last four years, GRSM has published one special issue only, on the topic of hyperspectral imaging, and the interested reader is referred to that issue to better understand the right blend of technical novelty, deep coverage, and clear explanation that the articles in that issue provided. GRSM has not published any special issues since then for the past five and a half years, and we do not expect to be able to publish one in 2023, but we are already planning for one or two special issues in 2024. Keeping the previous considerations in mind, interested guest editors who think they have a valuable proposal for a special issue of GRSM are welcome to submit a proposal using the template, which is available at https://www.grssieee.org/wp-content/uploads/2023/06/GRSM-TEMPLATEFOR-SPECIAL-ISSUE-PROPOSALS.pdf. Such a proposal should introduce the overall topic of the special issue and articulate the intended content, applications, and style of the contributions. The guest editors should also describe potential subdivisions of the contributions into separate sessions. Subsequently, the main motivation of the proposed special issue should be described in terms of the timeliness and scientific/technical relevance of the proposed topic and its expected interest for GRSM readers (i.e., the whole GRSS membership). A special issue proposal submission should also include a clear description of the relevance of the proposal to geoscience and remote sensing and to the scope of GRSM. In agreement with the approach and style of the magazine, the guest editor should strive to create a balanced mix between ensuring scientific depth and dissemination to a wide public, which would encompass remote sensing scientists, practitioners, and students. Finally, the names of the proposed guest editor team must be included, so that for each member, a biosketch in the SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

style of IEEE publications is available. In the last portion of the proposal, the guest editors should provide a tentative schedule for all the major deadlines involved in the special issue because according to GRSM policy, submissions will be articulated in a two-step procedure: First, the guest editors should solicit a short white paper from the invited authors (four to five pages in double-column format). The white paper will summarize the foreseen objectives of the article and discuss the importance of the addressed topic, the impact of the contribution, and the authors’ expertise and past activities on the topic. Based on the white papers, the guest editors will select the contributions to be submitted as full articles. The full articles will then be peerreviewed by international experts, and those accepted for publication will be included in the special issue. APPENDIX: RELATED ARTICLES [A1] G. B. Veselovska-Maiboroda, S. A. Velichko, and A. I. Nosich, “The orbital X-band real-aperture side-looking radar of Cosmos-1500: A Ukrainian IEEE Milestone candidate,” IEEE Geosci. Remote Sens. Mag., vol. 11, no. 3, pp. 8–20, Sep. 2023, doi: 10.1109/MGRS.2023.3294708. [A2] D. Ho Tong Minh and S. Tebaldini, “Interferometric phase linking: Algorithm, application, and perspective,” IEEE Geosci. Remote Sens. Mag., vol. 11, no. 3, pp. 46–62, Sep. 2023, doi: 10.1109/MGRS.2023.3300974. [A3] W. Y. Yan, “Airborne lidar data artifacts: What we know thus far,” IEEE Geosci. Remote Sens. Mag., vol. 11, no. 3, pp. 21–45, Sep. 2023, doi: 10.1109/MGRS.2023.3285261. [A4] M. Schmitt et al., “There are no data like more data: Datasets for deep learning in Earth observation,” IEEE Geosci. Remote Sens. Mag., vol. 11, no. 3, pp. 63–97, Sep. 2023, doi: 10.1109/ MGRS.2023.3293459. [A5] Y. Wang, N. A. A. Braham, Z. Xiong, C. Liu, C. M. Albrecht, and X. X. Zhu, “SSL4EO-S12: A large-scale multimodal, multitemporal dataset for self-supervised learning in Earth observation,” IEEE Geosci. Remote Sens. Mag., vol. 11, no. 3, pp. 98–106, Sep. 2023, doi: 10.1109/MGRS.2023.3281651. [A6] D. R. Cambrin, L. Colomba, and P. Garza, “CaBuAr: California burned areas dataset for delineation,” IEEE Geosci. Remote Sens. Mag., vol. 11, no. 3, pp. 106–113, Sep. 2023, doi: 10.1109/ MGRS.2023.3292467. [A7] A. Moreira, F. Bovolo, D. Long, and A. Plaza, “IGARSS 2023 in Pasadena, California: Impressions of the first days,” IEEE Geosci. Remote Sens. Mag., vol. 11, no. 3, pp. 114–125, Sep. 2023, doi: 10.1109/MGRS.2023.3303685. [A8] V. Vanthof, H. Mcnairn, S. Tumampos, and M. Burgin, “Reinforcing our commitment: Why DEI matters for the IEEE GRSS,” IEEE Geosci. Remote Sens. Mag., vol. 11, no. 3, pp. 126–129, Sep. 2023, doi: 10.1109/MGRS.2023.3303874. [A9] M. Maskey et al., “A summer school session on mastering geospatial artificial intelligence: From data production to artificial intelligence foundation model development and downstream applications,” IEEE Geosci. Remote Sens. Mag., vol. 11, no. 3, pp. 129–132, Sep. 2023, doi: 10.1109/MGRS.2023.3302813. GRS

5

PRESIDENT’S MESSAGE BY MARIKO BURGIN

Letter From the President

H

Digital Object Identifier 10.1109/MGRS.2023.3304090 Date of current version: 19 September 2023

can be organized within local chapters or conferences. The Sister Societies agreed to jointly work to recruit new members and encourage current members to become members of both Sister Societies. The Sister Societies will explore the possibility of a joint distinguished lecturer program and jointly sponsored special issues programs and also have agreed to promote each other’s community of conferences, meetings, and trade shows. This will enhance visibility within the relevant fields and allow members access to events exclusively organized by the signatory Sister Society. The Sister Societies will be able to offer an array of promising prospects for each Society’s members. Aside from the wealth of expertise, resources, and networks that will be shared through these collaborative agreements, I am optimistic that new ideas and intersociety activities will flourish. These joint partnerships will be a positive step toward synergistic growth and enrich each Society’s operations contributing to the overall success and impact of the GRSS

FIGURE 1. On 17 June 2023, the GRSS signed an MoU with its IEEE Sister Society, the AP-S. On the left, Dr. Mariko Burgin, president of the GRSS. On the right, Dr. Stefano Maci, president of the AP-S.

FIGURE 2. On 17 June 2023, the GRSS signed an MoU with its IEEE Sister Society, the MTT-S. On the left, Dr. Mariko Burgin, president of the GRSS. On the right, Dr. Nuno Borges Carvalho, president of the MTT-S.

ello again! My name is Mariko Burgin, and I am the IEEE Geoscience and Remote Sensing Society (GRSS) president. You can reach me at president@grss -ieee.org and @GRSS_President on X, formally known as Twitter. I am delighted to report that on 17 June 2023, the GRSS signed two separate memorandums of understanding (MoUs) with two of its IEEE Sister Societies. I signed an MoU with Dr. Stefano Maci, president of the IEEE Antennas and Propagation Society (AP-S) (see Figure 1), and another MoU with Dr. Nuno Borges Carvalho, president of the IEEE Microwave Theory and Technology Society (MTT-S) (see Figure 2). Through these MoUs, the IEEE Societies can facilitate knowledge exchange and enable the dissemination of best practices, research findings, and industry trends through joint technical or social meetings that

6

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

FIGURE 3. Group photo of the Society presidents and presidents-elect of the GRSS, the MTT-S, and the AP-S on 18 June 2023, in Chicago, IL. From left to right, Dr. Saibun Tjuatja (GRSS president-elect), Dr. Mariko Burgin (GRSS president), Dr. Nuno Borges Carvalho (MTT-S president), Dr. Stefano Maci (AP-S president), Dr. Maurizio Bozzi (MTT-S presidentelect), and Dr. Branislav Notaros (AP-S president-elect).

SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

together with the AP-S and the MTT-S. I am delighted to renew our commitment to collaboration together with the presidents and presidents-elect of the AP-S and the MTT-S (see Figure 3). I look forward to similar conversations around partnership and collaboration with the IEEE Aerospace and Electronic Systems Society and the IEEE Oceanic Engineering Society, as well as other Societies and institutions in the remainder of 2023. Would you like to be part of the next step on this journey, to turn these MoUs into actions? Reach out! We are looking for enthusiastic volunteers! We need YOU to shape your community! Warmly, Mariko Burgin IEEE GRSS President GRS

7

The Orbital X-Band Real-Aperture Side-Looking Radar of Cosmos-1500 A Ukrainian IEEE Milestone candidate GANNA B. VESELOVSKA-MAIBORODA, SERGEY A. VELICHKO  , AND ALEXANDER I. NOSICH 

W

e revisit the development and operation of the orbital X-band real-aperture side-looking radar (R A-SLR) onboard the USSR satellite Cosmos-1500 in the historical context. This radar was conceived, designed, and tested in the early 1980s and then supervised, in orbit, by a team of Ukrainian scientists and engineers led by Prof. Anatoly I. Kalmykov (1936–1996) at the O. Y. Usikov Institute of Radiophysics and Electronics (IRE) of the National Academy of Sciences of Ukraine (NASU). It had a magnetron source, a 12-m deployable slotted-waveguide antenna, and an onboard signal processing unit. Instead of preplanned meticulous experiments, only five days after placement into the polar Earth orbit in the autumn of 1983, the SLR of Cosmos-1500 rendered truly outstanding service. It provided a stream of microwave images of the polar sea ice conditions that enabled the rescue of freighters in the Arctic Ocean. Two years later, similar imagery was equally important in the rescue of a motor vessel (MV) in the Antarctic. However, the way to success was far from smooth. Besides the technical problems, Kalmykov had to overcome the jealousy and hostility of his home institute administration, colleagues from Moscow research laboratories, and high-level USSR bureaucracy. Later, Kalmykov’s radar was released to the industry and became the main instrument of the USSR and Russian series of remote sensing satellites Okean and Ukrainian satellites Sich-1 and Sich-1M. We believe that the RA-SLR of Cosmos-1500 is a good candidate for the status of an IEEE Milestone in Ukraine.

Digital Object Identifier 10.1109/MGRS.2023.3294708 Date of current version: 25 July 2023

8

2473-2397/23©2023IEEE

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

radiometer, and transponders for collecting data from ice and buoy transmitters.” Although the organizations and, in part, the people who conceived, designed, and built the main remote sensing instrument onboard Cosmos-1500, an X-band RA-SLR, are still alive, the time is merciless, and the memory tends to turn humans’ experience into a legend. We would like to introduce the readers to the history of the creation and operation of Cosmos-1500. Many interesting details of that story can be found in reviews [2], [3], [4], [5] and a book [6]. However, most of them have never been translated into English and remain unknown to international readers. Besides, the years that have passed since 1983 and the experience of the postUSSR developments enable us to reveal important details that escaped earlier publications, ensure proper positioning of that achievement, and add a “human dimension” to the whole story. This article builds upon the preceding short conference paper [7], which has been considerably extended.

IMAGE LICENSED BY INGRAM PUBLISHING

INTRODUCTION In NASA’s Space Science Coordinated Archive, there is a page devoted to the Earth satellite Cosmos-1500 launched 40 years ago in a country that does not exist anymore, the USSR [1]. It communicates brief information on that mission. “The Cosmos 1500 spacecraft was a precursor to the operational Russian Okean series of oceanographic remote sensing missions. The Cosmos 1500 tested new sensors and methods of data collection and processing. Cosmos 1500 had the capability of overlapping and processing images from its sensors. Data from Cosmos 1500 were sent directly to ships or automated data receiving stations and applied in navigation in northern oceans. The instrument complement was highlighted by an all-weather Side-Looking Real Aperture Radar operating at 9.5 GHz. Other instruments included a multispectral ­scanner, a scanning high-frequency

SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

THE THREE-HEADED DRAGON OF USSR SCIENCE In the USSR, science was a stateowned dragon of three heads, tightly controlled by the Communist Party (CP), whose goals—technological efficiency and political control—had always contradicted each other [8]. The first head was the research and development (R&D) establishments of the ministries, each ministry being a “state inside the state” in the USSR, where no companies existed. Many of these establishments, in engineering sciences, were called design bureaus (DBs) [9]. This head, the richest, was responsible for the applied research and designing and testing of prototypes. To facilitate technology transfer to the industry, every DB was associated with some plant. Of the ministries, the most powerful were those of defense, the nuclear industry, the space industry, the radio industry, communications, the aircraft industry, the shipbuilding industry, the maritime fleet, and some others. The second head was a network of large laboratories called R&D institutes of the Academy of Sciences (AS) of the USSR (in reality, this was the AS of Russia) and similar academies of sciences of the union republics. The AS of the Ukrainian SSR [now NASU)] was the largest of 9

the latter, hosting around 25% of all AS research laboratories and manpower [8]. This head, officially, was responsible for fundamental research using direct state funding; however, it was allowed to compete for the projects funded by the ministries. The third head, the poorest, represented the university science where professors were encouraged to take projects funded again by the ministries. This activity was concentrated exclusively in large cities. Since Stalin’s times, the research patterns of all academies of sciences and universities were heavily biased toward technical and engineering sciences with either military or double-purpose applications in mind. The CP and government priorities were crystal clear: 1) nuclear weapons, 2) missiles to deliver nuclear weapons, and 3) radars to aim and guide nuclear weapons. From the 1950s to the end of the USSR in 1992, military-flavored research projects contributed sizable funds to the budgets of all AS institutes that related to what we can call, for brevity, the IEEE scope of interests. Still, there existed an important difference between the AS R&D institutes in Russia and outside of Russia; the latter could not have more than 25% of the total budgets coming from the ministries and industry, while the former were allowed to exceed this limit. Such a limitation had, obviously, political origins and reflected the distrust of “union republics.” It was established by the Science Department of the Central Committee of the Communist Party of the Soviet Union (CC CPSU) as the supreme supervising and controlling body over all ministries, academies, and universities. Of some 50 R&D institutes of NASU, the second-largest cluster, after Kyiv, was and still is in Kharkiv. In particular, IRE (now IRE NASU) used to be the national research center of physics and technology of microwaves and millimeter waves. The IRE is the focus of our story, together with the Institute of Marine Hydrophysics (IMH NASU) in Sebastopol (currently occupied by Russia). Another R&D establishment that played a crucial role was the DB “Yuznoye” [now DB Pivdenne (DBP)] in Dnepropetrovsk (now Dnipro). This is an engineering laboratory, now independent and then associated with the Yuzhmash (now

FIGURE 1. Anatoly Kalmykov in his office at IRE NASU around 1990.

10

Pivdenmash) Industry, which was, since the mid-1950s, one of three major rocket, missile, and spacecraft industrial complexes in the USSR [9]. Of course, Pivdenmash belonged to the extremely powerful USSR “Ministry of General Machine Building,” an Orwell-style cover for the Ministry of Space Industry. For instance, the famous SS-18 Satan heavy intercontinental ballistic missile (ICBM) and some of the military satellites were developed and manufactured here until 1992. PREHISTORY, NAMES, AND DATES Since 1976, IMH in Sebastopol and DBP in Dnipro were involved in the design of experimental USSR satellites Cosmos-1076 and Cosmos-1151, equipped with low-resolution radar-like sensors called scatterometers [1], [2], [3], [6]. Their task was determining the parameters of the sea waves, in line with a secret decree of the CC CPSU on the development of the general-purpose orbital remote sensing system “Resurs.” By that time, IMH had already enjoyed collaboration around sea wave research, using coastal and airborne sensors, with the radar group of Kalmykov at IRE NASU in Kharkiv [10], [11], [12], [13], [14], [15] (Figure 1). However, the “scatterometers” of the late 1970s, which were a sort of radar prototype device, had failed to satisfy the customers, who were from various state services and organizations, including polar navigation, maritime and port services, meteorology, etc. This proved the necessity of more concentrated efforts aimed at the development of active microwave sensors, i.e., radar. Thanks to the fact that the work on the whole subsystem “Resurs-O” (i.e., oceanic survey satellites) was supervised from DBP (see the “Obstacles to Overcome: Not Only Technical” section), Kalmykov could expect to be in the center of the associated design and testing. However, he lacked both equipment and R&D manpower. Part of the problem was the extreme hostility of the then-IRE administration [6]. According to insiders [16], by the summer of 1979, Kalmykov had given up and decided to move to IMH in Sebastopol. As stated by the same source, it was the IMH director who persuaded the top bosses of the extremely powerful USSR Ministry of Space Industry to intervene and rescue Kalmykov’s team at IRE. The then-director of IRE, V. P. Shestopalov, received a phone call from Moscow, suggesting that he urgently organize, at IRE, a research unit dealing with space radio oceanography and sea ice sensing. The ministry also promised to allocate IRE significant funds dedicated to such research. As a result, a 20-strong Department of Earth Remote Sensing Techniques was created at IRE on 1 September 1979, headed by Kalmykov. Immediately, the department initiated the R&D of a novel all-weather active orbital sensor, specifically designed to study the sea surface and ice covers. This was an X-band RASLR. One group was designing a 100-kW pulse power magnetron source, another group designed a slotted-waveguide antenna, and still another group was responsible for the IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

signal processing. A prototype airborne system allowing in-flight testing was also designed, and systematic flights onboard a dedicated MI-8 helicopter were organized. Besides, it was decided to add another passive sensor to the SLR, working in the millimeter-wave range—a Ka-band radiometer, also developed at IRE. Moreover, to produce the images, an onboard electronic data processing block was developed at DBP and IMH and added both to the airborne prototype and to the orbital system. The very first airborne experiments soon confirmed the high efficiency of the designed instruments for studying the water and ice surfaces [17], [18], [19], [20]. The joint use of microwave images obtained from the X-band SLR and Kaband radiometer offered, in principle, more efficient study of the state of the sea and ice than using the data from each individual sensor. However, initial tests had also shown that obtaining reliable information on the water-surface waving needed a much deeper level of data processing than available at that time. In contrast, quite reliable data were obtained in a simpler way in the helicopter observations of ice. The results of airborne studies convinced Kalmykov of the favorable prospects for radar observations of sea ice from space. Still, attempts to interpret the ice-sounding data beyond simple discrimination between thin and thick ice did not lead, unfortunately, to the creation of an adequate model. The phenomenon of the scattering from the ice turned out to be much more complicated than the scattering from the water surface. Still, other possible applications emerged, such as wind measurements and oil slicks detection [18], [20]. The Cosmos-1500 satellite (Figure 2) was launched on 28 September 1983 from Plesetsk by the Tsyklon-3 rocket vehicle (a derivative of a heavy ICBM SS-18) and placed into low-altitude near-circular polar orbit. It remained operational until 16 July 1986. This was the first ever civil satellite to carry an X-band RA-SLR working at the wavelength of 3.16 cm with vertical polarization; the swath width was about 460 km, and the spatial resolution was 2.4–3.2 km in the flight direction and 1.3-0.6 km in the normal direction [4], [22], depending on the incidence angle (see Table 1). The antenna system was based on the 12-m-long slotted waveguide, which was kept folded at the launch and then automatically unfolded in orbit. This radar was supplemented with a 37-GHz horizontally polarized sidelooking passive radiometer, designed at IRE NASU, and a four-channel visible range imaging system from the Institute of Radio-Engineering and Electronics (IRE RAS) in Moscow. The polar orbit was selected to provide data on the ice conditions in the Arctic in the hope to be useful for the navigation of ships in the northern latitudes, which were not visible from geostationary satellites. The chosen RASLR parameters were considered as optimal for all-weather studies of the polar sea ice covers and the dynamics of ice formation, migration, and melting. SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

Besides carefully selected radar parameters, the high efficiency of the SLR Cosmos-1500 system was expected due to the simultaneous acquisition of overlapping images from two other sensors so that three different wavelength bands were involved. This could enable the improved interpretation of images and the elimination of errors in retrieved parameters. Further, the onboard preliminary processing of the radar data and the transmission of the

8 7

2 4

3

1 2

3

3 5 3 6

FIGURE 2. Cosmos-1500 and its microwave remote sensing instruments: (1) bus, (2) solar panels, (3) rotatable instrument panels, (4) SLR antenna, (5) radiometer, (6) optical sensors, (7) telescopic mast, and (8) gravitational stabilizer [3].

TABLE 1. PARAMETERS OF THE SLR OF COSMOS-1500. Wavelength

3.1 cm

Polarization

VV

Viewing angle range

20–46°

Antenna pattern width In the azimuthal plane In the elevation plane

0.2° 42°

Spatial resolution Along the flight direction Transverse to the flight direction

2.4–3.2 km 1.3–0.6 km

Average resolution in the swath, provided via APT In the UHF band In the VHF band

0.8 × 2.5 km 2 × 2.5 km

Receiver sensitivity

−140 dB/W

Transmitter power

100 kW

Pulse duration

3 µs

Pulse repetition frequency

100 Hz

Orbit altitude

650 km

Orbital inclination

82.6°

Swath

450 km

Reproduced from [4] after correction of typos and translation mistakes. VV: vertical transmit-vertical receive.

11

synthesized images, using the simple 137.4-MHz Automatic Picture Transmission (APT) channel, to hundreds of users, including a central site in Moscow and autonomous points in Kharkiv and Sebastopol, was also a very big step ahead. Here, Kalmykov had to fight with Moscow colleagues from IRE RAS, who wanted to have a full monopoly on satellite imagery. Many details of the SLR of Cosmos-1500 design and operation can be found in reviews [2], [3], [4], [5], [6], [21], [22], [23]. Being a general-purpose instrument, it outperformed greatly the preceding USSR military orbital SLR “Chaika” aimed at the search of massive surface targets such as the U.S. Navy air carriers [3] (see the “Other Contemporary Orbital Radar Systems: A Monster in the Shadow” section). WHEN LENIN WAS HELPLESS: RESCUE MISSIONS OF THE COSMOS-1500 SIDE-LOOKING RADAR The work program of the new spacecraft envisaged many weeks of meticulous tests and experiments; however, this had to be greatly revised at the very beginning as the onboard Ka-band radiometer failed to operate. Moreover, only five days after placement into the polar Earth orbit in the autumn of 1983, the SLR of Cosmos-1500 obtained a new task, which was absolutely unexpected. By the time of the launch of the spacecraft, a true drama had developed on the Northern Maritime Route, which runs all the way from the Atlantic to the Pacific along the Arctic coasts of the Russian Federation. That September, extremely strong northwest winds pushed the heavy multiyear ice to the De Long Strait near Wrangel Island, where a caravan of 22 freighters (perhaps several caravans as sometimes 40 and even 57  ships plus five icebreakers are mentioned) got blocked. The ships were loaded with cargo worth some US$8 billion [23], which was carried as winter supplies to the Arctic regions of the USSR. The MV Nina Sagaidak was soon crushed by the ice and sank (Figure 3), and there was a real threat of further losses, especially as the polar night was approaching. The authorities created ad hoc interservices staff to monitor and guide the caravans. Besides conventional icebreakers, the first USSR nuclear-powered icebreaker

Lenin was sent to the De Long Strait. However, soon, she got one propeller crushed by the ice and the other damaged. Her brand-new nuclear-powered sister ship Brezhnev (named after the recently deceased general secretary of the CC CPSU) had also failed to crush the pack ice. In the polar night season, air surveillance was pointless, and the SLR of Cosmos-1500 became the only available source of trusted sea-ice information (as already mentioned, the other all-weather instrument, the onboard Ka-band radiometer, failed to work) in and around the De Long Strait. Already, the first radar images of the disaster area ­(Figure 4) showed that the situation was not hopeless. Indeed, 100 km north of the caravan, near Wrangel Island, an extensive polynya (a sea area where the ice is either absent or very thin) could be seen, together with a strip of wide cracks and crevasses in heavy multiyear ice along which it was possible to direct the caravan to the polynya. Although the ad hoc staff of the rescue operation was reluctant to trust the microwave imagery, in the total absence of alternatives, it took up the risks and ordered the icebreaker to go north. On reaching the polynya, the icebreaker and the freighters turned southwest and, in a few days, sailed in safe waters.

(a) Wrangel Island

Pen. Chukotka

(b) FIGURE 4. The rescue mission of the USSR freighter caravan in the

FIGURE 3. The MV Nina Sagaidak sinking in the Arctic Ocean in September 1983.

12

De Long Strait, October 1983. (a) A radar image and (b) a topical map demonstrating the ships location and the route of their escape from the heavy ice area ( thin young ice, one-year ice, thick perennial ice; rescue route is the yellow line. Pen.: peninsula. (Source: Reproduced from [4].) IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

Amazingly, within its 33-month lifetime, the SLR of Cosmos-1500 was destined to fulfill another rescue mission, this time in the Southern Hemisphere [2], [3], [4], [5], [6], [23]. The research MV Mikhail Somov, sent in 1985 to the Antarctic to bring a rotation crew to a USSR polar station, was blocked in the 5-m-thick ice. To rescue her, a USSR icebreaker was sent all the way from the Northern Hemisphere; this was the conventional vessel Vladivostok as New Zealand, where the final replenishment had to take place, had already prohibited nuclear-powered ships from using its harbors. What is important for our story is that onboard the Vladivostok, a satellite information reception point was deployed to ensure the quick reception of microwave radar images from Cosmos-1500. The high polar orbit of Cosmos-1500 was well suited for such a mission. Its images (see Figure 5) enabled daily corrections of the icebreaker route in the ice fields both on the way to the blocked ship and back to the clean waters. At the crucial phase of the operation, radar images revealed a wide polynya in heavy ice, stretching toward the drifting ship. Thanks to this, instead of using a helicopter to evacuate the crew of the drifting ship, which would have had to be abandoned, the icebreaker rushed toward Mikhail Somov, freed her from the trap, and led her out of the ice [23]. When preparing this publication, we discovered that in today’s Russia, the role of Cosmos-1500 and its RA-SLR in the maritime rescue missions of 1983 and 1985 is subject to total oblivion. In several “documentary” films made in the 2010s, and even in the Russian-language Wikipedia, the existence of Cosmos-1500 is not mentioned at all, and instead, it is the “intuition” of icebreaker captains that is highly credited. Okay, intuition can be a powerful thing, especially when it is supported by microwave radar images received twice a day. A comprehensive 675-page Russian monograph [24], published in 2010 by the leading staff of the USSR space synthetic aperture radar (SAR) works at the Vega State Co., mentions Cosmos-1500 and its SLR. However, it does not mention its polar seas missions; instead of IRE NASU, the development of this radar is linked to the Kharkiv Institute of Radio Electronics, which was a technical university. It should also be noted that within the 19-month period between two polar rescue missions and after the second of them, the RA-SLR of Cosmos-1500 was engaged in its main operational tasks: research into the remote sensing of the mesoscale phenomena caused by the interaction between the ocean and the atmosphere. This related, first of all, to the detection and tracing of cyclones, typhoons, and hurricanes; however, less powerful formations, such as quasi-regular convective cell structures, cloud fronts, and vortices were also studied [2], [6], [23]. Figure 6 shows optical ­[Figure 6(a)] and microwave [Figure 6(b) and (c)] images of the tropical cyclone Diana dated 11–12 September 1984. The images of Figure 6(a) and (b) were obtained at the same time, and SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

that of Figure 6(c) was obtained 14 h later. These images allowed the correct estimation of the velocity of the cyclone translocation, around 7 km/h, and its total power, about 1.2 × 108 MW. OBSTACLES TO OVERCOME: NOT ONLY TECHNICAL When creating the RA-SLR of Cosmos-1500, Kalmykov had to solve many problems of organizational, technical, and human nature. Until 1972, Kalmykov closely collaborated with IRE’s theoreticians and enjoyed support and encouragement from the first IRE director, O. Y. Usikov. All that changed when the latter was replaced by V. P. Shestopalov by the decision of the CPSU Committee of the Kharkiv Region. The new director was a relative newcomer in the R&D institute as, until the end of the 1960s, he was just an associate professor in physics at second-level universities in Kharkiv. His career sky-rocketed when a cousin of his wife became a secretary of the CPSU Committee of the Kharkiv Region. It is no surprise that (as explained in [3], [6]) he was looking at space radar research, which promised both heavy responsibility and tight control from Moscow, as a high-risk activity that should be avoided. As mentioned, by 1979, Kalmykov had gotten so desperate that he decided on moving to IMH. In Sebastopol, his friend (and the head of

(a)

Arctic Continent (b) FIGURE 5. The rescue mission of the USSR MV Mikhail Somov in the Antarctic, July 1985. (a) A radar image and (b) a topical map demonstrating the ship location and the escape route of the icebreaker Vladivostok. (Reproduced from [4]; the color legends are the same as in Figure 4. Note a misprint: the continent edge is Antarctica.)

13

the collaborating group) was V.V. Pustovoytenko, who had his own troubles with his administration but was supported both by Moscow and Dnipro. As admitted in [3], [6], the resistance of V. P. Shestopalov was partially overcome only due to the extraordinary personal efforts of S. N. Konyukhov, the head of the rocket division of DBP in Dnipro, who was made responsible for the whole remote sensing payload for the “Resurs-O” program. Here, it should be explained that, in the USSR, R&D centers located not in Russia but in other republics seldom coordinated the state programs. Usually, this was entrusted either to the industry bosses, the military, or the R&D centers in

(a)

(b)

(c)

FIGURE 6. Tropical cyclone Diana in the Atlantic Ocean, 11 September 1984. (a) An optical image and (b) and (c) microwave images obtained at 6:30 p.m. (b) and at 8:30 a.m. the next day (c).

14

Moscow and Leningrad. Konyukhov was a rare exception. Perhaps this was because of the success of ICBM SS-18 development at DBP. Besides, it was his predecessor at DBP, V. M. Kovtunenko, who initiated, in 1974, the development of equipment for the study of oceans from orbit [3],  [5], leading to the previously mentioned secret decree of the CC CPSU and the government (1977) about the creation of the system “Resurs.” Still, it was the success of SAR work onboard the U.S. Seasat in 1978 and its huge effect on the USSR political and military leaders that caused a decision to speed up the work. Such was the background for the phone call to IRE’s director from the USSR Ministry of Space Industry about establishing a space radar R&D unit headed by Kalmykov. However, even after obtaining his own department at IRE, with rich funding from the ministry, Kalmykov’s working conditions remained far from perfect. Thus, Konyukhov coordinated the work of all three Ukrainian R&D centers, one ministerial (DBP) and two academic (IRE and IMH). Still, he was supervised by his ministry in Moscow, where the other powerful organizations, such as IRE RAS and the almighty ministerial Central R&D Institute of Device Building, were developing the optical sensors and, in part, the information storage, processing, and transmission to customers’ equipment for Cosmos-1500. As recalled in [3], a mutually beneficial collaboration between the Ukrainian teams was established quickly; however, a similar level of synergy was never reached with the central organizations. In 1983, the conflict culminated in a series of heated discussions where the directors and leading experts of several powerful Moscow R&D centers attacked Kalmykov, Pustovoytenko, and B. Y. Khmyrov (Konyukhov’s successor at DBP). They demanded abandoning SLR in favor of SAR and, therefore, transferring the radar development to their laboratories. To rationalize their demands, which were fed by professional jealousy, they referred to the success of the U.S. Seasat and used a vague accusation of the allegedly insufficient “information potential” of Kalmykov’s SLR data. Still, Moscow SAR designs existed only in drawings and needed at least one more year of intensive development and testing (in reality, the first successful USSR SAR was placed into orbit only in 1991), while Kalmykov had an unbeatable argument— the successful operation of the airborne analog of his SLR. Thanks to this circumstance, Khmyrov expressed full support to Kalmykov, the attacks of the Moscow colleagues were rebuked, the IRE team released the radar, and Cosmos-1500 was assembled at DBP and launched according to the schedule. When Kalmykov’s SLR got successfully into its orbit, the feud between the developers faded off, at least for a while. However, suddenly, new powerful opponents emerged. As mentioned, already before the placement of the SLR into orbit, a caravan of USSR freighters got blocked by heavy ice in the Eastern Arctic. The situation was made public by the IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

authorities, although with a delay, as usual for the USSR media. When Kalmykov learned about it, he tried to approach the ad hoc committee put in charge of the rescue mission to propose his aid. This happened to be nearly impossible. A legend tells that Kalmykov wrestled his way to the committee meeting room, showing the satellite imagery printouts to the KGB guards; however, more probably, he had found someone who played the role of mediator. Still, this was not the end. Unexpectedly, the top administration of the USSR Chief Directorate of the Northern Maritime Route, the dominant service in the rescue committee, displayed a huge distrust of the satellite data, which suggested a nontrivial escape route—to the north of the disaster site. At the crucial moment, Kalmykov had to voice a threat to file a complaint to the superpowerful authority: the CC CPSU. This worked out, and a nuclear-powered icebreaker was ­ordered to move north. OTHER CONTEMPORARY ORBITAL RADAR SYSTEMS: A MONSTER IN THE SHADOW The first space-based microwave Earth imaging experiment using the L-band SAR of the U.S. Seasat-A satellite was conducted in 1978. That radar worked for three months at the wavelength of 23 cm with a swath of 100 km and provided a spatial resolution of 25 m [25], [26]. The results of this experiment exceeded all expectations and showed the high capabilities of orbital systems. However, the radar images were synthesized not onboard but on the ground, with great delay, which prevented their use in time-sensitive applications. Essentially the same test SARs operated onboard the Space Shuttle Columbia in 1981 (five days) and 1984 (seven days) [25], [26], [27]. It should be noted that, in parallel to Kalmykov’s SLR, the USSR SAR systems were also developed: in Moscow. Test SAR “Travers” was installed onboard the spacecraft Resurs-O-1 launched as Cosmos-1689 in 1985 [2], [3] and, later, on the Priroda module of the orbital station (OS) Mir. The other SAR required full OS power; it was launched in 1987 onboard Cosmos-1870 and in 1991 on OS Almaz-1 [6], [24]. Despite an order of magnitude lower resolution than SAR, SLR was attractive due to higher radiometric accuracy and an order wider swath. It could use an available simple magnetron source, which had less stable characteristics than needed for SAR; additionally, onboard image processing, lower cost, and much quicker delivery were also very important. The orbital system of Cosmos-1500 had no contemporary analogs in the day-to-day practical monitoring of the ocean and ice. It was true that the Seasat and shuttle SAR experiments (and later ERS-1, RADARSAT, and other SAR systems) were primarily designed to serve oceanography and generally met and even exceeded expectations. However, they turned out to be even more useful for land applications, where a several-day delay in signal processing was not as critical as in maritime navigation. As a result, the practical components of their space SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

radar programs were focused mainly on monitoring land parameters [25], [26], [27]. Still, in deep secrecy, there existed another USSR orbital RA-SLR system for oceanic observations, perhaps a hundred times more expensive than Cosmos-1500 and all its derivatives. It was initiated as early as 1960, first placed into low-Earth orbit in 1975, and finally, closed in 1988 after at least 39 launches. This radar was called Chaika, and it and the spacecraft equipped with it were part of the top-secret naval reconnaissance and targeting system “Legenda” [24]. In the West, they were known as Radar Ocean Reconnaissance Satellites (RORSATs) [28], [29], [30]. Each RORSAT had two magnetrons (principal and backup); one, or after 1985, two RA-SLRs working at the frequency of 8.2 GHz; and one or two 10-m-long slotted-waveguide antennas to provide left-side and rightside swaths, each 450 km wide (Figure 7). These satellites were designed to find and track U.S. Navy air carriers, first of all, in the North Atlantic and North Pacific, and release the targeting data to the USSR assault triad: Navy bombers of the Tu-22M3 type, superheavy cruise missile submarines of the Oskar-I and Oskar-II (Kursk) types, and heavy guided missile cruisers of the Pyotr Velikiy and Moskva types. Each component of the triad had to launch many dozens of cruise missiles with conventional and nuclear warheads. For instance, according to a comprehensive description of a retired USSR Navy officer [31], to attack one U.S. air carrier from the air, as many as three full regiments of Tu-22M3 medium-range strategic bombers (i.e., 100 aircraft) were assigned. Some of the bombers and all dedicated submarines and cruisers were equipped with receivers of the “Legenda” system (Figure 8). Through the network of communication satellites known as Parus, RORSAT data information was passed on to these assets and a dedicated USSR Navy control center in Noginsk near Moscow [24]. The task of the system was not just to locate and identify naval ships but to provide the targeting data that could, allegedly, be fed directly into antiship missiles, such as the 6-ton X-22 liquidpropellant ones carried by the Tu-22M3. The reason for such a tremendous concern was that in the 1970s and early 1980s, the USSR submarine-launched ballistic missiles (SLBMs) had limited range and accuracy, so to fire them, the submarines had to come nearer to the U.S. East and West Coasts. Therefore, the U.S. Navy air carriers were viewed as an extremely dangerous force, able to block or destroy the USSR submarine fleet in their home bases at the Cola and Kamchatka Peninsulas. Still, traditionally, according to the USSR and Russian military doctrine and ethos, all bombers, submarines, and cruisers taking part in a raid on a U.S. air carrier were viewed as expendables. As for the bombers, probable losses were estimated at 50%; “Legenda” was not trusted by the pilots and the air staff, and a suicide raid of two dedicated Tu-16 reconnaissance aircraft was always envisaged to make visual contact with air carriers [31]. Similarly, the Navy staff always sent a destroyer 15

the submarine reflector antenna of the “Legenda” receiver was a huge retractable structure christened Punch Bowl by the North Atlantic Treaty Organization (NATO) fleets (Figure 9). To use it, the submarines had to stay at periscope depth for long hours preceding their attack, which should have added to the kamikaze spirit of their crews.

or even a trawler to follow every U.S. air carrier task force. Both submarines and cruisers shared the same nickname of “single-shot assets” as the reloading of their missiles was not available. According to [31], they expected no more than 30 min of life after firing their first and the last salvos of 24 or 20 (or 16) “Granit” cruise missiles, respectively. Moreover,

Conceptual Configuration of US-A Spacecraft Ejection of Reactor Core

Based on Sketches in (24)

NaK Droplets

Ejected Reactor Core

Image From russianspaceweb.com

Deployment of Side-Looking Radar Antenna

Reactor Container Attitude Thruster Pod

5.8 m 3,495 1,135

1,300

Orbit Insertion Propulsion System 4E18 Propulsion System

Shield

892 Propulsion System to Raise 19.542-MHz Transmitter in this Part Reactor

Reactor

Radiator

1,075

257 Side-Looking Radar Antenna, One on Each Side of the Spacecraft f = 8.2 GHz

Main Spacecraft Including Radar Payload 166-MHz Transmitter in this Part

0

(m)

5

(a)

(b) FIGURE 7. (a) RORSAT concept and configuration. (Source: [28].) (b) A demonstration copy at the DB “Arsenal” in St. Petersburg showing

the nuclear reactor at the forward end and two unfolded SLR antennas in the rear. (Source: [24].) 16

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

(a)

(b)

FIGURE 8. (a) “Legenda” satellite receiver antennas on the Pyotr Velikiy nuclear-powered USSR cruiser (a)—white radome on a vertical

structure at the side of the tower—and (b) on the sunken Moskva cruiser (b)—similar light-grey radome just above the rear launch tube. To eliminate the shadowing of antennas, the same equipment was placed at the portside. The whole USSR orbital naval reconnaissance and targeting system “Legenda” carried the same stigma of hopeless gigantomania and kamikaze spirit. The USSR electronics of that time were quite backward and unreliable; additionally, the signals backscattered from ships are accompanied by intensive clutter due to the sea waving and precipitation effects. To compensate for insufficient sensitivity and poor signal processing, USSR developers used an extremely monstrous approach. First, the RORSATs used small fast neutron nuclear reactors (“Buk”) to provide the 3-kW power needed to feed the radar; second, they always flew at low orbits of 250–270 km with a 65° inclination that made their lifetimes short, less than two months on average. Even more—to enable determining the direction and speed of the sea target with such an incoherent sensor as SLR—a primitive but efficient solution was found—nuclear RORSATs were launched in pairs and placed into identical orbits with a half-hour separation [28]. The combination of a low orbit and a nuclear power source introduced a serious risk of accident or uncontrolled reentry [29]. “To counter the problem, each RORSAT consisted of three major components: the payload and propulsion section, the reactor, and a disposal stage used to lift the reactor into a higher orbit, with an altitude of 900 km, at the end of the mission.” Each of at least 33 reactors launched in 1975–1988 contained more than 30 kg of weapon-grade (enriched to 90%) uranium-235, besides the sodium-potassium coolant. This means that presently about 940 kg of highly enriched uranium and a further 15 tons, mostly shaped as tens of thousands of radioactive coolant droplets, 0.6–2 cm in diameter, orbit Earth [29]. There were several accidents of the malfunctioning of RORSATs that triggered public attention to the danger they presented. On 24 January 1978, five years before Kalmykov’s success, Cosmos-954 failed to throw its nuclear SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

reactor into a high “graveyard” orbit and instead crashed over Canada’s Great Slave Lake, contaminating a wide area. The debris was examined by U.S. Lawrence Livermore National Laboratory scientists, which enabled them to get a better understanding of the design and mission of RORSATs. This catastrophe led to a two-year break in RORSAT launches, used for improvements in their design. Still, another similar accident happened at the beginning of 1983 with Cosmos-1402 when separate parts of the reactor fell into the Indian and Atlantic Oceans.

(a)

(b) FIGURE 9. (a) and (b) “Legenda” receiver antenna in a radome on

the top of the coning tower of the USSR Oskar type submarines (one of them was the ill-fated Kursk, which exploded in 2000) (a) in the harbor and (b) at sea. (Sources: [32], [33].) 17

According to the U.S. Central Intelligence Agency assessment, RORSATs were able to track U.S. air carriers in good sea and weather conditions; however, they became useless otherwise. These spacecraft were so tremendously expensive and slow in production that their launches were usually tied to the massive naval drills of the U.S. and NATO fleets, thus leaving lengthy gaps in air carrier tracking. By the end of the 1980s, the USSR SLBMs had improved their range and accuracy to the extent that the submarines could stay on patrol just in the Sea of Okhotsk, which obtained the name of “a submarine aquarium.” Therefore, the RORSAT program was terminated in 1988, although the other part of the “Legenda” system, higher-orbit electronic intelligence satellites (EORSATs), survived, and their derivatives are still in operational use by the Russian Navy [9]. For our story, it is interesting to admit that the frequency of operation, the type of microwave source and antenna, and the swath width of Kalmykov’s RA-SLR and the “Chaika” SLR were rather similar to each other. Kalmykov had security clearance and should have known about the existence of naval RORSATs designed by the rocket and missile DB “Prikladnaya Mekhanika” (Applied Mechanics) in Moscow (now Khrunichev State Co.). He could even know about their design principles because EORSAT satellites were designed at DBP and produced serially at the Yuzhmash Industry in Dnipro. Moreover, he should know from

FIGURE 10. The spacecraft Cosmos-1500 with microwave equip-

ment for remote sensing of Earth at the permanent USSR Exhibition of Achievements of National Economy in 1985. (Source: [4].) 18

his IMH colleagues that the development of the signal processing for SLR “Chaika” was facilitated by the Black Sea experiments with its airborne analog on a dedicated turboprop aircraft [24]. However, given the USSR spy mania and intensive and even brutal rivalry between the rocket DBs, Kalmykov could not know anything except general terms. Note the difference in the satellite composition: vertical for Cosmos-1500 versus longitudinal for RORSATs (see Figures 2 and 7). This was apparently connected to the flight altitude; low-orbit RORSATs had to have a more “aerodynamic” shape to reduce the effect of the atmosphere, while Cosmos-1500 could instead neglect it. It is also important that his SLR was developed 10 years later than “Chaika” and integrated Kalmykov’s multiyear collaboration with IRE’s theoreticians around the sea and ice backscattering research. In terms of onboard signal processing, Cosmos-1500 implemented two relatively new, at least for the USSR space programs, operations: incoherent integration of eight successive images along the swath and compression of the image intensity dynamic range across the swath using the automatic gain control. Additionally, telemetric data on the current parameters of the SLR units were also transmitted to the ground stations. Therefore, the SLR of Cosmos-1500 can be safely considered as an original instrument. POST-HISTORY: FROM USSR AWARDS TO A POSSIBLE IEEE MILESTONE Two polar sea rescue missions of Cosmos-1500 gave a rare chance to the USSR authorities to present the USSR space program as a peaceful, truly useful, and efficient activity, while in reality, it was heavily biased to military use, poorly balanced, and plagued by the fierce rivalries of different players, and it suffered from numerous accidents and catastrophes [9]. These missions were broadly highlighted in the USSR newspapers and on TV. Already in 1985, a full-size copy of Cosmos-1500 was displayed in Moscow at a permanent exhibition (Figure 10). This publicity and attention helped obtain fair recognition at the national level. In 1987, Kalmykov and his nine colleagues were awarded the National Prize of Ukraine in Science and Technology with the citation, “For the development and implementation of radar methods of Earth remote sensing from aerospace platforms.” The recognition at the USSR level was restricted to several state orders, the highest of which, Lenin’s Order, was given to the same IRE director who nearly pushed Kalmykov out in 1979. For comparison, the developers of the secret naval SLR “Chaika” and the whole system “Legenda” were awarded, despite its low efficiency and RORSAT disasters, a secret Lenin’s Prize of the USSR. As mentioned previously, the design and development of the X-band orbital RA-SLR of Cosmos-1500 led to the successful overcoming of a wide range of scientific and technical problems. This enabled the technology transfer to the IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

R&D Institute of Radio Measurements in Kharkiv (now the RADMIR Institute), which had a small-series production line. In all, essentially the same SLR was exploited on six remote sensing satellites of the USSR/Russia Space Operative System Okean in 1986–2004 and two Ukrainian satellites named Sich in 1997 and 2004. It was successfully used to detect and monitor many critical situations and natural phenomena on a global scale [2], [3], [4], [5], [6], [19], [20], [21], [22], [23]. When Ukraine got its independence in 1992, its space industry, centered around DBP and Pivdenmash, hoped to keep working, although Ukraine had no launch sites and only a few tracking and control facilities. Indeed, Russia was dependent on DBP for the maintenance of its major nuclear ICBM force of several hundred silo-based SS-18 and for the supply of Tsiklon boosters and EORSAT satellites. As a part of the bargain, two Ukrainian radar remote sensing satellites named Sich were launched from the Russian launch sites. The first, Sich-1, was fully operational for three years: from 1997 to 2000. However, the second of them, Sich-1M, was placed into the wrong orbit and failed to deliver the expected data. Given that Russia’s president in 2004 was the same as today and in view of the Russian invasion of Ukraine, one can guess that the “wrong orbit” of Sich-1M was one more secret-service operation, aimed at denying Ukraine sensitive information and spoiling its reputation as a reliable spacecraft developer. After 1992, Kalmykov had to restrict his work to the airborne analog of his SLR. Such a system, called MARS, was developed and used by various Ukrainian ministries and services [34], [35], [36], [37], [38], [39]. However, the unexpected death of Kalmykov in 1996 from kidney trouble left a void that was hard to fill. Broadly speaking, the development and operation of the RA-SLR of Cosmos-1500 initiated, 40 years ago, a new research area and discipline in the Ukrainian microwave community—remote sensing of Earth from aerospace platforms. Therefore, at the national level, its impact is truly huge. In the context of the USSR science and technology, it played the role of a prototype for the family of sea ice-monitoring satellites that provided safe navigation in the Arctic from 1986 to 2004. Besides, it was used by Russian researchers to study the formation and dynamics of the ice covers of the Sea of Okhotsk [40] (it is quite possible that this research was initiated by the USSR military as that distant sea, bounded by the scarcely inhabited Russian Far East territories and Kuril Islands, became a sanctuary for the USSR fleet of nuclear-powered Typhoon-type SLBM submarines in the late 1980s). At the global level, it was one of the cornerstones of what was christened oceanography from space [25] and had initiated the systematic use of radar images for safe polar navigation. As we believe, all of the previously presented information suggests that the SLR of Cosmos-1500 satisfies the requirements of the IEEE History Committee to be nominated as an IEEE Milestone in Ukraine. SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

ACKNOWLEDGMENT We thank the Physics Benevolent Fund of the Institute of Physics, U.K., for one-off emergency support in the context of solidarity with Ukraine. Alexander I. Nosich acknowledges the support of the European Federation of Academies of Sciences and Humanities via a research grant from the European Fund for Displaced Scientists and the hospitality of the Institute of Electronics and Numerical Technologies of the University of Rennes, France. AUTHOR INFORMATION Ganna B. Veselovska-Maiboroda ([email protected])

is with the Department of Physical Foundations of Radar, O.Y. Usikov Institute of Radiophysics and Electronics, National Academy of Sciences of Ukraine, 61085 Kharkiv, Ukraine. Sergey A. Velichko ([email protected]) is with the Department of Earth Remote Sensing, O.Y. Usikov Institute of Radiophysics and Electronics, National Academy of Sciences of Ukraine, 61085 Kharkiv, Ukraine. Alexander I. Nosich ([email protected]) is with the Laboratory of Micro and Nano Optics, O.Y. Usikov Institute of Radiophysics and Electronics, National Academy of Sciences of Ukraine, 61085 Kharkiv, Ukraine. He is a Fellow of IEEE. REFERENCES [1] “NASA space science coordinated archive, cosmos 1500,” NASA, Washington, DC, USA. Accessed: Jun. 30, 2023. [Online]. Available: https://nssdc.gsfc.nasa.gov/nmc/spacecraft/ display.action?id=1983-099A [2] A. I. Kalmykov, A. S. Kurekin, and V. N. Tsymbal, “Radiophysical researh of the Earth’s natural environment from aerospace platforms,” Telecommun. Radio Eng., vol. 52, no. 3, pp. 41–52, 1998, doi: 10.1615/TelecomRadEng.v52.i3.100. [3] G. K. Korotayev et al., “Thirty years of domestic space oceanology,” Space Sci. Technol., vol. 13, no. 4, pp. 28–43, 2007. [4] V. K. Ivanov and S. Y. Yatsevich, “Development of the Earth remote sensing methods at IRE NAS of Ukraine,” Telecommun. Radio Eng., vol. 68, no. 16, pp. 1439–1459, 2009, doi: 10.1615/ TelecomRadEng.v68.i16.40. [5] V. V. Pustovoytenko et al., “Space pilot of the nuclear-powered vessels,” in Proc. Int. Crimean Conf. Microw. Telecommun. Technol. (CriMiCo), 2013, pp. 19–22. [6] A. G. Boyev et al., Aerospace Radar Monitoring of Natural Disasters and Critical Situations. Kharkiv, Ukraine: Rozhko Publication, 2017. [7] G. Veselovska-Maiboroda, S. A. Velichko, and A. I. Nosich, “Orbital X-band side-looking radar of Cosmos-1500: Potential IEEE Milestone candidate,” in Proc. Int. Conf. Ukrainian Microw. Week (UKRMW), Kharkiv, Ukraine, 2022, pp. 670–673. [8] P. Kneen, Soviet Scientists and the State. Albany, NY, USA: State Univ. New York Press, 1984. [9] B. Harvey, The Rebirth of the Russian Space Program. Chichester, U.K.: Springer-Verlag, 2007.

19

[10] A. I. Kalmykov, I. E. Ostrovskii, A. D. Rozenberg, and I. M. Fuchs, “Influence of the state of the sea surface upon the spatial characteristics of scattered radio signals,” Sov. Radiophysics, vol. 8, no. 6, pp. 804–810, Nov. 1965, doi: 10.1007/ BF01038278. [11] A. D. Rosenberg, I. E. Ostrovskii, and A. I. Kalmykov, “Frequency shift of radiation scattered from a rough sea surface,” Sov. ­Radiophysics, vol. 9, no. 2, pp. 161–164, Mar. 1966, doi: 10.1007/ BF01038952. [12] F. G. Bass, I. M. Fuks, A. I. Kalmykov, I. E. Ostrovsky, and A. D. Rosenberg, “Very high frequency radiowave scattering by a disturbed sea surface part I: Scattering from a slightly disturbed boundary,” IEEE Trans. Antennas Propag., vol. 16, no. 5, pp. 554–559, Sep. 1968, doi: 10.1109/TAP.1968.1139243. [13] F. G. Bass et al., “Radar methods for the study of ocean waves,” Sov. Phys. Uspekhi, vol. 18, no. 8, pp. 641–642, 1975, doi: 10.1070/PU1975v018n08ABEH004920. [14] F. G. Bass et al., “Radiophysical investigations of sea roughness (radiooceanography) at the Ukrainian Academy of Sciences,” IEEE Trans. Antennas Propag., vol. 25, no. 1, pp. 43–52, Jan. 1977, doi: 10.1109/JOE.1977.1145324. [15] Y. M. Galaev et al., “Radar detection of oil slicks on a sea surface,” Izvestia SSSR Fizika Atmos. Okeana, vol. 13, no. 4, pp. 406–414, 1977. [16] V. D. Yeryomka, private communication, 2018. [17] A. I. Kalmykov and A. P. Pichugin, “Special features of the detection of sea surface inhomogeneities by the radar methods,” Izvestia SSSR Fizika Atmos. Okeana, vol. 17, no. 7, pp. 754–761, 1981. [18] A. I. Kalmykov, A. P. Pichugin, Y. A. Sinitsyn, and V. P. Shestopalov, “Some features of radar monitoring of the oceanic surface from aerospace platforms,” Int. J. Remote Sens., vol. 3, no. 3, pp. 311–325, 1982, doi: 10.1080/01431168208948402. [19] V. B. Efimov et al., “Study of ice covers by radiophysical means from aerospace platforms,” Izvestiya SSSR Fizika Atmos. Okeana, vol. 21, no. 5, pp. 512–520, 1985. [20] S. A. Velichko, A. I. Kalmykov, Y. A. Sinitsyn, and V. N. Tsymbal, “Influence of wind waves on radar reflection by the sea surface,” Radiophysics Quantum Electron, vol. 30, no. 7, pp. 620–631, Jul. 1987, doi: 10.1007/BF01036296. [21] A. I. Kalmykov et al., “Information content of radar remote sensing systems of earth from space,” Radiophysics Quantum Electron, vol. 32, no. 9, pp. 779–785, 1989, doi: 10.1007/BF01038802. [22] A. I. Kalmykov et al., “Kosmos-1500 satellite side-looking radar,” Sov. J. Remote Sens., vol. 5, no. 3, pp. 471–485, 1989. [23] A. I. Kalmykov, S. A. Velichko, V. N. Tsymbal, Y. A. Kuleshov, J. A. Weinman, and I. Jurkevich, “Observations of the marine environment from spaceborne side-looking real aperture radars,” Remote Sens. Environ., vol. 45, no. 2, pp. 193–208, Aug. 1993, doi: 10.1016/0034-4257(93)90042-V. [24] V. S. Verba, Ed., Spaceborne Earth Surveillance Radar Systems. Moscow, Russia: Radiotekhnika Publications, 2010. [25] W. S. Wilson et al., “A history of oceanography from space,” in Remote Sensing of the Environment, Manual of Remote Sensing, vol. 6. Baton Rouge, LA, USA: American Society for Photogrammetry and Remote Sensing, 2005, pp. 1–31.

20

[26] C. Elachi et al., “Spaceborne synthetic-aperture imaging radars: Applications, techniques, and technology,” Proc. IEEE, vol. 70, no. 10, pp. 1174–1209, Oct. 1982, doi: 10.1109/PROC. 1982.12448. [27] J. Cimino, C. Elachi, and M. Settle, “SIR-B-the second Shuttle imaging radar experiment,” IEEE Trans. Geosci. Remote Sens., vol. GE-24, no. 4, pp. 445–462, Jul. 1986, doi: 10.1109/ TGRS.1986.289658. [28] A. Siddiqi, “Staring at the sea: The Soviet RORSAT and ­EORSAT programmes,” J. Brit. Interplanetary Soc., vol. 52, nos. 11–12, pp. 397–416, 1999. [29] S. Grahn, “The US-A program (Radar Ocean Reconnaissance satellites – RORSAT) and radio observations thereof.” Accessed: Jun. 30, 2023. [Online]. Available: http://www.svengrahn. pp.se/trackind/RORSAT/RORSAT.html [30] R. Kopets and S. Skrobinska, “Russia’s space program is deadlocked: A space naval reconnaissance incident,” Int. Relations, Public Commun. Regional Stud, vol. 2, no. 6, pp. 28–38, 2019. [31] M. Y. Tokarev, “Kamikazes: The Soviet legacy,” Nav. War College Rev., vol. 67, no. 1, 2014, Art. no 7. [32] “Soviet submarine fleet 1945-1990 part 3–11.” (in Russian), Moremhod. Accessed: Jun. 30, 2023. [Online]. Available: http:// moremhod.info/index.php/library-menu/16-morskaya-tematika/ 188-pf7?start=10 [33] V. V. Bychkov and V. G. Cherkashin, “Strategic concept of development of naval reconnaissance and targeting system,” Nat. Security Strategic Planning, vol. 2021, no. 2, pp. 30–37, 2021, doi: 10.37468/2307-1400-2021-2-30-37. [34] A. Kalmykov et al., “Radar observations of strong subsurface scatterers. A model of backscattering,” in Proc. Int. Geosci. Remote Sens. Symp. (IGARSS), 1995, vol. 3, pp. 1702–1704, doi: 10.1109/IGARSS.1995.524001. [35] A. I. Kalmykov et al., “The two-frequency multi-polarisation L/VHF airborne SAR for subsurface sensing,” AEU Int. Electron. Commun., vol. 50, no. 2, pp. 145–149, 1996. [36] S. A. Velichko, A. I. Kalmykov, and V. N. Tsymbal, “Possibilities of hurricanes investigations by real aperture radars Cosmos-1500/Okean type,” Turkish J. Phys., vol. 20, no. 4, pp. 305–307, 1996, doi: 10.55730/1300-0101.2567. [37] E. N. Belov et al., “Application of ground-based and air/­ spaceborne radars for oil spill detection in sea areas,” Telecommun. Radio Eng., vol. 51, no. 1, pp. 1–8, 1997, doi: 10.1615/ TelecomRadEng.v51.i1.10. [38] A. I. Kalmykov et al., “Multipurpose airborne radar system “MARS” for remote sensing of the Earth,” Telecommun. Radio Eng., vol. 53, nos. 9–10, pp. 120–130, 1999, doi: 10.1615/ TelecomRadEng.v53.i9-10.150. [39] M. V. Belobrova et al., “Experimental studies of the spatial irregularities of radio-wave scattering in the Gulf Stream zone,” Radiophysics Quantum Electron., vol. 44, no. 12, pp. 949–955, Dec. 2001, doi: 10.1023/A:1014873927923. [40] L. M. Mitnik and A. I. Kalmykov, “Structure and dynamics of the Sea of Okhotsk marginal ice zone from “Ocean” satellite radar sensing data,” J. Geophys. Res, vol. 97, no. C5, pp. 7429–7445, 1992, doi: 10.1029/91JC01596. GRS

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

Airborne Lidar Data Artifacts

What we know thus far IMAGE LICENSED BY INGRAM PUBLISHING

WAI YEUNG YAN 

D

ata artifacts are a common occurrence in airborne lidar point clouds and their derivatives [e.g., intensity images and digital elevation models (DEMs)]. Defects, such as voids, holes, gaps, speckles, noise, and stripes, not only degrade lidar visual quality but also compromise subsequent data-driven analyses. Despite significant progress in understanding these defects, end users of lidar data confronted with artifacts are stymied by the scarcities of both resources for the dissemination of topical advances and analytic software tools. The situation is exacerbated by the wide-ranging array of potential internal and external factors, with examples including weather/atmospheric/Earth surface conditions, system settings, and laser receiver–transmitter axial alignment, that underlie most data artifact issues. In this article, we provide a unified overview of artifacts commonly found in airborne lidar point clouds and their derivatives and survey the existing literature for solutions to resolve these issues. The presentation is from an end-user perspective to facilitate rapid diagnoses of issues and efficient referrals to more specialized resources during data collection and processing stages. We hope that the article can also serve to promote coalescence of the scientific community, software developers, and system manufacturers for the ongoing development of a comprehensive airborne lidar point cloud processing bundle. Achieving this goal would further empower end users and move the field forward. Digital Object Identifier 10.1109/MGRS.2023.3285261 Date of current version: 5 July 2023

SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

INTRODUCTION A number of recent surveys published between 2020 and 2022 project that the lidar market will likely reach US$2.9– 4.71 billion by 2026 to 2030 [1], [2], [3]. Among all the multimodal platforms, airborne lidar is projected to capture the majority of the market share because of its wide area coverage and enhanced data collection capability. Different advanced airborne lidar systems have been rolled out lately, including dual/triple laser channel systems operated with green and infrared laser wavelengths (e.g., Optech Titan and RIEGL VQ-1560i-DW), dual-wavelength airborne lidar bathymetry (ALB) (e.g., Optech CZMIL and Leica Chiroptera-5), dual-channel topographic lidar (e.g., RIEGL LMS-Q1560 and Optech Galaxy T-2000+G2), and singlephoton lidar (e.g., Leica SPL100). Despite these high-end commercial products, the data analytical capability of lidar is unable to keep pace with hardware development. These market surveys also emphasize that having insufficient data processing software tools sets a barrier to further expansion of the lidar market. As a result, raw lidar data likely grow at a rate that exceeds our processing capacity with current software tools. Traditional satellite remote sensing and aerial photogrammetric imagery data ride on the advancement of computer vision, machine learning, and image processing algorithms. Therefore, commercial software bundles handling these geospatial imagery data are widespread in the market and with ever-advancing capability and applicability. In contrast, airborne lidar data encompass a 2473-2397/23©2023IEEE

21

wealth of topographic information represented by a set of georeferenced 3D point clouds together with backscattered laser signal strengths. Although such an active remote sensing approach overcomes the limitations of traditional imaging techniques [4], the unstructured and bulky 3D point clouds are challenging for end users to handle. The deficiency of airborne lidar data processing software on the market can also be ascribed to the end users’ primitive understanding of the physical mechanism behind how a lidar system works. Most end users simply conduct visual examination of the collected 3D point clouds through generating intensity images and DEMs/ digital surface models (DSMs). As a result, users may be unaware or dismissive of the potential causes of data artifacts. These underlying issues, if unmanaged, could affect the overall reliability of any subsequent analysis based on the compromised data. System manufacturers and service providers make every effort to collect and produce the highest quality of airborne lidar data. While the lidar community puts strong emphasis on the quality assurance (QA) and quality check (QC) of data [5], [6], most of the QA and QC efforts are in modeling system biases, which, in turn, rectifies only geometric accuracy. Collected lidar data have unavoidable artifacts in the form of speckle and/or random noise, anomalies, defects, systematic biases, spikes, pitfalls, stairsteppings, periodic stripings, corduroys, and so on, especially if users examine the data at a sufficiently fine scale. Improvements in pulse repetition frequency (PRF) and a finer laser pulsewidth [7] further drive a significant increase of point cloud density [8] and ultimately magnify these artifacts. These defects not only result in unpleasant visual effects but also interfere with the subsequent analyses. Commercial remote sensing and photogrammetric software tools usually bundle with data preprocessing modules for radiometric/geometric correction, noise removal, image filtering, and spatial enhancement. However, software tools for the airborne lidar community are comparatively more rudimentary, with the exception of a few prototypes created by researchers, e.g., LAStools [9], OPALS [10], and lidR [11]. Most commercial software tools for lidar lack preprocessing functionalities

handling data artifacts. The inability to resolve these artifacts at the beginning stage may render the extracted information unreliable. This review summarizes the artifacts commonly found in airborne lidar data and their derivatives, the internal and external factors underlying these defects, and the current status of research handling and resolving these issues. It should be noted, however, that this review centers on artifacts that have significant visual and analytical impacts, rather than on defects or noise reported in backscattered lidar waveform signals [12]. In addition, although the term “airborne lidar system” in this article mainly refers to topographic airborne lidar, it also covers ALB, unmanned aircraft system (UAS), and airborne photon-counting lidar. Specifically, the focus of this concise review is on three types of data artifacts (see Table 1 and Figure 1): 1) Data voids and holes appear as blank regions that frequently occur in most topographic airborne lidar point clouds. Data voids appear because of occlusions (i.e., shadows caused by elevated objects nearby), laser dropouts found on water surfaces, ground objects with an extremely low reflectance (i.e., dark objects), swath gaps between data strips, or ground filtering. The presence of data voids may cause unpleasant triangular meshes in the resulting DEMs or intensity images, which, in turn, affects the visual quality, topographic analysis, semantic labeling, and 3D model reconstruction. 2) Stripe artifacts are commonly found in the resulting intensity image. If stripes appear on an individual lidar data strip, this is mainly attributed to misalignment between the axes of the laser transmitter and receiver, where this scenario is characterized as intensity banding. Due to signal attenuation at swath edges, i.e., long range and a large scan angle, intensity striping may appear when combining two overlapping lidar data strips. Similarly, striping may also appear in the overlapping region of the resulting DEM if the respective lidar data strips are not properly georeferenced and/or calibrated. 3) Random/speckle noise appears both in intensity images and DEMs. Noisy random data points can be caused by backscattered returns from unwanted floating objects

TABLE 1. AIRBORNE LIDAR DATA ARTIFACTS AND THEIR EFFECTS ON THE DERIVATIVES. LIDAR DATA

DATA VOIDS AND HOLES

STRIPES

SPECKLE/RANDOM NOISE

Point clouds

• Occlusions/shadows • Laser dropouts • Swath gaps • Objects with low reflectance • Dark objects/weak targets • Ground filtering

• Imperfect instrument calibration and alignment • Banding (i.e., axial displacement ­between e ­ mitter and receiver) • Signal attenuation at large incident angles and long ranges

• Floating objects (e.g., clouds and birds) • Bragg scattering and specular reflection (e.g., water surfaces) • Solar background noise (e.g., photon counting) • Range walk error/timing jitter

Intensity image

• Huge triangular facets • Missing intensity metrics • Inaccurate understory information

• Periodic striping in individual strips • Striping found at swath edges in ­overlap region

• Heterogeneous intensity on ­homogeneous surfaces • Intensity pits

DEM/DSM

• Triangular facets • Discontinuous slope • Inaccurate estimation of normal

• Periodic striping in the overlapping region

• Craters/divots • Blunders/spikes • Pits/spikes in canopy height models

22

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

(e.g., aerosols, rain, clouds, or birds), timing jitters, or range walk errors (RWEs). Moreover, photon-counting lidar is sensitive to solar background noise, resulting in noisy returns. Forest scientists often suffer from data pits appearing in lidar-derived canopy height models (CHMs). All these conditions ultimately lead to the appearance of spikes and blunders in DEMs. Bragg scattering of lasers on water surfaces also causes speckle noise in the resulting intensity images. The noise-induced intensity inhomogeneities pose serious challenges to using intensity data for mapping water surfaces. VOIDS, HOLES, AND GAPS Data voids, holes, and gaps are among the artifacts that are commonly found in airborne lidar point clouds and adversely impact data derivatives in general. These defects, apart from those present in the original data, may also be induced by ground filtering of aboveground features for extracting the bare Earth. The presence of data holes results in undesired triangular facets after conducting a surface interpolation, regardless of a DEM or an intensity image. These unfavorable effects are particularly prominent in the vicinity of data discontinuities. For instance, the surface normal being estimated at the edge of a hole may tilt into a wrong direction. Therefore, such data loss or missing information is considered a major obstacle to subsequent analyses and modeling. Various airborne lidar specifications, including the United States Geological Survey (USGS) lidar base specification, also stress that “a data void is considered to be any area greater than or equal to (4 # Aggregate Nominal Pulse Spacing (ANPS))2, which is measured using first returns only” [13], [14]. In general, there are four possible causes of these data voids, holes, and gaps. These include 1) occlusions/shadows, 2) laser dropouts on water surfaces, 3) swath gaps, and 4) weak targets and objects with a low reflectance. Each of these is discussed in the following sections, covering the causes, impacts, and possible solutions currently available.

OCCLUSIONS/SHADOWS An occlusion is a hindrance caused by a shadow cast by a nearby elevated object or terrain undulation. The line of sight from the lidar system to the occluded region is thus blocked, with no illumination of the laser footprints, leading to data sparsity and voids [see Figure 2(a)]. Common occlusions can be visualized DATA VOIDS, HOLES, AND as a shading effect caused by GAPS ARE AMONG THE nearby elevated objects. AnARTIFACTS THAT ARE other example of occlusion COMMONLY FOUND IN is shown in Figure 3, where a number of occluded regions AIRBORNE LIDAR POINT on the terrain can be found CLOUDS AND ADVERSELY with shadows cast from atmoIMPACT DATA DERIVATIVES spheric smoke and clouds. On IN GENERAL. the other hand, occluded data voids can also be found with sizes as small as the height of a typical curbstone [15]. The shadow region s blocked by an elevated object with a known height hb can be estimated with respect to the lidar system’s flying altitude H and scanning angle [16], using the following equation:

h #d s = Hb - h b (1) b

where db refers to the horizontal distance between the nadir point and the object (see Figure 2). If prior knowledge of the study environment is available, better mission plans could be devised in advance to reduce shadows by having multiple overlapping strips with parallel and crossing flight paths. In some special cases, the appearance of occluded data voids causes discontinuities of outlines of objects and thus hinders the automation of detection and 3D reconstruction processes. Various studies address such a challenge under different scenarios, especially in urban infrastructure, for instance, mapping curbstones and roadsides where there are

Adverse Effects

Potential Causes Airborne Lidar

Random Noise

Stripes/ Banding

Speckle Noise

Void/ Hole

Cloud/Rain /Smoke

Ground Condition

System Settings

Moving Object

FIGURE 1. Airborne lidar data artifacts’ causes and adverse effects. SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

23

Power Lidar

Occluded Region

Time

Flight Direction

H

First Return hb Front View

db

Nadir

s Top View

s (a)

Second Return Third Return (b)

FIGURE 2. (a) The occluded region, s, is blocked by an elevated object. Here, H is the flying altitude, db is the horizontal distance between

the nadir point and the object, and hb is the height of the object. (b) A laser beam penetrates the tree crown, resulting in multiple returns. The final return may cause an appearance of a data void due to occlusion caused by the overstory and/or a data pit in a CHM.

(a)

(b)

FIGURE 3. (a) Smoke and clouds appear during airborne lidar data collection, resulting in

occlusions on the terrain. (b) Unnatural artifacts (dark patches) also appear in the resulting intensity image.

(a)

(b)

FIGURE 4. (a) Occlusions found in the lower structure/part of (b) a multilayer highway.

24

occlusions caused by on-road parked vehicles [15], automatic rooftop extraction with very close canopies situated nearby [17], [18], road segments represented by segregated lidar data points due to the blockage of objects found right next to roads [19], and reconstruction of a lowerlevel structure occluded by the upper bridge in a multilayer-interchange highway [20] (see Figure 4 for an example). Most of these studies focus on object detection and 3D reconstruction, and thus, they respectively propose corresponding workflows and algorithms to restore the missing parts of the structure, based on prior knowledge of the objects’ geometric characteristics. The occluded regions in these scenarios all have notable visual effects of the missing regions. Aside from urban infrastructure, occlusion caused by tree canopies, though inducing less visual salience, may impose a burden for retrieving understory vegetation information. If a laser footprint has a projected area larger than the foliage, the laser energy, at one point, is backscattered from the first instance, and the rest of it propagates down the canopy. The footprint may subsequently reach the twigs, branches, and trunk and finally land on the ground, resulting in a number of backscattered laser echoes [see

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

rotatingprismsoroscillatingmirrors(seeFigure5).Althoughthe USGS mentions such unavoidable data voids in water regions in the lidar base specification [13], these missing regions, if left aside, may adversely impact visualization and subsequent analyses. Depending on the laser wavelength and instantaneous water conditions, a laser pulse usually undergoes (Fresnel) reflection, refraction, and (Bragg) scattering when it interacts with a water body. A nearinfrared (NIR) laser, operated at 1,064 nm, may suffer from EVEN IF A LIDAR END USER pulse absorption if the waAPPLIES ADVANCED ter is clean and the surface GROUND FILTERING AND is calm. On the other hand, INTERPOLATION NIR laser undergoes Bragg ALGORITHMS, THE scattering with instantaneous water waves or turbulence RESULTING DIGITAL [37] or water with a high deTERRAIN MODEL MAY BE gree of suspended sediments. INACCURATE AND SUFFER Green laser (e.g., operated at FROM A LOSS OF RUGGED 532  nm) backscatters from DETAILS. the water surface, refracts and penetrates along the water column, and finally reaches the seabed (depending on the water depth). That is the reason why ALB systems, such as Optech SHOALS or CZMIL, are usually equipped with dual lasers with a NIR channel capturing the water surface and a green channel mapping the water bottom. Laser dropouts usually occur at large incident angles [35], [36] and can be explained by a number of reasons: 1) the recorded laser energy is usually low at a large incident angle, as attributed to the laser backscattering

Elevation (m)

–20

LASER DROPOUTS ON WATER SURFACES Topographic airborne lidar may record null returns when scanning over water bodies, resulting in holes appearing in point clouds [33], [34]. These are commonly known as laser dropouts [35], which are found to be evident in data collected by linear array scanners, i.e., lidar systems operated with either SEPTEMBER 2023

Laser Dropouts at Swath Edge

0 20 Scan Angle Water Scan Line Profile

Elevation (m)

Intensity

4096

Intensity

Figure 2(b)]. Therefore, the overstory reduces the possibility of laser penetration to the understory, and such an issue appears to be particularly serious on closed canopies. As a result of the occlusion, the ground vegetation receives considerably fewer laser pulses compared to the upper canopy layer, inducing data voids and holes. This causes a number of adverse impacts. First, the generation of bare ground in dense forest regions becomes stimulating. Even if a lidar end user applies advanced ground filtering and interpolation algorithms, the resulting digital terrain model may be inaccurate and suffer from a loss of rugged details [21], [22]. The situation gets even worse on mountainous terrain with steep slopes [22], [23]. In addition, forest scientists may encounter difficulties in computing lidar metrics and predicting the substantial numbers of understory vegetation elements when there is a sparsity of data points [24], [25]. Apart from these notable effects, the radiometric measurement of the lower vegetation layer may suffer from the transmission loss of laser energy due to canopy occlusion. The intensities of understory vegetation are underpresented since the laser signals have been intercepted due to earlier backscattering with the overstory. Indeed, it is mostly impossible to correct or normalize the intensity values of the second to the last returns, especially for understory vegetation, based on the discrete-return or multiecho lidar system [26]. Unless the recorded waveform echoes are available for radiometric correction [27], an accurate estimation of spectral reflectance of understory vegetation can hardly be achieved. Unlike the scenarios found in urban areas that may be resolved by multiple overlapping data strips, the occlusions caused by forest canopies are impossible to avoid. Increasing the PRF and restricting the swath with a narrow off-nadir angle may allow better laser penetration 4096 deeper into canopies [26], [28] but only to a limited extent. Possible solutions are to either conduct an airborne lidar measurement during the 0 leaf-off season [24], [25], [29], [30] or 50 compensate the understory coverage with multiple ground measurements using mobile or terrestrial lidar systems [25], [31], [32]. 0

0 50

0 –20

0 20 Scan Angle Land

Scan Line Profile

Swath Width

FIGURE 5. NIR lasers usually experience laser dropouts at swath edges, with a high variation

of intensity values backscattered on the water surface along each scan line [36].

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

25

geometry [38]; 2) laser pulses experience Fresnel reflectance, which can be as low as 2% [39]; and 3) atmospheric turbulence and high near-shore wind speeds further attenuate the laser energy, leading to the failure of the laser signal to reach a threshold to be processed [39]. With the combination of these effects, null returns may be found at swath edges and occasionally appear in the close-to-nadir region (see Figure  5 for an illustration). Apart from these, the backscattered laser pulses may have a high variance of signal strength in these water regions, resulting in some form of speckle noise in the intensity images (discussed in the “Speckle Intensity Noise on Water Surfaces” section). The presence of data voids in water regions causes unnatural triangular facets when generating DEMs. Indeed, the USGS has emphasized replacing these defects in the resulting DEM with a flat virtual water surface [13]. This process is called hydroflattening. Existing solutions mainly rely on either manual digitization or the incorporation of existing break lines along coastal regions and river banks [40], [41], depth sounding data [34], and instantaneous water surface measurements via boat surveys [42] to locate water regions, followed by hydroflattening. The shortcomings of relying on manual intervention or a semiautomatic process have been addressed lately. Water data points can be first extracted based on an airborne lidar ratio index, the scan line intensity–elevation ratio (SLIER) [36]. The SLIER reaps the benefits of how a laser interacts with the water body. Along each scan line s, the water surface usually has a relatively lower variance in terms of elevation z but a higher degree of fluctuation in terms of intensity I for a given lidar dataset L. With 6s ! L, the SLIER can be defined as v SLIER = v zI . (2)



1,550

1,064 (nm)

532

1,550

Elevation (m) 138

532

N v SLIER = v zI # cos i # n ss (3) where i refers to the scan angle, N s equals the maximum number of points found among the scan lines in L, and n s refers to the number of data points found in the current scan line s. If the swath of s completely covers a water region, the laser dropouts cause fewer data points, leading to a small n s value. This can further boost the SLIER, causing higher values to be computed on the water surface and lower SLIER values associated to the land. Sample water surface data points can be located with higher SLIER values (e.g., the top 10% within L). Then, a virtual water surface can be defined based on the elevation of these sample water data points [36]. These data points can further serve as training datasets for classifiers to distinguish between land and water regions [43]. Figure 6 provides an example of the SLIER and the result of water mapping. With the SLIER aiding in water identification, those data voids caused by laser dropouts can be located and compensated. Recently, [44] proposed two scan line void-filling algorithms to add artificial data points in the water gaps in the close-to-nadir region and swath edges. The algorithm handling the close-to-nadir region aims to add these artificial data points if the distance between two consecutive points along the scan line is larger than the mean point spacing (or ANPS) via interpolation. The second algorithm first estimates the maximum swath width w within L and then adds artificial data points to the swath until the length of s reaches w. Assuming that the elevation of these artificial data points equals the mean value of the water data points, the entire L and these artificial data points are combined to generate a DEM. The resulting process leads to a hydroflattened water surface, and those unpleasant triangular facets disappear in the resulting DEM accordingly (see Figure 7 for an example).

1,550

0

1,064 (nm)

4096 Low (b)

532

SLIER

Intensity 146

(a)

1,064 (nm)

To further take advantage of laser dropouts found at large incident angles, (2) can be further revised as

1,064 (nm)

532

Land–Water Classification High

(c)

1,550

Water

Land

(d)

FIGURE 6. (a) Laser dropouts found in a water region, particularly at the swath edges. (b) A high variance of the intensity values on the water surface. (c) High SLIER values computed on the water region, particularly on laser channels of 1,064 and 1,550 nm. (d) The use of the SLIER for land–water classification [36].

26

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

Edge Nadir

(d)

Height (m)

378

0 (a)

0.2 km 243

(b)

(c)

(e)

FIGURE 7. (a) Laser dropouts cause data voids along an inland river. (b) The implementation of a scan line void-filling algorithm implement-

ed at the close-to-nadir region and swath edges. (c) The resulting hydroflattened DEM. (d) The corresponding 3D view of the original DEM with unpleasant triangular facets along the river. (e) The hydroflattened DEM [44]. SWATH GAPS Another explanation of void appearance can be attributed to the gaps found between swaths. In common practice, a 15%–50% overlap between adjacent data strips is a rule of thumb during mission planning [13], [14], [45]. Combining a number of partially overlapping data strips can lead to a seamless mapping of the study region. Putting aside the possibility of poor flight planning, adjacent flight lines may have gaps due to various internal and external factors. Instantaneous flight conditions, such as air turbulence or a change of heading direction, likely cause undesired movements of the aircraft, resulting in a crooked flight trajectory. It is a common problem not only in an airborne lidar system but also in a UAS platform. As demonstrated in Figure 8, a portion of the airborne lidar data strip is twisted with a curvy swath due to possible turbulence. A notable void is thus found between the swaths, a scenario considered an unintentional cause of data voids. On the other hand, a change of system settings or system error/bias may also lead to the presence of swath gaps. This situation can be further broken down into the following two scenarios. First, a significant change of terrain height, particularly on rugged mountainous terrain, may alert the swath of the airborne lidar if the flying altitude remains unchanged. Combining two adjacent strips under this circumstance may lead to the appearance of a swath gap. Although topographic airborne lidar systems operated with oscillating mirrors are capable of maintaining a consistent swath width via adjusting the degree of angular rotation, this may SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

not be the case for systems equipped with rotating prisms. The change of swath width also happens if the flight altitude changes during the flight mission. This may also lead to the formation of a gap between two swaths. Second, data gaps may be found between two data strips, which can be ascribed to the accuracy of the GNSS and inertial measurement unit (IMU). Nevertheless, data gaps may be particularly obvious along the vertical direction in the overlapping region. Such a case also would lead to stripe artifacts in the resulting DEM. Strip adjustment should therefore be applied to remove system errors and biases so as to improve the geometric accuracy. Details can be found in the “Stripe Artifacts in Digital Elevation Models” section.

FIGURE 8. A swath gap with an annulus-shaped sector due to a

crooked flight trajectory. 27

DARK OBJECTS/WEAK TARGETS Objects with a low reflectance, which can be categorized as dark or weak targets, may result in less backscattered laser energy traveling to the receiver. Energy absorption by dark objects is not limited to water bodies. In some cases, the atmospheric absorption and scattering may further lead to significant energy attenuation, resulting in a null return recorded by the system. As a result, the cause of such a data void depends on two issues: OBJECTS WITH A LOW instantaneous ground conREFLECTANCE, WHICH CAN ditions and system settings BE CATEGORIZED AS DARK (i.e., gain control and emitted OR WEAK TARGETS, MAY laser power). Various studies report null RESULT IN LESS returns or a comparative sparBACKSCATTERED LASER sity of point density in obENERGY TRAVELING jects with a low reflectance or TO THE RECEIVER. made with highly reflective materials. Transparent structures, such as glass, may also cause a sparsity of point clouds found on rooftops since laser beams likely penetrate through the glass and backscatter from the objects underneath. This makes automatic building or rooftop detection and segmentation challenging [17]. Asphalt roads or tar-coated rooftops, such as in the example in Figure 9, absorb laser pulses and thus lead to null returns and data voids. Objects having a high degree of moisture and wetness (i.e., after a rainfall) also likely hinder laser backscattering. Indeed, a negative correlation exists between surface moisture (or wetness) and backscattered laser energy, and this phenomenon is particularly obvious in NIR lasers. Surface moisture further causes a reduction of the intensity difference among surface objects, for instance, sand rich and shale rich, as reported in [46], which makes object segmentation challenging. The moisture may reduce the backscattered reflectance (after conducting radiometric correction) by 7%–27%, comparing wet and dry stretches of road surfaces [47]. In another experiment [48], the backscattered reflectance was further reduced by 30%–50% for different

FIGURE 9. An image of concrete rooftops and asphalt roads that has a sparsity of data points and holes.

28

sand and gravel samples that were thoroughly soaked. Water standing on rooftops may cause a lack of data points that further hinders the subsequent automatic 3D reconstruction [49]. Such an influence has a greater overall control on hard objects, while the impacts of moisture have a limited effect on vegetative covers [50]. Indeed, a lidar system is usually configured to expect both weak returns and retroreflections. Nevertheless, extreme high-energy returns may saturate the laser receiver, which also looks for weak returns from dark objects at the same time. Some of the airborne lidar systems are equipped with automatic gain control (AGC), which can leverage the varying strengths of backscattered laser returns and flying altitudes. The airborne lidar ramps down the detector’s AGC when surveying retroreflective targets, while weak targets or dark objects drive up the receiver’s gain [51]. Despite that, the mechanism of AGC, though it is commonly known as a linear system, is considered a “black box” to the end users. There have been several attempts to utilize the AGC mechanism and the recommended intensity corrections to retrieve the (psuedo)spectral reflectance for a Leica ALS50 with the AGC setting [51], [52]. It is commonly known that a lidar system increases the PRF to look for an improved data density. The increase, in turn, causes a decrease in emitted laser pulse energy [53] as a side effect and further reduces the possibility of weak target detection. Therefore, some airborne lidar systems may produce a varying laser signal strength to leverage the PRF. For instance, the Optech Titan is equipped with three laser channels with a linear detector response. As a result, the recorded backscattered intensities are linear across the entire scale. However, this particular system has an intensity transfer function [36], which can change the power ratio of the green (532-nm) and short-wave infrared (SWIR) (1,550-nm) channels with respect to the infrared (1,064 nm). The power ratio of the green channel can be twice as much as that of the infrared channel, regardless of the PRF settings. The SWIR channel has a varying power ratio, depending on the PRF. The SWIR channel can generate a higher laser power ratio of up to three times with the reduction of the PRF [36]. Although an increased power ratio can improve the capability of target detection and minimize the presence of data voids, it may also induce a high saturation of signal returns that appear like “speckle noise” in the resulting intensity image. More details are covered in the “Speckle Intensity Noise on Water Surfaces” section. SOLUTIONS AND FUTURE DIRECTIONS With respect to the previously mentioned causes, detection of data voids becomes critical and subsequent void-filling strategies should be implemented to resolve these defects. Indeed, good mission planning can alleviate the effects of occlusions and swath gaps. Various lidar specifications recommend a 15%–50% overlap between adjacent flight strips [13], [14], [45]. Also, seamless mapping should include multiple scans of the study area, with different flight IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

on the ground or not [57]. Despite a recent study on terresdirections so as to minimize the effects of occlusions and trial lidar that classified occlusions and laser dropouts on swath gaps. Very often, end users remove the data points terrains [58], research in point cloud void classification rewith large incident angles (215°) since these off-nadir mains in an early stage. A specific filling mechanism should data points suffer from significant laser energy attenuation, be designed for gradient changes of water slopes, such as at resulting in stripe artifacts in the intensity image. If these waterfalls or downstream locations [59]. For those artificial data points are incorporated in subsequent analyses, radioobjects, point cloud completion or generative adversarial metric correction should be implemented. More details can networks (GANs) can be introduced to repair the missing be found in the “Intensity Striping at Swath Edges” section. regions if intensive training data are available [60], [61], Detection of data voids within airborne lidar point [62], [63]. Persistent homology, which discerns topological clouds can be made based on certain assumptions or features from the point cloud through algebraic theories, threshold techniques. Zhou and Vosselman [15] identified can also be introduced to determine significant holes from the gaps of curb lines caused by the occlusions of on-road the topological noise [64]. All in all, a comprehensive workvehicles, and only those gaps with a predefined length flow for automatic void detection, classification, filling/in(based on trial and error) were bridged. Yan [44] used the painting, and reconstruction is desired. time stamp and distance between two consecutive points along the scan line as a criterion to look for laser dropouts STRIPE ARTIFACTS on water surfaces. Feng et al. [18] fitted a minimum boundStripe artifacts are also referred to as rippling, periodic banding rectangle to estimate whether a rooftop suffers from ing, stairstepping, or corduroy in the literature. Such a defect partial occlusion caused by adjacent trees. Elberink and may not be spotted out until one interpolates the airborne Vosselman [49] adopted the rooftop area, orthogonal dislidar point clouds to an intensity image or a DEM. An intance, and shortest distance with corresponding predefined dividual lidar data strip may suffer from intensity banding thresholds to assess whether there exists a lack of data due to the misalignment between the field of view (FOV) of points or gaps. Apart from making use of the raw lidar data the receiver and emitter. When combing multiple overlapproperties, various computational geometry and computer vision methods are proposed via detecting and filling the ping lidar data strips, intensity striping may appear at swath holes in triangular meshes [54], [55]. edges. This can be explained by the significant attenuation After gap/void detection, the design of a corresponding of laser beams with longer ranges and larger incident angles. filling mechanism highly depends on the type of voids/ Geometric misalignment between two overlapping swaths gaps being found. A flattened water surface with a specific may also lead to a rippling effect in the resulting DEM. elevation value can be assigned after compensating the laser dropouts in the water region [44]. Studying the saliency INTENSITY BANDING features, such as normal and curvature, can determine the If an airborne lidar data strip has a notable systematic stripcorresponding way to close the gaps found on rooftops ing pattern found in the interpolated intensity image, the with a sparsity of data points [17]. On the other hand, a lidar data highly likely suffer from intensity banding issues sigmoid function along a spline can be used to reconstruct (see Figure 10). Intensity banding occurs only if the lidar data the distorted curb lines caused by occlusion [15]. Therestrip is collected by an airborne lidar system operated with fore, the filling and inpainting processes are object and an oscillating mirror that has an incorrect optical alignment shape dependent. [65], [66]. Therefore, the intensity banding effects caused by Most of the time, data voids within an airborne lidar dataset can be detected and located. Nevertheless, it is hard to decide the corresponding filling mechanism, especially if the dataset contains various voids found in water regions, terrains, and objects with low reflectance as well as at steep slopes [56]. As a result, a future direction should focus on a machine learning approach to determine and classify the types of voids being found. Classification of data voids/ gaps is critical since it may affect the (a) (b) inference of change detection. Data gaps caused by building occlusions should be considered since doing so FIGURE 10. (a) Stripe artifacts found in an intensity image generated by an individual lidar may infer whether there is a change data strip. (b) The intensity image after the removal of the banding effect. SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

29

System’s Side View

stripe artifacts are always along the across-track direction. Also, one should note that the term “intensity banding” here does not share the same meaning as “banding” in lidar point sampling, where a condensed or stretched point distribution is found along the data strip, due to a change of the aircraft’s speed during the survey [67]. Intensity banding is mainly attributed to the misalignment between the axes of the laser transmitter’s and receiver’s FOV. Under normal circumstances, the two axes should be properly aligned and parallel to each other (see Figure 11). In this case, the backscattered laser beam should be completely captured within the receiver’s FOV. Due to improper calibration, mechanical failure, or a collision during installation or delivery, the oscillating mirror may cause the laser beam’s axis to lean away from or bend toward the receiver’s axis. As a result, the backscattered laser beams, which are collected in the scanning direction, with the laser beam’s axis leaning away from the receiver’s axis, may partially fall outside the receiver’s FOV. It thus causes a consistent attenuation with the collected laser signal strength in this particular scanning direction and ultimately leads to the stripe artifacts found in the lidar dataset. Interested readers are referred to [66]. Currently, a number of advances have been made to resolve intensity banding. Image filters are one of the common ways to handle this issue since the stripe artifacts are being treated as systematic intensity noise. Nobrega et al. [68] proposed an anisotropic diffusion filter, which is capable of smoothing the intraregional areas preferentially over interregional areas. This not only removes the striping but also preserves the edges of objects. However, although the filter can successfully achieve denoising and outperform

Receiver

Transmitter

All the scan lines suffering from the banding are corrected using the LSLC together with the estimated parameters. The resulting intensity image after the LSLC should be striping-free. The authors of [66] reported a reduction of the coefficient of variation (cv) by 13%–80% on three monochromatic lidar datasets suffering from light, mild, and extreme banding effects after implementing the LSLC. The work in [69] further applied the LSLC on the two laser channels (1,064 nm and 1,550 nm) of the multispectral airborne lidar data collected by an Optech Titan to remove the intensity banding, with a reduction of cv by 4.8%–66.9% and 0.2%–23.1%, respectively. If point cloud density is not a major concern, some of the existing studies simply abandon the lidar scan lines of a

Right-toLeft Scan

Mirror Rotates in a Constant Speed Within this Zone

Mirror Rotates at a Lower Speed Within this Zone Mirror Decelerates

Receiver’s FOV

Cross-Section View

I h = a 0 + a 1 I l + a 2 i + a 3 I l i + a 4 I l2 + a 5 i 2 . (4)



Laser Beam

Receiver’s Axis

Laser Beam

Left-toRight Scan

low-pass filters, it also oversmoothens the intensity image, causing a blurry effect. On the other hand, [65] proposed to adopt histogram matching to normalize the intensity histograms collected by the two opposite scans. Recently, lidar scan line correction (LSLC) was proposed in [66] to adjust the radiometric misalignment found in the two opposite scans within a lidar data strip. The LSLC is, in essence, a polynomial function, which normalizes all the scan lines of a particular direction having a lower mean intensity (i.e., I l) with reference to the opposite, which has a higher mean intensity value (i.e., I h). The model parameters, including both the intensity and scan angle i, can be estimated by first pairing up the nearest points of the two scans and solving the polynomial function by using an iteratively reweighted least-squares adjustment, as follows:

Transmitter’s Axis

Transmitter’s Axis is Not Parallel to the Receiver’s axis

Cross Section of Receiver's FOV

In normal case, the laser beam completely falls within the receiver’s FOV.

A portion of laser beam falls outside the FOV in case the transmitter’s axis leans away from the receiver’s axis. (a)

Laser Beam Falls Outside Receiver’s FOV

The entire laser beam falls within the FOV in case the transmitter’s axis leans toward the receiver’s axis.

Receiver’s FOV

Mirror Accelerates

Flight Direction Swath Edge

Nadir

Swath Edge

(b)

FIGURE 11. The imperfect alignment of a laser beam’s axis and receiver’s FOV in a specific direction, causing intensity banding [66].

30

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

particular scanning direction that have a consistently lower intensity value than the opposite ones. This can be achieved by computing the mean intensity value of the respective two scanning directions by using the “scan flag” field in the LAS file. For instance, Okhrimenko and Hopkinson [70] omitted half of the lidar dataset (i.e., those scan lines suffering from intensity banding) to compute the spectral vegetation indices when using an Optech Titan. The resulting scan line pattern would be similar to those collected by a rotating prism. In [71], the authors adopted a similar approach in their radiometric analysis but retained the entire dataset for geometric/ spatial analysis of canopies. Nevertheless, Goodbody et al. [72] also argued that the reduced point cloud density, particularly in the green channel (532 nm) of an Optech Titan, affected their voxelization results in computing the vegetation indices for modeling forest inventory attributes. Although the intensity banding issue has been addressed in the literature lately, all the previously mentioned methods have their own drawbacks. Specifically, the banding issue is prominent at the close-to-nadir region but more subdued at the swath edges [66]. This can be explained by the speed of the oscillating mirror, which decreases when approaching the swath edges, causing a high overlap factor between the laser beam and the receiver’s FOV. On the other hand, the oscillating mirror is rotating at a high speed when approaching the nadir. As a result, the overlap factor between the laser beam and the receiver’s FOV decreases. Therefore, image filtering and histogram equalization [65] may not be able to appropriately handle different levels of banding in these regions. Although the LSLC considers the use of the scan angle as one of the correction parameters, the method relies on global parameter estimation lacking in robustness and flexibility [73]. In the future, it is recommended to propose a local parameter estimation method within, say, a pair of scan lines or an individual region along the across-track direction and to consider other

physical parameters and/or overlap functions. This can lead to a more robust solution to resolve intensity banding. INTENSITY STRIPING AT SWATH EDGES Stripe artifacts at swath edges can be explained by laser attenuation (see Figure 12). When a laser beam travels through the atmosphere and finally reaches the ground, the laser beam undergoes aerosol/molecular absorption and scattering. As a result, the longer the range, the greater the attenuation of the emitted and backscattered laser echoes. A laser beam reaching the swath edge implies a large incident angle, which also significantly reduces the backscattered laser power. Combining data points with laser attenuation occurring at the swath edges with the data points from the nadir of another strip thus causes the stripe artifacts notable in the overlapping region of the intensity image. To physically model the laser attenuation, the radar (range) equation [38] is used to described the received laser power with respect to various parameters:

Pr =

Pt G t v r h atm h sys (5) 4rR 2 4rR 2 D 2

where Pr refers to the recorded laser power that is typically scaled to an 8- to 16-b intensity value, Pt is the transmitted laser power, R is the laser range, G t refers to the antenna’s gain factor, D is the aperture diameter, and h sys is a transmission factor that is system dependent. The atmospheric attenuation factor, h atm, can be modeled based on the Beer– Lambert law:

h atm = e -2aR (6)

where a is the extinction coefficient that can be determined by summing up the aerosol (Mie) scattering, Rayleigh scattering, and aerosol/molecular absorption, all of which are

Change of System Setting (e.g., Flying Height)

Top View Overlapping Scan

I = 28 Range

First Scan

I = 45 I = Intensity

Flying Height

First Scan ~ 600 m

First Scan ~ 600 m

Second Scan ~ 600 m

Second Scan

~ 300 m

Scan Angle

I = 50

Asphalt Surface

Nadir Half of Swath Width

Striping at Swath Edges

Footprint

Swath Edge (a)

(b)

Overlapping Region (c)

Striping Found in the Entire Overlapping Region

FIGURE 12. (a) Laser attenuation found at long ranges and large incident angles, resulting in smaller intensity values at the swath edges compared to those at the nadir region. (b) Striping found at the swath edges when combining two overlapping data strips. (c) Striping found in the entire overlapping region, due to different flight altitudes. SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

31

^ I n h being directly proportional to the R to the power of a factor f; i.e.,

wavelength-dependent parameters [74]. The laser cross section v in (5) can be formulated based on the assumption of Lambertian reflectance:



v = 4rtA cos (i r)(7)

I ? t·

r

where R r refers to a reference range value that is commonly set to be the minimum or median range value within the collected lidar data strips and the value of f is commonly two, according to the radar (range) equation [38]. Nevertheless, such a setting handles only simple land cover scenarios by monochromatic lidar systems with stable settings. Different research studies have attempted to look for an optimal f value through using a cross-validation approach with a range of predefined values, e.g., an f between zero and four [52], [80], [81]. Indeed, the value of f varies with respect to the study area (or ground objects to be surveyed) and specific lidar systems. It is highly unlikely to have a fixed f value to be applied to a certain lidar system or study area. Also, the use of the cross-validation approach (i.e., setting a range of f values to yield the lowest cv) to look for an optimal f is not practical at all. Therefore, an automatic approach, one that can estimate the correction parameters from the overlapping lidar data strips, is desired, especially for the increased number of laser channels of multispectral lidar systems. Recently, Yan et al. [69] reported two-stage correction for multispectral airborne lidar data by first removing the intensity banding, using the LSLC in each of the individual data strips [refer to (4)], followed by an overlap-driven

where t is the (pseudo)spectral reflectance of the target object, A is the area of the projected laser footprint, and i r is the laser incident angle. Indeed, radiometric calibration or intensity correction models are mainly built upon the previously mentioned radar (range) equation, which describes the relationship between the emitted and recorded laser energy with respect to the range, incident angle, surface condition, atmospheric attenuation, and other system factors [75], [76]. Theoretically, the collected intensity data represent the amplitude of the laser backscattered signal strength after linearization, and thus, the purpose of radiometric calibration/correction aims to convert the intensity value into a relative (pseudo)spectral reflectance after considering the preceding attenuation factors [74], [75], [77] through reformulating (5) as follows:

R f I n = I·a R k (9)

1 ·cos i·e -2cR . (8) R2

Range normalization is also widely adopted to correct the intensity data (I) [78], [79], where it simply considers the effects of the range, without taking into account the rest of the parameters. This results in the normalized intensity

(a)

Swath Edge (b)

0

40 m (c)

FIGURE 13. Two overlapping swaths of a multispectral airborne lidar dataset, one with a large scan angle at the bottom and the other with a

small scan angle (near the nadir region) at the top. The (a) original intensity, (b) original intensity + LSLC (removal of the intensity banding), and (c) original intensity + LSLC + overlap intensity correction (further removal of the striping at the swath edge). 32

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

coordinates ^ X G h of each range measurement ^ r h with respect to a known coordinate system ^ X 0 h, as follows [87]:

intensity correction (see Figure 13 for the removal of the striping caused by the laser attenuation found at the swath edges [44]). The overlap intensity correction is able to estimate the correction parameters of range, angle, and atmospheric attenuation from the overlapping data strips through an iteratively reweighted least squares with a Huber estimator to resist outliers. The removal of striping can, in turn, improve not only the lidar data quality but also subsequent analyses. Given that the lidar community is moving toward in the use of intensity and/or waveform features for surface classification and object extraction, a number of studies reap the benefits of radiometric calibration, range normalization, and intensity correction, resulting in improved intensity homogeneity and classification accuracy [74], [75], [77]. On the other hand, radiometric treatment seems to provide a marginal impact on lidar intensity-derived forest metrics [69], [82], [83]. Despite these well-developed techniques, apparently, a stratified approach to intensity correction, similar to topographic correction in satellite remote sensing, is desired so that heterogeneous land covers receive different levels of corrections for the laser range, incident angle, and atmospheric attenuation [84]. STRIPE ARTIFACTS IN DIGITAL ELEVATION MODELS Aside from the unpleasant triangular facets found in the lidar-derived DEM (refer to the “Laser Dropouts on Water Surfaces” section), the existing literature also reports the appearance of stripes in the resulting DEMs, especially those collected by small-footprint airborne lidar systems. These stripes appear notable on rooftops and flat ground, disrupting terrain analyses (e.g., the computation of slope and aspect). Stripe artifacts in DEMs intrinsically imply an imperfect calibration and improper alignment of the overlapping data strips. Causes of these can be explained by GNSS and IMU measurement errors. An airborne lidar system comes with a positioning and orientation system (POS), including the GNSS and an IMU for direct georeferencing [85], [86]. The GNSS determines the instantaneous 3D coordinates of the aircraft, while the IMU collects the angular motion or orientation, i.e., R ~, {, l, of the aircraft. For a linear lidar system, the lidar equation thus provides a direct georeferencing solution to estimate the 3D SEPTEMBER 2023

R V 0 S W W . (10) 0 X G = X 0 + R ~, {, l PG + R ~, {, l R T~, T{, Tl R Si i S S- ^r + Tr hW T X All the components (i.e., laser ranger, POS, and oscillating mirror) in an airborne lidar system certainly have their own error sources [88], and neither is it the case that these components can perfectly align to a physical common origin (see Figure 14). The laser ranger may be intrinsically subject to systematic errors (i.e., Tr) in each of the range measurements (r). The oscillating mirror, which distributes laser beams on the ground, with scan angle i, is embedded with a scale factor S i, as shown in (10). In terms of the physical alignment, the spatial offset and orientation difference between the laser ranger and the POS respectively correspond to the issues of the lever arm ^PG h and boresight angle ^R T~, T{, Tlh . Besides, the sampling rate of each component is different and difficult to precisely synchronize. These uncertainties and errors thus accumulate in each of the lidar data strips, leading to swath mismatch and striping in the resulting DEM (see Figure 15). Striping in DEMs can

Laser Ranger

Scan Angle

P⃗G

ω

POS

ϕ

Height

Mirror

Uncalibrated Lidar Strips

k

r⃗

X⃗0 Z0 X⃗G

Y0

With Strip Adjustment

X0 FIGURE 14. The system calibration of airborne lidar data strips.

(a)

(b)

FIGURE 15. (a) Striping on a DEM. (b) A DEM generated by geometrically calibrated lidar data strips based on a quasi-rigorous calibration procedure [87].

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

33

be resolved in two major ways depending on whether the original point clouds or resulting DEMs are available. If the original lidar data strips are available, strip adjustment offers a way to improve the geometric data quality via minimizing the discrepancies found in the overlapping region of the adjacent strips [89]. In general, the process of strip adjustment begins via identification of common shapes (such as rooftops or planar patches) extracted from the point clouds or the derived DEM in the overlapping region. Surface matching then determines and quantifies the discrepancies between the two overlapping data strips. A number of options are available for surface matching; these include point-to-point analysis (e.g., the iterative closest point), surface or linear feature matching [90], and so on. Apart from using specific common shapes/objects for surface matching, lidar intensity values can also serve as another constraint for surface matching and subsequent adjustment, and this is found predominantly in reducing the horizontal discrepancy [91]. Finally, the strip discrepancy between two strips can be minimized by applying a 3D geometric transformation of the strips (such as a 3D affine transformation) or a correction to the previously mentioned sensor parameters in (10) (i.e., a system calibration approach). Interested readers are referred to [89]. If end users are unable to obtain the original lidar data strips, there are also several solutions to resolve the stripe artifacts in the resulting DEMs. Albani and Klinkenberg [92] proposed a cross-smoothing algorithm with a probability replacement function to constrain the elevation of the striping in the DEM. Gallant [93] adopted a multiscale adaptive smoothing filter to aid in DEM noise removal with varying characteristics. Image filters based on wavelet and/ or Fourier transforms can help eliminate the stripe artifacts in different orientations [94], [95]. All these provide

(a)

(b)

solutions to resolve the striping in lidar-derived DEMs, which ultimately improves the DEM quality for various applications, such as volcanic analyses [96]. SPECKLE/RANDOM NOISE Similar to stripe artifacts, noisy returns may appear to be scattered in the original point clouds as well as their data derivatives, such as intensity images and CHMs. 3D point cloud noise can be removed by postprocessing algorithms based on spatial filtering, differences in elevations, or local densities. Speckle intensity noise found on water surfaces, although appearing to be difficult to handle, sheds light on certain information for automatic water mapping. SPECKLE INTENSITY NOISE ON WATER SURFACES When an airborne lidar system surveys a water region, the resulting intensity image may be corrupted by speckle noise in addition to the effects of laser dropouts (i.e., data voids), as previously mentioned (refer to the “Laser Dropouts on Water Surfaces” section). Unlike passive optical remote sensing images capturing a homogeneous reflectance over a water region, the emitted laser beams undergo a series of scattering, reflection, and refraction processes. When the incident angle is almost perpendicular, specular reflection or even retroreflection occurs, causing a strong reading of the echo amplitude [97]. Since the instantaneous conditions of water bodies keep changing, all these processes cause different levels of energy transfer, leading to a variation of the backscattered laser pulse strength [36]. In addition, if the gain control keeps changing (as mentioned in the “Dark Objects/Weak Targets” section), the collected intensity also varies even as the surveyed objects remain the same [51]. All these effects result in the presence of speckle noise in the intensity image (see Figure 16). The drawbacks of speckle noise can be summarized in various ways.

(c)

(d)

FIGURE 16. Speckle intensity noise appears in different types of water regions, such as (a) inland rivers, (b) natural shores, (c) inland lakes, and (d) rocky shores.

34

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

In optical remote sensing, multispectral image bands can aid in deriving biophysical indices not only for the retrieval of vegetation parameters but also assessing water conditions, e.g., the Normalized Difference Water Index (NDWI). With the invention of dual- and tri-channel lidar systems, green and NIR laser channels can conceptually serve such a purpose. However, Yan et al. [36] have proved that speckle intensity noise found on water surfaces degrades the quality of the derived NDWI used for delineating water bodies from land. Such an issue is particularly serious when dealing with high waves, turbulence, and white caps because laser pulse returns tend to interact with the water surface via Bragg scattering [37]. Apart from the inappropriateness of deriving the water index, direct use of lidar intensity in a machine learning classifier to extract water regions may suffer from the issues of between-class spectral confusion and within-class spectral variation [43]. This may result in salt-and-pepper noise found in the classification results. Therefore, proper treatment of the intensity should be instituted prior to information extraction. A number of attempts are reported to mitigate speckle intensity noise through image filtering; these include the median filter [98] and mean filter [99]. However, these filters may oversmoothen the entire dataset, in which most regions are, in fact, free of speckle noise contamination. Incekara et al. [100] proposed to implement a mean-shift segmentation on the intensity image for image smoothening since it maintains the edge of a water lake while eliminating the noise. To the best knowledge of the author, a comprehensive solution that completely removes the intensity speckle noise on water surfaces is yet to be available. Nevertheless, since speckle noise can already be used to distinguish water regions from other topographic features (the use of the SLIER by [36], as described in the “Laser Dropouts on Water Surfaces” section), other potentially useful information that aids downstream analyses could, in principle, also be harnessed. Our deepened understanding of the nature of the noise from these efforts will eventually establish the foundations for a more comprehensive solution of noise removal.

range can then be computed by considering the round-trip travel time and the speed of light. As a result, the accuracy of range detection highly depends on the backscattered echo’s amplitude. If a certain degree of high-frequency noise, caused by fluctuations in air currents, for instance, is superimposed on the backscattered laser echo, it may cause an upward noisy artifact that preempts the actual signal. The TOF mechanism thus records a return earlier than it should and recognizes it as a data point that is closer to the system than its intended position. Such a scenario is named timing jitter [101] [see Figure 17(a)] Depending on the target range, size, and reflectivity, the backscattered laser echoes may have similar shapes but with different amplitudes. As shown in Figure 17(b), although the echoes’ peaks are identical to each other in the time domain, the echo with the higher peak amplitude has an upward shift of the signal passing the detection threshold earlier than the one with the lower peak amplitude. Similar to the scenario of timing jitter, the TOF mechanism may recognize a return earlier than it should, resulting in a noisy point hovering on the point cloud. Such a scenario is commonly named RWE [101]. The timing jitter problem is indeed hard to solve since it is a random error. The RWE, on the other hand, can be treated as a systematic error that can be adjusted by considering the time offset between the pulse rise time of the transmitted and recorded echoes. Adjusting the threshold over time can also account for the variation of target properties and atmospheric attributes. Apart from the aforementioned system-induced noisy returns, airborne lidar systems are sensitive to instantaneous environmental and atmospheric conditions, resulting in unwanted returns recorded in the point clouds. Fire smoke and cumulus clouds appear as various clustered points located on top of the terrain surface (see the examples in Figure 18). Although end users can adjust the point clouds by removing these obvious anomalies manually, the corresponding occlusion still causes the problem Floating Point

SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

Power

Power

Threshold

3D POINT CLOUD NOISE End users of airborne lidar data may notice noisy returns hovering on the 3D point clouds. The drawbacks of these anomalies, if not properly resolved, may cause strong convexities in the data, leading to spikes in DEMs or heterogeneous intensity in the projected images. The causes of these defects can be ascribed to system and environmentally induced factors, such as random/systematic errors of range detection and instantaneous atmospheric conditions, e.g., rain, snow, the solar background, and so on. A time-of-flight (TOF) lidar system determines the range of an on-ground object by measuring the arrival time of backscattered laser echoes if the amplitude of its leading edge exceeds a specific detection threshold. The target

Emitted Pulse

Backscattered Pulse (a)

t1 t2

Time

(b)

FIGURE 17. (a) Timing jitter: a certain degree of high-frequency noise superimposed on the backscattered laser echo causes an upward noisy signal that advances the time, exceeding the threshold. (b) RWE: the leading edge detection of a backscattered laser echo influenced by the pulse amplitude and width that may result in an earlier record (t1) of the backscattered laser pulse. Regardless of (a) and (b), floating points may hover on the point cloud.

35

(a)

(b)

(c)

(d)

FIGURE 18. Noisy 3D returns caused by (a) smoke, (b) unknown flying objects, (c) cloud, and (d) solar background noise or rain.

of data voids (refer to the “Occlusions/Shadows” section). Instantaneous rainfall or snow storms during airborne lidar surveys generate random scattered returns, which are challenging to remove manually. Similar scenarios have been reported in photon-counting lidar systems, such as the Leica SPL-100, where the system’s detector is sensitive to solar background photons [7]. As a result, the solar background noise contaminates the collected point clouds. Unlike the previously mentioned unwanted returns caused by clouds or smoke, random noisy returns are challenging to remove manually. There are two ways to remove the outliers when one faces the preceding situations. Users can perform denoising on the point cloud, based on different 3D spatial filters [102]; clustering approaches, such as principal component analysis [103] and density-based spatial clustering of applications with noise [104]; or deep learning approaches [105]. Unlike those point clouds collected by low-cost sensors, the density of outliers, in most cases, is significantly lower than the point cloud itself. Therefore,

(a)

(b)

FIGURE 19. (a) A CHM suffering from data pits. (b) The CHM after

the data pits are removed. 36

one may simply investigate the distance between any pair of closest points or construct a voxel grid to assess the local density [7] so as to look for the outliers. If end users are unable to have the original point clouds on hand, a despiking algorithm, such as the despike virtual deforestation algorithm [106] or spike-free algorithm [107], can be applied to the triangular area network (TIN) for removing the spiky features and negative blunders as well as smoothening the terrain. PITS AND SPIKES IN CANOPY HEIGHT MODELS The penetrating property of lasers and the ability of lidar systems to probe and reconstruct environments in 3D have allowed scientists to adopt airborne lidar data to retrieve/understand forest structures and properties. Although airborne lidar systems have been proved to be superior to optical imaging sensors in fine-scale forestry studies [108], [109], the collected datasets have certain imperfections disrupting the retrieval of forest inventory parameters. Aside from the occlusion effects caused by foliage and upper canopies, as mentioned in the “Occlusions/Shadows” section, the retrieved CHMs often suffer from randomly distributed data pits and spikes [see Figure 19(a)]. These unnatural pits cause a high fluctuation of elevations in the CHMs, which may influence the estimation of tree heights, crown boundaries, basal areas, and stand volumes or induce errors in treetop detection [107], [110], [111]. Due to the natural variation of canopies and irregularities in tree heights, the emitted laser beams usually produce a number of returns from the upper canopies, tree trunks, branches, and foliage before reaching the understory or even hitting the ground [see Figure 2(b)]. Occasionally, laser beams may not interact with any overstory and simply penetrate deeply into the crown, IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

Power

Power

Power

OTHER ARTIFACTS resulting in the first return being backscattered from the lower canopy branches, foliage, or ground. These first reUNWANTED POINTS PRESENT BETWEEN turns represent a significantly lower elevation than their immediate surroundings, and thus, they appear like TWO NEARBY OBJECTS dark holes or negative blunders in the resulting CHMs. Occasionally, end users may notice the existence of undeCombining multiple overlapping data strips may further sired data points between two objects or surfaces that are loworsen the issue of data pits since laser beams may “peek cated close to each other in discrete-return or full-waveform under” different locations from multiple f light lines lidar data. As described in Figure 20, a set of points appears [107]. The appearance of data pits can also be ascribed between the stiffener and plating located on a stainless steel to the interpolation of CHMs for filling missing/empty T-bar. This set of data points forms a line, which is believed regions [110]. to be the lidar’s line of sight. Nevertheless, these data returns Forest modelers and scientists desperately look for are unlikely to be backscattered from the T-bar itself since pit-free CHMs prior to the retrieval of forest parameters. the geometry of the bar does not have an inclined surface Traditional image filters, such as mean or median filters, or plate supporting the stiffener and plating. Indeed, the can work to some extent. Nevertheless, the kernel size is appearance of such defects can be explained by the limited certainly a critical factor since data pits often appear in range resolution and pulsewidth of the lidar systems used. various sizes. Also, such an image filtering approach may Common lidar systems adopt a laser pulsewidth rangoversmoothen all the elevations (i.e., pixel values) in the ing from 2 to 5 ns. This range converts to a distance of apCHMs [112]. Among all the existing methods, the pit-free proximately 0.6 to 1.5  m (considering the speed of light algorithm proposed by [111] is considered to be the most . 3 # 10 8 ms-1). A lidar system is unlikely to be able to disfavorable method to generate a pit-free CHM. Such an aptinguish two returns, or, technically, two objects/surfaces, proach relies on a cascading workflow by first generating if they have an offset distance smaller than the previously CHMs at different elevations (e.g., at 2 m, 5 m, and so on) mentioned range. This is mainly because the two backscatusing all first returns and subsequently stacking all the retered echoes are closely mixed with each other, where the sulting CHMs while taking the maximum height values signal decomposition method, such as the Gaussian mix[see the example in Figure 19(b)]. ture model, may return a solution with a single return that Another way to tackle data pits involves pit detection results between these two objects/surfaces (see Figure 20). and filling. It usually begins with ground filtering to delinAs a result, in the example of a T-bar, instantaneous data eate the ground and nonground points. Then, the process returns are found within the T-bar, where the laser beam carries on by ignoring the data pits and spikes to generate first partially illuminates the plating and the final return the CHMs. A common approach simply looks around each comes from the stiffener. The mixed echoes backscattered data point with a spherical kernel and assesses whether from these two surfaces result in a return between. Such a the elevation of the surrounding nonground points exscenario also causes ALB challenges to mapping the seafloor ceeds a certain threshold so as to conduct an interpolaof very shallow water regions [115]. Since it is nearly impostion for CHM generation [113]. A similar mechanism is sible to change the pulsewidth of the emitted laser echoes or found in [107] by studying whether the length of a triangular edge in the resulting TIN is within a freeze distance, where the value depends on the ANPS. All these approaches Plating require the determination of a speti tj Lidar (b) i cific threshold, and a parameter-free approach is often desired. Recently, k cloth simulation has been introUnwanted ti tj Points duced to remove data pits without j (c) the need of undergoing the pit dei k tection and filling processes [114]. Stiffener j All the previously mentioned data tk pits appear in monochromatic lidar(a) (d) derived CHMs. Indeed, similar data pits are also noted in the forest covers FIGURE 20. (a) Laser pulses are backscattered from the plating of a T-bar at i and stiffener at of multispectral airborne lidar inten- j. (b) If the distance between the plating (i ) and stiffener ( j ) is larger than a laser pulsewidth, sity data. However, there is a lack of two discrete returns can be identified. (c) If the distance between the plating (i ) and stiffener studies thus far addressing these in- ( j ) is small, two laser returns from i and j mix, and (d) waveform decomposition, such as tensity pits, providing opportunities Gaussian decomposition, may treat this as a single return with the echo peak located between for further investigations. i and j, i.e., k, which is the location of unwanted points. SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

37

increase the sampling rate of the waveform digitizer, these artifacts can be removed only in the postprocessing stage. MOTION ARTIFACTS Modern terrestrial lidar systems, suc h as t he L eica R TC360, are equipped with fast double scans to remove abnormal point clouds caused by rain, water waves, walking pedestrians, and so on. However, the function is considered suitable only for stationary measurements. Although an airborne lidar survey captures the instantaneous conditions of topography, motion artifacts are notable when moving objects, especially on-road vehicles, appear on the scene. As a result, the shapes of moving vehicles are likely to be distorted. Motion artifacts in airborne lidar datasets appear stretched and/or sheared depending on whether an along-track motion or an across-track motion occurs. To model the geometry of a distorted vehicle, the following equation can be adopted [116]:

ls =

lv # va (11) v a - v v cos ^i v h

where l s and l v refer to the sheared length and original length of the vehicle, respectively; v a and v v refer to the speed of the aircraft and the vehicles, respectively; and i v refers to an angle of intersection between the heading of the aircraft and vehicle. The sheared angle ^i sh of the distorted vehicle can be modeled using the following formula:

As detailed in Figure 21, a combination of shearing and stretching effects can be found in different on-road moving vehicles. One should note that if the vehicle is moving in the opposite direction to the aircraft [i.e., i v = 180° in (11)], it results in a compressing effect [see Figure 21(b) and compare the stationary vehicles parked on the roadside]. On the other hand, if the vehicle is heading in the same direction as the aircraft, the moving vehicle suffers from a stretching effect [see Figure 21(c)]. Either scenario may also cause a certain degree of shearing since the heading vectors of aircraft and moving vehicles are unlikely to be exactly the same as or opposite to one another. Despite being a type of visible artifact, the sheared or stretched vehicles in lidar point clouds can nevertheless be shown to reveal their motion information based on their distortions. Yao et al. [116] proposed a workflow to first detect and extract the data points of vehicles, parameterize the shapes of vehicles, classify their motions, and, finally, estimate their speeds. Such an approach can aid in traffic analysis to a large spatial extent, which can, in turn, contribute to the macroscopic fundamental diagram that requires information of both traffic speed and density [117].

HARDWARE MALFUNCTIONS System manufacturers usually conduct pref light mission inspection and calibration. Also, system malfunctions should be reported through flight reporting and logs, as ^ h mentioned in existing specifications and guidelines [14], v v sin i v i s = arctan d + 90° n . (12) v a - v v cos ^i v h [118]. Still, it is inevitable to experience hardware failure before or during flight missions. As mentioned in the “Intensity Banding” section, if the lidar system is jarred during transportation or installation, the misalignment between the axes Scanning of the laser emitter and receiver reDirection sults in intensity banding. A mechanical solution to such a problem can be found in [119], where the incorporaFlight Vehicle tion of a field lens in the telescope’s Direction Direction focal plane is proved to improve the overlap factor between the receiver’s FOV and laser beam. Stripe artifacts in DEMs can be ascribed to systematic and random errors in the POS and lidar systems [89]. Figure 22 presents an airborne lidar dataset suffering from a mechanical problem. The laser scanning process literally starts when (a) (b) (c) the transmitter emits a laser pulse to the folding and scan mirrors and FIGURE 21. (a) A vehicle moving in an across-track direction, resulting in a sheared effect. passes through the sensor window; (b) A vehicle moving in the opposite direction to the aircraft, leading to a compressed effect. the equipment needs to flip the sen(c) A vehicle moving in the same direction as the aircraft, causing a stretching effect. Aside sor window so as to avoid too much from the motion artifacts, the DEM suffers from the effect of divots on the ground.

38

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

backscattering or retroref lection. However, due to an unknown reason, it turns out that at a certain range of scan angles (213° and 1 23), the backscattered light finds a way to travel back to the detector. The light is too strong and overwhelming, leading the recorded intensity to drop to one. The resulting intensity image thus shows a specific direction of scans having a significant energy loss in addition to the banding issue, as mentioned in the “Intensity Banding” section. Postprocessing is unable to restore the intensity in this scenario, and thus, the hardware needs to be sent back to the manufacturer for subsequent repair. THE WAY FORWARD ALGORITHMIC DEVELOPMENT AND TRAINING DATABASE Regarding algorithmic development, both geometric and radiometric preprocessing (including calibration, correction, or normalization), ground filtering, and object extraction (such as buildings) have indeed reached a mature stage. Nevertheless, some of the issues and artifacts covered in this article still require further research efforts to improve robustness and accuracy. The latest developments on this front include consideration of the development of a stratified strategy for the intensity correction of multispectral airborne lidar data, in that different land cover features should undergo different levels of range, angle, and atmospheric attenuation correction [84], and using persistent homology to create a threshold-free approach for void detection [64] and classification coupled with a corresponding point cloud inpainting strategy [120]. Despite the availability of image, audio, and video training datasets, there appears to be a lack of open well-labeled datasets for airborne lidar point clouds. Indeed, the lidar community should contribute to working on open benchmark datasets that can serve different applications and purposes. GANs can be trained and adopted to generate artificial data points to cover up voids for specific types of objects. In view of the improved mapping scale of regional and global land cover maps, a large number of labeled airborne lidar point clouds are also needed for training deep neural networks for large-area land cover classification, semantic segmentation, and point cloud completion. Also, a subjective point cloud quality assessment database should be created so that it can act as a reference for assessing point cloud quality. This can be achieved by assessing the geometric and radiometric discrepancy between the reference points in the database and the collected samples, such as the Point-Centered Quarter Method [121] and GraphSIM [122]. Further discussion can be found in the following section. INFERRING THE POINT CLOUD QUALITY Airborne lidar flight surveys usually go along with ground surveys or adopt existing ground controls to assess the geometric accuracy of the collected point clouds. Thus, SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

the geometric accuracy can be measured by assessing the root-mean-square errors in both the horizontal and vertical directions. This can be achieved through comparing sets of selected target points from the ground survey of the 3D point clouds or the resulting DSM/DEM against its known coordinates. Nevertheless, there is a lack of objective quality metrics that are capable of inferring the radiometric quality of lidar data. In view of the improved spectral information collected by airborne lidar systems, there should be a way DESPITE THE AVAILABILITY to quantify the quality of airOF IMAGE, AUDIO, AND borne lidar point clouds. VIDEO TRAINING DATASETS, A few initial attempts have THERE APPEARS TO BE A been reported lately in the computer science commuLACK OF OPEN WELLnity for red–green–blue point LABELED DATASETS FOR clouds collected by low-cost AIRBORNE LIDAR POINT sensors. Yang et al. [122] proCLOUDS. posed a process of resampling the key points, local graph construction, aggregation of color gradients, and similarity derivation. The objective of the workflow aims to assess the graph similarity of key points extracted at high spatial frequency from the point cloud and compare them against a reference point cloud without noise. Projecting a 3D point cloud onto a 2D image for image quality assessment seems to be a viable approach [123], especially for airborne lidar point clouds. Through comparing the image entropy difference (e.g., Boltzmann entropy) and the level of information extracted (e.g., classification accuracy), both before and after radiometric correction (refer to the “Intensity Banding” and “Intensity Striping at Swath Edges” sections), the radiometric quality of airborne lidar data can be quantified.

FIGURE 22. The mechanical failure of a sensor window, causing

incorrect intensity records at certain scan angles in addition to intensity banding. 39

ADVANCEMENTS IN OPTICS AND PHOTONICS Aside from algorithmic development, new findings and developments in optics and photonics can certainly improve the signal-to-noise ratio of lidar systems. The recent invention of the avalanche photodiode drives the next-generation lidar systems. Jones et al. [124] adopted the use of molecular beam epitaxy to grow a digital alloy, which has long-wavelength sensitivity and low excess noise. Although further research efforts are needed to improve the sensitivity to the single-photon level and extend the operating wavelength to the infrared region [125], the reported study has made a significant breakthrough to maintain eye safety while keeping high laser power, which is considered a bottleneck in existing avalanche photodiode technologies. On the other hand, quantum parametric mode sorting (QPMS) appears to be a promising new method for single-photon noise reduction [126]. The QPMS imprints certain quantum properties on the emitted laser beams and filters the backscattered laser pulses in such a way that only those photons with matching quantum properties are adopted and recorded. This can aid in cleaning up unwanted noisy returns from ambient light sources, e.g., sunlight, where existing solutions mainly rely on offline noise filtering algorithms to improve the data quality of photon-counting lidar systems [7]. An integrated antifogging and antireflective coating on lidar optical surfaces can facilitate water permeability while maintaining low reflectivity [127]. This can boost the performance of lidar systems, avoid unwanted reflections, and ultimately improve lidar data quality. The white laser invention allows a full span of visible spectrum, which further improves lidar’s radiometric quality if the laser output stability and eye safety issue can be guaranteed for long-range measurements [128]. Techniques based on the principle of entanglement, such as two-photon interference lidar [129] or two-photon dualcomb lidar [130], can facilitate improved depth resolution to capture finer details of objects through the mechanism of wave interference. A significant enhancement of the detector’s or waveform digitizer’s sampling rate (e.g., up to a picosecond or femtosecond) can also help improve the range resolution, which can help resolve the issue addressed in the “Unwanted Points Present Between Two Nearby Objects” section. All these advancements open up avenues for the future development of lidar hardware with improved precision and quality in both radiometric and geometric perspectives. PREPROCESSING FUNCTIONS IN SOFTWARE The availability of airborne lidar data, particularly those collected from unmanned aerial vehicle platforms, dramatically increases at a rate that is faster than software development for data exploitation. Unlike aerial photogrammetry and satellite remote sensing, there is a lack of off-the-shelf software tools for airborne lidar point cloud processing, especially embedded with 40

functions handling the aforementioned artifacts. Current software tools may include functions for data interoperability, data compression, terrain generation, ground filtering, object extraction, segmentation, and classification. Nevertheless, preprocessing functions, including void detection, classification and reconstruction, geometric correction, intensity correction, data noise and pit removal, and so on, are needed to improve the data quality prior to information extraction. Also, near-future software development certainly rides on the wave of multicore CPU and GPU processing, on which stand-alone and cloud-based platforms are expected to be based. In five to 10 years, data processing for largearea fine-resolution multimodal lidar point clouds may arrive when the development of quantum computing reaches a mature stage [131]. CONCLUSIONS Information extraction from airborne lidar point clouds has gone viral in the remote sensing community lately. Nevertheless, most of the studies, riding on the wave of artificial intelligence and machine learning, do not pay much attention to lidar data artifacts. Ensuring data quality, i.e., completeness, accuracy, and reliability, is always crucial prior to data analysis in any application domain. Reducing these lidar data defects and artifacts can, in turn, maximize the potential of point cloud classification, semantic labeling, the inference of topographic properties, and so on. Therefore, this article provided a comprehensive review of three common types of airborne lidar data artifacts, i.e., voids, stripes, and noise, by going through their causes, potential impacts, current solutions, and future directions. Most of the artifacts can be attributed to the geometric and radiometric discrepancies of lidar data with respect to instantaneous atmospheric and ground conditions and hardware deficiency. One should bear in mind that these artifacts, though they are usually treated as undesired defects, sometimes shed light on certain semantic information. A rectangular data void in a residential area may likely be a swimming pool, for instance. The intention of preparing this article was to alert the community about the existence and implications of these data artifacts. On one hand, this article hopefully serves as a guide for beginner lidar users to understand the causes of data artifacts and their possible solutions. On the other hand, it hopefully urges the research community and industry to incorporate functions handling these artifacts in their lidar data processing bundles. The current unavailability of these functions severely handicaps lidar users in resolving data defects. Aside from algorithmic development, advancements in optics and photonics also drive the development of next-generation lidar systems with high stability and an improved signalto-noise ratio. Quantum computing will also likely boost the computational power and efficiency of point cloud processing in the near future. IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

ACKNOWLEDGMENT This work was supported by the General Research Fund (Project 15221022, Quantifying Airborne Lidar Data Artifacts) by the Research Grants Council of the Hong Kong Special Administrative Region (HKSAR). The author would like to express sincere appreciation to Dr. Paul E. LaRocque, Dr. Ana P. Kersting, and Alex Yeryomin, of Teledyne Optech, for discussing lidar system problems and providing sample datasets. Acknowledgment also goes to Prof. Ayman Habib, of Purdue University; Prof. Ahmed Shaker, of Toronto Metropolitan University; Dr. Karin van Ewijk and Prof. Paul Treitz, of Queen’s University; McElhanney Consulting Services, British Columbia, Canada; and the Department of Civil Engineering and Development, Government of the ­HKSAR, for providing sample airborne lidar datasets. In addition, the author thanks Dr. Ernest Ho for his meticulous proofreading and the anonymous referees for suggesting improvements. This work specifically pays tribute to the late Dr. Martin ­Isenburg for his dedication and contribution to data processing tools and open data/formats/methods for lidar point clouds. AUTHOR INFORMATION Wai Yeung Yan ([email protected]) received his Ph.D. degree in civil engineering from Toronto Metropolitan University (formerly Ryerson University) in 2012. He is currently an assistant professor with the Department of Land Surveying and Geo-Informatics, Hong Kong Polytechnic University, Kowloon, Hong Kong, and an adjunct professor with the Department of Civil Engineering, Toronto Metropolitan University, Toronto, ON M5B 2K3, Canada. His research interests include improving lidar data quality, point cloud processing, laser scanning, and remote sensing. He is a Senior Member of IEEE. REFERENCES [1] “LiDAR market by technology (2D, 3D, 4D), component (Laser Scanners, Navigation and Positioning Systems), installation type (Airborne and Ground Based), range, service, end-use application, and region (2021-2026).” Markets and Markets. Accessed: Jun. 1, 2022. [Online]. Available: https://www.marketsandmarkets. com/Market-Reports/lidar-market-1261.html [2] “LiDAR market size, share & trends analysis report by type, application, end-user, and segment forecasts, 2020–2027.” Valuates Reports. Accessed: Jun. 1, 2022. [Online]. Available: https://reports.valuates.com/market-reports/ALLI-Manu -4A11/lidar [3] “LiDAR market size worth $4.71 billion by 2030.” Grand View Research. Accessed: Jun. 1, 2022. [Online]. Available: https://www.grandviewresearch.com/press-release/global -lidar-market [4] W. Y. Yan, A. Shaker, and N. El-Ashmawy, “Urban land cover classification using airborne LiDAR data: A review,” Remote Sens. Environ., vol. 158, pp. 295–310, Mar. 2015, doi: 10.1016/ j.rse.2014.11.001. SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

[5] A. Habib, K. I. Bang, A. P. Kersting, and D.-C. Lee, “Error budget of LiDAR systems and quality control of the derived data,” Photogrammetric Eng. Remote Sens., vol. 75, no. 9, pp. 1093–1108, Sep. 2009, doi: 10.14358/PERS.75.9.1093. [6] A. Habib, “Accuracy, quality assurance, and quality control of LiDAR data: Principles and processing,” in Topographic Laser Ranging and Scanning. Boca Raton, FL, USA: CRC Press, 2017, pp. 269–294. [7] A. Swatantran, H. Tang, T. Barrett, P. DeCola, and R. Dubayah, “Rapid, high-resolution forest structure and terrain mapping over large areas using single photon LiDAR,” Scientific Rep., vol. 6, no. 1, pp. 1–12, Jun. 2016, doi: 10.1038/srep28277. [8] D. F. Laefer, S. Abuwarda, A.-V. Vo, L. Truong-Hong, and H. Gharibi, “2015 aerial laser and photogrammetry survey of Dublin city collection record,” NYU Spatial Data Repository, New York, NY, USA, 2017. [Online]. Available: https://geo.nyu. edu/catalog/nyu_2451_38684 [9] M. Isenburg. “Efficient tools for LiDAR processing.” LAStools. Accessed: Jun. 1, 2022. [Online]. Available: https://rapidlasso.de/ [10] N. Pfeifer, G. Mandlburger, J. Otepka, and W. Karel, “OPALS – A framework for airborne laser scanning data analysis,” Comput., Environ. Urban Syst., vol. 45, pp. 125–136, May 2014, doi: 10.1016/j.compenvurbsys.2013.11.002. [11] J.-R. Roussel et al., “LidR: An R package for analysis of Airborne Laser Scanning (ALS) data,” Remote Sens. Environ., vol. 251, Dec. 2020, Art. no. 112061, doi: 10.1016/j.rse.2020.112061. [12] C. Mallet and F. Bretar, “Full-waveform topographic LiDAR: State-of-the-art,” ISPRS J. Photogrammetry Remote Sens., vol. 64, no. 1, pp. 1–16, Jan. 2009, doi: 10.1016/j.isprsjprs.2008. 09.007. [13] H. K. Heidemann, “LiDAR base specification version 2.1,” in U.S. Geological Survey Techniques and Methods. Reston, VA, USA: U.S. Geological Survey, 2019, ch. B4. [14] “Federal airborne LiDAR data acquisition guideline version 2.0,” Natural Resources Canada and Public Safety Canada, ­Ottawa ON, Canada, General Inf. Product 117e, 2018. [Online]. Available: https://www.afn.ca/wp-content/uploads/2021/03/2. -Federal-Airborne-LiDAR-Data-­Acquisition-Guideline-Version -2.0.pdf [15] L. Zhou and G. Vosselman, “Mapping curbstones in airborne and mobile laser scanning data,” Int. J. Appl. Earth Observ. Geoinf., vol. 18, pp. 293–304, Aug. 2012, doi: 10.1016/j.jag.2012.01.024. [16] P. T.-Y. Shih and C.-M. Huang, “The building shadow problem of airborne LiDAR,” Photogrammetric Rec., vol. 24, no. 128, pp. 372–385, Nov. 2009, doi: 10.1111/j.1477-9730.2009.00550.x. [17] S. A. N. Gilani, M. Awrangjeb, and G. Lu, “Segmentation of airborne point cloud data for automatic building roof extraction,” GIScience Remote Sens., vol. 55, no. 1, pp. 63–89, 2018, doi: 10.1080/15481603.2017.1361509. [18] M. Feng, T. Zhang, S. Li, G. Jin, and Y. Xia, “An improved minimum bounding rectangle algorithm for regularized building boundary extraction from aerial LiDAR point clouds with partial occlusions,” Int. J. Remote Sens., vol. 41, no. 1, pp. 300–319, 2020, doi: 10.1080/01431161.2019.1641245. [19] L. Liu and S. Lim, “A framework of road extraction from airborne LiDAR data and aerial imagery,” J. Spatial Sci., vol. 61, no. 2, pp. 263–281, Apr. 2016, doi: 10.1080/14498596.2016.1147392.

41

[20] L. Cheng, Y. Wu, Y. Wang, L. Zhong, Y. Chen, and M. Li, “Threedimensional reconstruction of large multilayer interchange bridge using airborne LiDAR data,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 8, no. 2, pp. 691–708, Feb. 2015, doi: 10.1109/JSTARS.2014.2363463. [21] A. Kobler, N. Pfeifer, P. Ogrinc, L. Todorovski, K. Oštir, and S. Džeroski, “Repetitive interpolation: A robust algorithm for DTM generation from aerial laser scanner data in forested terrain,” Remote Sens. Environ., vol. 108, no. 1, pp. 9–23, May 2007, doi: 10.1016/j.rse.2006.10.013. [22] X. Liu, “Airborne LiDAR for DEM generation: Some critical issues,” Prog. Physical Geography, vol. 32, no. 1, pp. 31–49, Feb. 2008, doi: 10.1177/0309133308089496. [23] A. L. Montealegre, M. T. Lamelas, and J. De La Riva, “A comparison of open-source LiDAR filtering algorithms in a m ­ editerranean forest environment,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 8, no. 8, pp. 4072–4085, Aug. 2015, doi: 10.1109/ JSTARS.2015.2436974. [24] R. Hill and R. K. Broughton, “Mapping the understorey of deciduous woodland from leaf-on and leaf-off airborne LiDAR data: A case study in lowland Britain,” ISPRS J. P­ hotogrammetry Remote Sens., vol. 64, no. 2, pp. 223–233, Mar. 2009, doi: 10.1016/j.isprsjprs.2008.12.004. [25] D. Kükenbrink, F. D. Schneider, R. Leiterer, M. E. Schaepman, and F. Morsdorf, “Quantification of hidden canopy volume of airborne laser scanning data using a voxel traversal algorithm,” Remote Sens. Environ., vol. 194, pp. 424–436, Jun. 2017, doi: 10.1016/j.rse.2016.10.023. [26] I. Korpela, A. Hovi, and F. Morsdorf, “Understory trees in airborne LiDAR data — Selective mapping due to transmission losses and echo-triggering mechanisms,” Remote Sens. Environ., vol. 119, pp. 92–104, Apr. 2012, doi: 10.1016/j. rse.2011.12.011. [27] K. Richter and H.-G. Maas, “Radiometric enhancement of full-waveform airborne laser scanner data for volumetric representation in environmental applications,” ISPRS J. Photogrammetry Remote Sens., vol. 183, pp. 510–524, 2022, doi: 10.1016/j. isprsjprs.2021.10.021. [28] L. Chasmer, C. Hopkinson, B. Smith, and P. Treitz, “Examining the influence of changing laser pulse repetition frequencies on conifer forest canopy returns,” Photogrammetric Eng. Remote Sens., vol. 72, no. 12, pp. 1359–1367, Dec. 2006, doi: 10.14358/ PERS.72.12.1359. [29] B. M. Wing, M. W. Ritchie, K. Boston, W. B. Cohen, A. Gitelman, and M. J. Olsen, “Prediction of understory vegetation cover with airborne LiDAR in an interior ponderosa pine forest,” Remote Sens. Environ., vol. 124, pp. 730–741, Sep. 2012, doi: 10.1016/j.rse.2012.06.024. [30] M. E. Hodgson et al., “An evaluation of LiDAR-derived elevation and terrain slope in leaf-off conditions,” Photogrammetric Eng. Remote Sens., vol. 71, no. 7, pp. 817–823, Jul. 2005, doi: 10.14358/PERS.71.7.817. [31] T. Hilker et al., “Comparing canopy metrics derived from terrestrial and airborne laser scanning in a Douglas-fir dominated forest stand,” Trees, vol. 24, no. 5, pp. 819–832, Oct. 2010, doi: 10.1007/s00468-010-0452-7.

42

[32] T. Hilker et al., “Comparison of terrestrial and airborne LiDAR in describing stand structure of a thinned lodgepole pine forest,” J. Forestry, vol. 110, no. 2, pp. 97–104, Mar. 2012, doi: 10.5849/jof.11-003. [33] B. B. Worstell, S. Poppenga, G. A. Evans, and S. Prince, “LiDAR point density analysis: Implications for identifying water bodies,” U.S. Geological Survey, Reston, VA, USA, USGS Numbered Series 2014-5191, 2014. [34] J. B. Coleman, X. Yao, T. R. Jordan, and M. Madden, “Holes in the ocean: Filling voids in bathymetric LiDAR data,” Comput. Geosci., vol. 37, no. 4, pp. 474–484, Apr. 2011, doi: 10.1016/j. cageo.2010.11.008. [35] B. Höfle, M. Vetter, N. Pfeifer, G. Mandlburger, and J. Stötter, “Water surface mapping from airborne laser scanning using signal intensity and elevation data,” Earth Surf. Processes Landforms, vol. 34, no. 12, pp. 1635–1649, Sep. 2009, doi: 10.1002/esp.1853. [36] W. Y. Yan, A. Shaker, and P. E. LaRocque, “Scan line intensityelevation ratio (SLIER): An airborne LiDAR ratio index for automatic water surface mapping,” Remote Sens., vol. 11, no. 7, Apr. 2019, Art. no. 814, doi: 10.3390/rs11070814. [37] S. Tamari, J. Mory, and V. Guerrero-Meza, “Testing a near-infrared LiDAR mounted with a large incidence angle to monitor the water level of turbid reservoirs,” ISPRS J. Photogrammetry Remote Sens., vol. 66, no. 6, pp. S85–S91, Dec. 2011, doi: 10.1016/j.isprsjprs.2011.01.009. [38] A. V. Jelalian, Laser Radar Systems. Norwood, MA, USA: Artech House, 1992. [39] J. L. Bufton, F. E. Hoge, and R. N. Swift, “Airborne measurements of laser backscatter from the ocean surface,” Appl. Opt., vol. 22, no. 17, pp. 2603–2618, Sep. 1983, doi: 10.1364/AO.22.002603. [40] L. S. Rose et al., “Challenges and lessons from a wetland LiDAR project: A case study of the Okefenokee Swamp, Georgia, USA,” Geocarto Int., vol. 28, no. 3, pp. 210–226, 2013, doi: 10.1080/10106049.2012.681707. [41] L. P. Rampi, J. F. Knight, and C. F. Lenhart, “Comparison of flow direction algorithms in the application of the CTI for mapping wetlands in Minnesota,” Wetlands, vol. 34, no. 3, pp. 513–525, Feb. 2014, doi: 10.1007/s13157-014-0517-2. [42] J. A. Czuba, S. R. David, D. A. Edmonds, and A. S. Ward, “Dynamics of surface-water connectivity in a low-gradient meandering river floodplain,” Water Resour. Res., vol. 55, no. 3, pp. 1849–1870, Mar. 2019, doi: 10.1029/2018WR023527. [43] A. Shaker, W. Y. Yan, and P. E. LaRocque, “Automatic land-­ water classification using multispectral airborne LiDAR data for near-shore and river environments,” ISPRS J. Photogrammetry Remote Sens., vol. 152, pp. 94–108, Apr. 2019, doi: 10.1016/j. isprsjprs.2019.04.005. [44] W. Y. Yan, “Scan line void filling of airborne LiDAR point clouds for hydroflattening DEM,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 14, pp. 6426–6437, Jun. 2021, doi: 10.1109/JSTARS.2021.3089288. [45] D. Gatziolis and H.-E. Andersen, “A guide to LiDAR data acquisition and processing for the forests of the Pacific Northwest,” U.S. Dept. Agriculture, Forest Service, Pacific Northwest Research Station, Portland, OR, USA, General Tech. Rep. PNWGTR-768, 2008. IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

[46] D. Burton, D. B. Dunlap, L. J. Wood, and P. P. Flaig, “LiDAR intensity as a remote sensor of rock properties,” J. Sedimentary Res., vol. 81, no. 5, pp. 339–347, May 2011, doi: 10.2110/jsr.2011.31. [47] S. Kaasalainen et al., “Effect of target moisture on laser scanner intensity,” IEEE Trans. Geosci. Remote Sens., vol. 48, no. 4, pp. 2128–2136, Apr. 2010, doi: 10.1109/TGRS.2009.2036841. [48] S. Kaasalainen et al., “Radiometric calibration of LiDAR intensity with commercially available reference targets,” IEEE Trans. Geosci. Remote Sens., vol. 47, no. 2, pp. 588–598, Mar. 2009, doi: 10.1109/TGRS.2008.2003351. [49] S. O. Elberink and G. Vosselman, “Quality analysis on 3D building models reconstructed from airborne laser scanning data,” ISPRS J. Photogrammetry Remote Sens., vol. 66, no. 2, pp. 157–165, Mar. 2011, doi: 10.1016/j.isprsjprs.2010.09.009. [50] K. Garroway, C. Hopkinson, and R. Jamieson, “Surface moisture and vegetation influences on LiDAR intensity data in an agricultural watershed,” Can. J. Remote Sens., vol. 37, no. 3, pp. 275–284, Jun. 2011, doi: 10.5589/m11-036. [51] A. Vain, X. Yu, S. Kaasalainen, and J. Hyyppa, “Correcting airborne laser scanning intensity data for automatic gain control effect,” IEEE Geosci. Remote Sens. Lett., vol. 7, no. 3, pp. 511–514, Jul. 2010, doi: 10.1109/LGRS.2010.2040578. [52] I. Korpela, H. O. Ørka, J. Hyyppä, V. Heikkinen, and T. Tokola, “Range and AGC normalization in airborne discrete-return LiDAR intensity data for forest canopies,” ISPRS J. Photogrammetry Remote Sens., vol. 65, no. 4, pp. 369–379, Jul. 2010, doi: 10.1016/j.isprsjprs.2010.04.003. [53] F. Pirotti, A. Guarnieri, and A. Vettore, “State of the art of ground and aerial laser scanning technologies for highresolution topography of the earth surface,” Eur. J. Remote Sens., vol.  46, no.  1, pp. 66–78, Feb. 2013, doi: 10.5721/ EuJRS20134605. [54] P. Liepa, “Filling holes in meshes,” in Proc. Eurograph./ACM SIGGRAPH Symp. Geometry Process., 2003, pp. 200–205. [55] W. Zhao, S. Gao, and H. Lin, “A robust hole-filling algorithm for triangular mesh,” Vis. Comput., vol. 23, no. 12, pp. 987–997, Dec. 2007, doi: 10.1007/s00371-007-0167-y. [56] A. Ruiz, W. Kornus, J. Talaya, and J. Colomer, “Terrain modeling in an extremely steep mountain: A combination of airborne and terrestrial LiDAR,” Int. Soc. Photogrammetry Remote Sens., vol. 36, no. 3, pp. 281–284, Jan. 2004. [57] Z. Zhang, G. Vosselman, M. Gerke, C. Persello, D. Tuia, and M. Y. Yang, “Detecting building changes between airborne laser scanning and photogrammetric data,” Remote Sens., vol. 11, no. 20, Oct. 2019, Art. no. 2417, doi: 10.3390/ rs11202417. [58] M. S. O’Banion, M. J. Olsen, J. P. Hollenbeck, and W. C. Wright, “Data gap classification for terrestrial laser scanning-derived digital elevation models,” ISPRS Int. J. Geo-Inf., vol. 9, no. 12, Dec. 2020, Art. no. 749, doi: 10.3390/ijgi9120749. [59] J.-F. Parrot and C. R. Núñez, “LiDAR DTM: Artifacts, and correction for river altitudes,” Investigaciones Geográficas, Boletín del Instituto de Geografía, vol. 2016, no. 90, pp. 28–39, Aug. 2016, doi: 10.14350/rig.47372. [60] J. Chen, J. S. K. Yi, M. Kahoush, E. S. Cho, and Y. K. Cho, “Point cloud scene completion of obstructed building facades with SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

[61]

[62]

[63]

[64]

[65]

[66]

[67]

[68]

[69]

[70]

[71]

[72]

[73]

generative adversarial inpainting,” Sensors, vol. 20, no. 18, Sep. 2020, Art. no. 5029, doi: 10.3390/s20185029. Z. Huang, Y. Yu, J. Xu, F. Ni, and X. Le, “PF-Net: Point fractal network for 3D point cloud completion,” in Proc. IEEE/CVF Conf. Comput. Vision Pattern Recognit., 2020, pp. 7662–7670, doi: 10.1109/CVPR42600.2020.00768. X. Wang, M. H. Ang Jr., and G. H. Lee, “Cascaded refinement network for point cloud completion,” in Proc. IEEE/CVF Conf. Comput. Vision Pattern Recognit., 2020, pp. 790–799, doi: 10.1109/CVPR42600.2020.00087. X. Wen, T. Li, Z. Han, and Y.-S. Liu, “Point cloud completion by skip-attention network with hierarchical folding,” in Proc. IEEE/CVF Conf. Comput. Vision Pattern Recognit., 2020, pp. 1939–1948, doi: 10.1109/CVPR42600.2020.00201. A. Zomorodian and G. Carlsson, “Computing persistent homology,” Discrete Comput. Geometry, vol. 33, no. 2, pp. 249–274, Feb. 2005, doi: 10.1007/s00454-004-1146-y. H. O. Ørka, T. Gobakken, E. Næsset, L. Ene, and V. Lien, “Simultaneously acquired airborne laser scanning and multispectral imagery for individual tree species identification,” Can. J. Remote Sens., vol. 38, no. 2, pp. 125–138, Feb. 2012, doi: 10.5589/m12-021. W. Y. Yan and A. Shaker, “Airborne LiDAR intensity banding: Cause and solution,” ISPRS J. Photogrammetry Remote Sens., vol. 142, pp. 301–310, Aug. 2018, doi: 10.1016/j.isprsjprs.2018.06.013. C. L. Leigh, D. B. Kidner, and M. C. Thomas, “The use of ­LiDAR in digital surface modelling: Issues and errors,” Trans. GIS, vol. 13, no. 4, pp. 345–361, Aug. 2009, doi: 10.1111/j.1467-9671. 2009.01168.x. R. A. Nobrega, J. A. Quintanilha, and C. G. O’Hara, “A noiseremoval approach for LiDAR intensity images using anisotropic diffusion filtering to preserve object shape characteristics,” in Proc. ASPRS Annu. Conf., Tampa, FL, USA, May 2007. W. Y. Yan, K. van Ewijk, P. Treitz, and A. Shaker, “Effects of radiometric correction on cover type and spatial resolution for modeling plot level forest attributes using multispectral airborne LiDAR data,” ISPRS J. Photogrammetry Remote Sens., vol. 169, pp. 152–165, Nov. 2020, doi: 10.1016/j.isprsjprs.2020.09.001. M. Okhrimenko, C. Coburn, and C. Hopkinson, “Multispectral LiDAR: Radiometric calibration, canopy spectral reflectance, and vegetation vertical SVI profiles,” Remote Sens., vol. 11, no. 13, Jun. 2019, Art. no. 1556, doi: 10.3390/ rs11131556. M. Okhrimenko and C. Hopkinson, “Investigating the consistency of uncalibrated multispectral LiDAR vegetation indices at different altitudes,” Remote Sens., vol. 11, no. 13, Jun. 2019, Art. no. 1531, doi: 10.3390/rs11131531. T. R. Goodbody, P. Tompalski, N. C. Coops, C. Hopkinson, P. Treitz, and K. van Ewijk, “Forest inventory and diversity attribute modelling using structural and intensity metrics from multispectral airborne laser scanning data,” Remote Sens., vol.  12, no. 13, Jul. 2020, Art. no. 2109, doi: 10.3390/rs12132109. P. Knott, T. Proctor, A. Hayes, J. Ralph, P. Kok, and J. Dunningham, “Local versus global strategies in multiparameter estimation,” Physical Rev. A, vol. 94, no. 6, Dec. 2016, Art. no. 062312, doi: 10.1103/PhysRevA.94.062312.

43

[74] W. Y. Yan, A. Shaker, A. Habib, and A. P. Kersting, “Improving classification accuracy of airborne LiDAR intensity data by geometric calibration and radiometric correction,” ISPRS J. Photogrammetry Remote Sens., vol. 67, pp. 35–44, Jan. 2012, doi: 10.1016/j.isprsjprs.2011.10.005. [75] B. Höfle and N. Pfeifer, “Correction of laser scanning intensity data: Data and model-driven approaches,” ISPRS J. Photogrammetry Remote Sens., vol. 62, no. 6, pp. 415–433, Dec. 2007, doi: 10.1016/j.isprsjprs.2007.05.008. [76] A. G. Kashani, M. J. Olsen, C. E. Parrish, and N. Wilson, “A review of LiDAR radiometric processing: From ad hoc intensity correction to rigorous radiometric calibration,” Sensors, vol. 15, no. 11, pp. 28,099–28,128, Nov. 2015, doi: 10.3390/s151128099. [77] W. Y. Yan and A. Shaker, “Radiometric correction and normalization of airborne LiDAR intensity data for improving land-cover classification,” IEEE Trans. Geosci. Remote Sens., vol. 52, no. 12, pp. 7658–7673, Jun. 2014, doi: 10.1109/TGRS.2014.2316195. [78] C. Hopkinson, “The influence of flying altitude, beam divergence, and pulse repetition frequency on laser pulse return intensity and canopy frequency distribution,” Can. J. Remote Sens., vol. 33, no. 4, pp. 312–324, Aug. 2007, doi: 10.5589/m07-029. [79] S. Kaasalainen et al., “Radiometric calibration of ALS intensity,” Int. Arch. Photogrammetry, Remote Sens. Spatial Inf. Sci., vol. 36, no. Part3/W52, pp. 201–205, Jan. 2007. [80] I. S. Korpela, “Mapping of understory lichens with airborne discrete-return LiDAR data,” Remote Sens. Environ., vol. 112, no. 10, pp. 3891–3897, Oct. 2008, doi: 10.1016/j.rse.2008.06.007. [81] D. Gatziolis, “Dynamic range-based intensity normalization for airborne, discrete return LiDAR data of forest canopies,” Photogrammetric Eng. Remote Sens., vol. 77, no. 3, pp. 251–259, Mar. 2011, doi: 10.14358/PERS.77.3.251. [82] H. You, T. Wang, A. Skidmore, and Y. Xing, “Quantifying the effects of normalisation of airborne LiDAR intensity on coniferous forest leaf area index estimations,” Remote Sens., vol. 9, no. 2, Feb. 2017, Art. no. 163, doi: 10.3390/rs9020163. [83] M. Kukkonen, M. Maltamo, L. Korhonen, and P. Packalen, “Multispectral airborne LiDAR data in the prediction of boreal tree species composition,” IEEE Trans. Geosci. Remote Sens., vol. 57, no. 6, pp. 3462–3471, Jan. 2019, doi: 10.1109/TGRS.2018.2885057. [84] W. Y. Yan, “Intensity correction of multispectral airborne laser scanning data,” in Proc. IEEE Int. Geosci. Remote Sens. Symp., Brussels, Belgium: IEEE, Jul. 2021, pp. 7712–7715, doi: 10.1109/IGARSS47720.2021.9553543. [85] N. El-Sheimy, “Georeferencing component of LiDAR systems,” in Topographic Laser Ranging and Scanning. Boca Raton, FL, USA: CRC Press, 2017, pp. 195–214. [86] A. Habib, K. I. Bang, A. P. Kersting, and J. Chow, “Alternative methodologies for LiDAR system calibration,” Remote Sens., vol. 2, no. 3, pp. 874–907, Mar. 2010, doi: 10.3390/rs2030874. [87] A. F. Habib, A. P. Kersting, A. Shaker, and W.-Y. Yan, “Geometric calibration and radiometric correction of LiDAR data and their impact on the quality of derived products,” Sensors, vol. 11, no. 9, pp. 9069–9097, Sep. 2011, doi: 10.3390/s110909069. [88] G. Vosselman, “Analysis of planimetric accuracy of airborne laser scanning surveys,” Int. Arch. Photogrammetry, Remote Sens. Spatial Inf. Sci., vol. 37, no. 3a, pp. 99–104, 2008.

44

[89] C. K. Toth and Z. Koppanyi, “Strip adjustment,” in Topographic Laser Ranging and Scanning. Boca Raton, FL, USA: CRC Press, 2018, pp. 259–289. [90] A. Gruen and D. Akca, “Least squares 3D surface and curve matching,” ISPRS J. Photogrammetry Remote Sens., vol. 59, no. 3, pp. 151–174, May 2005, doi: 10.1016/j.isprsjprs.2005.02.006. [91] H.-G. Maas, “On the use of pulse reflectance data for laserscanner strip adjustment,” Int. Arch. Photogrammetry Remote Sens. Spatial Inf. Sci., vol. 34, no. 3/W4, pp. 53–56, Jan. 2001. [92] M. Albani and B. Klinkenberg, “A spatial filter for the removal of striping artifacts in digital elevation models,” Photogrammetric Eng. Remote Sens., vol. 69, no. 7, pp. 755–765, Jul. 2003, doi: 10.14358/PERS.69.7.755. [93] J. Gallant, “Adaptive smoothing for noisy DEMs,” in Proc. Geomorphometry, Redlands, CA, USA, Sep. 2011, pp. 37–40. [94] B. Münch, P. Trtik, F. Marone, and M. Stampanoni, “Stripe and ring artifact removal with combined wavelet--Fourier filtering,” Opt. Exp., vol. 17, no. 10, pp. 8567–8591, Jun. 2009, doi: 10.1364/OE.17.008567. [95] T. H. Tarekegn and T. Sayama, “Correction of SRTM DEM artefacts by Fourier transform for flood inundation modeling,” J. Jpn. Soc. Civil Eng., B1 (Hydraul. Eng.), vol. 69, no. 4, pp. I_193–I _198, Jan. 2013, doi: 10.2208/jscejhe.69.I_193. [96] M. Favalli, A. Fornaciai, and M. T. Pareschi, “LiDAR strip adjustment: Application to volcanic areas,” Geomorphology, vol. 111, nos. 3–4, pp. 123–135, Oct. 2009, doi: 10.1016/j.geomorph.2009.04.010. [97] A. Zlinszky, E. Boergens, P. Glira, and N. Pfeifer, “Airborne laser scanning for calibration and validation of inshore satellite altimetry: A proof of concept,” Remote Sens. Environ., vol. 197, no. 4, pp. 35–42, Aug. 2017, doi: 10.1016/j.rse.2017.04.027. [98] J.-H. Song, S.-H. Han, K. Yu, and Y.-I. Kim, “Assessing the possibility of land-cover classification using LiDAR intensity data,” Int. Arch. Photogrammetry Remote Sens. Spatial Inf. Sci., vol. 34, no. 3/B, pp. 259–262, May 2012. [99] X. Lai, X. Zheng, and Y. Wan, “A kind of filtering algorithms for LiDAR intensity image based on flatness terrain,” in Proc. IEEE Int. Symp. Spatio-temporal Model., Spatial Reasoning, Anal., Data Mining Data Fusion, 2005. [100] A. H. Incekara, D. Z. Seker, and B. Bayram, “Qualifying the LiDAR-derived intensity image as an infrared band in NDWIbased shoreline extraction,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 11, no. 12, pp. 5053–5062, Dec. 2018, doi: 10.1109/JSTARS.2018.2875792. [101] X. Li, B. Yang, X. Xie, D. Li, and L. Xu, “Influence of waveform characteristics on LiDAR ranging accuracy and precision,” Sensors, vol. 18, no. 4, Apr. 2018, Art. no. 1156, doi: 10.3390/ s18041156. [102] X.-F. Han, J. S. Jin, M.-J. Wang, W. Jiang, L. Gao, and L. Xiao, “A review of algorithms for filtering the 3D point cloud,” Signal Process., Image Commun., vol. 57, pp. 103–112, May 2017, doi: 10.1016/j.image.2017.05.009. [103] Y. Duan, C. Yang, H. Chen, W. Yan, and H. Li, “Low-complexity point cloud denoising for LiDAR by PCA-based dimension reduction,” Opt. Commun., vol. 482, Mar. 2021, Art. no. 126567, doi: 10.1016/j.optcom.2020.126567. IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

[104] E. Schubert, J. Sander, M. Ester, H. P. Kriegel, and X. Xu, “DBSCAN revisited, revisited: Why and how you should (Still) use DBSCAN,” ACM Trans. Database Syst., vol. 42, no. 3, pp. 1–21, Jul. 2017, doi: 10.1145/3068335. [105] L. Zhou, G. Sun, Y. Li, W. Li, and Z. Su, “Point cloud denoising review: From classical to deep learning-based approaches,” Graphical Models, vol. 121, May 2022, Art. no. 101140, doi: 10.1016/j.gmod.2022.101140. [106] R. A. Haugerud and D. Harding, “Some algorithms for virtual deforestation (VDF) of LiDAR topographic survey data,” Int. Arch. Photogrammetry Remote Sens., vol. 34, no. 3/W4, pp. 211– 218, Jan. 2001. [107] A. Khosravipour, A. K. Skidmore, and M. Isenburg, “Generating spike-free digital surface models using LiDAR raw point clouds: A new approach for forestry applications,” Int. J. Appl. Earth Observ. Geoinf., vol. 52, pp. 104–114, Oct. 2016, doi: 10.1016/ j.jag.2016.06.005. [108] K. Lim, P. Treitz, M. Wulder, B. St-Onge, and M. Flood, “LiDAR remote sensing of forest structure,” Progr. Physical Geography, vol. 27, no. 1, pp. 88–106, Mar. 2003, doi: 10.1191/ 0309133303pp360ra. [109] M. A. Wulder et al., “LiDAR sampling for large-area forest characterization: A review,” Remote Sens. Environ., vol. 121, pp. 196– 209, Jun. 2012, doi: 10.1016/j.rse.2012.02.001. [110] J. R. Ben-Arie, G. J. Hay, R. P. Powers, G. Castilla, and B. StOnge, “Development of a pit filling algorithm for LiDAR canopy height models,” Comput. Geosci., vol. 35, no. 9, pp. 1940– 1949, Sep. 2009, doi: 10.1016/j.cageo.2009.02.003. [111] A. Khosravipour, A. K. Skidmore, M. Isenburg, T. Wang, and Y. A. Hussin, “Generating pit-free canopy height models from airborne LiDAR,” Photogrammetric Eng. Remote Sens., vol. 80, no. 9, pp. 863–872, Sep. 2014, doi: 10.14358/PERS.80.9.863. [112] C. Chen, Y. Wang, Y. Li, T. Yue, and X. Wang, “Robust and parameter-free algorithm for constructing pit-free canopy height models,” Int. J. Geo-Inf., vol. 6, no. 7, Jul. 2017, Art. no. 219, doi: 10.3390/ijgi6070219. [113] H. Liu and P. Dong, “A new method for generating canopy height models from discrete-return LiDAR point clouds,” Remote Sens. Lett., vol. 5, no. 6, pp. 575–582, Jun. 2014, doi: 10.1080/2150704X.2014.938180. [114] W. Zhang et al., “Cloth simulation-based construction of pitfree canopy height models from airborne LiDAR data,” Forest Ecosystems, vol. 7, no. 1, pp. 1–13, Jan. 2020, doi: 10.1186/ s40663-019-0212-0. [115] T. Allouis, J.-S. Bailly, Y. Pastol, and C. Le Roux, “Comparison of LiDAR waveform processing methods for very shallow water bathymetry using Raman, near-infrared and green signals,” Earth Surf. Processes Landforms, J. Brit. Geomorphological Res. Group, vol. 35, no. 6, pp. 640–650, May 2010, doi: 10.1002/esp.1959. [116] W. Yao, M. Zhang, S. Hinz, and U. Stilla, “Airborne traffic monitoring in large areas using LiDAR data – Theory and experiments,” Int. J. Remote Sens., vol. 33, no. 12, pp. 3930–3945, Jun. 2012, doi: 10.1080/01431161.2011.637528. [117] C. F. Daganzo and N. Geroliminis, “An analytical approximation for the macroscopic fundamental diagram of urban traf-

SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

fic,” Transp. Res. B, Methodological, vol. 42, no. 9, pp. 771–781, Nov. 2008, doi: 10.1016/j.trb.2008.06.008. [118] “Specifications for airborne LiDAR for the province of British Columbia,” Ministry of Forests, Lands, Natural Resources Operations and Rural Development, Victoria, BC, Canada, Version 5.0, 2020, p. 54. [119] R. Chen, Y. Jiang, and H. Wang, “Calculation method of the overlap factor and its enhancement for airborne LiDAR,” Opt. Commun., vol. 331, pp. 181–188, Nov. 2014, doi: 10.1016/j.optcom. 2014.05.063. [120] S. Crema et al., “Can inpainting improve digital terrain analysis? Comparing techniques for void filling, surface reconstruction and geomorphometric analyses,” Earth Surf. Processes Landforms, vol. 45, no. 3, pp. 736–755, Mar. 2020, doi: 10.1002/ esp.4739. [121] G. Meynet, Y. Nehmé, J. Digne, and G. Lavoué, “PCQM: A fullreference quality metric for colored 3D point clouds,” in Proc. 12th IEEE Int. Conf. Qual. Multimedia Exp. (QoMEX), 2020, pp. 1–6, doi: 10.1109/QoMEX48832.2020.9123147. [122] Q. Yang, Z. Ma, Y. Xu, Z. Li, and J. Sun, “Inferring point cloud quality via graph similarity,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 6, pp. 3015–3029, Jun. 2022, doi: 10.1109/ TPAMI.2020.3047083. [123] Q. Yang, H. Chen, Z. Ma, Y. Xu, R. Tang, and J. Sun, “Predicting the perceptual quality of point cloud: A 3D-to-2D projectionbased exploration,” IEEE Trans. Multimedia, vol. 23, pp. 3877– 3891, 2021, doi: 10.1109/TMM.2020.3033117. [124] A. H. Jones, S. D. March, S. R. Bank, and J. C. Campbell, “Lownoise high-temperature AlInAsSb/GaSb avalanche photodiodes for 2-μm applications,” Nature Photon., vol. 14, no. 9, pp. 559–563, Sep. 2020, doi: 10.1038/s41566-020-0637-6. [125] S. Kodati et al., “Low noise AlInAsSb avalanche photodiodes on InP substrates for 1.55 μm infrared applications,” in Proc. Infrared Technol. Appl., Bellingham, WA, USA: Int. Soc. Opt. Eng., 2021, vol. 11741, Art. no. 117411X, doi: 10.1117/12.2587884. [126] A. Shahverdi, Y. M. Sua, L. Tumeh, and Y.-P. Huang, “Quantum parametric mode sorting: Beating the time-frequency filtering,” Scientific Rep., vol. 7, no. 1, pp. 1–12, 2017, doi: 10.1038/ s41598-017-06564-7. [127] A. Gärtner et al., “Combined antifogging and antireflective double nanostructured coatings for LiDAR applications,” Appl. Opt., vol. 62, no. 7, pp. B112–B116, Jan. 2023, doi: 10.1364/AO.476974. [128] F. Fan, S. Turkdogan, Z. Liu, D. Shelhammer, and C.-Z. Ning, “A monolithic white laser,” Nature Nanotechnol., vol. 10, no. 9, pp. 796–803, Jul. 2015, doi: 10.1038/nnano.2015.149. [129] R. Murray and A. Lyons, “Two-photon interference LiDAR imaging,” Opt. Exp., vol. 30, no. 15, pp. 27,164–27,170, Jul. 2022, doi: 10.1364/OE.461248. [130] H. Wright, J. Sun, D. McKendrick, N. Weston, and D. T. Reid, “Two-photon dual-comb LiDAR,” Opt. Exp., vol. 29, no. 23, pp. 37,037–37,047, Nov. 2021, doi: 10.1364/OE.434351. [131] S. S. Gill et al., “Quantum computing: A taxonomy, systematic review and future directions,” Softw., Pract. Exp., vol. 52, no. 1, pp. 66–114, Jan. 2022, doi: 10.1002/spe.3039. GRS

45

Interferometric Phase Linking Algorithm, application, and perspective DINH HO TONG MINH  AND STEFANO TEBALDINI 

M

itigating decorrelation effects on interferometric synthetic aperture radar (InSAR) time series data is challenging. The phase linking (PL) algorithm has been the key to handling signal decorrelations in the past 15 years. Numerous studies have been carried out to enhance its precision and computational efficiency. Different PL algorithms have been proposed, each with unique phase optimization approaches, such as the quasi-Newton method, equal-weighted and coherence-weighted factors, component extraction and selection SAR (CAESAR), and eigendecomposition-based algorithm (EMI). The differences among the PL algorithms can be attributed to the weight criteria adopted in each algorithm, which can be coherencebased, sparsity-based, or other forms of regularization. The PL algorithm has multiple applications, including SAR tomography (TomoSAR), enhancing distributed scatterers (DSs) to combine with persistent scatterers (PS) in PS and DS (PSDS) techniques, and compressed PSDS InSAR (ComSAR), where it facilitates the retrieval of the optimal phase from all possible measurements. This article aims to review PL techniques developed in the past 15 years. The review also underscores the importance of the PL technique in various SAR applications (TomoSAR, PSDS, and ComSAR). Finally, the deep learning (DL) approach is discussed as a valuable tool to improve the accuracy and efficiency of the PL process. INTRODUCTION InSAR has become an increasingly popular tool for high-precision deformation monitoring, due to its ability to detect small changes over time and provide a unique view of Earth’s surface [1]. One critical challenge in InSAR is extracting Digital Object Identifier 10.1109/MGRS.2023.3300974 Date of current version: 19 September 2023

46

2473-2397/23©2023IEEE

IMAGE LICENSED BY INGRAM PUBLISHING

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

meaningful information from the interferometric phase, which is affected by atmospheric conditions, topography, and decorrelations [2], [3]. Several techniques have been proposed to address these signal decorrelations to improve the interferometric phase’s quality [1]. The first category of techniques is based on PS InSAR, which utilizes individual scatterers dominating the signal from a resolution cell to track deformation through time [4]. PS interferometry (PSI) techniques provide high-quality deformation information at point target locations. However, in natural scenes, PSI technology, widely used for deformation estimation in urban areas, may not be sufficient to obtain accurate results, due to a low density of PSs [5]. To address the challenges associated with the limited information in InSAR data, an alternative approach is based on DSs, which offers the potential to leverage information more effectively. DS targets are commonly found in natural environments, such as meadows, fields, and bare soil, where multiple scatterers with similar brightness contribute to the information in a resolution cell. However, to account for signal decorrelation, one should select interferogram subsets for a temporal analysis using short spatial and temporal baselines, known as small baseline subsets (SBASs) [6], [7], [8]. This approach has demonstrated promising results in various applications, such as ground deformation monitoring and surface elevation mapping. The SBAS approach offers a valuable alternative to traditional InSAR methods by leveraging the information from DSs [9]. However, deformation measurements on distributed targets are often of lower quality and require spatial multilooked filtering. Another approach is the PL method introduced by Guarnieri and Tebaldini [10]. PL is defined as a statistical method used in interferometry to combine multiple interferometric phases into a single equivalent single-reference (ESR) phase. Suppose N single-look complex (SLC) images are available. The PL algorithm is the maximum likelihood estimation (MLE) of the N – 1 ESR phase from all possible combinations of N (N - 1) /2. Before applying the SBAS approach, it is necessary to unwrap the interferograms. However, the PL method exploits all wrapped interferometric phases to optimize the phase quality. SqueeSAR technology [11], which uses a phase triangulation algorithm, is one example of the PL method. These optimized ESR phases of DS targets can be used in conventional PSI processing. SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

Accurately estimating linked phases is critical in coherent analysis for mitigating decorrelation effects on SAR data. Hence, numerous studies have been carried out to enhance the precision and computational efficiency of PL estimation since the work of Guarnieri and Tebaldini [10]. Ferretti et al. [11] proposed a quasi-Newton method for unconstrained nonlinear optimization, which is the Broyden–Fletcher–Goldfarb–Shanno algorithm [12] for the MLE solution. This optimization technique is efficient in minimizing the nonlinear cost function of the PL problem. Cao et al. [13] introduced equal-weighted and coherence-weighted factors for phase optimization, which provide flexibility for incorporating coherence information. The CAESAR algorithm [14] was proposed for phase optimization under multiple scattering mechanisms. It is based on the coherence matrix’s eigenvalue decomposition (EVD) and can extract different scattering components. The EMI algorithm proposed by Ansari et al. [15] is also EVD based and MLE based. It is efficient in computation and estimation due to its exploitation of coherence information. Ho Tong Minh and Ngo [16] proposed a compression technique for PL estimation. The method involves dividing massive data into ministacks and then compressing them to enhance the noise-free short-lived interferometric components. Finally, Zwieback [17] proposed regulation methods to improve the estimation of the coherence matrix. The improvement in the compression [16] and regulation [17] approaches is most significant for low long-term coherences, due to more reliable coherence matrix estimates. Overall, the differences among the PL algorithms can be attributed to the weight criteria adopted in each algorithm, which can be coherence-based, sparsity-based, or other forms of regularization. Addressing signal decorrelations to improve the interferometric phase’s quality is crucial for many applications. The PL algorithm has been instrumental in handling this issue in the past 15 years. In the past five years, there have been 826 review articles published, while the total number of papers related to PL since its introduction is 1,286, according to results from Google Scholar as of June 2023. However, there is a need for a comprehensive review of the different PL algorithms proposed so far. Notably, no comprehensive review has been conducted thus far on the relationship between PL and its applications in diverse SAR methodologies, including TomoSAR, PSs and DSs, and the ComSAR algorithm. This article aims to bridge this knowledge gap by providing an overview of the PL techniques developed over the past 15 years and emphasizing the significance of the PL technique in various SAR applications. Furthermore, the potential of employing DL as a valuable tool to enhance the precision and efficiency of the PL process is explored and discussed. Table 1 provides nomenclature used throughout the article. 47

PHASE LINKING MODELS AND PROPOSED ALGORITHMS CLOSURE PHASE PROBLEM Suppose that N SLC SAR images are available for a specific area of interest. The images are coregistered on a reference grid, and phase contributions due to terrain topography and orbit have been compensated. Let yn be the nth coregistered SLC images in the form of

y n = S n exp (j{ n)(1)

where Sn is the amplitude and { n is the phase of yn. With three SAR images, it is possible to generate three single-look interferograms written as

y 12 = y 1 y )2, y 13 = y 1 y )3, y 23 = y 2 y )3 (2)

where * denotes the complex conjugate. The closure phase is the circular combination of the three interferometric phases:

W 1, 2, 3 = W {{ 12 + { 23 + { 31}(3)

where W is the wrapping (modulo-2π) operator, { nm = { n - { m is the single-look interferometric phase of ynm, and { nm includes contributions related to the residual

topography, deformation, atmosphere, and noise [2]. For single pixels, the closure phase W 1, 2, 3 is always zero by definition and is called the phase consistency condition or phase triangularity condition [18], [ [19]. However, the phase triangularity condition is not necessarily valid for multilooked interferometric pixels, as the closure phase can be nonzero [11], [19], [20]. Figure  1 shows an example of a full-scene closure phase from three multilooked interferometric phases. It has been derived from Advanced Land Observing Satellite (ALOS) 2/ Phased-Array L-Band Synthetic Aperture Radar (PALSAR) 2 data acquired over Vietnam. Ho Chi Minh City occupies the west center of the image. Notably, it has been demonstrated that nonsymmetric volumetric targets have a nonzero phase closure [20]. This implies that the zero-closure model assumes no phase closure and is inadequate for volume scattering scenarios, such as forests and glaciers. In such cases, multilooked interferograms assume a mathematical model representing the volumetric target as an “equivalent point target” with a “phase center” position. Consistent with this interpretation, the mathematical model of multilooked interferograms assumed in PL algorithms is given as E {y n y )m} = c nm v n v m exp ^ j (z n - z m) h (4)

TABLE 1. NOMENCLATURE USED IN THIS ARTICLE.

where E denotes spatial averaging, cnm is the coherence of the nmth interferometric pair, v n = E {S 2n}, z n is the multilook interferometric phase for the nth acquisition, and z nm = z n - z m is the multilook interferometric phase of ynm. Equation (4) requires reevaluating the phase estimation in a stack of SAR images. The selection and weighting of the interferograms can affect the accuracy of the reconstructed

EXPANSION

CAESAR

Component extraction and selection synthetic aperture radar

CNN

Convolutional neural network

ComSAR

Compressed persistent scatterers and distributed scatterers interferometric SAR algorithm

DL

Deep learning

CRLB

Cramér‒Rao lower bound

DS

Distributed scatterer

EMI

Eigendecomposition-based maximum likelihood estimator of interferometric phase

ESA

European Space Agency

ESR

Equivalent single reference

EVD

Eigenvalue decomposition

GAN

Generative adversarial network

MLE

Maximum likelihood estimator

InSAR

Interferometric SAR

PCA

Principal component analysis

PL

Phase linking

PS

Persistent scatterer

PSDS

PSs and DSs

PSI

PS interferometry

PU

Phase unwrapping

RNN

Recurrent NN

SAR

Synthetic aperture radar

SBAS

Small baseline subsets

SHP

Statistically homogeneous pixel

SLC

Single-look complex

FIGURE 1. The closure phase corresponds to three ALOS-2/PALSAR-2

TomoSAR

SAR tomography

TSPA

Two-stage programming approach

acquisitions over Vietnam, acquired on 12 January, 23 March, and 10 August 2018. Ho Chi Minh City is recognizable in the west center of the image.

48

N

+30

Degree

10°30′0″N 10°40′0″N 10°50′0″N

11°0′0″N 11°10′0″N

ABBREVIATION

–30

0

5

10

20 km

106°30′0″E 106°40′0″E 106°50′0″E 107°0′0″E 107°10′0″E

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

phase history. PL is a technique to address this statistical misclosure. COHERENCE MATRIX Under the hypothesis of distributed scattering, the probability density function of the data may be regarded as a zero-mean multivariate circular normal distribution. Therefore, an ensemble of the second-order moments (i.e., a covariance matrix or a coherence matrix) represents sufficient statistics to infer information from the data. Compared to a covariance matrix, the coherence matrix can avoid amplitude disturbance among SAR images. For simplicity, we assume, without any loss of generality, that the images are normalized such that v n = 1 for every n [see (4)] in the remainder of this article. Under this assumption, the covariance matrix is identical to a coherence matrix. The expression of a sampled coherence matrix can be defined as t = E [yy H] . 1 / yy H (5) C L y !X



where H is the conjugate transpose and represents a homogeneous patch containing L adjacent pixels with similar scattering properties. The absolute and phase values of t are the estimated coherence value ct and the elements of C t can be interferometric phase z n, respectively. Therefore, C expressed as R V ct 12 e jz 12 g ct 1N e jz1NW S 1 z j 2NW S ct 21 e jz 21 1 g ct 2N e t =S W C h j h W S h Sct N1 e jz N1 ct N2 e jz N2 f 1 W Tt X (6) = | C |% U where U is an N # N matrix, with element e jz nm indicating the interferometric phases between the nth acquisition and t represents an N # N matrix mth acquisition. In this case, C t with element c nm . MAXIMUM LIKELIHOOD ESTIMATION PHASE LINKING Based on the central limit theorem, we assume that the normalized SAR data vector y follows a complex multivariate normal distribution with zero mean and dispersion matrix R. The dispersion matrix R represents the scattering properties of a DS. Recalling R as a model for the underlying covariance of a complex circular Gaussian process, it is known t follows a comthat the probability density function of C plex Wishart distribution with L degrees of freedom [13]: t ) L - N exp " - tr 6LR -1 C t @, (L) NL det (C

t | R) = p (C r

N (N - 1) 2

det (R) L P ^gamma (L - j + 1) h N

(7)

j=1

where tr(.) and det(.) indicate the trace and determinant operators. The R of a generic pixel can be expressed using “true” coherence values and “true” phase values as SEPTEMBER 2023

R = HGH H (8) IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

where R jj 1 Se S0 H=S SS h 0 RT 1 S S c 21 G=S SS h c N1 T

0 e jj 2 h 0 c 12 1 h c N2

g g j g g g j f

V W W W W jj NW e X c 1NWV c 2NW . h WW 1 W X 0 0 h

Here, m = 6j 1, j 2, ..., j N@T is the optimal phase that needs to be estimated from the filtered N (N - 1)/2 phases. It should be noted that the N (N - 1) /2 interferometric phase values are not redundant, due to the spatial filtering process. Furthermore, the estimation of the absolute phase series is ambiguous. The phase of an arbitrary image in the time series is set to zero, and the remaining values are measured comparably to this arbitrary datum. Without any loss of generality, the first value can be set to zero, such as m = 60, j 2, ..., j N@T . Therefore, only N – 1 phase values need to be estimated. PL is a technique to estimate the optimal N – 1 interferometric phases from the possible N (N - 1)/2 phases. In other words, PL can be understood as combining multiple interferometric phases into an ESR phase. PL is commonly formulated as an optimization problem. t obtained from (5), the MLE of R follows from Given C the maximization of the Wishart probability density function given by t | R)At = argmax # ln 7p (C R R



t ) - L ln ^det (R) h-  = argmax # - tr (LR -1 C R

t ) + ln ^det (G) h- . = argmin # tr (HG -1 H H C G, H

(9)

To estimate H, the true coherence G is required. Since G is unknown in practice, it is common to use the standard plug-in estimate by setting it as t (10) G= C



resulting in unregularized PL and guaranteeing nonnegativeness [17]. We then can write (9) as

t | -1 H H C t ) - . (11) t = argmin # tr (H | C R H

Equation (11) was originally proposed by Guarnieri and Tebaldini [10]. To ensure a unified analysis, considering the variations in PL approaches, (11) can be reformulated based on the employed weighting strategy. In detail, at row n and column m, the MLE of the phase value is [13]

mt MLE = argmin ( / N

m

N

/

n = 1 m= n+1

clnm cos (z nm - j n + j m) 2 (12) 49

RECENT ADVANCES Estimating the linked phase is the crucial step in coherent analysis to account for decorrelating targets. Consequently, dedicated research focuses on improving the precision and computational efficiency in the work of Guarnieri and Tebaldini [10]. These techniques can be categorized into two main groups. The first approach is based on computation methods involving optimization, iteration, or EVD [11],

TABLE 2. CHARACTERISTICS OF THE MAIN PL APPROACHES. METHOD/ REFERENCE

NAME

WEIGHT

DESCRIPTION

Guarnieri MLE and Tebaldini [10]

clnm

The element of Hadamard ­product |Ct | -1 & Ct , with the ­iterative solution

Ferretti et al. [11]

MLE

clnm

Similar to Guarnieri and ­Tebaldini [10], with the solution by the Broyden–Fletcher– ­Goldfarb– Shanno algorithm

Cao et al. [13]

Coherence

COMPUTATION

ct nm

The element of coherence matrix Ct (with equal-weighted factor ct nm = 1)

Fornaro et al. EVD [14]

ct nm hlnm

hlnm is the element of matrix |h 1| | h 1| T , where |h 1| is the m ­ aximum eigenvector of ­coherence matrix Ct

Ansari et al. [15]

clnm

EMI

Similar to Guarnieri and Tebaldini [10], with the iterative solution that initializes as the minimum eigenvector of the matrix |Ct | -1 & Ct

COHERENCE MATRIX Ho Tong Minh and Ngo [16]

MLE

The element of Hadamard c compression nm ­product |Ct compression | -1 & Ct compression

Zwieback [17]

MLE

The element of Hadamard c regularization nm ­product |Ct regularization |-1 & Ct regularization

50

[13], [14], [15]. On the other hand, the second approach involves estimating the coherence matrix, which can be achieved through regularization or compression techniques [16], [17]. Ferretti et al. [11] proposed a quasi-Newton method for unconstrained nonlinear optimization, i.e., the Broyden– Fletcher–Goldfarb–Shanno algorithm [12], for the MLE solution. Cao et al. [13] introduced equal-weighted and coherence-weighted factors in the phase optimization. They proposed a modified MLE algorithm incorporating the coherence matrix. The coherence-weighted algorithm can be more accurate than the equal-weighted one in temporal decorrelation. Fornaro et al. [14] proposed a specific CAESAR algorithm for estimating the phase in the presence of multiple scattering mechanisms. The authors used an EVD-based approach to extract different scattering components from the coherence matrix. The algorithm was shown to be accurate and computationally efficient for estimating the phase in the presence of multiple scattering mechanisms. Ansari et al. [15] proposed an EVDbased MLE (EMI) algorithm to estimate the phase. They used an EVD approach to reduce the problem’s dimensionality and improve the algorithm’s computational efficiency. The initial solution can be found as the minimum t -1 & C t . Ho Tong Minh and eigenvector of the matrix C Ngo [16] proposed a data compression technique to improve the precision of the phase estimation algorithm. The authors divided massive data into many ministacks and then compressed them. The improvement of the phase estimation algorithm from the compression is due to the noise-free short-lived interferometric components. Zwieback [17] proposed a method for improving the estimation of the coherence matrix by regulating the coherence matrix estimates. For instance, with spectral regularizat regularization = bI + (1 - b) C t , with b varied from zero tion, C to one. The regularization techniques impose constraints

0.2 Root-Mean-Square Error (rad)

t -1 & C t, where clnm is the element of Hadamard product C defined as the weight factor in the optimization, and & represents the Hadamard entry-wise product. In practice, a damping factor is used in the algorithm to remove small negative or null eigenvalues ONE CRITICAL CHALLENGE t before maof the matrix C IN INSAR IS EXTRACTING trix inversion. The MLE may MEANINGFUL INFORMATION be interpreted as a temporal FROM THE filter that compresses the inINTERFEROMETRIC PHASE, formation of N (N - 1)/2 interferograms to a phase series WHICH IS AFFECTED BY of size N. The solution of (12) ATMOSPHERIC CONDITIONS, is achieved by iteratively minTOPOGRAPHY, AND imizing [10]. After this proceDECORRELATIONS. dure, the DS phase values are filtered, resulting in more reliable PU. With this assumed model, the scattering behavior of the DS neighborhood is approximated by a PS-resembling mechanism.

0.18 0.16 0.14 0.12 0.1

Coherence EVD EMI MLE Regularization Compression CRLB

0.08 0.06 0.04 0.02 0 0

20

40

60 80 100 120 140 160 180 SLC Image Index

FIGURE 2. PL performances using a Sentinel-1 temporal coherence

model. The coherence is modeled as two exponential decays and a long-term coherent component [21], [16]. The performances are ordered to facilitate the visualization. IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

on the coherence matrix estimates to improve their accuracy. In compression [16] and regularization [17] approaches, the improvements are most significant for low long-term coherences, due to more reliable coherence mat estimates. In summary, the difference among the trix C PL algorithms can be interpreted as the adopted weight criteria, as in (12). Table 2 gives the characteristics of different PL approaches. Figure 2 illustrates PL performances. We employ a welldocumented coherence Sentinel-1 model to simulate the behavior of temporal coherence over time [21], [16]. This model generates a coherence matrix for a three-year time series of 180 temporally ordered measurements taken at six-day intervals. Each measurement includes an ensemble of 300 statistically homogeneous samples. The simulation is repeated 1,000 times. The EMI solution corresponds t -1 & C t . The to the minimum eigenvector of the matrix C spectral regularization is used with a b of 0.5. We set the ministack as 10 for the compression method. The CRLB is a theoretical measure that employs simulated coherence for the calculation, as described in [10]. The EVD and coherence weight results are very similar. The compressed estimator performs better than other approaches, closely approximating the CRLB. The compressed estimator’s reduced error is attributed to the lack of noise in short-lived components [16]. It is essential to highlight that the significance of PL becomes more pronounced in ill-posed InSAR scenarios. However, with specific well-posed InSAR techniques, such as the TSPA, PU methods are already integrated into the processing workflow [22]. Thus, exploring the synergy between PL and the TSPA could lead to exciting research opportunities and further advancements in handling challenging InSAR scenarios. Since the introduction of PL, there have been numerous SAR applications. Among them, the PSDS technique has gained immense popularity, as it surpasses PSI in terms of performance [11]. In tomographic focusing, the PL algorithm plays a crucial role in the phase calibration, as it requires an optimal phase model to compensate for potential phase residuals that may affect 3D imaging [23]. The ComSAR technique is a recent advancement that utilizes PL algorithms to select the most coherent interferograms based on their linked phases [16]. These applications are discussed in the “Applications” section. APPLICATIONS SYNTHETIC APERTURE RADAR TOMOGRAPHY TomoSAR is a relatively new technique that has emerged as a powerful tool for the 3D imaging of complex scenes [24]. TomoSAR builds upon the capability of SAR systems to acquire data from multiple angles, which enables the reconstruction of the 3D structure of the imaged object [25]. In the case of forests, TomoSAR can be used to retrieve the vertical distribution of scatterers within the canopy, providing SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

valuable information on forest structure [26], [27], classification [28], and biomass [23]. Conventional SAR imaging of forests is challenging due to the complex interaction between the radar signal and forest canopy. The forest canopy is composed of vegetation layers that attenuate and scatter the radar signal in a complex way, making it difficult to retrieve information on the underlying terrain and vegetation [23]. TomoSAR can overcome some of these limitations by PL IS DEFINED AS A exploiting the 3D structure of STATISTICAL METHOD USED the forest. When the radar’s wavelength is long enough to IN INTERFEROMETRY TO penetrate the forest canopy, it COMBINE MULTIPLE becomes possible to use mulINTERFEROMETRIC PHASES tiple SAR acquisitions with INTO A SINGLE EQUIVALENT slightly different look angles SINGLE-REFERENCE PHASE. over the same area to quantify the three dimensions of forest reflectivity. This principle is demonstrated in Figure 3(a) and (c), where acquisitions from traditional SAR and TomoSAR are shown, respectively. To further elaborate on the TomoSAR process, consider a scenario where a sensor carefully flies along N parallel tracks and acquires multibaseline data of SAR images. Each pixel at the slant range r and azimuth location x in the nth image is denoted by y n (r, x). The azimuth axis x is defined by the direction of the aircraft platform, whereas the slant range r is the distance line of sight (LOS) linking the SAR’s sensor to targets on the ground, as in Figure 3(c). It is assumed that each image within the multibaseline dataset has been coregistered and resampled on a common grid (i.e., the reference track) and that phase components due to terrain topography and platform motion have been compensated. Thus, the multibaseline SAR model can be written as [24], [23]

y n ^r, x h =

# S^p, r, xh exp a j 4mrr b n p kdp (13)

where p is the cross-range coordinate, defined by the direction orthogonal to the LOS and azimuth coordinate; bn is the normal baseline relative to the nth image with respect to the reference image; m is the carrier wavelength; and S ^p, r, x h is the average scene complex reflectivity within the slant range, azimuth, and cross-range resolution cell, as described in Figure 3(d). It is worth noting that the SAR scene and its geometric configuration are linked. Specifically, the distribution of the SAR scene’s reflectivity in the cross-range direction is directly related to the multibaseline SAR data. These two components form a Fourier pair [as shown in (13)]. As a result, it is possible to reconstruct the cross-range distribution of the scene’s complex reflectivity by taking the Fourier transform, as follows [23]:

St (p, r, x) =

/ y n ^r, xh exp a - j 4mrr b n p k.(14) N

n=1

51

Cr os s

Ra ng e

Tomographic analysis relies on the outline theoretical model that ξ assumes a disturbance-free propagating signal. Before focusing, a phase 30 m calibration procedure is required to z compensate for phase residuals pox 15 m tentially affecting 3D focusing [20], 0m [31], [32]. In addition, analyzing the O backscattered power from different y heights in a forest makes it possible (a) (b) to gain insight into the scattering mechanisms within the canopy [23]. To successfully implement this concept, a vertical axis reference must t0 t 1 ξ tN–1 be established, which enables height measurements with respect to ter30 m rain elevation. The ground surface is Sl z an h t often called the “zero-meter layer.” tR 15 m imu an Az x However, the ground phase contrige r bution must be separated from the 0m O vegetation phase to prevent it from influencing the 3D focusing. y These two points can be made by (c) (d) removing the ground phase contribution in the tomographic data [23]. FIGURE 3. A comparison of traditional SAR and TomoSAR acquisitions. (a) and (b): Traits of traThe ground phases are determined ditional SAR. (c) and (d): Traits of TomoSAR. (a) A SAR acquisition. (b) A SAR resolution cell. (c) A not only by terrain height zg but also TomoSAR acquisition. (d) A TomoSAR resolution cell. The figure was adapted from [29, Fig. 2]. by the phase disturbances h deriving from the platform motion. In a formula, { ground = k z z g + h, where k z = 4rb n /m sin iR n is the Consequently, TomoSAR processing enables us to obtain the cross-range distribution of the SAR scene’s reflecheight-to-phase factor and i is the local incidence angle. tivity at every range and azimuth location. By doing so, The multipolarimetric multibaseline covariance matrix we can obtain a 3D image that provides comprehensive W can be approximated by retaining the first two terms information on the reflectivity of a forest in three dimenof the sum of the Kronecker products [33]. In a formula, sions. This information can be used to derive valuable W . C G 7 R G + C V 7 R V , where R and C are referred to as forest structure characteristics, such as height and biointerferometric information and polarimetric information, remass [23], [29], [30]. spectively, and G and V are associated with ground-only and volume-only contributions, respectively. PL is a fundamental component in facilitating the retrieval of this ground phase contribution from RG. Indeed, applying z PL to forested areas allows for representing forest scatterN 30 m ing in terms of the “equivalent point target,” with wellx defined distances from the radar in different trajectories. 15 m This allows for simultaneous target and radar position estimation, after which platform motion can be corrected 0m with subwavelength accuracy [20]. Figure 4 presents an example of SAR and TomoSAR imaging in the Paracou tropical forest site (French Guiana, South America). 0

0.5 (a)

1 y

(b)

FIGURE 4. A comparison of traditional SAR and TomoSAR imaging. (a) A traditional SAR image from the Paracou, French Guiana, forest site. (b) TomoSAR layers, with each related to a certain height above the ground. The figure was adapted from [29, Fig. 4(b)].

52

PERSISTENT SCATTERERS AND DISTRIBUTED SCATTERERS TECHNIQUE The PSDS technique is an approach that leverages the phase change over time of both PS and DS targets [1]. The technique involves two main steps: 1) PL and signal decorrelation removal and 2) estimation of parameters of interest. PSDS refers to techniques that exploit the time series phase IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

that exceed a certain threshold [16]. Once a DS candidate is identified, its sample coherence matrix can be computed using its SHP family. The coherence matrix fully characterizes the target statistics and can be used to invert for N – 1 linked phases using PL. The quality of the estimated N – 1 phase values can be assessed using the PL coherence, defined as [11] N

N

2 c PL = N (N - 1) Re / / exp ^i (z nm - j n + j m) h). (15) n = 1 m= n+1

Normal Baseline (m)

1,500 1,000 500 0 –500 –1,000 2011

2012 2013 2014 Acquisition Date (a)

2015

2012 2013 2014 Acquisition Date (b)

2015

2012 2013 2014 Acquisition Date (c)

2015

Normal Baseline (m)

1,500 1,000 500 0 –500 –1,000 2011

1,500 Normal Baseline (m)

change of both PS and DS targets. SqueeSAR technology [11] is one example of the PSDS technique. The PL technique is applied first to all the N (N - 1) /2 interferograms available from N images. It jointly exploits these interferograms to squeeze the best estimates of N - 1 interferometric phases. Once the estimate of the N - 1linked phases has been produced, the second step is necessary to remove the signal decorrelations and estimate the parameters of interest, such as the elevation error and constant velocity, similar to the PS interferometry processing algorithm [1]. The PSDS technique is widely used in InSAR applications, as it provides a reliable way to detect and monitor changes in Earth’s surface [34], [35], [36], [37]. It allows for detecting small and slow surface movements over a large area, which is helpful in surface deformation applications [38]. An example of how interferogram networks can be exploited in the InSAR time series techniques is provided in Figure 5. The figure was generated using Constellation of Small Satellites for Mediterranean Basin Observation data over the Ha Noi, Vietnam, area. It illustrates how interferogram networks can be exploited to analyze PSDS targets over time [39]. It is important to note that PSI can be considered a particular case of the maximum likelihood interferometry approach [1]. In PSI, all PS targets are assumed to be equally correlated in all images. This assumption obviates the need for the joint processing of all available interferograms to estimate interferometric phase information. Instead, only removing the reference phase from all other phases is necessary to obtain the linked phases required for estimating parameters, such as surface deformation. The PSDS algorithm is an extension of the PSI method that considers both PS and DS targets. The joint processing of all available interferograms in PSDS can help account for variations in the radar signal caused by different surface characteristics, including changes in moisture content, surface roughness, and vegetation cover. As a result, the PSDS approach can offer greater accuracy and detail than the PSI method in detecting and measuring surface deformation. The DS target is known for its low average temporal coherence, primarily due to geometrical and temporal decorrelation phenomena [3]. As a result, this target often has a low signal-to-noise ratio, making it challenging to work with. However, enhancing the DS target’s signal-to-noise ratio and treating it as a PS target is possible by identifying pixels within a neighborhood that exhibit similar behavior. These similar pixels are called SHPs. They can be identified using a two-sample Kolmogorov–Smirnov [11] or Baumgartner–Weiss–Schindler [40] test on the amplitude-based time series of the current pixel and its neighbors within a specified window. The pixels with a similar cumulative probability distribution are grouped as “brothers,” resulting in a family of SHPs [see Figure 6(a)]. A DS candidate is identified if it has a sufficient number of SHPs

1,000 500 0 –500 –1,000 2011

FIGURE 5. Interferogram networks. (a) The single master network

in PSI processing. (b) The subset network consists of interferograms with short spatial and temporal baselines in SBASs. (c) The fully connected network in the PSDS technique. 53

If the c PL coherence is above a certain threshold, a DS point with the linked N – 1 phase value will replace the original points. Finally, the selected DS will jointly process using the same PSI technique as the PS. Figure 6(b) is an example of PL coherence, EXPLORING THE SYNERGY which is in full resolution, alBETWEEN PL AND THE TSPA lowing for more comprehenCOULD LEAD TO EXCITING sive PL performance. Improving PL and selectRESEARCH OPPORTUNITIES ing SHPs are critical aspects AND FURTHER of PSDS techniques. Various ADVANCEMENTS IN modified approaches have HANDLING CHALLENGING been proposed to select INSAR SCENARIOS. SHPs, including the Anderson–Darling test [34], time series likelihood ratios [41], t-test [42], fast SHP selection, Baumgartner–Weiss– Schindler test [40], mean amplitude difference [43], and similar time series interferometric phase [44], among others. These approaches aim to increase the density of DSs, mitigating sample coherence bias. Additionally, the conventional DS assumption of independent

300 50

250

100

200

150 200

150

250

100

300

50

350 200 400 600 800 1,000 1,200 1,400 1,600 (a)

0

1 0.9

50

0.8

100

0.7

150

0.6

200

0.5 0.4

250

0.3

300

0.2

350

0.1 200 400 600 800 1,000 1,200 1,400 1,600 (b)

0

FIGURE 6. Internal results from PSDS-based processing. (a) The number of SHPs was identified using the Baumgartner–Weiss– Schindler test on a 9#35 window. (b) The PL coherence corresponds to the SHP map. The figure was adapted from [16, Fig. 4].

54

small scatterers with a uniform scattering mechanism can be relaxed by considering DS targets dominated by two or more scattering mechanisms [36], [45]. Engelbrecht et al. [45] showed that incorporating multiple scattering mechanisms in L-band ALOS PALSAR data improved deformation measurement extraction in dynamic agricultural regions. Recently, it has been demonstrated that adding polarimetric information can increase the number of coherent pixels by a factor of eight compared to a single-polarization channel [46]. Overall, improving SHP selection and PL, as well as considering more complex scattering mechanisms, can enhance the performance of PSDS techniques in deformation monitoring and other applications. COMPRESSED PERSISTENT SCATTERERS AND DISTRIBUTED SCATTERERS INTERFEROMETRIC SYNTHETIC APERTURE RADAR TECHNIQUE The recent trend of spaceborne SAR missions focuses on systematic Earth monitoring with high temporal resolution [47], [48]. This has resulted in unprecedented SAR data volumes due to short revisit cycles, as brief as six to 12 days, with missions such as the ESA’s Sentinel-1 [49] and NASA–Indian Space Research Organization Synthetic Aperture Radar mission [50]. However, interferometric processing of these large data stacks with currently available algorithms is infeasible. To address this demand, efforts have focused on reducing product latency through parallelized computation and cloud computing. One of the first efficient stacking techniques that allows for efficient PL in sequences using isolated data batches of the time series is the sequential estimator [51]. The ComSAR algorithm introduced by Ho Tong Minh and Ngo [16] has been proposed for processing PSDS analysis. Since most deformation phenomena develop slowly, a processing scheme can be devised using reduced-volume data sets. The algorithm divides massive data into many ministacks and compresses them. Conventional MLE PL suffers from a high computational time complexity driven by the number of SLCs involved, primarily due to the iterative maximum likelihood optimization for the phase estimation and secondarily due to the regularization and inversion of the complex coherence matrix, both of which are affected by the number of interferograms. To address the high-datavolume problem, data compression is a classic approach in dealing with high data volumes. In the case of multipass SAR, the objective is to compress a stack of coregistered SAR data in the temporal direction so that the size of the time series is reduced but the spatial size of each image is intact. To perform data compression, PCA is a well-known technique [52], [53]. However, PCA fails to incorporate the statistical properties of the complex covariance matrix data correctly, as it is a geometrical rather than a probabilistic approach [53]. On the other hand, MLE PL is a purely probabilistic approach and is well IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

known for its precise phase estimation [10], [51]. Therefore, the simple idea is to use the most coherent interferograms from their linked phases because the linked phases are optimal from all possible interferometric phases. This approach reduces the data volume by using the most informative interferograms for further processing, thereby allowing for efficient interferometric processing. Let us assume that the N SAR dataset (ordered temporally) can be divided into small batches, or ministacks, with M images. The compressed version St of M SAR images for the kth sequence can be determined by a coherent summation, as follows [51], [16]: St (r, x) =



M

/

S m (r, x) g m (16)

m=1

t /< K t < = 6p 1, p 2, ..., p M@, K t = exp (jmt k) is the linked where g = K phase from M SAR images and S(r, x) is the scene complexvalued SLC at the slant range azimuth position. The vector g weights each image’s contribution in the ministack in the coherent summation process. For each ministack, its compression is formed using (16), resulting in a strong data reduction. As an example of the algorithm’s efficiency, from a stack of 90 images, we

can set M as 10 (see Figure 7). The processing will then estimate the 10 optimum phases by using the PL technique in (12). These phases allow us to coherently focus the stack subset and produce a single compressed image that can represent the 10 first images of the stack. The same procedure will be repeated on the following 10 images until the end of the stack, producing nine compressed images. We note that this process has to be performed on an SHP family basis since PL estimation can be valid only locally. The ComSAR scheme in signal processing has benefits beyond just reducing the computational burden. It also prevents the need for updating and re-estimating the entire phase history at the face of every single acquisition. The processing scheme reduces the data volume from the entire stack to the compressed SLCs, making storage easier. These compressed images can then be used as a reference point to link history ministacks with recent acquisitions and reconstruct the full phase time series [see Figure 7(d)]. In detail, PL will be performed on the compressed components (St ), producing a vector mt cal = 7jt cal (1), jt cal (2), ..., jt cal (K)A that contains the calibration phases for connecting the ministacks. The datum connection for the kth sequence is then carried out by [16] mt kunified = mt k + jt cal (k)(17)



1 20

0.8

40

0.6 0.4

60

0.2

80 20

40

60

0

80

(b)

2 4 6

(c)

8 10 2

4

6

8

10

(a)

(d)

FIGURE 7. The ComSAR algorithm. (a) A full coherence matrix in the PSDS technique. The data are divided into ministacks with 10 images

to improve the process efficiency. (b) The PL technique is employed to compress each ministack. This generates linked phases that enable a coherent focus on the stack subset, resulting in a compressed image representing the first 10 images. This compression procedure is repeated on the following 10 images, creating nine compressed images. These compressed images can be utilized to link prior ministacks with new acquisitions and reconstruct the full phase time series without the need to recalculate everything. ComSAR can work with full and compressed time series, but the (c) compressed version typically outperforms the (d) full time series version [16]. SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

55

where the superscripts indicate the time series sequence and jt cal (k) is the kth of the calibration vector. Figure 8(a) displays an original interferogram covering 670 days in Mexico City, Mexico, while Figure 8(b) is a compressed version (with M = 10). Compared to the original [in Figure 8(d) and (e)], the compressed interferogram exhibits superior quality and coherence. The average coherence improves from 0.4 to 0.8 [shown in Figure 8(c)], mainly because the noise component of the data is reduced in the compression process. Essentially, the noisy short-lived components are eliminated from the artificially compressed interferograms created from the ministacks. Consequently, these interferograms have a higher signal-to-noise ratio than the initial ones. Since the estimation of the linked phase is ambiguous, we set the phase of the first image in each ministack to zero. Finally, PS values at these multireference images will be extracted from the original SLCs and integrated into compressed phase time series for PSDS analysis [16]. An interesting development by Ho Tong Minh and Ngo is the implementation of the PSDS and ComSAR algorithms, which have been made available as an open

–π

source TomoSAR package [16], [54], [56]. To the best of our knowledge, this package is the first publicly accessible tool that enables the simultaneous handling of both PS and DS targets (https://github.com/DinhHoTong Minh/TomoSAR). PERSPECTIVE WITH DEEP LEARNING This section highlights the potential of the DL approach in SAR data, emphasizing the value of DL in enhancing the accuracy and efficiency of PL. The “Brief Review of Deep Learning for Synthetic Aperture Radar Data” section serves as a bridge, introducing DL’s potential for SAR data processing and paving the way for its application in improving PL techniques, in the “Perspective for Phase Linking” section. BRIEF REVIEW OF DEEP LEARNING FOR SYNTHETIC APERTURE RADAR DATA In recent years, the increase in SAR Earth observation missions, such as TerraSar-X, Sentinel-1, ALOS, and RADARSAT, has led to a new scenario characterized by the continuous generation of a massive amount of data. On the one hand, this trend has allowed us to observe the inadequacy

+π (a) × 104 12

(b)

Compressed Coherence Raw Coherence

Frequency

10 8 6 4 2 0

(d) 0

0.2

0.4 0.6 Coherence (c)

0.8

(e)

1

FIGURE 8. Mexico interferograms. (a) A raw 670-day interferogram. (b) A compressed 670-day interferogram. (c) The coherence distribution.

(d) and (e) Zoomed-in versions of (a) and (b), respectively. The figure is adapted from [54, Fig. 3]. 56

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

of classical algorithms in regard to generalization capabilities and computational performance; on the other hand, it paved the way for the new artificial intelligence paradigm, including the DL approach [57]. DL has profoundly impacted many scientific fields, such as machine vision, natural language processing, video processing, speech, image processing, and Earth data science [58], [59]. Over the past decade, significant advances have been made in developing and applying DL techniques to problems in Earth observation [59], [60], [61]. These techniques have proved highly effective for classification and parameter retrieval tasks, as they can process large amounts of data and deal efficiently with complex spatial and temporal structures [59]. DL has been applied to the field of SAR imaging. CNNs, GANs, and RNNs are the leading neuron network DL architectures that have been applied to SAR data analysis. CNNs have been widely used in various tasks, such as ship detection [62], building detection [63], deformation observation [64], and land cover classification [61], [65], and they have shown to be effective in these tasks. On the other hand, GANs have been used for SAR image superresolution and to enhance the quality of SAR images [66]. RNNs have been applied to classify time series SAR data and have shown good results [60], [67]. Several studies have explored DL approaches for SAR filtering with promising results. For instance, Mullissa et al. [68] proposed a deSpeckNet DL-based approach for SAR despeckling that achieved higher accuracy and efficiency than traditional methods. Similarly, Wu et al. [69] used a deep CNN for polarimetric SAR filtering and demonstrated improved filtering performance. Promising outcomes have been achieved by exploring DL techniques in SAR PU tasks. Traditional PU methods assume that the phase possesses spatial continuity, but their effectiveness is hampered by decorrelation noise and aliasing fringes that invalidate such assumptions. To enhance the reliability of unwrapping outcomes, Wu et al. [70] proposed a deep CNN, known as a discontinuity estimation network, that predicts the probabilities of phase discontinuities in interferograms. Similarly, Zhou et al. [71] converted the PU problem into a learnable image semantic segmentation problem and presented a DL-based branch cut deployment approach (BCNet). Experimental results demonstrate that the proposed BCNet-based PU method is a near-real-time PU algorithm with higher accuracy than traditional PU methods. Accurate identification of PSs is crucial in obtaining reliable phase information in PSI and PSDS. To address this, a novel deep CNN named PSNet has been proposed by Tianxiang et al. [72] for PS identification. The significant advantage of PSNet lies in its deep architecture, which can learn the distinguishing features of PSs from vast training images with diverse topography and landscapes. Using the combined feature images of the average amplitude, amplitude dispersion, and coherence of interferograms as inputs, PSNet was trained to classify PS and non-PS pixels. The results demonstrate that PSNet accurately distinguishes between PS and non-PS SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

pixels. Notably, PSNet outperformed the StaMPS algorithm by detecting more than double the number of PSs. PERSPECTIVE FOR PHASE LINKING The traditional PL methods rely on handcrafted algorithms, which can be time-consuming and may not always provide optimal results. DL has the potential to revolutionize the PL process in SAR imaging by providing an end-to-end solution that can learn the complex relationships between the interferometric phases and the ESR phase from data. The principle is quite simple for DL, which consists of multiple layers of interconnected nodes [58], [59], [73]. The general DL for PL can include the first layer of the network extracting low-level features from the TRADITIONAL PL METHODS input data, such as the interRELY ON HANDCRAFTED ferometric phases. The subsequent layers learn increasingly ALGORITHMS, WHICH CAN complex data representations BE TIME-CONSUMING AND until the final layer produces MAY NOT ALWAYS PROVIDE the desired output, such as the OPTIMAL RESULTS. ESR phase. Unfortunately, we cannot find any report on the DL approach for the PL technique in the literature. For this reason, we discuss the possibility of DL as a valuable tool for PL by addressing a few selected points: ◗◗ Multipass SAR data are typically represented as complex-valued time series, which cannot be directly used as input to most DL models. One common approach is to separate the real and imaginary components and treat them as separate input channels. Alternatively, derived quantities, such as magnitude and phase, can be input. In both ways, the topographic phase must be subtracted to improve the spatial stationarity in the homogeneous region before feeding in the DL model [16]. Specialized NN architectures that handle complex-valued data directly, such as complex CNNs, also exist [74]. In addressing the PL problem, examining the data input approach is essential. Although few algorithms leverage the complex-valued nature of radar data [75], [ [76], separating the real and imaginary components or using magnitude and phase quantities as input are commonly used and practical approaches. ◗◗ One advanced CNN model is U-Net, an encoder–decoder CNN initially employed for semantic segmentation in medical images [77]. For example, a U-Net model could be trained to take SAR interferograms as input and output unwrapped phase maps, using significant annotated phase and deformation maps datasets [78]. This approach could improve the accuracy of the PU process and make it more robust to noise and other sources of error. Similarly, an autoencoder architecture is designed to effectively separate ground deformation signals from noise in InSAR time series without requiring any prior information about the location of a fault or its slip 57

phase, and it can then generalize these relationships to new SAR images. The DL approach’s end-to-end nature means that a deep NN can perform the entire PL process, from feature extraction to ESR phase estimation. This results in a faster and more accurate PL process, as the network can automatically learn the most relevant features and relationships from the data [59], [73]. ◗◗ Exploiting DL for PL can bring two benefits. The first advantage is that it can handle nonstationary phase noise and outliers more effectively than traditional methods. Traditional PL methods rely on handcrafted algorithms, which may not always be robust to nonstationary phase noise and outliers. On the other hand, DL can learn to handle these issues from the data during the training process [79], resulting in a more robust PL process. Another advantage is a reduction of the computational cost of the process. Traditional PL methods can be computationally expensive, especially with large SAR datasets. DL, however, can be implemented on parallel architectures, such as GPUs, which can significantly reduce the

behavior [64]. Indeed, U-Net is an architecture designed to learn a model in an end-to-end setting. U-Net’s encoder path compresses the input image information by extrapolating relevant features computed at different resolution scales. As a result, this hierarchical feature extraction provides various representations of abstraction levels. On the other hand, U-Net’s decoder path reconstructs the original image by mapping the intermediate representation back to the input spatial resolution. In particular, during this reconstruction process, the information is restored at different resolutions by stacking many upsampling layers. This can preserve relevant information during the decoding stage [77]. In this way, the reconstructed image accuracy can be well preserved. In this sense, there is a potential link between CNN U-Net models and PL. ◗◗ DL training can be performed by providing a large dataset of SAR images and their corresponding ESR phases. The network will use these data to learn the relationships between the interferometric phases and the ESR

(a)

(b)

(c)

(d)

(e)

(f) –π

0



FIGURE 9. A synthetic example of PL using DL. (a) A simulated deformed signal for interferograms, using the first acquisition as the reference image. (b) Interferograms after adding decorrelation noise. (c) The results of the MLE method using all the interferograms. (d) The residuals of the MLE method [i.e., the difference between (a) and (c)]. (e) The results of the DL method using the U-Net model. (f) The residuals of the DL method [i.e., the difference between (a) and (e)].

58

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

computational cost [55]. This is particularly important for near-real-time processing of big SAR data, where the computational cost of the PL process is a significant concern. In summary, DL holds considerable promise for various applications in PL, offering opportunities to enhance precision estimation and computational efficiency. Nevertheless, it is important to acknowledge the challenges and limitations associated with its implementation. These include data quality, feature extraction, model complexity, and interpretability concerns. While DL presents an enticing avenue for future research trends in PL, it necessitates a meticulous design and thorough evaluation of both the models and the data involved. EXAMPLE WITH U-NET MODEL To demonstrate the applicability of DL in PL, we conducted a proof-of-concept test on a synthetic dataset. The simulation settings included generating a radar data stack with temporal noise behaviors, consisting of 10 SLC images with a revisit time of 35 days. We simulated a deformation signal by assuming a simple Gaussian deformation bowl with a maximum LOS deformation rate of 14 mm/year at the center and a radius of 600 m. The simulation was conducted on a flat area, resulting in a zero topographic signal. We assumed a crop of 1,280 # 1,280 m, a radar wavelength of 56 mm, and a pixel size of 20 # 20 m. We employed the coherence Sentinel-1 model to simulate the behavior of temporal coherence over time, which generates a coherence matrix for a one-year time series of 10 temporally ordered measurements taken at 35-day intervals [21], [16]. Each SLC included 64 # 64 homogeneous pixels, and the simulation was repeated 5,000 times for DL data training. Figure 9(a) and (b) provides a visual representation of both the noise-free and noisy simulated datasets. Figure 9(c) and (d) demonstrates the application of the MLE spectral regularization algorithm using coherence matrix estimation over 11 # 11 windows and a b value of 0.5 (see the “Recent Advances” section). We used a MATLAB function called unetLayers to define the U-Net architecture, with a depth of three in the encoder network. The complex-valued interferometric data were separated by magnitude and phase quantities as input for the U-Net model, resulting in only 2N - 1 channels since the first phase is zero. The patch size was specified as 64 # 64 # 19, and the number of output channels was nine ESR phases. We modified the U-Net architecture by replacing the original softmax layer with a regression layer and trained the model on a dataset of 4,800, measurements with 200 for validation, setting MaxEpochs as 10. Figure 9(e) and (f) indicates that the DL approach is expected to be comparable to the handcrafted MLE algorithm. However, it is important to note that the synthetic simulation used in this example was relatively basic. For future studies to effectively address the PL challenge using DL, it is crucial to have access to high-quality datasets containing both synthetic and real-world interferometric SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

patch images from various landscapes. Additionally, utilizing a more advanced U-Net model with greater depth can significantly enhance the accuracy and efficiency of the PL process. CONCLUSIONS Accurate estimation of linked phases is crucial in mitigating decorrelation effects on SAR data in the PL technique. Researchers have proposed various algorithms, such as quasi-Newton, CAESAR, and EMI, to improve PL estimation’s precision and computational efficiency. Additionally, new compression and regulation techniques have been developed to enhance the estimation of the coherence matrix. PL is widely used in TomoSAR, PSDS, and ComSAR applications, and the adoption of DL is expected to improve the accuracy and efficiency of the process. The future of the DL approach for PL is promising, as ongoing research in various areas can shape the development of better algorithms and techniques. This will help improve the detection and measurement of surface deformation and parameter estimation in SAR applications, leading to more accurate and efficient results. ACKNOWLEDGMENT This work was supported, in part, by the ESA; Center National d’Etudes Spatiales/Terre, Ocean, Surfaces Continentales, Atmosphere (project MekongInSAR); UMR TETIS; and Institut National de Recherche en Agriculture, Alimentation, et Environnement. The ALOS-2/PALSAR-2 data were kindly provided by the Japanese Aerospace Exploration Agency, under the third Research Announcement Program on Earth Observation, with project ER3A2N097. AUTHOR INFORMATION Dinh Ho Tong Minh ([email protected]) is with UMR TETIS, INRAE, University of Montpellier, 34090 Montpellier, France. He is a Member of IEEE. Stefano Tebaldini ([email protected]) is with Dipartimento di Elettronica, Informazione, e Bioingegneria, Politecnico di Milano, 20133 Milano, Italy. He is a Senior Member of IEEE. REFERENCES [1] D. Ho Tong Minh, R. Hanssen, and F. Rocca, “Radar interferometry: 20 years of development in time series techniques and future perspectives,” Remote Sens., vol. 12, no. 9, Apr. 2020, Art. no. 1364, doi: 10.3390/rs12091364. [Online]. Available: https:// www.mdpi.com/2072-4292/12/9/1364 [2] R. F. Hanssen, Radar Interferometry: Data Interpretation and Error Analysis. Dordrecht, The Netherlands: Kluwer, 2001. [3] H. A. Zebker and J. Villasenor, “Decorrelation in interferometric radar echoes,” IEEE Trans. Geosci. Remote Sens., vol. 30, no. 5, pp. 950–959, Sep. 1992, doi: 10.1109/36.175330. [4] A. Ferretti, C. Prati, and F. Rocca, “Permanent scatterers in SAR interferometry,” IEEE Trans. Geosci. Remote Sens., vol. 39, no. 1, pp. 8–20, Jan. 2001, doi: 10.1109/36.898661.

59

[5] A. Hooper, H. Zebker, P. Segall, and B. Kampes, “A new method for measuring deformation on volcanoes and other natural terrains using InSAR persistent scatterers,” Geophysical Res. Lett., vol. 31, no. 23, pp. 1–5, Dec. 2004, doi: 10.1029/2004GL021737. [6] P. Berardino, G. Fornaro, R. Lanari, and E. Sansosti, “A new algorithm for surface deformation monitoring based on small baseline differential SAR interferograms,” IEEE Trans. Geosci. Remote Sens., vol. 40, no. 11, pp. 2375–2383, Dec. 2002, doi: 10.1109/TGRS.2002.803792. [7] M.-P. Doin et al., “Presentation of the small baseline NSBAS processing chain on a case example: The Etna deformation monitoring from 2003 to 2010 using Envisat data,” in Proc. Fringe ESA Conf. Workshop, Frascati, Italy, 2011, pp. 1–8. [8] Z. Yunjun, H. Fattahi, and F. Amelung, “Small baseline InSAR time series analysis: Unwrapping error correction and noise reduction,” Comput. Geosci., vol. 133, Dec. 2019, Art. no. 104331, doi: 10.1016/j.cageo.2019.104331. [Online]. Available: http://www. sciencedirect.com/science/article/pii/S0098300419304194 [9] D. Ho Tong Minh, R. Hanssen, M.-P. Doin, and E. Pathier, Advanced Methods for Time-Series InSAR. Hoboken, NJ, USA: Wiley, 2022, ch. 5, pp. 125–153. [10] A. M. Guarnieri and S. Tebaldini, “On the exploitation of target statistics for SAR interferometry applications,” IEEE Trans. Geosci. Remote Sens., vol. 46, no. 11, pp. 3436–3443, Nov. 2008, doi: 10.1109/TGRS.2008.2001756. [11] A. Ferretti, A. Fumagalli, F. Novali, C. Prati, F. Rocca, and A. Rucci, “A new algorithm for processing interferometric datastacks: SqueeSAR,” IEEE Trans. Geosci. Remote Sens., vol. 49, no. 9, pp. 3460–3470, Sep. 2011, doi: 10.1109/TGRS.2011.2124465. [12] R. Battiti and F. Masulli, BFGS Optimization for Faster and Automated Supervised Learning. Dordrecht, The Netherlands: Springer Netherlands, 1990, pp. 757–760. [13] N. Cao, H. Lee, and H. C. Jung, “Mathematical framework for phase-triangulation algorithms in distributed-scatterer interferometry,” IEEE Geosci. Remote Sens. Lett., vol. 12, no. 9, pp. 1838–1842, Sep. 2015, doi: 10.1109/LGRS.2015.2430752. [14] G. Fornaro, S. Verde, D. Reale, and A. Pauciullo, “Caesar: An approach based on covariance matrix decomposition to improve multibaseline-multitemporal interferometric SAR processing,” IEEE Trans. Geosci. Remote Sens., vol. 53, no. 4, pp. 2050–2065, Apr. 2015, doi: 10.1109/TGRS.2014.2352853. [15] H. Ansari, F. De Zan, and R. Bamler, “Efficient phase estimation for interferogram stacks,” IEEE Trans. Geosci. Remote Sens., vol. 56, no. 7, pp. 4109–4125, Jul. 2018, doi: 10.1109/ TGRS.2018.2826045. [16] D. Ho Tong Minh and Y.-N. Ngo, “Compressed SAR interferometry in the big data era,” Remote Sens., vol. 14, no. 2, pp. 1–13, Jan. 2022, doi: 10.3390/rs14020390. [Online]. Available: https://www.mdpi.com/2072-4292/14/2/390 [17] S. Zwieback, “Cheap, valid regularizers for improved interferometric phase linking,” IEEE Geosci. Remote Sens. Lett., vol. 19, pp. 1–4, Aug. 2022, doi: 10.1109/LGRS.2022.3197423. [18] J. Biggs, T. Wright, Z. Lu, and B. Parsons, “Multi-interferogram method for measuring interseismic deformation: Denali fault, Alaska,” Geophysical J. Int., vol. 170, no. 3, pp. 1165–1179, Sep. 2007, doi: 10.1111/j.1365-246X.2007.03415.x.

60

[19] F. De Zan, A. Parizzi, P. Prats-Iraola, and P. López-Dekker, “A SAR interferometric model for soil moisture,” IEEE Trans. Geosci. Remote Sens., vol. 52, no. 1, pp. 418–425, Jan. 2014, doi: 10.1109/TGRS.2013.2241069. [20] S. Tebaldini, F. Rocca, M. Mariotti d’Alessandro, and L. FerroFamil, “Phase calibration of airborne tomographic SAR data via phase center double localization,” IEEE Trans. Geosci. Remote Sens., vol. 54, no. 3, pp. 1775–1792, Mar. 2016, doi: 10.1109/ TGRS.2015.2488358. [21] H. Ansari, F. De Zan, and A. Parizzi, “Study of systematic bias in measuring surface deformation with SAR interferometry,” IEEE Trans. Geosci. Remote Sens., vol. 59, no. 2, pp. 1285–1301, Feb. 2021, doi: 10.1109/TGRS.2020.3003421. [22] H. Yu, Y. Lan, Z. Yuan, J. Xu, and H. Lee, “Phase unwrapping in InSAR: A review,” IEEE Geosci. Remote Sens. Mag. (replaced Newsletter), vol. 7, no. 1, pp. 40–58, Mar. 2019, doi: 10.1109/ MGRS.2018.2873644. [23] D. Ho Tong Minh, T. Le Toan, F. Rocca, S. Tebaldini, M. Mariotti d’Alessandro, and L. Villard, “Relating P-band synthetic aperture radar tomography to tropical forest biomass,” IEEE Trans. Geosci. Remote Sens., vol. 52, no. 2, pp. 967–979, Feb. 2014, doi: 10.1109/TGRS.2013.2246170. [24] A. Reigber and A. Moreira, “First demonstration of airborne SAR tomography using multibaseline L-band data,” IEEE Trans. Geosci. Remote Sens., vol. 38, no. 5, pp. 2142–2152, Sep. 2000, doi: 10.1109/36.868873. [25] S. Tebaldini, D. Ho Tong Minh, M. M. d’Alessandro, L. Villard, T. Le Toan, and J. Chave, “The status of technologies to measure forest biomass and structural properties: State of the art in SAR tomography of tropical forests,” Surv. Geophys., vol. 40, no. 4, pp. 779–801, May 2019, doi: 10.1007/s10712-019-09539-7. [26] M. Pardini, M. Tello, V. Cazcarra-Bes, K. P. Papathanassiou, and I. Hajnsek, “L- and P-band 3-D SAR reflectivity profiles versus lidar waveforms: The AfriSAR case,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 11, no. 10, pp. 1–16, Oct. 2018, doi: 10.1109/JSTARS.2018.2847033. [27] I. El Moussawi et al., “Monitoring tropical forest structure using SAR tomography at L- and P-band,” Remote Sens., vol. 11, no. 16, Aug. 2019, Art. no. 1934, doi: 10.3390/rs11161934. [Online]. Available: https://www.mdpi.com/2072-4292/11/16/1934 [28] D. Ho Tong Minh, Y.-N. Ngo, and T. T. Lê, “Potential of P-band SAR tomography in forest type classification,” Remote Sens., vol. 13, no. 4, Feb. 2021, Art. no. 696, doi: 10.3390/ rs13040696. [Online]. Available: https://www.mdpi.com/2072 -4292/13/4/696 [29] D. Ho Tong Minh et al., “SAR tomography for the retrieval of forest biomass and height: Cross-validation at two tropical forest sites in French Guiana,” Remote Sens. Environ., vol. 175, pp. 138–147, Mar. 2016, doi: 10.1016/j.rse.2015.12.037. [30] Y.-N. Ngo, Y. Huang, D. H. T. Minh, L. Ferro-Famil, I. Fayad, and N. Baghdadi, “Tropical forest vertical structure characterization: From GEDI to P-band SAR tomography,” IEEE Geosci. Remote Sens. Lett., vol. 19, pp. 1–5, Sep. 2022, doi: 10.1109/ LGRS.2022.3208744. [31] I. El Moussawi et al., “L-band UAVSAR tomographic imaging in dense forests: Gabon forests,” Remote Sens., vol. 11, no. 5, Feb. IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

2019, Art. no. 475, doi: 10.3390/rs11050475. [Online]. Available: https://www.mdpi.com/2072-4292/11/5/475 [32] V. Wasik, P. C. Dubois-Fernandez, C. Taillandier, and S. S. Saatchi, “The AfriSAR campaign: Tomographic analysis with phasescreen correction for P-band acquisitions,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 11, no. 10, pp. 1–13, Oct. 2018, doi: 10.1109/JSTARS.2018.2831441. [33] S. Tebaldini, “Algebraic synthesis of forest scenarios from multibaseline polinsar data,” IEEE Trans. Geosci. Remote Sens., vol. 47, no. 12, pp. 4132–4142, Dec. 2009, doi: 10.1109/ TGRS.2009.2023785. [34] K. Goel and N. Adam, “A distributed scatterer interferometry approach for precision monitoring of known surface deformation phenomena,” IEEE Trans. Geosci. Remote Sens., vol. 52, no. 9, pp. 5454–5468, Sep. 2014, doi: 10.1109/ TGRS.2013.2289370. [35] D. Ho Tong Minh, L. Van Trung, and T. L. Toan, “Mapping ground subsidence phenomena in Ho Chi Minh city through the radar interferometry technique using ALOS PALSAR data,” Remote Sens., vol. 7, no. 7, pp. 8543–8562, Jul. 2015, doi: 10.3390/rs70708543. [Online]. Available: https://www.mdpi. com/2072-4292/7/7/8543 [36] N. Cao, H. Lee, and H. C. Jung, “A phase-decompositionbased PSInSAR processing method,” IEEE Trans. Geosci. Remote Sens., vol. 54, no. 2, pp. 1074–1090, Feb. 2016, doi: 10.1109/ TGRS.2015.2473818. [37] J. Cohen-Waeber, R. Bürgmann, E. Chaussard, C. Giannico, and A. Ferretti, “Spatiotemporal patterns of precipitation-­ modulated landslide deformation from independent component analysis of InSAR time series,” Geophys. Res. Lett., vol. 45, no. 4, pp. 1878–1887, Feb. 2018, doi: 10.1002/2017GL075950. [Online]. Available: https://agupubs.onlinelibrary.wiley.com/ doi/abs/10.1002/2017GL075950 [38] B. Fruneau, D. Ho Tong Minh, and D. Raucoules, Anthropogenic Activity: Monitoring Surface-Motion Consequences of Human Activities With Spaceborne InSAR. Hoboken, NJ, USA: Wiley, 2022, ch. 1, pp. 283–313. [39] D. Ho Tong Minh et al., “Measuring ground subsidence in Ha Noi through the radar interferometry technique using TerraSAR-X and cosmos SkyMed data,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 12, no. 10, pp. 3874–3884, Oct. 2019, doi: 10.1109/JSTARS.2019.2937398. [40] M. Jiang, X. Ding, R. F. Hanssen, R. Malhotra, and L. Chang, “Fast statistically homogeneous pixel selection for covariance matrix estimation for multitemporal InSAR,” IEEE Trans. Geosci. Remote Sens., vol. 53, no. 3, pp. 1213–1224, Mar. 2015, doi: 10.1109/TGRS.2014.2336237. [41] X. Lv, B. Yazıcı, M. Zeghal, V. Bennett, and T. Abdoun, “Jointscatterer processing for time-series InSAR,” IEEE Trans. Geosci. Remote Sens., vol. 52, no. 11, pp. 7205–7221, Nov. 2014, doi: 10.1109/TGRS.2014.2309346. [42] R. Shamshiri, H. Nahavandchi, M. Motagh, and A. Hooper, “Efficient ground surface displacement monitoring using ­Sentinel-1 data: Integrating distributed scatterers (DS) identified using two-sample t-test with persistent scatterers (PS),” Remote Sens., vol. 10, no. 5, May 2018, Art. no. 794, doi: 10.3390/ SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

rs10050794. [Online]. Available: https://www.mdpi.com/2072 -4292/10/5/794 [43] K. Spaans and A. Hooper, “InSAR processing for volcano monitoring and other near-real time applications,” J. Geophysical Res., Solid Earth, vol. 121, no. 4, pp. 2947–2960, Apr. 2016, doi: 10.1002/2015JB012752. [Online]. Available: https://agupubs. onlinelibrary.wiley.com/doi/abs/10.1002/2015JB012752 [44] A. B. Narayan, A. Tiwari, R. Dwivedi, and O. Dikshit, “A novel measure for categorization and optimal phase history retrieval of distributed scatterers for InSAR applications,” IEEE Trans. Geosci. Remote Sens., vol. 56, no. 10, pp. 5843–5849, Oct. 2018, doi: 10.1109/TGRS.2018.2826842. [45] J. Engelbrecht and M. R. Inggs, “Coherence optimization and its limitations for deformation monitoring in dynamic agricultural environments,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 9, no. 12, pp. 5647–5654, Dec. 2016, doi: 10.1109/ JSTARS.2016.2593946. [46] A. G. Mullissa, D. Perissin, V. A. Tolpekin, and A. Stein, “Polarimetry-based distributed scatterer processing method for PSI applications,” IEEE Trans. Geosci. Remote Sens., vol. 56, no. 6, pp. 3371–3382, Jun. 2018, doi: 10.1109/TGRS.2018.2798705. [47] G. Krieger et al., “Advanced L-band SAR system concepts for high-resolution ultra-wide-swath SAR imaging,” in Proc. ESA Adv. RF Sens. Remote Sens. Instrum. (ARSI), Sep. 2017. [Online]. Available: https://elib.dlr.de/113598/ [48] J. Mittermayer, G. Krieger, A. Bojarski, M. Zonno, and A. Moreira, “A mirrorSAR case study based on the X-band high resolution wide swath satellite (HRWS),” in Proc. Eur. Conf. Synthetic Aperture Radar, 2021, pp. 1–6. [49] R. Torres et al., “GMES Sentinel-1 mission,” Remote Sens. Environ., vol. 120, pp. 9–24, May 2012, doi: 10.1016/j.rse.2011.05.028. [Online]. Available: http://www.sciencedirect.com/science/ article/pii/S0034425712000600 [50] P. A. Rosen, Y. Kim, R. Kumar, T. Misra, R. Bhan, and V. R. Sagi, “Global persistent SAR sampling with the NASA-ISRO SAR (NISAR) mission,” in Proc. IEEE Radar Conf. (RadarConf), May 2017, pp. 0410–0414, doi: 10.1109/RADAR.2017.7944237. [51] H. Ansari, F. De Zan, and R. Bamler, “Sequential estimator: Toward efficient InSAR time series analysis,” IEEE Trans. Geosci. Remote Sens., vol. 55, no. 10, pp. 5637–5652, Oct. 2017, doi: 10.1109/ TGRS.2017.2711037. [52] I. T. Jolliffe, Principal Component Analysis. New York, NY, USA: Springer-Verlag, 2002. [53] L. R. Brigitte and H. Rouanet, Geometric Data Analysis, From Correspondence Analysis to Structured Data Analysis. Dordrecht, The Netherlands: Kluwer, 2004. [54] D. Ho Tong Minh and Y.-N. Ngo, “ComSAR: A new algorithm for processing big data SAR interferometry,” in Proc. IEEE Int. Geosci. Remote Sens. Symp. (IGARSS), 2021, pp. 820–823, doi: 10.1109/IGARSS47720.2021.9553675. [55] M. Pandey et al., “The transformational role of GPU computing and deep learning in drug discovery,” Nature Mach. Intell., vol. 4, no. 3, pp. 211–221, Mar. 2022, doi: 10.1038/s42256-022 -00463-x. [56] D. Ho Tong Minh and Y.-N. Ngo, “Tomosar platform supports for Sentinel-1 tops persistent scatterers interferometry,” in Proc.

61

IEEE Int. Geosci. Remote Sens. Symp. (IGARSS), Jul. 2017, pp. 1680–1683, doi: 10.1109/IGARSS.2017.8127297. [57] Y. Xu et al., “Artificial intelligence: A powerful paradigm for scientific research,” Innovation, vol. 2, no. 4, Nov. 2021, Art. no. 100179, doi: 10.1016/j.xinn.2021.100179. [Online]. Available: https://www.sciencedirect.com/science/article/pii/ S2666675821001041 [58] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, May 2015, doi: 10.1038/nature14539. [59] R. Baraniuk, D. Donoho, and M. Gavish, “The science of deep learning,” Proc. Nat. Acad. Sci., vol. 117, no. 48, pp. 30,029–30,032, Nov. 2020, doi: 10.1073/pnas.2020596117. [Online]. Available: https://www.pnas.org/doi/abs/10.1073/ pnas.2020596117 [60] D. Ho Tong Minh et al., “Deep recurrent neural networks for winter vegetation quality mapping via multitemporal SAR Sentinel-1,” IEEE Geosci. Remote Sens. Lett., vol. 15, no. 3, pp. 464–468, Mar. 2018, doi: 10.1109/LGRS.2018.2794581. [61] D. Ienco, R. Interdonato, R. Gaetano, and D. Ho Tong Minh, “Combining Sentinel-1 and Sentinel-2 satellite image time series for land cover mapping via a multi-source deep learning architecture,” ISPRS J. Photogrammetry Remote Sens., vol. 158, pp. 11–22, Dec. 2019, doi: 10.1016/j.isprsjprs.2019.09.016. [Online]. Available: https://www.sciencedirect.com/science/article/ pii/S0924271619302278 [62] Z. Lin, K. Ji, X. Leng, and G. Kuang, “Squeeze and excitation rank faster R-CNN for ship detection in SAR images,” IEEE Geosci. Remote Sens. Lett., vol. 16, no. 5, pp. 751–755, May 2019, doi: 10.1109/LGRS.2018.2882551. [63] H. Li, F. Zhu, X. Zheng, M. Liu, and G. Chen, “MSCDUNet: A deep learning framework for built-up area change detection integrating multispectral, SAR, and VHR data,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 15, pp. 5163–5176, Jun. 2022, doi: 10.1109/JSTARS.2022.3181155. [64] B. Rouet-Leduc, R. Jolivet, M. Dalaison, P. A. Johnson, and C. Hulbert, “Autonomous extraction of millimeter-scale deformation in InSAR time series using deep learning,” Nature Commun., vol. 12, Nov. 2021, Art. no. 6480. [65] Y. J. E. Gbodjo, O. Montet, D. Ienco, R. Gaetano, and S. Dupuy, “Multisensor land cover classification with sparsely annotated data based on convolutional neural networks and self-distillation,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 14, pp. 11,485–11,499, Oct. 2021, doi: 10.1109/ JSTARS.2021.3119191. [66] C. Zhang et al., “Blind super-resolution for SAR images with speckle noise based on deep learning probabilistic degradation model and SAR priors,” Remote Sens., vol. 15, no. 2, Jan. 2023, Art. no. 330, doi: 10.3390/rs15020330. [Online]. Available: https://www.mdpi.com/2072-4292/15/2/330 [67] E. Ndikumana, D. Ho Tong Minh, N. Baghdadi, D. Courault, and L. Hossard, “Deep recurrent neural network for agricultural classification using multitemporal SAR Sentinel-1 for Camargue, France,” Remote Sens., vol. 10, no. 8, Aug. 2018, Art.

62

no. 1217, doi: 10.3390/rs10081217. [Online]. Available: https:// www.mdpi.com/2072-4292/10/8/1217 [68] A. G. Mullissa, D. Marcos, D. Tuia, M. Herold, and J. Reiche, “deSpeckNet: Generalizing deep learning based SAR image despeckling,” IEEE Trans. Geosci. Remote Sens., vol. 60, pp. 1–15, 2022, doi: 10.1109/TGRS.2020.3042694. [69] W. Wu, H. Li, L. Zhang, X. Li, and H. Guo, “High-resolution PolSAR scene classification with pretrained deep convnets and manifold polarimetric parameters,” IEEE Trans. Geosci. Remote Sens., vol. 56, no. 10, pp. 6159–6168, Oct. 2018, doi: 10.1109/ TGRS.2018.2833156. [70] Z. Wu, T. Wang, Y. Wang, R. Wang, and D. Ge, “Deep-learningbased phase discontinuity prediction for 2-D phase unwrapping of SAR interferograms,” IEEE Trans. Geosci. Remote Sens., vol. 60, pp. 1–16, 2022, doi: 10.1109/TGRS.2021.3121906. [71] L. Zhou, H. Yu, Y. Lan, and M. Xing, “Deep learning-based branch-cut method for InSAR two-dimensional phase unwrapping,” IEEE Trans. Geosci. Remote Sens., vol. 60, pp. 1–15, 2022, doi: 10.1109/TGRS.2021.3099997. [72] T. Yang, H. Yu, and Y. Wang, “Selection of persistent scatterers with a deep convolutional neural network,” in Proc. IEEE Int. Geosci. Remote Sens. Symp. (IGARSS), 2022, pp. 2912–2915, doi: 10.1109/IGARSS46834.2022.9884199. [73] L. Zhang and B. Du, “Deep learning for remote sensing data: A technical tutorial on the state of the art,” IEEE Geosci. Remote Sens. Mag. (replaced Newsletter), vol. 4, no. 2, pp. 22–40, Jun. 2016, doi: 10.1109/MGRS.2016.2540798. [74] J. A. Barrachina, Nov. 2022, “Negu93/CVNN: Complex-valued neural networks,” Zenodo, doi: 10.5281/zenodo.7303587. [75] T. Scarnati and B. Lewis, “Complex-valued neural networks for synthetic aperture radar image classification,” in Proc. IEEE Radar Conf. (RadarConf), 2021, pp. 1–6, doi: 10.1109/RadarConf 2147009.2021.9455316. [76] Y. Zhang, H. Yuan, H. Li, C. Wei, and C. Yao, “Complex-valued graph neural network on space target classification for defocused ISAR images,” IEEE Geosci. Remote Sens. Lett., vol. 19, pp. 1–5, Jun. 2022, doi: 10.1109/LGRS.2022.3185709. [77] O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional networks for biomedical image segmentation,” 2015. [Online]. Available: https://arxiv.org/abs/1505.04597 [78] H. H. Zeyada, M. S. Mostafa, M. M. Ezz, A. H. Nasr, and H. M. Harb, “Resolving phase unwrapping in interferometric synthetic aperture radar using deep recurrent residual U-Net,” Egyptian J. Remote Sens. Space Sci., vol. 25, no. 1, pp. 1–10, Feb. 2022, doi: 10.1016/j.ejrs.2021.12.001. [Online]. Available: https://www.sciencedirect.com/science/article/pii/ S1110982321000958 [79] F. Zhou, H. Zhou, Z. Yang, and L. Gu, “IF2CNN: Towards non-stationary time series feature extraction by integrating iterative filtering and convolutional neural networks,” Expert Syst. Appl., vol. 170, May 2021, Art. no. 114527, doi: 10.1016/j. eswa.2020.114527. [Online]. Available: https://www.science direct.com/science/article/pii/S0957417420311714 GRS

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

IMAGE LICENSED BY INGRAM PUBLISHING

There Are No Data Like More Data Datasets for deep learning in Earth observation MICHAEL SCHMITT , SEYED ALI AHMADI , YONGHAO XU , GÜLS¸EN TAS¸KIN , UJJWAL VERMA , FRANCESCOPAOLO SICA , AND RONNY HÄNSCH

C

arefully curated and annotated datasets are the foundation of machine learning (ML), with particularly data-hungry deep neural networks forming the core of what is often called artificial intelligence (AI). Due to the massive success of deep learning (DL) applied to Earth observation (EO) problems, the focus of the community has been largely on the development of evermore sophisticated deep neural network architectures and training strategies. For that purpose, numerous task-specific datasets have been created that were largely ignored by previously published review articles on AI for EO. With this article, we want to change the perspective and put ML datasets dedicated to EO data and applications into Digital Object Identifier 10.1109/MGRS.2023.3293459 Date of current version: 8 August 2023

SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

the spotlight. Based on a review of historical developments, currently available resources are described and a perspective for future developments is formed. We hope to contribute to an understanding that the nature of our data is what distinguishes the EO community from many other communities that apply DL techniques to image data, and that a detailed understanding of EO data peculiarities is among the core competencies of our discipline. INTRODUCTION DL techniques have enabled drastic improvements in many scientific fields, especially in those dedicated to the analysis of image data, e.g., computer vision or remote sensing. Although it was possible to train shallow learning approaches on comparably small datasets, deep learning requires 2473-2397/23©2023IEEE

63

large-scale datasets to reach the desired accuracy and generalization performance. Therefore, the availability of annotated datasets has become a dominating factor for many cases of modern EO data analysis that develops and evaluates powerful, DL-based techniques for the automated interpretation of remote sensing data. The main goal of general computer vision is the analysis of optical images, such as photos, which contain everyday objects, e.g., furniture, animals, or road signs. Remote sensing involves a larger variety of sensor modalities and image analysis tasks than conventional computer vision, rendering the annotation of remote sensing data more difficult and costly. Besides classical optical images, multi- or hyperspectral sensors and different kinds of infrared sensors; active sensor technologies such as laser scanning, microwave altimeters, and synthetic aperture radar (SAR) are regularly used, too. The fields of application range from computer vision-like tasks, such as object detection and classification, to semantic segmentation (mainly for land cover mapping) to specialized regression tasks grounded in the physics of the used remote sensing system. To provide an illustrative example, a dataset for biomass regression from interferometric SAR data will adopt imagery and annotations very different from the ones needed for the semantic segmentation of urban land cover types from multispectral optical data. Thus, although extensive image databases, such as ImageNet, were created more than 10 years ago and form the backbone of many modern ML developments in computer vision, there is still no similar dataset or backbone network in remote sensing. (Note that, as a prime example of an annotated computer vision dataset, ImageNet contains more than 14 million images depicting objects from more than 20,000 categories.) This lack of generality renders the generation of an ImageNet-like general EO dataset extremely complicated and thus costly: instead of photographs openly accessible on the Internet, many different—and sometimes quite expensive—sensor data would have to be acquired and, instead of “mechanical turks,” trained EO experts would have to be hired to link these different sensor data to the multitude of different domain- and task-specific annotations (see “The Mechanical Turk”). Therefore, until now, the trend in ML applied to EO data has been characterized by the generation of

The Mechanical Turk The name mechanical turk comes from a fraudulent chess-playing machine developed in the 18th century. Chess players were made to believe they played against the machine, but were in fact competing against a person hidden inside it. Today, the term mostly refers to Amazon Mechanical Turk (MTurk), a crowdsourcing website run by Amazon. On MTurk, users can hire remotely located crowdworkers to perform desired tasks. MTurk is frequently used to create manual annotations for supervised machine learning tasks.

64

numerous remote sensing datasets, each consisting of a particular combination of sensor modalities, applications, and geographic locations. Yet, a review of these developments is still missing in the literature. The only articles that make a small step toward a general review of benchmark datasets are [1], [2], [3], and [4]. All of them provide some sort of review, however, they are always limited to a very narrow aspect, e.g., object detection or scene classification. Furthermore, their focus is on ML approaches and their corresponding datasets, while the historical evolution of datasets is neither discussed in detail, nor from a sensor- and task-agnostic point of view. As an extension of our 2021 IEEE International Symposium on Geoscience and Remote Sensing contribution [5], this article intends to close this gap by ◗◗ reviewing current developments in the creation of datasets for DL applications in remote sensing and EO ◗◗ structuring existing datasets and discussing their properties ◗◗ providing a perspective on future requirements. In this context, we additionally present the Earth Observation Database (EOD) [6], which is the result of the effort and cooperation of voluntary scientists within the IEEE Geoscience and Remote Sensing Society (GRSS) Image Analysis and Data Fusion (IADF) Technical Committee (TC). This database aims to function as a centralized tool that organizes the meta information about existing datasets in a community-driven manner. EVOLUTION OF EO-ORIENTED ML DATASETS HISTORICAL DEVELOPMENT High-quality benchmark datasets have played an increasingly important role in EO for quite some time and are one of the driving factors for the recent success of DL approaches to analyze remote sensing data. As such, they can be seen as a tool complementary to methodological advancements to push accuracy, robustness, and generalizability. This section reviews and summarizes the historical development of EOoriented ML datasets to provide insights into the evolution of this “tool,” ranging from its historical beginnings to the current state of the art. The beginnings of ML applied to remote sensing focused on specific applications. Datasets were mainly built by considering a very localized study site, a few specific sensor modalities, and a relatively small number of acquired samples. Therefore, the first datasets were relatively small compared to what is now considered a benchmarking dataset. Training, validation, and testing samples were often taken from the same image. Even with the use of sophisticated shallow learning models but especially since the advent of DL, such small datasets were no longer sufficient for proper training and evaluation. The need for extended datasets has led to the creation of larger datasets containing multiple images, often acquired at different geographic locations. IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

benchmarking datasets, we gathered an extensive list of datasets available in the EO community, resulting in a large collection with currently 400 (i.e., 380 image-based

Figure 1 illustrates the evolution of benchmark datasets for ML in EO by showing this temporal development. To provide a broad view about the recent evolution of 7,358 Change Detection Superresolution/Pansharpening Object Detection

76

Classification Semantic Segmentation

191

Change Detection and Classification 352

Object Detection and Classification Object Detection and Semantic Segmentation

338

341

Others

45

44

361

208

287

356

73

63

146

105

254

373 235

311 216 5719

114 37

371

262

303 349 217

272

145

85 329

289

2016

2015

65

2014

2012

2011

2010

2009

2008

2007

2006

0

127

51

222

21

325 320

42 36

2020

26

2013

22

24

152 141

2019

64

41

255

2018

25 23

2017

316

281

280 328 218 370 172 251 244 294 119 327 77 111 270 368 229 223 261 38 90 175 109 314 206 8 283 269 147 243 333 335 266 61 27 18 12 274 169 309 239138 161 300 173 334 148 124 39 358 205253 174 143 201 339 189 231 32 82 142 291 308 301 364 5 221 242 135 17 128 168 91 366 55 30 78 240 101 234 200 336 171 207 153 296 321 290 3 33 15 277 94 121 194 69 131 6306 350 265 113 157 130 71 79 52 211 118 2 4 250 9 210 230 238 233 317 209 355 154 293 86 192 275 357 75 110 28 29 241 87 156 182 225 323 139 276 72 232 348 184 226 318 83 295 237 170 163 31 95 193 190 203 304 11 248 56 155 16 363 43 67 150 14 247 186 60 149 271 202 372 34 108 185 284 187 144 80 188 120 116 93 285 367 298 165 66 362 167 299 166 84 322 245 134 297 224

104

50

2022

Volume (GB)

286

1

282 264

1

236 35

13

106

263

46

160

107 53

220 219

158 25997

332 292 132

199 331

48

47 49

256 40

98 20

54

151

288

359

273

312

22

102

260

Classification and Semantic Segmentation

2021

1,386

Publication Year FIGURE 1. Distribution of remote sensing datasets over the years. The x-axis shows the publication year (the datasets are placed within the region of their publication year with a small random offset to minimize visual overlap in the graph), while the y-axis represents the volume of each dataset in gigabytes on a logarithmic scale. The circle radius indicates the dataset size in terms of the number of pixels. Colors denote the type of task addressed by a dataset. Each circle is accompanied by an index, allowing for identification of the dataset in the database (see Table 2), which provides further information. SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

65

and 20 point cloud-based datasets) entries (see Tables 1 and 2), including related metadata. We point out that, although being extensive, this list is far from being complete due to the fact that a large number of new datasets are published every year. Furthermore, the metadata required to generate the plot of Figure 1 are available only for a subset of 290 datasets (roughly 73%). In the horizontal axis, we indicate the year of publication. The vertical axis shows the volume of a dataset, while the circle radius reflects the number of spatial pixels covered by a dataset. For a more detailed explanation of how we measure dataset size,

please refer to Figure S1 and “How to Measure the Size of a Dataset.” Figure 1 provides a straightforward overview of the proportion among size and spatial dimension, and therefore about the overall information content given by features such as resolution, sensor modalities, number of bands/ channels, and so on. Each circle is accompanied by an index, allowing for identification of the dataset in the database (see Table 2) that provides further information. Note that we use the category “Others” for datasets that do not belong to any of the other categories and are too rare to

TABLE 1. ALTHOUGH A THOROUGH ANALYSIS OF LIDAR DATASETS IS BEYOND THE SCOPE OF THIS SURVEY, WE DO PROVIDE AN OVERVIEW OF SEVERAL EXAMPLE DATASETS. POINT CLOUD DATASETS ARE ANOTHER LARGE GROUP OF BENCHMARK DATA THAT ARE WIDELY USED IN THE LITERATURE AND INDUSTRY. WITHIN EO THE MOST COMMON SOURCE FOR POINT CLOUD DATA ARE LIDAR SENSORS THAT USE LIGHT IN THE FORM OF LASER PULSES TO MEASURE THE DISTANCE TO THE SURFACE. THE PRIMARY SOURCES ARE AIRBORNE LASER SCANNING (ALS), TERRESTRIAL LASER SCANNING (TLS), AND MOBILE LASER SCANNING (MLS) DEVICES. OTHER SOURCES OF POINT CLOUDS AND 3D DATA INCLUDE PHOTOGRAMMETRIC METHODS (STRUCTURE FROM MOTION, MULTI-VIEW STEREO, AND DENSE MATCHING APPROACHES) AND TOMOGRAPHIC SAR. AS 3D DATA TYPICALLY COME WITH FEATURES THAT ARE VERY DIFFERENT FROM 2D IMAGE DATA, SUCH DATASETS ARE BEYOND THE SCOPE OF THIS ARTICLE. NEVERTHELESS, TABLE 1 PROVIDES A SHORT LIST OF EXAMPLE LIDAR/POINT CLOUD DATASETS FOR INTERESTED READERS. TASK

PLATFORM

TIMESTAMPS NAME

PUBLICATION DATE

POINT DENSITY (POINTS/M2)

NUMBER NUMBER OF OF CLASSES POINTS

VOLUME (MB)

Change detection

ALS

Multiple

Abenberg ALS

2013

16



5,400,000

258

Classification

ALS

Single

NEWFOR

2015

Varies

Four



97

Classification

ALS

Single

DFC19

2019



Six

167,400,000

613

Classification

ALS

Single

ISPRS 3D Vaihingen

2014

8

Nine

780,879



Classification

Multiple

Single

ArCH

2020

Varies

10

136,138,423



Classification/semantic segmentation

ALS

Single

DublinCity

2019

240–348

13

260,000,000

3,000

Filtering

ALS

Single

OpenGF

2021

6 and 14

Three

542,100,000

2,280

Object detection/semantic segmentation

TLS

Single

LiSurveying

2021

Varies

54

2,450,000,000



Others

ALS

Single

RoofN3D

2018

4.72

Three

118,100



Semantic segmentation

ALS

Single

LASDU

2020

3–4

Six

3,120,000



Semantic segmentation

ALS

Single

DALES

2020

50

Eight

505,000,000

4,000

Semantic segmentation

ALS

Single

DALES Object

2021

50

Eight

492,000,000

5,000

Semantic segmentation

Drone

Single

Campus3D

2020

Varies

24

937,000,000

2,500

Semantic segmentation

Drone

Multiple

Hessigheim 3D

2021

800

11

73,909,354

5,950

Semantic segmentation

Drone

Single

WildForest3D 2022

60

Six

7,000,000

81

Semantic segmentation

MLS

Single

Toronto3D

2020

1,000

Eight

7,830,0000

1,100

Semantic segmentation

MLS

Multiple

HelixNet

2022



Nine

8,850,000,000

235,700

Semantic segmentation

Photogrammetry

Single

SensatUrban

2020



13

2,847,000,000

36,000

Semantic segmentation

Photogrammetry

Single

STPLS3D

2022



20



36,600 (images: 700,000)

Semantic segmentation

TLS

Single

Semantic 3D

2017



Eight

4000000000

23,940

66

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

67

Change detection

Change detection

Change detection

Classification

Classification

Classification

Classification

Classification

Classification

Classification

Classification

Classification

Cloud removal

Object detection

Object detection

Object detection

Object detection

Object detection

Object detection

Object detection

Object detection

Object detection

Object detection/ classification

Semantic segmentation Aerial

Semantic segmentation Aerial

Semantic segmentation Aerial

Semantic segmentation Aerial

Semantic segmentation Aerial

Semantic segmentation Aerial

10

11

15

21

22

24

34

40

52

63

69

76

98

105

109

114

117

127

138

151

162

191

204

208

211

212

213

214

215

Satellite

Satellite

Satellite

Satellite

Drone

Drone

Aerial

Aerial

Aerial

Aerial

Satellite

Satellite

Satellite

Satellite

Satellite

Satellite

Drone

Aerial

Aerial

Aerial

Satellite

Satellite

Satellite

Aerial

Change detection

1

PLATFORM

TASK

INDEX

Multiple

Multiple

Multiple

Multiple

Hyperspectral

Hyperspectral

SAR

SAR

Optical

Optical

Optical

Multispectral

Optical

Optical

Optical

Optical

Multiple

Optical

Optical

Optical

Multispectral

Multiple

Optical

Hyperspectral

Hyperspectral

Hyperspectral

Optical

Multispectral

Multiple

Multiple

SENSOR TYPE

DFC18

DFC15

DFC14

DFC13

HOSD

DFC08

FUSAR-Ship

xView3-SAR

AI-TOD

SpaceNet-4

DroneCrowd

BIRDSAI

RSOC

DOTA v2.0

DOTA v1.0

SZTAKI AirChange

SEN12MS-CR-TS

FMoW

AID

University of California, Merced

EuroSAT

BigEarthNet-MM

AIDER

Kennedy Space Center

Salinas

Indian Pines

LEVIR-CD

OneraCD

DFC09

DFC21-MSD

NAME

2018

2015

2014

2013

2022

2008

2020

2021

2020

2018

2022

2020

2020

2020

2018

2012

2021

2018

2017

2010

2018

2019

2019

2005

2000

2000

2020

2018

2009

2021

PUBLICATION DATE

Single

Single

Single

Single

Single

Single

Single

Single

Single

Single

Multiple

Multiple

Single

Single

Single

Multiple

Multiple

Single

Single

Single

Single

Single

Single

Single

Single

Single

Single

Multiple

Multiple

Multiple

240

550

365

145

1,024

600

98

4,000

IMAGE SIZE

One

Seven

Seven

145

18 × 224

Five

5,000

991 × 2

28,036

60,000

33,600

162,000

3,057

11,268

2,806

13

53

523,846

10,000

2,100

27,000 × 13

20

10,000



16

1,700

194

512

27,000



900

1,920

640

2,500

4,000

4,000

800

4,000



Two



63

30

21

10

19

Four

13

16

16

One

Two





Four

18

19,763

700,000,000



1,000

3,622,355,072



1,310,720,000

1.42E + 12











One



113

Two



48,600,000,000 186,000

69,672,960,000 One

49,766,400,000

3,621,481,392

1.80E + 11

44,896,000,000 15

7920640

848,000,000

1.08E + 12

3,600,000,000

137,625,600

256 600

1,437,696,000

1.02E + 11

152,352,000

311,100

111,104

21,025

667,942,912

8,640,000











(Continued)

2,200





2,000,000

22,000



10,400

43,200



34,280

18,000

42

649,000

3,500,000

2,440

317

1,920

121,000

275

57

27

Six

2,700

489



325,000

NUMBER OF VOLUME CLASSES (MB)

36,000,000,000 15

SIZE

64

590,326 × 12 120

2,645

One

One

One

637

24

Two

2,250

NUMBER OF TIMESTAMPS IMAGES

TABLE 2. ALTHOUGH THE COMPLETE COLLECTION OF 380 DATASETS IS TOO EXTENSIVE TO BE INCLUDED IN THE PRINT VERSION OF THIS ARTICLE, IT CAN BE FOUND AT [85]. HERE WE INCLUDE A PORTION OF THIS LIST, WHICH CONTAINS ALL THE DATASETS MENTIONED IN THE ARTICLE, ALONG WITH THEIR INDEX IN THE COMPLETE DATABASE.

68

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

Semantic segmentation Satellite

Semantic segmentation Satellite

Semantic segmentation Satellite

Semantic segmentation Satellite

Semantic segmentation Satellite

Semantic segmentation Satellite

Semantic segmentation Satellite

Semantic segmentation Satellite

Semantic segmentation Satellite

Semantic segmentation Satellite

Semantic segmentation Satellite

Semantic segmentation, classification

Semantic segmentation, classification

Superresolution, pansharpening

Superresolution, pansharpening

Superresolution, pansharpening

Superresolution, pansharpening

Superresolution, pansharpening

254

257

261

264

282

286

287

288

292

294

301

312

313

316

318

319

324

325

Others

Semantic segmentation Multiple

253

Others

Semantic segmentation Multiple

247

360

Semantic segmentation Multiple

246

340

Optical

Semantic segmentation Aerial

220

Satellite

Multiple

Satellite

Satellite

Satellite

Satellite

Aerial

Satellite

Satellite

Multispectral

Semantic segmentation Aerial

219

Optical

Multiple

Optical

Optical

Multispectral

Multispectral

Multispectral

Multiple

Multiple

Optical

Optical

Optical

Optical

Optical

Multiple

Multiple

Multiple

Multiple

Optical

Multiple

Multiple

Multispectral

Multispectral

Multiple

Semantic segmentation Aerial

SENSOR TYPE

217

PLATFORM

TASK

INDEX

DFC16

DFC19

WorldStrat

PAirMax (Maxar)

PAirMax (Airbus)

Proba-V Superresolution

DFC06

DFC23

SEN12MS

SpaceNet-8

SpaceNet-7

SpaceNet-5

SpaceNet-3

SpaceNet-2

SpaceNet-1

OpenSentinelMap

MapInWild

DFC21-DSE

DFC20

DFC07

OpenEarthMap

SpaceNet-6

DFC17

ISPRS 2 D-Vaihingen

ISPRS 2 D-Potsdam

DFC22-SSL

NAME

2016

2019

2022

Multiple

Single

Single

Single

Single

2021 2021

Single

Single

Single

Single

Multiple

Multiple

Single

Single

Single

Single

Multiple

Single

Single

Single

Single

Single

Single

Multiple

Single

Single

Single





3,928

14



1,160

Six

12

180,662

1,200

1,525

2,369

3,711

24,586

9,735

137,045

1,018 × 8

6,000

180,662

One

5,000

3,401

17

33

38

5,000

NUMBER OF TIMESTAMPS IMAGES

2018

2006

2023

2019

2022

2020

2019

2017

2017

2016

2022

2022

2021

2020

2007

2022

2020

2017

2011

2011

2022

PUBLICATION DATE





1,054





384

5,000



256

1,200

1,024

1,300

1,300

650

650

192

1,920

16

17

787

1,024

900



2,200

6,000

2,000

IMAGE SIZE





4,363,678,048





171,048,960

637,500,000



11,839,864,832

3,456,000,000

1,599,078,400

4,003,610,000

6,271,590,000

10,387,585,000

4,113,037,500

5,052,026,880

30,022,041,600

1,536,000

19,200

619,369

5,242,880,000

275,4810,000



156,750,000

1,368,000,000



















Two

One

One

One

One

One

15

11

Four



19

Eight

One



Six

Six





107,000

386

153

692

390



510,000

6,800

20,582

84,103

182,200

182,200

31,000

445,000

365,000

18,000





9,100

368



17,000

16,000

42,000

NUMBER OF VOLUME CLASSES (MB)

20,000,000,000 14

SIZE

TABLE 2. ALTHOUGH THE COMPLETE COLLECTION OF 380 DATASETS IS TOO EXTENSIVE TO BE INCLUDED IN THE PRINT VERSION OF THIS ARTICLE, IT CAN BE FOUND AT [85]. HERE WE INCLUDE A PORTION OF THIS LIST, WHICH CONTAINS ALL THE DATASETS MENTIONED IN THE ARTICLE, ALONG WITH THEIR INDEX IN THE COMPLETE DATABASE. (Continued)

form their own category. Examples in the “Others” category are datasets on cloud removal, visual question answering, and parameter-estimation tasks such as hurricane wind speed prediction, satellite pose estimation, and vegetation phenological change monitoring. The dashed line (“—”) illustrates an exponential growth of benchmark datasets created by and for the EO community. This map on the evolution of remote sensing datasets offers several interesting insights: ◗◗ The beginnings: In addition to the first IEEE GRSS Data Fusion Contest in 2006 (Table 2; number 316), there are a few other pioneering datasets that have fostered ML research applied to remote sensing data in its early stages, e.g., •• Hyperspectral datasets [(Indian Pines, Salinas Valley, and Kennedy Space Center) (Table 2; 21, 22, and 24)]: Published before 2005, these datasets triggered the ML era in remote sensing. Covering a very small area on the ground and having a very small number of pixels, such datasets are not suitable for training DL models (or have to be used with excessive caution). On the other hand, due to their rich hyperspectral information, they are still being used for tasks such as dimensionality reduction and feature extraction. •• The University of California, Merced dataset (Table 2; 63) [7]: Published in 2010, it was the first dataset dedicated to scene classification. •• ISPRS Potsdam/Vaihingen dataset (Table 2; 219 and 220) [8]: Published in 2012, it was initially intended to benchmark semantic segmentation approaches tailored to aerial imagery. Later, it was also used for other tasks, e.g., single-image height reconstruction (e.g., in [9]). •• SZTAKI-AirChange dataset (Table 2; 105) [10]: Published in 2011, it was one of the earliest datasets designed for object detection. All of those pioneering datasets have seen massive use in the early ML-oriented EO literature. It is interesting to note that pansharpening, scene classification, semantic segmentation, and object detection were the first topics in remote sensing to be addressed using ML-based methodologies. ◗◗ The DL boom: As discussed by several review articles on DL and AI applied to EO [11], [12], [13], the year 2015 marked the beginning of the DL boom in the EO community. This is well reflected by a significantly rising number of datasets published from that year onward. It is furthermore confirmed by the fact that the dataset sizes, both in terms of spatial pixels and data volume, started to increase significantly from approximately that time. ◗◗ The diversity of tasks: From the early days to the present, ML-oriented EO datasets have been designed for a multitude of different tasks. The historical evolution depicted in Figure 1 further shows that object detection and semantic segmentation are the most popular tasks, with a significant increase of datasets dedicated to minority categories (denoted as “Others”) from roughly 2019 to the present. This indicates that the rise of DL in EO broadens the overall application scope of the discipline. SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

Figure 2 provides further insights into the size distribution of available datasets in terms of the 1) number of pixels and 2) data volume in gigabytes for typical remote sensing tasks. Illustrating these two different aspects allows a deeper understanding of the nature

How to Measure the Size of a Dataset In this article, we look at the size of datasets from the following two perspectives: 1) Size: data volume in terms of the number of spatial pixels. We count the number of pixels in the highest-available image resolution while ignoring multiband, multichannel, and multisensor data. In other words, pixels are only counted once in the spatial coverage provided by the dataset. 2) Volume: data volume in terms of storage. The amount of disk space required for a dataset is a proxy for image resolution and the provided modalities (e.g., multiple bands and sensor types). Figure S1 highlights the different factors that affect the volume and size of a dataset: the number of bits per pixel (radiometric resolution), number of spectral bands (spectral resolution; i.e., red, green, blue; multispectral; or hyperspectral), number of images during a specific time period (temporal resolution), and number of pixels per unit area (spatial resolution). As mentioned previously, the size is directly related to the unique number of ground-projected resolution cells. A larger dataset in terms of size corresponds to images with higher resolutions or broader coverage.

Satellites

Airplanes

t1 t2 ... tn

Drones

Time Series Multispectral

Width

t

gh

i He

HyperSpectral Volume

Resolution

Images Projected onto Ground Surface

FIGURE S1. A schematic illustration of the proposed size measure used to characterize datasets i.e., pixels are only counted once in the spatial coverage provided by a dataset. For a more detailed definition, see “How to Measure the Size of a Dataset.”

69

of the data. For example, datasets counting a similar number of spatial pixels may differ in data volume, which can, for example, indicate the use of multimodal imagery. Object detection offers the largest data volume among the existing benchmarking datasets, which

again confirms its popularity in DL-oriented remote sensing research. However, in terms of number of pixels, semantic segmentation takes the lead, indicating that a larger spatial coverage is usually involved for this type of application.

0.66%

0.13%

3.39%

1.12%

0.16% 1.92%

14.53%

21.07%

8.52% 22.81%

0.65%

29.94%

13.87%

57.31% 20.19%

0.24%

0.48%

0.34% 2.52%

(a)

0.14%

(b)

Superresolution/Pansharpening

Object Detection

Object Detection and Classification

Change Detection

Change Detection and Classification

Classification

Classification and Semantic Segmentation

Semantic Segmentation

Object Detection and Semantic Segmentation

Others

FIGURE 2. Distribution of EO dataset sizes over typical remote sensing tasks, expressed in (a) volume and (b) size, as defined in Figure S1 and “How to Measure the Size of a Dataset.” In (a), with 30%, object detection is the predominant task, followed by semantic segmentation. In (b), in contrast to Figure 2(a), semantic segmentation is the prevailing task, illustrating that corresponding datasets involve more complex scenarios such as leveraging multiple sensors or spectral bands. Object detection and semantic segmentation are the dominant image analysis tasks in ML-centered EO.

Platforms

Sensor Type Satellite

Time Epochs

RGB

60.19%

Single 54.63%

79.01%

4.63% 25%

Aerial

HS

7.41%

8.33%

Drone

Multiple

SAR

17.9%

6.48%

15.43% Multiple

20.99%

MS

Multiple

FIGURE 3. A distribution of available EO datasets over different platforms, sensor types, and number of acquisition times. Single-image red, green, blue (RGB) images acquired by satellites are clearly the dominating modality. MS: multispectral; HS: hyperspectral.

70

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

PLATFORMS A direct overview of the occurrence of a type of platform and sensor is given in Figure 3. Satellite platforms are the most common, followed by airborne platforms and drones. Optical data cover more than half of the datasets, while all other sensors are almost equally distributed. Interestingly, 20% of the datasets provide time series, while the rest are single-temporal acquisitions. Complementing this, Figure 4 highlights the distribution of tasks between sensors and platforms. The inner ring indicates

Oth ers OD /Cla ss

SemS

OD

Class

OD

ss

la

C B

HS MS

RGB

Multiple

g Se m Se

Cl as OD /C O s Se Oth las D m er s Se s CD g /C CD las s

SAR

rs

he Ot

OD

OD

ss /Cla OD hers Ot eg mS Se ss Clalass /C

SemSeg Class/SemSeg OD SemSeg Others

eg

the platform type, which then splits into different sensor types in the middle ring, and finally denotes the targeted tasks in the outer ring, respectively. This graph shows that the datasets acquired by unmanned aerial vehicles (UAVs) and aircraft are mainly dedicated to optical sensors, while satellite-based EO has a much wider and more homogeneous distribution across all sensor and application types. Figure 5(a) and (b) further specifies the previous findings by showing how the datasets acquired by different platforms are distributed across both tasks and sensors.

ipl

ult

M

RG

CD

e

ss

Cla

ss/S

RG

B

emS

eg OD

le

ltip

Mu

emS

Others

le

tip

ul

M

eg MS

g ers Se OthD/Sem OD OD C Seg Sem SR OD

e

Dron

OD/S

eg

mS

Se

Cla

MS

l

ria

Ae

SR

SemSeg

HS

Others OD Class CD

SemSeg

SemSeg

Others

CD

SAR

Sate

le

ltip Mu

Seg

em

ss/S

Cla

OD

llite

s

Clas

Cla

ss

ers

Oth

Se

eg

mS

mS

eg

CD

RGB

Se

SR

Class/Sem

Seg

Cla ss

rs

he

Ot

OD

FIGURE 4. A distribution of tasks between sensors and platforms. Platforms are in the inner ring, sensors are distributed in the middle ring,

and the outer ring shows different tasks per sensor. OD: object detection; CD: change detection; SemSeg: semantic segmentation; SR: superresolution; class: classification; MS: multispectral; HS: hyperspectral. SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

71

Mulitiple

Satellite

Aircraft

Drone

Multiple HS

MS

Sensor

RGB

SAR

Acquisition Platforms

CD

CD/Class

Class

Class/SemSeg

OD

OD/SemSeg

SemSeg

2018

CD/Class

Class Class/SemSeg OD

OD/ClassOD/SemSeg Others

SR

CD

SemSeg

CD/Class

Class Class/SemSeg OD

OD/ClassOD/SemSeg Others

SR

SemSeg

SR

SemSeg

SR

SemSeg

HS

HS

MS Multiple RGB SAR

2020

MS Multiple RGB SAR

2019

Sensor

SR

MS Multiple RGB SAR CD

CD

CD/Class

Class Class/SemSeg OD

OD/ClassOD/SemSeg Others

SR

CD

SemSeg

CD/Class

Class Class/SemSeg OD

OD/ClassOD/SemSeg Others

HS

HS

MS Multiple RGB SAR

2022

MS Multiple RGB SAR

2021

Sensor

Others

HS

HS

MS Multiple RGB SAR

Until 2018

Sensor

OD/Class

Tasks (a)

CD

CD/Class

Class Class/SemSeg OD

OD/ClassOD/SemSeg Others

Tasks

SR

CD

SemSeg

(b)

CD/Class

Class Class/SemSeg OD

OD/ClassOD/SemSeg Others

Tasks

FIGURE 5. Different tasks [CD, classification (class), semantic segmentation (SemSeg), object detection (OD), superresolution (SR), and pansharpening] and task combinations (denoted by “/”) make use of very different platforms and sensors. (a) Combinations of different platforms, sensors, and tasks accumulated over the years. (b) Combinations of different platforms, sensors, and tasks for different years. MS: multispectral; HS: hyperspectral.

72

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

Although the former provides an overview, the latter adds a temporal aspect by showing the corresponding applications per year. The x- and y-axes represent tasks and sensors, respectively, while the different markers indicate the type of acquisition platforms. From these plots, we can affirm that object detection and classification tasks are mainly performed on optical images. At the same time, semantic segmentation is fairly evenly distributed among optical, multispectral, and other sensor combinations. SAR images are mainly acquired from satellite platforms, while hyperspectral datasets are almost always acquired from airborne systems. UAVs mainly carry optical sensors in the context of semantic segmentation. Some tasks, such as superresolution, naturally make use of multimodal data, e.g., optical and hyperspectral imagery. The year-by-year graph in Figure 5(b) shows that superresolution datasets, together with UAVbased acquisitions, have received more attention in recent years. On the other hand, the EO community has not seen many new hyperspectral datasets since 2018. Optical sensors were the most common source of information, while after 2020 an increasing number of datasets were also acquired from other sensors, such as SAR or hyperspectral systems. Figures 3, 4, and 5(a) show that the number of datasets acquired from “Multiple” platforms or sensors is still the minority, which provides evidence for the earlier statement that state-of-the-art datasets are usually designed to respond to a specific task in EO applications. These figures also show which combination of EO tasks, platforms, and sensors is currently underrepresented. In particular, Figure 5(a) shows three main gaps: 1) SAR change detection (CD), 2) SAR superresolution, and 3) hyperspectral superresolution. From a sensor perspective alone, the lack of airborne SAR datasets and drone-based hyperspectral benchmarks are other obvious gaps.

GEOGRAPHIC DIVERSITY The geographic diversity and spatial distribution of EO imagery is an important attribute of a benchmark dataset as it is directly related to the geographic generalization performance of data-driven models. Figure 6 shows the geographic distribution of datasets (i.e., of roughly ~300 datasets (75%) from Table 2 OBJECT DETECTION OFFERS that provided (or where we THE LARGEST DATA VOLUME could find) their geographic AMONG THE EXISTING information). Many of the BENCHMARKING DATASETS, datasets are globally distribWHICH AGAIN CONFIRMS uted (“Global”) and contain ITS POPULARITY IN images from around the DL-ORIENTED REMOTE world, while others cover a SENSING RESEARCH. limited set of multiple cities or countries (“Multiple Locations”). Maybe surprisingly, “Synthetic” datasets show a dominant presence as well, illustrating the benefits of being able to control image acquisition parameters (such as viewing angle), environmental factors (such as atmospheric conditions, illumination, and cloud cover), and scene content (e.g., type, size, shape, and number of objects). Figure 6 illustrates an important, and seldom-discussed issue within the EO community: there exists a strong geographic bias within the available EO datasets. Although 25% of the datasets contain samples from globally distributed locations, nearly 40% of available datasets cover regions in Europe (21%) and North America (18%) only. Asia is still covered by 10%, however, Africa (5%), South America (4%), and Australia (1%) are barely included. This raises the question of whether many of the findings and conclusions in corresponding research articles would generalize to these geographic areas. In any case, the need for more spatially

80

Geographic Diversity of Datasets

70

South America 4%

60

North America 18%

50 40 30 20

Australia 1%

Asia 10% Africa 5%

Global 25%

Multiple Locations 12% Synthetic 4%

Global United States Multiple Locations China Synthetic Germany France Europe Brazil Italy Canada Asia Africa Rwanda Greenland Switzerland India Peru South Africa South America Spain Netherlands Japan Greece Alaska Belgium Gulf of Mexico United Kingdom UAE Australia Sweden Sudan Austria Benin Slovenia Scotland Denmark Poland Ecuador Norway North America New Zealand Estonia Finland Mexico Mali Malawi Kenya Ghana Hungary West Africa

10 0

Other 41%

Europe 21%

FIGURE 6. A geographic distribution of EO benchmark datasets (which provided clear location information). UAE: United Arab Emirates. SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

73

diverse datasets becomes apparent, in particular, covering underdeveloped countries as well.

6,492,000

4,242,000

2,719,000

1,708,000

627,000

1,048,000

365,000

206,000

59,000

112,000

29,000

2022

2021

2020

2019

2018

2017

2016

2014

2015

2013

2012

2011

2010

2009

2008

2007

Others

6,000

Classification and Semantic Segmentation

14,000

Object Detection and Semantic Segmentation

2,000

Object Detection and Classification

357

Change Detection and Classification

1,000

Semantic Segmentation

109

Classification

6

Object Detection

28

108 97 87 78 69 61 54 48 42 36 31 27 23 20 16 14 11 9 7 5 4 2 1

CD Superresolution/Pansharpening

Publication Year (a)

APPLICATIONS AND TASKS Finally, we analyze a last aspect concerning the distribution of some characteristics of the datasets, namely, the number of classes and images. In Figure 7(a), we show the evolution of the number of classes per publication year. In Figure 7(b), we plot the number of classes against the number of images in the dataset. In both plots, the radius indicates the size of the image (width or height), while the color indicates the type of task. We can see that there is no clear trend or correlation between the year of publication and the number of classes in a dataset. Instead, we find that more recently published datasets increase the variety in the number of classes, again highlighting an increased interest in using benchmarking datasets in a wider range of applications.

1

108 97 87 78 69 61 54 48 42 36 31 27 23 20 16 14 11 9 7 5 4 2 1 2006

Number of Classes

SOURCE OF THE REFERENCE DATA Another important aspect is the source of the reference data. Although most of the scientific papers that introduce a new benchmark dataset are very detailed regarding source and properties of the provided image data, they often contain only sparse information about the exact process of how the reference data were obtained. However, knowledge about the source of the reference data as well as about measures for quality assessment and quality control are essential to judge the potential of the benchmark and how to interpret obtained results. For example, achieving high “accuracy” on a dataset that contains a high number of annotation errors in its reference data only means that the model is reproducing the same bias as the method used to annotate the data but is not actually producing accurate results. Furthermore, information about the source of the reference data is not only scarce but also very heterogeneous. Examples include manual annotation for semantic segmentation [e.g., OpenEarthMap (Table 2; 253)] or object detection [e.g., BIRDSAI (Table 2; 127)], DroneCrowd (Table 2; 138), the use of keywords in image search engines for classification [e.g., AIDER (Table 2; 34)], leveraging existing resources [e.g., BigEarthNet (Table 2; 40)] uses the CORINE land use land cover (LULC) map, MapInWild (Table 2; 264) uses the World Database of Protected Areas, AI-TOD (Table 2; 162) uses other existing datasets, ship detection datasets usually employ AIS data, OpenSentinelMap (Table 2; 282) uses OSM, automatic reference methods (e.g., HSI Oil Spill

(Table 2; 211) and Sen12MS-CR-TS (Table 2; 98), or a mixture of these [e.g., RSOC (Table 2; 117)] uses existing datasets (DOTA, Table 2; 109) as well as manual annotations for new images. The quality of class definitions varies among different datasets, often depending on whether a dataset is designed for a specific real-world application (in which class definitions are driven by the application context), or whether it is just an example dataset used to train and evaluate a MLbased approach (in which case, class definitions are more arbitrary). If human interaction is involved (e.g., via manual annotations), annotation protocols or precise class definitions are often not shared (if they even exist). With the evolution of datasets, the quality of the meta-information about the reference data needs to increase, together with the quality and quantity of the image data.

Number of Images (b)

FIGURE 7. The number of classes provided by the reference data of a given dataset not only varies for different tasks (e.g., object detection

is dominated by datasets with only a single class) but also with (a) publication year and (b) the number of images of a dataset. 74

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

Furthermore, although there is no clear correlation, we can confirm that increasing the number of classes reduces the size of the images and increases the number of images. More populated datasets, in terms of number of images, also have smaller image sizes and typically consider a smaller number of classes. Conversely, larger images are found in less populated datasets. This overview is a snapshot of the currently published benchmarking datasets. Although the list of datasets continues to grow, we believe that the observed trends will not change much over the next few years. Instead, we expect long-term developments that will lead to two divergent aspects of datasets: specificity and generality. This will be further discussed in the “Open Challenges and Future Directions” section. EXEMPLARY DATASETS This section provides short descriptions of a selection of EO datasets for various tasks: semantic segmentation, classification, object detection, CD, and superresolution/pansharpening. Given the vast number of publicly available EO datasets, it is only possible to present some of them in this article. Thus, this selection cannot be comprehensive and certainly follows a subjective view influenced by the experience of the authors. However, the selected datasets are representative for their respective tasks and were selected based on their observed relevance: they either are the largest in terms of size (see the “Evolution of EO-Oriented ML Datasets” section for a definition), the first to be introduced for a given task, or the most popular dataset for the chosen application. The popularity was determined based on the number of citations the original paper introducing the dataset received. The popularity of a dataset is influenced by multiple factors. One is certainly the size of a dataset, i.e., larger datasets are often preferred; however, there are exceptions. For

instance, Functional Map of the World (FMoW) (Table 2; 76), introduced in 2018 [14], is the largest dataset for remote sensing scene classification in terms of the number of images (1 million) but has yet to gain a high level of popularity [with 200 citations and 128 GitHub MORE POPULATED stars, compared to EuroSAT DATASETS, IN TERMS OF (Table 2; 52) with 796 citaNUMBER OF IMAGES, ALSO tions and 276 GitHub stars, or HAVE SMALLER IMAGE AID (Table 2; 69) with 1,000 SIZES AND TYPICALLY citations, which were all pubCONSIDER A SMALLER lished in 2018]. Several other NUMBER OF CLASSES. factors affect the popularity of a dataset, too, such as ease of access to the hosting servers [Google Drive, IEEE DataPort, Amazon Web Services (AWS), university/personal servers, and so on], accompanying documentation and presentation (standard metadata format, detailed description, availability of template code and support, suitable search engine optimization, and so on), or ease of downloading the data (temporary or permanent links, bandwidth of hosting server, sign-in/-up requirements, and so forth). Finally, an already-established dataset is more likely to be used in new studies to enable a comparison to related prior works, even if newer (and potentially better) datasets might exist. Along with brief descriptions, this section provides insights into the different dataset characteristics. SEMANTIC SEGMENTATION Semantic segmentation refers to assigning a class label to each pixel in the image. Partitioning the image into semantically meaningful parts creates a more detailed map of the input image and provides a foundation for further analysis such as LULC classification, building footprint extraction, landslide mapping, and so on [15]. Figure 8 shows the

Volume (MB) ActiveFireL8

South Sudan Crop Type CITY-OSM FireCube

PASTIS

HKH Glacier Mapping SpaceNet-1

RescueNet SpaceNet-7

MultiSenGE

MUESLI

DFC18 DFC22-SSL DFC20 Other

DFC21 - DSE AIRS

OpenSentinelMap

SEN1-2

WildfireDanger

DLRSD Indian Pines Aerial Salinas DFC07 DubaiSemSeg Image University Segmentation Kennedy Botswana Space Center Pavia GTA -VCenter SID Zurich Summer DFC08 Kag. Satellite Water Bodies SpaceNet -6 MultimodalBen UBC Aeroscapes Kag. Satellite S12_ScotInd_Cloud DFC13 SynthAer WHU -Hi Buildings Synthinel -1 SmallholderPlant ManipalUAVid MARIDA LandCoverAI Chikusei STGAN Cloud Removal RIT-18 AeroRIT BHPools&WaterTanks NDWS HOSD Landslide4Sense DeepGlobe-LandCover ISPRS 2D - Potsdam S2 -SHIPS Kag. Massachusetts Semantic Drone Buildings TTPLA FloatingObjects SemCity Toulouse MiniFrance ETCI2021 CrowdAI Mapping Sat_Burn_Area UAVid iSAID SEN-12-FLOOD SpaceNet-8 LoveDA Chesapeake Land Cover FloodNet GreeceFire C2S -MS MMFlood GrowliFlower OnCloudN SpaceNet-5

MapInWild

ISPRS 2D - Vaihingen

SpaceNet-2 Ghana Crop Type

SeasoNet

SpaceNet-3

FIGURE 8. Relative volume distribution among datasets addressing semantic segmentation. SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

75

relative volume of corresponding benchmark datasets and illustrates a wide spread of dataset sizes. Here, we present two examples: SEN12MS (focusing on medium-resolution satellite data and weak annotations) and the ISPRS Vaihingen dataset (focusing on very high-resolution airborne data with high-quality annotations). 1) SEN12MS [49] is among the most popular and largest datasets (Table 2; 312) in terms of volume of data as shown in Figure 1. A total of 541,986 image patches of size 256 × 256 pixels with high spectral information content is present in this dataset [16]. It contains dualpolarimetric SAR image patches from Sentinel-1, multispectral image patches from Sentinel-2, and ModerateResolution Imaging Spectroradiometer (MODIS) land cover maps (Figure 9). The image patches are acquired

(a)

at random locations around the globe and cover the four meteorological seasons of the Northern Hemisphere. The patches were further processed to ensure that the images did not contain any clouds, shadows, and artifacts. In addition to the images and the land cover maps, the results of two baseline convolutional neural network classifiers (ResNet50 and DenseNet121) are also discussed to demonstrate the usefulness of the dataset for land cover applications [17]. 2) ISPRS Vaihingen [18] is the earliest semantic segmentation dataset used for identifying land cover classes in aerial images [18], as shown in Figure 1 (Table 2; 220). This dataset contains a subset of images acquired by aerial cameras during the test of digital aerial cameras carried out by the German Association of Photogrammetry and Remote Sensing [19]. The dataset was prepared as part of a 2D semantic labeling contest organized by the International Society for Photogrammetry and Remote Sensing (ISPRS). The dataset contains images acquired over the city of Vaihingen in Germany. In total, orthorectified images of varying sizes and a digital surface model (DSM) are provided for 33 patches covering Vaihingen (Figure 10). The ground-sampling distance for the images and DSM is 9 cm. The three channels of the orthorectified images contain infrared, red, and green bands as acquired by the camera. The images are manually labeled for six common land cover classes: impervious surfaces, buildings, low vegetation, tree, car, and clutter/background.

(b)

(c)

(d) FIGURE 9. Several patch triplet examples from the SEM12MS

dataset. (a) False-color Sentinel-1 SAR (R: VV; G: VH; B: VV/VH), (b) Sentinel-2 red, green, blue, (c) Sentinel-2 short-wave infrared, (d) IGBP land cover, and LCCS land cover. 76

FIGURE 10. An illustration of 33 patches from the ISPRS Vaihingen

dataset. (Source: [18].) IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

SCENE CLASSIFICATION Remote sensing scene classification is closely related to semantic segmentation and has been used in various application domains such as urban planning [20], environment monitoring [21], LULC estimation [22], and so forth. The main difference is that instead of pixelwise classification and resolution-preserving maps as output, in scene classification, only one or more global labels are predicted for a given input image, aiming at a more comprehensive, generalized, and context-aware understanding of the underlying scene. Similar to image classification, which has been the driving force behind the early days of DL in computer vision, research on remote sensing scene classification has led to the creation of more diverse and large-scale high-resolution image datasets. Figure 11 shows the relative volume distribution of remote sensing scene classification datasets. This section covers EuroSAT as one of the earliest and FMoW as one of the largest datasets (representing roughly 50% of the available data in this task category) as well as BigEarthNetMM, which introduces an additional interesting aspect by providing multiple labels for each image. ◗  EuroSAT [50] is one of the earliest large-scale datasets tailored for training deep neural networks for LULC classification tasks, as depicted in Figure 1 (Table 2; 52). The dataset includes 10 land cover classes (Figure 12), each containing 2,000–3,000 images, for 27,000 annotated and georeferenced images with 13 different spectral bands of 64 # 64 pixels. It contains multispectral images with a single label acquired from Sentinel-2 satellite images of cities in 34 European countries [23], [24]. This dataset has been widely used for classification tasks, however, it may be used in a variety of real-world EO applications, such as detecting changes in land use and land cover as well as improving geographical maps as Sentinel-2 images are freely available [23], [25], [26].

◗   BigEarthNet-MM [51] (Table 2; 40) is a benchmark ar-

chive that introduced an alternative nomenclature for images as compared to the traditional CORINE Land Cover (CLC) map for multilabel image classification and image-retrieval tasks. The CLC level-3 nomenclature is arranged into 19 classes for semantically coherent and homogenous land cover classes. This archive is created using Sentinel-1 SAR and Sentinel-2 multispectral satellite images acquired between June 2017 and May 2018 over 10 European countries [27]. The first version of BigEarthNet included only Sentinel-2 image patches. It has been augmented with Sentinel-1 image patches to form a multimodal benchmark archive (also called BigEarthNetMM). The archive comprises 590,326 image patches with 12 spectral and two polarimetric bands. As shown in Figure 13, each image patch is annotated with several land cover classes (multilabels). One of the key features of the BigEarthNet dataset is its large number of classes and images, which makes it suitable for training DL models. However, because it is a multilabel dataset, the complexity of the problem is significantly increased compared to single-label datasets [28]. ◗◗ FMoW [52] (Table 2; 76) is the largest dataset (Figure 1) for remote sensing scene classification in terms of the number of images [14]. The FMoW dataset is composed of approximately 1 million satellite images, along with metadata and multiple temporal views as well as subsets of the dataset that are classified into 62 classes, which are used for training, evaluation, and testing. Each image in the dataset comes with one or multiple bounding boxes indicating the regions of interest. Some of these regions may be present in multiple images acquired at different times, adding a temporal dimension to the dataset. The FMoW dataset is available to the public in two image formats: the FMoW-Full and the FMoW

Volume (MB) FMoW

So2Sat LCZ42 DND-SB Rwanda Crop Type

Gaofen Image (GID)

BrazilDAM Other ZueriCrop

BigEarthNet-MM LandCoverNet Au

Kag. Planet Forest

SloveniaLandCover LandCoverNet Af

LandCoverNet SA

AiRound

LandCoverNet Eu LandCoverNet As

xView

ARGO AnthroProtect

OVERHEAD BCS BCSS MNIST Kag. Satellite Hurricane Damage RSSCN7 WHU-RS19 SAR -based Sea Ice Dataset SanFranBay_Ships_Planet UC NWPU -Merced Bijie Landslide SIRI-WHU USTC_SmokeRS RSISAT-4 SAT-6 Airbus Wind PatternNet Turbines Patches MLRSNet EuroSAT RSI-CB256 MASATI-v2 AID Kag. Find a Car Park SeaIceClassification CV4A Kenya DeepGlobe -Road VEDAI ADVANCE Kaggle Cloud Detection MultiScene CV-BrCT S2-Agri ICONES-HSI

DENETHOR LandCoverNet NA

FIGURE 11. Relative volume distribution among datasets addressing scene classification. SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

77

AnnualCrop

Forest

HerbaceousVegetation

Highway

Industrial

Pasture

PermanentCrop

Residential

River

SeaLake

FIGURE 12. Sample image patches of all 10 classes covered in the EuroSAT dataset.

red, green, blue (RGB). The FMoW-Full is in TIFF format and contains four- and eight-band multispectral imagery with a high spatial resolution resulting in 3.5 TB of data, while the FMoW-RGB has a much smaller size of

200 GB and includes all multispectral data converted to RGB in JPEG format. Examples of the classes in the dataset include flooded roads, military facilities, airstrips, oil and gas facilities, surface mines, tunnel openings, shipyards, ponds, and towers (see Figure 14 for examples). The FMoW dataset has a number of important characteristics, such as global diversity, multiple images per scene captured at different times, multispec t ral imager y, and metadata linked to each image.

Urban Fabric, Arable Land, Coniferous Forest, Mixed Forest, Transitional Woodland, Shrub

Peatbogs

Permanently Irrigated Permanently Irrigated Discontinuous Urban Land, Estuaries ... Land ... Fabric ...

Sea and Ocean

Pastures, Mixed Forest

Nonirrigated Arable Land ...

Nonirrigated Arable Land ...

Coniferous Forest, Water Bodies ...

FIGURE 13. Sample image patches of several classes in the BigEarthNet-MM dataset with multiple labels being assigned to each image.

78

OBJECT DETECTION The aim of object detection is to locate and identify the presence of one or more objects within an image, including objects with clear boundaries, such as vehicles, ships, and buildings, as well as those with more complex or irregular boundaries, for example, LULC parcels [29]. As seen in Figure 2, object detection is one of the most widely studied research tasks. Figure  15 shows the relative volume of the corresponding datasets. The xView3-SAR is by far the largest one. On the other hand, the family of DOTA datasets has a pioneering role in this field, placing them among the most popular datasets for object recognition tasks from remote sensing images.

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

◗   The xView3-SAR [53] Ship Detection dataset is the largest

labeled dataset (Table 2; 191), as shown in Figure 1, for training ML models to detect and classify ships in SAR images. The dataset includes nearly 1,000 analysis-ready

SAR images from the Sentinel-1 mission, which are annotated using a combination of automated and manual analysis (see Figure 16 for image samples). The images are accompanied by co-located bathymetry and wind

Airport

Airport Hangar

Airport Terminal

Amusement Park

Aquaculture

Border Checkpoint

Burial Site

Car Dealership

Construction Site

Educational Institution

Factory or Power Plant

Electric Substation

Golf Course

Ground Transportation Station

Lighthouse

Military Facility

Place of Worship

Residential Unit

Solar Farm

Tower

Helipad

Nuclear Power Plant

Archaeological Site

Barn

Crop Field

Dam

Debris or Rubble

Fire Station

Flooded Road

Fountain

Gas Station

Hospital

Impoverished Settlement

Interchange

Lake or Pond

Office Building

Oil or Gas Facility

Police Station

Port

Prison

Racetrack

Residential Unit

Road Bridge

Runway

Shipyard

Park

Railway Bridge

Shopping Mall

Space Facility

Stadium

Storage Tank

Surface Mine

Swimming Pool

Tunnel Opening

Waste Disposal

Water Treatment Facility

Wind Farm

Zoo

Parking lot or Garage

Recreational Facility

Smokestack

Tollbooth

FIGURE 14. Sample image patches for several classes from the FMoW dataset. SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

79

Volume (MB)

xView3-SAR

WHUS2-CD+ WHU Building

Kag. Airbus Ship Detection

RampBuilding DOTA v2.0 TopoBoundary

INRIA BIRDSAI

Other DOTA v1.5

COWC VisDrone ArcticSeals

SpaceNet-4

AMD SZTAKI SAR_VD AirChange NWPU -VHR10 NASAdebris AirbusTree Oil Storage Urban Crown Detection EoC AIR-SARShip -2 ALCD Iceberg Detection RSOD WAMI DIRSIG PlanesNet SAR Ship Detection MTARSI HRSID SRSDD -v1.0 SaRNet UAVOD-10 LEVIR-Ship TaS SSDD SIMD DSSDD ITCVD Bridges built-structure -count CARPK NYCPlanimetric AU-AIR SeabirdsDetection DSTL3B MSAR Oil Storage Tanks Satellite Pool Detection ShipRSImageNet SWIM Agriculture-Vision PKLot DroneCrowd AFO EU Flood TGRS-HRRSD DSTL16BRoads VISO Massachusetts NEON Tree HRSC2016 Dataset LS-SSDD

DOTA v1.0

RarePlanes Flying Airplanes

FIGURE 15. Relative volume distribution among the datasets addressing object detection.

state rasters, and the dataset is intended to support the development and evaluation of ML approaches for detecting “dark vessels” not visible to conventional monitoring systems. ◗◗ DOTA [54] is one of the most popular and largest object detection datasets (Table 2; 109 and 114) in terms of labeled object instances. It includes 2,806 images acquired from Google Earth (GE) and the China Center for Resources Satellite Data and Application [31], [32], [33].

VH_dB

VV_dB

The DOTA dataset is available in three different versions: DOTA-v1.0, DOTA-v1.5, and DOTA-v2.0. The image size in the initial version ranges from 800 × 800 pixels to 4,000 × 4,000 pixels, with 188,282 object instances with various scales and angular orientations and a total of 15 object categories. DOTA-v1.5 adds various small objects (fewer than 10 pixels) and a new container crane category with 402,089 instances, whereas DOTA-v2.0 adds two categories, airport and helipad, with 11,268 images and 1,793,658 instances, respectively. Some image samples of the DOTA Mask dataset are presented in Figure 17.

(a) WindDirection

WindQuality

WindSpeed

(b) FIGURE 16. (a) An example image stack of dual-polarimetric SAR images and a water mask and

(b) several wind properties from the xView3-SAR dataset. 80

CD CD in remote sensing aims to identify temporal changes by analyzing multitemporal satellite images of the same location. CD is a popular task in EO as it fosters monitoring environmental changes through artificial or natural phenomena. Figure 2 shows that the number of dedicated CD datasets is small compared to other applications. Figure 18 shows that the available data are dominated by the DFC20 dataset (Track MSD), which focuses on semantic CD, followed by xView2, which tackles building damage assessment in the context of natural disasters. We chose LEVIR-CD as a recent dataset example and the Onera Satellite Change Detection dataset as one of the first large-scale datasets.

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

as new buildings or roads [34]. The locations were selected from around the world, including Brazil, the datasets, as seen in Figure 1 (Table 2; 15). It was mainly United States, Europe, the Middle East, and Asia. The developed for the evaluation of DL algorithms on buildreference data for pixel-level change are provided for ing-related changes, including building growth (the transition from soil/grass/hardened ground, building under construction to new build-up regions) Harbor Large-Vehicle Swimming-pool Baseball-Diamond and building decline [33]. The dataset comprises 637 optical image patch pairings extracted from GE with a resolution of 1,024 # 1,024 pixels acquired over a time span of five to 14 years. LEVIRAirport Ground-Track-Field Ship Storage-Tank CD covers various structures, including villa residences, tall apartments, small garages, and large warehouses, from 20 distinct regions in multiple Texas cities. The fully annotated LEVIR-CD dataset comprises 31,333 distinct Tennis-Court Small-Vehicle Basketball-Court Bridge change-building instances, some of which are illustrated in Figure 19, generated from the bitemporal images by remote sensing image interpretation specialists. ◗◗ Onera Satellite Change Detection [56] is one of the first, larger Helipad Plane Roundabout Soccer-Ball-Field CD datasets, as shown in Figure 1 (Table 2; 11), containing multispectral image pairs from Sentinel-2. This dataset includes 24 pairs of multispectral images acquired from Sentinel-2 satellites between 2015 and 2018, focusing on urban changes such FIGURE 17. Several example image samples from the DOTA dataset. ◗   LEVIR-CD [55] is one of the most recent and largest CD

Volume (MB) DFC21-MSD Synthetic and Real

LEVIR-CD

Other HRSCD SYSU-CD S2Looking

Second

DFC09 DSIFN MtS-WH

S2MTcityPair MUNO21

OneraCD AIST Building CD

AICD

xView2

FIGURE 18. Relative volume distribution among the datasets addressing change detection. SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

81

SUPERRESOLUTION/PANSHARPENING Pansharpening is one of the oldest data fusion approaches in remote sensing and aims to increase the spatial resolution of a multispectral image by combining it with

all 14 training and 10 test image pairs. As illustrated in Figure 20, the annotated changes are primarily associated with urban changes, such as new buildings or roads.

FIGURE 19. Examples of annotated samples from the LEVIR-CD dataset.

Chongqing

Dubai

Lasvegas

Brasilia

Valencia

Milano

FIGURE 20. Examples of annotated samples from the Onera Satellite Change Detection dataset.

82

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

a panchromatic image. Due to the resolution difference between panchromatic and multispectral sensors, pansharpening is an exciting topic in remote sensing as it can provide a means to obtain higher-resolution data without better sensor technology. We select the Proba-V, PAirMax, and WorldStrat datasets as examples to showcase the peculiarities of datasets designed for that particular application. 1) Proba-V [57] is the earliest dataset available for superresolution [35] as Figure 1 shows (Table 2; 318). This dataset contains radiometrically and geometrically corrected Top-Of-Atmosphere reflectances in Plate Carre projection from the PROBA-V mission of the European Space Agency (Figure 21). The RED and near-infrared (NIR) spectral band data at 300 m and 100 m resolution are collected for 74 selected regions across the globe. Superresolution might be affected by the presence of pixels with clouds, cloud shadows, and ice/snow cover. Therefore, this dataset contains a quality map indicating pixels affected by clouds that should not be considered for superresolution. The dataset contains one 100 m resolution image and several 300 m resolution images of the same scene.

2) PAirMax [58] is a recently published (Table 2; 319 and 324) yet popular dataset for pansharpening. It contains 14 multispectral and panchromatic image pairs from six sensors on board satellite constellations of Maxar Technologies and Airbus [36]. As seen in Figure 22, most of the images are acCD IS A POPULAR TASK IN quired over urban areas, EO AS IT FOSTERS highlighting several chalMONITORING lenges for pansharpening ENVIRONMENTAL CHANGES (such as high-contrast feaTHROUGH ARTIFICIAL OR tures and adjacent regions NATURAL PHENOMENA. with different spectral features). The work [36] also discusses the best practices to be followed in preparing high-quality, full, and reduced-resolution images for pansharpening applications. In this dataset, the panchromatic band has 4 × 4-times-more pixels than the multispectral bands. The multispectral bands are near the visible infrared region. The dataset also contains nine reduced-resolution test cases per Wald’s protocol. 3) WorldStrat [59] is a recently introduced dataset for superresolution [37] and the largest in terms of volume

LR000

LR001

LR002

...

LR014

SM

QM000

QM001

QM002

...

QM014

HR

LR000

LR001

LR002

...

LR025

SM

QM000

QM001

QM002

...

QM025

Red

Red

NIR

NIR

HR

FIGURE 21. Sample images from the Proba-V dataset. Each sample consists of one high-resolution (HR) and several low-resolution (LR)

images, each with a quality map (QM) showing which pixels are concealed (e.g., through clouds and so on). SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

83

FIGURE 22. Sample images from PAirMax dataset. (Source: [58].)

(107 GB), as Figure 1 shows (Table 2; 325). This dataset contains high-resolution images from Airbus SPOT 6 and 7 along with 16 temporally matched low-resolution images from Sentinel-2 satellites. The high-resolution images are over five spectral bands: the panchromatic band at a 1.5-m pixel resolution and RGB, and NIR bands at 6 m per pixel. The low-resolution ranges from 10 m per pixel to 60 m per pixel (Figure 23). In total, the dataset covers an area of approximately 10,000 km 2 and attempts to represent all types of land use across the world. Notably, the dataset contains nonsettlement and underrepresented locations such as illegal mining sites, settlements of persons at risk, and so on.

WORKING WITH REMOTE SENSING DATASETS This section provides guidance on how to leverage available datasets to their full potential. Two of the main difficulties caused by information asymmetry (i.e., the information imbalance between the data providers and the data users) [5] have found suitable datasets and easy prototyping of ML approaches using such datasets. Here we discuss resources to gain an overview of existing datasets and download actual data, but we also provide examples of EO-oriented ML programming libraries.

DATA AVAILABILITY Data availability covers two distinct aspects: on the one hand, access to the curated benchmark datasets, i.e., how such datasets are made available to the public. This section prov ides several e xamples of the most common data sources. On the other hand, the actual noncurated measurements as acquired by the different sensors such as satellites, planes, and UAVs are often available too. Many data providers offer either their complete database for public access (e.g., the European Copernicus Open Access Hub [60]) or at least portions of their image acquisitions (e.g., via Open Data Programs such as those from Maxar [61] and Capella Space [62] or through scientific proposals FIGURE 23. Sample images from the WorldStrat dataset. (Source: [37].) 84

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

as possible for TerraSAR-X/TanDEM-X data [63]). In principle, these data sources are highly valuable as they offer free access to petabytes of EO data, which can either be used to compile benchmark datasets by augmenting them with reference data for a specific task, or be leveraged for modern training paradigms such as self-supervised learning. An important aspect that is unfortunately sometimes ignored is the licensing of the corresponding data products: although direct usage (at least for scientific purposes) is usually free, redistribution of these data is often prohibited. Nevertheless, such image data find their way into public benchmark datasets, essentially causing copyright infringements. In addition to being in direct conflict with scientific integrity, such behavior is less likely to be tolerated in the future given the rising commercial value of EO imagery. Creators of future benchmark datasets should be fully aware of the license under which the leveraged EO data are accessible and how it limits their use in public datasets. In parallel, data providers should consider a licensing scheme that allows noncommercial redistribution for scientific/educational purposes to further facilitate the development and evaluation of modern ML techniques. The following is a listing of the most common data sources: ◗◗ IEEE DataPort [64] is a valuable and easily accessible data platform that enables users to store, search, access, and manage data. The platform is designed to accept all formats and sizes of datasets (up to 2 TB), and it provides both downloading capabilities and access to datasets in the cloud. Both individuals and institutions can indefinitely store and make datasets easily accessible to a broad set of researchers, engineers, and industry. In particular, most of the datasets used for previous IEEE GRSS Data Fusion Contests have been published on this platform (see “The Role of the IEEE Geoscience and Remote Sensing Society Data Fusion Contests.”) However, unless the dataset is associated with a competition or being submitted as open access, an IEEE account is required to access and download it. ◗◗ Radiant MLHub [65] enables users to access, store, register, and share high-quality open datasets for training ML models in EO. It’s designed to encourage widespread collaboration and the development of trustworthy applications. The available datasets in this platform cover research topics like crop type classification, flood detection, tropical storm estimation, and so on. ◗◗ Euro Data Cube (EDC) [66] provides a global archive of analysis-ready data (ARD) (Sentinel, Landsat, MODIS, and so forth), where users can search for and order data using the EDC Browser graphical interface (see Figure 24). It also enables users to store, analyze, and distribute user-contributed data content with simple application programming interfaces (APIs). ◗◗ NASA Earthdata [67] provides access to a wide range of remote sensing datasets, including satellite imagery, SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

◗◗

◗◗

◗◗

◗◗

atmospheric data, and land cover data, and offers a number of programming APIs. For example, the Application for E xtracting and E xploring A nalysis Ready Samples API enables users to perform data access and transformation processes easily and efficiently. Maxar Open Data Program [61] provides access to high-resolution remote sensing imagery collected since 2017, amounting to a total area of 1,834,152 km 2. It also features several programming APIs, such as the Maxar Catalog API and the ARD API. In addition, this program seeks to assist the humanitarian community by furnishing essential and useful information to support response efforts when crises occur. OpenAerialMap [68] is a community-driven platform that provides access to openly licensed satellite and UAV imagery from across the globe, accompanied by programming APIs for data access and processing. The platform currently hosts 12,614 images captured by 721 sensors, and all the images can be traced in OpenStreetMap [69]. EarthNets [70] is an interactive data-sharing platform with more than 400 publicly available datasets in the geoscience and remote sensing field, which covers essential EO tasks like land use/cover classification, change/ disaster monitoring, scene understanding, climate change, and weather forecasting [38]. Each benchmark dataset provides detailed meta information like spatial resolution and volume. Moreover, it also supports standard data loaders and codebases for different remote sensing tasks, which enables users to conduct a fair and consistent evaluation of DL methods on the benchmark datasets. EOD [71] is an interactive online platform for cataloging different types of datasets leveraging remote sensing imagery, which is developed by the IEEE GRSS IADF TC [6]. The key feature of EOD is to build a central c­ atalog that provides an exhaustive list of available datasets with their basic information, which can be accessed and extended by the community and queried in a structured

FIGURE 24. An illustration of the EDC Browser, where users can easily search for and order satellite data. (Source: [87].)

85

The Role of the IEEE Geoscience and Remote Sensing Society Data Fusion Contests The IEEE Geoscience and Remote Sensing Society (GRSS) Data Fusion Contest (DFC) has been organized as an annual challenge since 2006 by the IEEE GRSS Image Analysis and Data Fusion (IADF) Technical Committee (TC). The GRSS is an IEEE Society whose stated goal is to bring together researchers and practitioners to monitor and understand Earth’s ecosystems and identify potential risks. The IADF is one of seven GRSS TCs aiming at technical contributions within the scope of geospatial image analysis [e.g., machine learning (ML), deep learning, image and signal processing, and big data) and data fusion (e.g., multisensor, multiscale, and multitemporal data integration). In general, the contest promotes the development of methods for extracting geospatial information from large-scale, multisensor, multimodal, and multitemporal data. It focuses on challenging problem settings and establishes novel benchmark datasets for open problems in remote sensing image analysis (see Table S1). Historically, the DFC developed from less formal activities related to the distribution of datasets between 2002 and 2006 by means of a collaboration between the GRSS and the International Association of Pattern Recognition. In 2006, the first DFC was organized by what was then called the Data Fusion TC of the GRSS. It addressed the fusion of multispectral and panchromatic images, i.e., pansharpening, and provided one of the first public benchmark datasets for ML in remote sensing. Since 2006, various sensors have played a role in the DFC, including optical (SPOT [S1], DEIMOS-2 [S2], WorldView-3 [S3], aerial [S4], [S5], [S6], [S7], [S8], [S9]), multispectral (Landsat [S10], [S11], [S12], [S13], Worldview-2 [S14], [S15], Quickbird [S15], Sentinel-2 [S11], [S13], [S16], and aerial [S12]) as well as hyperspectral images [S7], [S17], [S18], SAR (ERS [S1], [S10], TerraSAR-X [S15], Sentinel-1 [S13], [S16] Gaofen [S9]) and lidar [S5], [S6], [S7], [S14], [S17], [S18] data, but also less common sources of Earth observation (EO) data such as thermal images [S4], digital elevation models [S8], video from space (ISS [S2]),

nighttime images (Visible Infrared Imaging Radiometer Suite [S13]), and OpenStreetMap [S11]. Another meaningful change happened in 2017 when the DFC moved away from providing data over a single region only but instead allowed the training of models over five cities worldwide (Berlin, Hong Kong, Paris, Rome, and Sao Paulo) and testing on four other cities (Amsterdam, Chicago, Madrid, and Xi’an). This enabled the creation of models that generalize to new and unseen geographic areas instead of overfitting to a single location. This commendable trend was continued in 2020 by using SEN12MS (Table 2; 87) [S20] as training data [S16], providing data for the whole state of Maryland, USA, in 2021 (Table 2; 1 and 291) [S12], in total, 19 different conurbations in France in 2022 (Table 2; 241) [S8], and data from 17 cities across six continents in 2023 [S9]. The tasks addressed by the DFC over the years have been dominated by semantic mapping [S3], [S4], [S5], [S6], [S7], [S9], [S10], [S11], [S13], [S17], [S18], [S19] but also include pansharpening (Table 2; 320) [S21], change detection (in the context of floods) (Table 2; 10) [S1] and 3D reconstruction (Table 2; 193) [S3], [S9], modern challenges such as weakly (Table 2; 1/287/291) [S12], [S16] and semisupervised learning (Table 2; 241) [S8] as well as open task contests [S2], [S4], [S14], [S15], which allowed participants to freely explore the potential of new and uncommon EO data. In 2006, seven teams from four different countries participated in the first DFC [S21] despite public contests being a new concept within the EO community. Being organized by an internationally well-connected Society, providing exciting challenges, and establishing new benchmarks led to a quick rise in popularity. From seven teams in four countries in 2006, participation jumped quickly to 21 teams in 2008, 42 teams in 2014, and reached its peak with 148 teams (distributed over four different tracks, however) in 2019. The peak of the popularity was around 2012, when the DFC attracted

TABLE S1. AN OVERVIEW OF THE IEEE GRSS DATA FUSION CONTESTS FROM 2006 TO 2023. YEAR

DATA

GOAL

REFERENCE

2023

Very high-resolution (VHR) optical and SAR satellite images

Classification, height estimation

[S9]

2022

VHR aerial optical images, DEM

Semisupervised learning

[S8]

2021

Multitemporal multispectral (aerial and Landsat-8) imagery

Weakly supervised learning

[S12]

Multispectral (Landsat-8, Sentinel-2), SAR (Sentinel-1), nighttime (Visible Infrared Imaging Radiometer Suite) images

Semantic segmentation

[S13]

2020

Multispectral (Sentinel-2) and SAR (Sentinel-1) imagery

Weakly supervised learning

[S16]

2019

Optical (Worldview-3) images and lidar data

Semantic 3D reconstruction

[S3], [S19]

2018

Multispectral lidar data, VHR optical and hyperspectral imagery

Semantic segmentation

[S7]

2017

Multispectral (Landsat, Sentinel-2) images and OpenStreetMap

Semantic segmentation

[S11]

2016

Very high temporal resolution imagery (DEIMOS-2) and video from space (ISS)

Open for creative ideas

[S2]

2015

Extremely high-resolution lidar and optical data

Semantic segmentation

[S5], [S6]

2014

Coarse resolution thermal/hyperspectral data and VHR color imagery

Semantic segmentation

[S4]

2013

Hyperspectral imagery and a lidar-derived digital surface model

Semantic segmentation

[S18]

2012

VHR optical (QuickBird and WorldView-2), SAR (TerraSAR-X), and lidar data

Open for creative ideas

[S15]

2011

Multiangular optical images (WorldView-2)

Open for creative ideas

[S14]

2009-2010

Multitemporal optical (SPOT) and SAR (ERS) images

Change detection

[S1]

2008

VHR hyperspectral imagery

Semantic segmentation

[S17]

2007

Low-resolution SAR (ERS) and optical (Landsat) data

Semantic segmentation

[S10]

2006

Multispectral and panchromatic images

Pansharpening

[S21]

ERS: European Remote Sensing.

86

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

more than 1,000 registrations for the contest from nearly 80 different countries. Influenced by different factors, including the overwhelming success of datasets (and connected contests) in the Computer Vision community, an increasing number of EO sensors with easier access to their data, as well as improved options for data hosting, the number of available benchmark datasets (that were not always but often connected to a contest) increased dramatically around 2015 (see also Figure 1). Interestingly, the growing influence of Computer Vision on remote sensing, in particular due to deep learning, is also reflected by renaming the Data Fusion Technical Committee to the Image Analysis and Data Fusion (IADF) Technical Committee in 2014. Since then, participation in the DFC has been more or less constant (with a few positive outliers such as 2019), with approximately 40 teams from roughly 20 countries. Another commendable fact is that at the beginning, participants of the DFC were dominated by well-established scientists with solid experience in their respective fields. Although this group still plays a vital role in more recent DFCs, a large number of participants (and winners!) are students. This shows that improved data availability helped lower the starting threshold to analyze various EO data by standardizing data formats, easing access to theoretical knowledge, and open sourcing software libraries and tools.

[S9] C. Persello et al., “2023 IEEE GRSS data fusion contest: Large-scale finegrained building classification for semantic urban reconstruction [Technical Committees],” IEEE Geosci. Remote Sens. Mag., vol. 11, no. 1, pp. 94–97, Mar. 2023, doi: 10.1109/MGRS.2023.3240233. [S10] F. Pacifici, F. Del Frate, W. J. Emery, P. Gamba, and J. Chanussot, “Urban mapping using coarse SAR and optical data: Outcome of the 2007 GRSS data fusion contest,” IEEE Geosci. Remote Sens. Lett., vol. 5, no. 3, pp. 331–335, Jul. 2008, doi: 10.1109/LGRS.2008.915939. [S11] N. Yokoya et al., “Open data for global multimodal land use classification: Outcome of the 2017 IEEE GRSS data fusion contest,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 11, no. 5, pp. 1363–1377, May 2018, doi: 10.1109/JSTARS.2018.2799698. [S12] Z. Li et al., “The outcome of the 2021 IEEE GRSS data fusion contest—Track MSD: Multitemporal semantic change detection,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 15, pp. 1643–1655, Jan. 2022, doi: 10.1109/ JSTARS.2022.3144318. [S13] Y. Ma et al., “The outcome of the 2021 IEEE GRSS data fusion contest - Track DSE: Detection of settlements without electricity,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 14, pp. 12,375–12,385, Nov. 2021, doi: 10.1109/JSTARS.2021.3130446. [S14] F. Pacifici and Q. Du, “Foreword to the special issue on optical multiangular

References

data exploitation and outcome of the 2011 GRSS data fusion contest,” IEEE J.

[S1] N. Longbotham et al., “Multi-modal change detection, application to the

Sel. Topics Appl. Earth Observ. Remote Sens., vol. 5, no. 1, pp. 3–7, Feb. 2012,

detection of flooded areas: Outcome of the 2009–2010 data fusion contest,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 5, no. 1, pp. 331– 342, Feb. 2012, doi: 10.1109/JSTARS.2011.2179638. [S2] L. Mou et al., “Multitemporal very high resolution from space: Outcome of the 2016 IEEE GRSS data fusion contest,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 10, no. 8, pp. 3435–3447, Aug. 2017, doi: 10.1109/ JSTARS.2017.2696823. [S3] S. Kunwar et al., “Large-scale semantic 3D reconstruction: Outcome of the 2019 IEEE GRSS data fusion contest - Part A,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 14, pp. 922–935, 2021, doi: 10.1109/ JSTARS.2020.3032221. [S4] W. Liao et al., “Processing of multiresolution thermal hyperspectral and digital color data: Outcome of the 2014 IEEE GRSS data fusion contest,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 8, no. 6, pp. 2984–2996, Jun. 2015, doi: 10.1109/JSTARS.2015.2420582. [S5] M. Campos-Taberner et al., “Processing of extremely high-resolution LiDAR and RGB data: Outcome of the 2015 IEEE GRSS data fusion contest–

doi: 10.1109/JSTARS.2012.2186733. [S15] C. Berger et al., “Multi-modal and multi-temporal data fusion: Outcome of the 2012 GRSS data fusion contest,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 6, no. 3, pp. 1324–1340, Jun. 2013, doi: 10.1109/ JSTARS.2013.2245860. [S16] C. Robinson et al., “Global land-cover mapping with weak supervision: Outcome of the 2020 IEEE GRSS data fusion contest,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 14, pp. 3185–3199, Mar. 2021, doi: 10.1109/ JSTARS.2021.3063849. [S17] G. Licciardi et al., “Decision fusion for the classification of hyperspectral data: Outcome of the 2008 GRS-S data fusion contest,” IEEE Trans. Geosci. Remote Sens., vol. 47, no. 11, pp. 3857–3865, Nov. 2009, doi: 10.1109/ TGRS.2009.2029340. [S18] C. Debes et al., “Hyperspectral and LiDAR data fusion: Outcome of the 2013 GRSS data fusion contest,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 7, no. 6, pp. 2405–2418, Jun. 2014, doi: 10.1109/JSTARS.2014. 2305441.

part A: 2-D contest,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens.,

[S19] Y. Lian et al., “Large-scale semantic 3-D reconstruction: Outcome of the 2019

vol. 9, no. 12, pp. 5547–5559, Dec. 2016, doi: 10.1109/JSTARS.2016.

IEEE GRSS data fusion contest—Part B,” IEEE J. Sel. Topics Appl. Earth

2569162.

Observ. Remote Sens., vol. 14, pp. 1158–1170, 2021, doi: 10.1109/

[S6] A.-V. Vo et al., “Processing of extremely high resolution LiDAR and RGB data:

JSTARS.2020.3035274.

Outcome of the 2015 IEEE GRSS data fusion contest—Part B: 3-D contest,”

[S20] M. Schmitt, L. H. Hughes, C. Qiu, and X. X. Zhu, “SEN12MS - A curated data-

IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 9, no. 12, pp. 5560–

set of georeferenced multi-spectral sentinel-1/2 imagery for deep learning

5575, Dec. 2016, doi: 10.1109/JSTARS.2016.2581843.

and data fusion,” ISPRS Ann. Photogrammetry Remote Sens. Spatial Inf.

[S7] Y. Xu et al., “Advanced multi-sensor optical remote sensing for urban land use and land cover classification: Outcome of the 2018 IEEE GRSS data fusion contest,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 12, no. 6, pp. 1709–1724, Jun. 2019, doi: 10.1109/JSTARS.2019.2911113. [S8] R. Hänsch et al., “The 2022 IEEE GRSS data fusion contest: Semisupervised learning [Technical Committees],” IEEE Geosci. Remote Sens. Mag., vol. 10, no. 1, pp. 334–337, Mar. 2022, doi: 10.1109/MGRS.2022.3144291.

SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

Sci., vol. IV-2/W7, pp. 153–160, Sep. 2019, doi: 10.5194/isprs-annals-IV2-W7-153-2019. [S21] L. A lparone, L. Wald, J. Chanussot, C. Thomas, P. Gamba, and L. M.

Bruce, “Comparison of pansharpening algorithms: Outcome of the 2006 GRS-S data-fusion contest,” IEEE Trans. Geosci. Remote Sens., vol. 45, no. 10, pp. 3012–3021, Oct. 2007, doi: 10.1109/TGRS.2007. 904923.

87

FIGURE 25. An illustration of the map view of the EOD data catalog.

and interactive manner (see Figure 25 for an example of the user interface). For more information, see “The IEEE Geoscience and Remote Sensing Society Earth Observation Database.” EO-ORIENTED ML LIBRARIES Most of the existing ML libraries are developed for classic computer vision tasks, where the input image is usually single-channel (grayscale) or with RGB bands. EO datasets are often of large volumes with highly diverse data types, different numbers of spectral bands, and spatial resolutions, as illustrated in Figure S1. The code base for processing such data samples is often highly complex and difficult to maintain. One approach to increase readability, reusability, and maintainability is to modularize the code and encapsulate different tasks by decoupling processing the dataset (data loaders, visualization, preprocessing, and so on) and applying ML models (training, prediction, model selection, evaluation, and so forth). A major challenge in training advanced ML models for EO tasks is the implementation of an easy-to-use, yet efficient data loader explicitly designed for geoscience data that loads and preprocesses a complex dataset and produces an iterable list of data samples in a customizable way. This section introduces several well-known packages designed explicitly for geoscience and remote sensing 88

data. As PyTorch [39] and TensorFlow [40] (note that Keras is now a part of TensorFlow) are the most widely used DL frameworks, we focus mainly on the introduction for the existing libraries, using these two DL frameworks as the back end. 1) TorchGeo [72] is an open source PyTorch-based library, which provides datasets, samplers, transforms, and pretrained models specific to geospatial data [41]. The main goal of this library is to simplify the process of interacting with complex geospatial data and make it easier for researchers to train ML models for EO tasks. Figure 26 provides an example of sampling pixel-aligned patch data from heterogeneous geospatial data layers using the TorchGeo package. As different layers usually have different coordinate systems and spatial resolutions, patches sampled from these layers in the same area may not be pixel aligned. Therefore, in a practical application scenario, researchers need to conduct a series of preprocessing operations such as reprojecting and resampling of the geospatial data before training ML models, which is time consuming and laborious. To address this challenge, TorchGeo provides data loaders tailored for geospatial data, which support transparently loading data from heterogeneous geospatial data layers with relatively simple code, per the following example: IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

 1  From torch.utils.data import DataLoader  2  from torchgeo.datasets import CDL, Landsat8, stack _ samples  3  from torchgeo.samplers import RandomGeoSampler  4  5  # Loading Landsat8 and CDL layers  6  landsat8 = Landsat8(root=”…”)   7  cdl = CDL(root=”…”, download=True, checksum=True)  8  9  # Take the intersection of Landsat8 and CDL 10  dataset = landsat8 & cdl 11 12  # Sample 10,000 256 x 256 image patches 13  sampler = RandomGeoSampler(dataset, size = 256, length = 10000) 14 15  # Build a normal PyTorch DataLoader with the sampler 16  dataloader = DataLoader(dataset, batch _ size = 128, sampler=sampler, collate _ fn=stack _ samples) 17 18  for batch in dataloader: 19      image = batch[“image”] 20     mask = batch[“mask”] 21 22  # Train a model 23  …

A more detailed introduction about the supported geospatial datasets in TorchGeo can be found in [41]. 2) RasterVision [73] is an open source Python framework that aims to simplify the procedure for building DLbased computer vision models on satellite, aerial, and other types of geospatial data (including oblique drone imagery). It enables users to efficiently construct a DL pipeline, including training data preparation, model training, model evaluation, and model deployment, without any expert knowledge of ML. Specifically, RasterVision supports chip classification, object detection, and semantic segmentation with the PyTorch back end on both CPUs and GPUs, with built-in support for running in the cloud using AWS. The framework is extensible to new data sources, tasks (e.g., instance segmentation), back ends (e.g., Detectron2), and cloud providers. Figure 27 provides the pipeline of the RasterVision package. A more comprehensive tutorial for this package can be found in [74]. 3) Keras Spatial [75] provides data samplers and tools designed to simplify the preprocessing of geospatial data for DL applications with the Keras back end. It provides a data generator that reads samples directly from raster layers without creating small patch files before runSEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

The IEEE Geoscience and Remote Sensing Society Earth Observation Database The core of the Earth Observation Database (EOD) is its database of user-submitted datasets. It can be searched by a combination of various queries, including location as well as different combinations of sensors and tasks. Search results (as well as the whole database) can be viewed either in an illustrated list view or in an interactive map view, indicating those datasets that cover only a specific location with a marker (see Figure 25). Clicking on one of the markers in the map view will limit the list to datasets at this specific location. Clicking on a dataset in the list will open a new window, displaying detailed information about this dataset, which includes ◗◗ geographic location ◗◗ sensor modality ◗◗ task/application ◗◗ data size ◗◗ a Uniform Resource Locator to access the data ◗◗ number of views ◗◗ a graphical representation ◗◗ a brief description of the dataset. A helpful function is the compare option, which allows a direct side-to-side comparison of this information from two or more datasets in a newly opened window. EOD is meant as a community-driven catalog maintained by the IEEE Geoscience and Remote Sensing Society Image Analysis and Data Fusion (IADF) Technical Committee, i.e., adding datasets is neither limited to IADF members nor to the creators of the dataset, but to anybody with an interest who increases the visibility and accessibility of a certain dataset can request to include it in EOD.The submission requests are reviewed by the IADF for completeness and correctness before the dataset is added to the database and visible to the public.

ning the model. It supports loading raster data from local files or cloud services. Necessary preprocessing, like reprojecting and resampling, is also conducted automatically. Keras Spatial supports sample a­ugmentation Landsat 8 Scene C

A

EPSG: 32617

D

B Cropland Data Layer EPSG: 5072

FIGURE 26. An illustration of sampling from heterogeneous

geospatial data layers. (Source: [41].) With the TorchGeo package, users can focus directly on training ML models without manually preprocessing the heterogeneous geospatial data, including aligning data layers by reprojection and resampling. 89

using a user-defined callback system to improve the flexibility of data management. Here, a simple demo code using the SpatialDataGenerator class from Keras Spatial to prepare a training set for a DL model is given:  1  from keras _ spatial import SpatialDataGenerator  2  3  # Loading labels from a local file  4  labels = SpatialDataGenerator()  5  labels.source = ‘/path/to/labels.tif’  6  # Sample 128 x 128 patches  7  labels.width, labels.height = 128, 128  8  # Set a geodataframe with 200 x 200  9  # in projection units of the original raster 10  df = labels.regular _ grid(200,200) 11 12  # Loading images from a file on the cloud 13  samples = SpatialDataGenerator() 14  samples.source = ‘https://server.com/ files/data.tif’ 15  samples.width, samples.height = labels. width, label.height 16 17  # The training set generator 18 train _ gen = zip(labels.flow _ from _ dataframe(df), patches.flow _ from _ dataframe(df)) 19 20  # Train a model 21 model(train _ gen) GEOSPATIAL COMPUTING PLATFORMS In addition to ML libraries, public geospatial computing platforms such as Google Earth Engine (GEE) for EO tasks in practical application scenarios offer a series of benefits, including the following: ◗◗ Access to large-scale datasets: geospatial computing platforms usually provide access to large and diverse geospatial datasets, such as satellite imagery, weather data, and terrain data.

These datasets may be time consuming and expensive to acquire on one’s own but can be easily accessed through a geospatial computing platform using the cloud service. ◗◗ Scalability: geospatial computing platforms are designed to address large-scale geospatial data processing and analysis. They usually provide cloud-based computing resources that can be easily scaled up or down to meet researchers’ needs. This makes it easier to perform complex geospatial analysis that would be difficult to do on local machines considering the limited computing resources. ◗◗ Prebuilt tools and APIs: geospatial computing platforms usually provide prebuilt tools and programming APIs for image processing, geocoding, and data visualization, making it much easier for researchers to work with geospatial data. Several representative geospatial computing platforms are as follows: ◗◗ GEE [76] is a cloud-based platform designed for largescale geospatial data analysis and processing. It provides a range of tools and APIs that allow users to analyze and visualize geospatial data, including raster and vector processing tools, ML algorithms, and geospatial modeling tools. In addition, GEE provides access to powerful computing resources, such as virtual machines and storage, to enable users to perform complex geospatial analyses. The GEE Data Catalog contains more than 40 years of historical imagery and scientific datasets for Earth science, which are updated and expanded daily. These datasets cover various topics such as climate, weather, surface temperature, terrain, and land cover. Notable datasets available on the GEE Data Catalog include Planet SkySat Public Ortho Imagery (collected for crisis response events) [42] and NAIP [77] (agricultural monitoring data in the United States). ◗◗ AWS [78] is a powerful platform for geospatial computing that offers a range of services for geospatial data storage, processing, and analysis. AWS hosts several representative geospatial datasets, including Digital Earth Africa [79] (Landsat and Sentinel products over Africa), National Oceanic and Atmospheric Administration Emergency Response Imagery [80] (lidar and hyperspectral data over the United States), and datasets from the SpaceNet

Deployment Inputs

RasterVision Pipeline

Images ANALYZE

CHIP

TRAIN

PREDICT

EVAL

BUNDLE

Live Predictions

Labels Dataset Metrics AOI

Training Chips

Training Model

Validation Predictions

Evaluation Metrics

Batch Predictions

Model Bundle

Custom Integrations

FIGURE 27. An illustration of the pipeline from the RasterVision package. AOI: area of interest. (Source: https://github.com/azavea/raster-vision.)

90

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

challenges (Table 2; 134, 264, 298, 299, 300, 304, 306, and 312) [43]. Sharing data on AWS allows anyone to analyze them and build services using a broad range of computing and data analytical products like Amazon EC2, which enables data users to spend more time on data analysis rather than data acquisition. ◗◗ Microsoft Planetary Computer [81] provides access to a wide range of EO data and powerful geospatial computing resources. The platform is specifically designed to support the development of innovative applications and solutions for addressing critical environmental challenges, such as climate change, biodiversity loss, and natural resource management. It offers cloud-based computing services that enable users to perform complex geospatial analyses efficiently. The Planetary Computer Data Catalog provides access to petabytes of environmental monitoring data in a consistent and analysis-ready format. Some of the representative datasets available on the Planetary Computer Data Catalog include the HREA dataset [82] (settlement-level measures of electricity access derived from satellite images) and the Microsoft Building Footprints dataset [83]. ◗◗ Colab [84] provides a flexible geospatial computing platform that offers users access to free GPU and tensor processing units (TPUs) computing resources for analyzing geospatial data. It provides a free Jupyter notebook environment that allows users to write and run Python code for accessing and processing geospatial data from various platforms (e.g., the GEE Data Catalog). Colab notebooks can be easily shared and collaborated on with others, which is particularly useful for geospatial analysis projects that involve multiple team members. ◗◗ Kaggle [86] is an online platform widely used for data science and ML competitions. It provides a cloud-based computational environment that allows for reproducible and collaborative analysis in the field of geospatial computing. Kaggle also provides access to a variety of geospatial datasets, including satellite imagery, terrain data, and weather data, along with tools to analyze and visualize these datasets. Users can take advantage of the platform’s free GPU and TPU computing resources to train ML models and undertake complex geospatial analysis tasks. OPEN CHALLENGES AND FUTURE DIRECTIONS The previous sections give an overview of existing benchmarking datasets, presenting their main features and describing their main characteristics, eventually providing a broad yet detailed picture of the current state of the art. This section discusses existing gaps, open challenges, and potential future trends. WHERE ARE WE NOW? The most prominent issue that new dataset releases aim to fix is a lack of diversity. Many of the earlier datasets contain SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

samples that are extremely limited in their spatial and temporal distribution, to the extreme of consisting of a single, small image only. Not only are such datasets prone to leading to biased evaluation protocols, where information from the training set leaks into the test set via spatial correlation, they are usually also not sufficient to train models that are able to generalize to other geographic areas or time points (e.g., difTHE GEE DATA CATALOG ferent seasons). More modCONTAINS MORE THAN 40 ern datasets aim to increase YEARS OF HISTORICAL diversity regarding scene IMAGERY AND SCIENTIFIC content (e.g., more different DATASETS FOR EARTH object instances), environSCIENCE, WHICH ARE mental factors (e.g., seasons, UPDATED AND EXPANDED light conditions, and so on), DAILY. or aspects regarding image acquisition/processing (e.g., different look angles, resolutions, and so forth). This increase of diversity has thus far always connected to an increase in dataset size by including more images and/or larger scenes, or even other modalities. Although an increase in the provided image data is often easily possible, it is usually not feasible to have the same increase in the reference data. This leads to large-scale datasets where the reference data are much less carefully curated as for early datasets, often based on other existing sources (e.g., OpenStreetMap) and containing more label noise. Although many ML methods can handle a certain extent of label noise during training, its effects on the evaluation are barely understood and often ignored. CURRENT TRENDS: SPECIFICITY AND GENERALITY In this context, two main characteristics of a dataset for a given task will be the focus of the discussion: specificity and generality. EO applications present numerous and diverse scenarios due to varying object types, semantic classes, sensor types and modalities, spatial, spectral, and temporal resolutions, and coverage (global, regional, or local). High specificity refers to datasets that are strongly tailored to a specific sensor-platform-task combination, maybe even being limited to certain acquisition modes, geographic regions, or meteorological seasons. These can hardly be used for anything beyond their original purpose. Although different application domains such as agriculture, urban mapping, military target detection, and bio-/geophysical parameter extraction do require different types of data, i.e., images, point measurements, tables, and metadata, the proliferation of different datasets specialized for every small task variation reduces reusability of the datasets in a broader context and affects scientific transparency and reproducibility. High specificity also contributes to cluttered nomenclature, causing different datasets to appear different while 91

actually sharing very similar content. For example, a high level of detail in class semantics and terminology makes it difficult to compare the reference data of different datasets. A typical example is land cover classification, where similar classes may be aggregated into different subcategories depending on the application. As a result, models trained on different application-specific datasets may actually approximate very similar functional relationships between image data and target variables. Virtually all of the less recent and still most of the modern datasets aim for specificity. However, several of the more recent benchmarks follow another direction: generality, i.e., providing more sensor modalities than actually required plus large-scale, often noisy reference data for multiple tasks instead of small scale and carefully curated annotations that only address a single task. The contribution of such general datasets is manifold: first and foremost, the required number of (annotated) training samples for fully supervised ML simply does not scale very well given the effort of data annotation and curation in remote sensing. Thus, such general, large-scale datasets introduce new factors that increase the relation to realistic application scenarios such as robustness to label noise (e.g., by leveraging existing semantic maps as reference data, which are often outdated, misaligned, or of coarse resolution) and weakly supervised learning (where the reference data have a lower level of detail than the actual target variable, e.g., training semantic segmentation networks with labels on image level). Large-scale datasets are the only option to realistically test the generalization capabilities of learning-based models, e.g., over different geographic regions, seasons, or other domain

The Ideal Pretraining Dataset A dataset ideally suited for pretraining and/or self-supervised learning should adhere to as many of the following characteristics as possible: ◗◗ multiple platforms (vehicle, drone, airplane, and satellite) ◗◗ multiple sensors (Planet, SPOT, WorldView, Landsat, Sentinel 1/2, and so forth) ◗◗ several acquisition modalities (SAR, RGB, hyperspectral, multispectral, thermal, lidar, passive microwave, and so on) ◗◗ diverse acquisition geometries (viewing angles, e.g., off-nadir conditions and spatial and temporal baselines in multiview data, e.g., interferometric SAR) ◗◗ realistic distortion factors (cloud cover, dust, smog, fog, atmospheric influence, spatial misalignments and temporal changes in multiview data, and so forth) ◗◗ well distributed geographical locations (spatial distribution within the dataset, climate zones, socioeconomic and cultural factors, and different topographies) ◗◗ diverse land cover/use (urban, rural, forest, agricultural, water, and so on) ◗◗ varying spatial resolution (0.1–1 m, 3–10 m, 10–30 m, 100–500 m, and scale distribution) ◗◗ temporally well distributed (seasonality, lighting condition, sun angle, and nighttime imagery) ◗◗ a diverse set of reference data that are well aligned with the EO measurements (semantic information, change, geo/biophysical parameters, and so forth).

92

gaps. Furthermore, although multimodal datasets enable data-fusion and cross-modal approaches that leverage the different input sources, multitask datasets allow exploiting the mutual overlap of related tasks regarding feature extraction and representation learning. Finally, the idea of loosening the previously tight relationship between input data and the target variable in datasets (up to the point where a dataset might not offer reference data for any target variable) is to provide data that can be leveraged to learn powerful representations that are useful for a large variety of downstream tasks (as in pretraining or self-supervised learning). However, there is not yet a single “go-to” dataset that can be used for pretraining most of the newly developed models or for benchmarking specific tasks against state-of-theart approaches. Collecting such a high-quality benchmark dataset that enables pretraining of models for as many downstream tasks as possible is of significant value for further pushing performance boundaries. Figure 28 presents a schematic diagram of the properties of an ideal solution for a go-to EO benchmark dataset, covering diverse geolocations, multiple modalities, different acquisition scenarios, and various applications. It is ideally acquired by different types of sensors and platforms with different viewing geometries to cover objects from different look angles. The images are obtained from different electromagnetic spectrum bands, i.e., visible, infrared, thermal, and microwave, resulting in multi-/hyperspectral, SAR, lidar, optical, thermal, and passive microwave measurements. The reference information or annotations are provided on a level that allows defining various tasks based on a single annotation. For example, an image with dense semantic annotations allows users to generate their desired object instance annotation files. Extending the dataset to multiple images of a scene with corresponding semantic labels enables not only semantic segmentation but also semantic CD tasks. In summary, we foresee a certain duality in the future development of EO-related datasets: on the one hand, following the paradigm of data-centric ML [44], i.e., moving against the current trend of creating performance gains merely by leveraging more training data but instead focusing on datasets tailored toward specific problems with well-curated, highquality reference data (e.g., manually annotated or based on official sources). On the other hand, general datasets that cover as many input and output modalities as possible to allow learning generic representations that are of value for a large number of possible downstream tasks. FINDABILITY, ACCESSIBILITY, INTEROPERABILITY, REUSE AND ARD In addition to the content, scope, and purpose of datasets, their organization will gain importance. With only a dozen public datasets available prior to 2015, it was feasible that each is provided with its own data format and meta-information, hosted on individual web pages, and IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

Pansharpening/Superresolution Multiresolution Data Applications

Multi-Modal Data Applications

SAR

Lidar

Different Satellites Different Viewing Angles

Different Geolocations

Multi-/Hyperspectral t1

Unsupervised CD Time-Series Analysis Applications

Annotations

tn

Spring Summer Autumn Winter Different/Multiple Times of the Year

Object Detection Semantic Segmentation Instance Segmentation

Supervised CD FIGURE 28. An illustration that shows the authors’ view of the paramount properties that an ideal benchmark dataset needs to satisfy,

including the type of tasks, sensors, temporal constraints, and geolocalization. downloaded by everyone who wanted to work with them. With the hundreds of datasets available today and many more published every year, this cannot be maintained. Concepts such as findability, accessibility, interoperability, and reuse (FAIR) (see, for example, [45]) were proposed years ago and are still of high relevance. Data catalogs such as EOD (see the “Working With Remote Sensing Datasets” section) are a first step toward structure datasets that are scattered among different data hosts. ARD (see, e.g., [46]), for example, in the form of data cubes [47], and efforts to homogenize meta-information, e.g., in the form of datasheets [48], will continue to evolve into new standardized data formats. The trend of datasets growing in terms of size and volume (see the “Evolution of EOOriented ML Datasets” section) as well as the need for global data products will soon put a stop to the current practice of downloading datasets and processing them locally. Working with data on distributed cloud services will create new requirements regarding data formats but also lead to new standards and best practices for processing. Finally, the goal of any ML-centered dataset is to train an ML model. These models should be treated similar to the data they originated from, i.e., they should follow standardized data formats and FAIR principles. SUMMARY AND CONCLUSION This article discussed the relevance of ML-oriented datasets in the technical disciplines of EO and remote sensing. An analysis of historical developments shows that the DL SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

boom has not only led to a rise in dataset numbers but also a large increase in size (as in spatial coverage and resolution but also with respect to multimodal and multitemporal imagery) and a diversity of application tasks. Furthermore, this development has led to the implementation of dedicated software packages and meta databases that help interested users develop solutions for their applications. Eventually, we drew the conclusion that one of the critical challenges in dataset design for EO tasks is the strong heterogeneity of possible sensors, data, and applications, which has led to a jungle of narrow-focused datasets. Although one of the trends in DL is certainly the consideration of smaller, wellcurated, and task-specific datasets, another direction is the creation of a generic, nearly sensor- and task-agnostic database similar to the well-known ImageNet dataset used in Computer Vision. Such a generic dataset will be especially valuable in the pretraining of large high-capacity models with worldwide applicability. AUTHOR INFORMATION MICHAEL SCHMITT ([email protected]) received his Dipl.-Ing. degree in geodesy and geoinformation, his Dr.-Ing. Degree in remote sensing, and his habilitation in data fusion from the Technical University of Munich (TUM), Germany, in 2009, 2014, and 2018, respectively. Since 2021, he has been a full professor for Earth observation at the Department of Aerospace Engineering of the University of the Bundeswehr Munich, 85577 Neubiberg, Germany. Before that, he was a professor of applied geodesy 93

and remote sensing at the Department of Geoinformatics, Munich University of Applied Sciences. From 2015 to 2020, he was a senior researcher and deputy head at the Professorship for Signal Processing in Earth Observation at TUM; in 2019, he was additionally appointed adjunct teaching professor at the Department of Aerospace and Geodesy of TUM. In 2016, he was a guest scientist at the University of Massachusetts, Amherst. He is a cochair of the Active Microwave Remote Sensing Working Group of the International Society for Photogrammetry and Remote Sensing, and also of the Benchmarking Working Group of the IEEE Geoscience and Remote Sensing Society Image Analysis and Data Fusion Technical Committee. He frequently serves as a reviewer for a number of renowned international journals and conferences and has received several Best Reviewer Awards. His research focuses on technical aspects of Earth observation, in particular image analysis and machine learning applied to the extraction of information from multisensor remote sensing observations. Among his core interests is remote sensing data fusion with a focus on synthetic aperture radar and optical data. He is a Senior Member of IEEE. SEYED ALI AHMADI ([email protected]) received his B.Sc. degree in surveying engineering and his M.Sc. degree in remote sensing from the Faculty of Geodesy and Geomatics, K.N. Toosi University of Technology, Tehran 19697, Iran, in 2015 and 2017, respectively, where he is currently pursuing his Ph.D. thesis on building damage assessment. He worked on image classification and segmentation techniques, machine learning algorithms, and lidar data processing. His thesis was focused on classifying hyperspectral and lidar datasets by combining spectral and spatial features to increase classification accuracy. He is a cochair of the Benchmarking Working Group of the IEEE Geoscience and Remote Sensing Society Image Analysis and Data Fusion Technical Committee, frequently serves as a reviewer for a number of international journals, and received a Best Reviewer Award in 2018. His research interests include machine learning, deep learning, geospatial data analysis, image processing, and computer vision techniques for remote sensing and Earth observation applications. YONGHAO XU ([email protected]) received his B.S. and Ph.D. degrees in photogrammetry and remote sensing from Wuhan University, China, in 2016 and 2021, respectively. He is currently a postdoctoral researcher at the Institute of Advanced Research in Artificial Intelligence, 1030 Vienna, Austria. His research interests include remote sensing, computer vision, and machine learning. He is a Member of IEEE. GÜLS,EN TAS,KIN ([email protected]) received her B.S. degree in geomatics engineering and her M.S. and Ph.D. degrees in computational science and engineering from Istanbul Technical University, Turkey, in 2001, 2003, and 2011, respectively. She is currently an associate professor at the Institute of Disaster Management 94

at Istanbul Technical University Istanbul, 34469 Turkey. She was a visiting scholar at the School of Electrical and Computer Engineering and the School of Civil Engineering at Purdue University from 2008 to 2009 and 2016 to 2017. She is a reviewer for Photogrammetric Engineering and Remote Sensing, IEEE Transactions on Geoscience and Remote Sensing, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, IEEE Geoscience and Remote Sensing Letters, and IEEE Transactions on Image Processing. Her current research interests include machine learning approaches in hyperspectral image analysis, dimensionality reduction, explainable artificial intelligence, and sensitivity analysis. UJJWAL VERMA ([email protected]) received his Ph.D. degree from Télécom ParisTech, University of Paris-Saclay, Paris, France, in image analysis and his M.S. degree (Research) from IMT Atlantique (France) in signal and image processing. He is currently an associate professor and head of the Department of Electronics and Communication Engineering at Manipal Institute of Technology, Bengaluru 560064, India. He is a recipient of the ISCA Young Scientist Award 2017–2018 by the Indian Science Congress Association, a professional body under the Department of Science and Technology, Government of India. He is also a recipient of the Young Professional Volunteer Award 2020 by the IEEE Mangalore Subsection in recognition of his outstanding contribution to IEEE activities. He is a co-lead of the Working Group on Machine/Deep Learning for Image Analysis of the Image Analysis and the IEEE Geoscience and Remote Sensing Society Data Fusion Technical Committee. He is a guest editor for Special Stream in IEEE Geoscience and Remote Sensing Letters and a reviewer for several journals including IEEE Transactions on Image Processing, IEEE Transactions on Geoscience and Remote Sensing, IEEE Geoscience and Remote Sensing Letters. He is also a sectional recorder for the Information and Communication Technology Section of the Indian Science Congress Association for 2020–2024. His research interests include computer vision and machine learning, focusing on variational methods in image segmentation, deep learning methods for scene understanding, and semantic segmentation of aerial images. He is a Senior Member of IEEE. FRANCESCOPAOLO SICA (francescopaolo.sica@unibw. de) received his laurea (M.S.) degree (summa cum laude) in telecommunications engineering and his Dr. Ing. (Ph.D.) degree in information engineering from the University of Naples Federico II, Italy, in 2012 and 2016, respectively. Since 2022, he has been deputy head of the Earth Observation Laboratory at the Department of Aerospace Engineering of the University of the Bundeswehr Munich, 85577 Neubiberg, Germany. Between 2016 and 2022, he was a researcher at the German Aerospace Center. He received a Living Planet Post-Doctoral Fellowship from the European Space Agency for the High-Resolution Forest Coverage with InSAR & Deforestation Surveillance) project. He is a cochair of the Benchmarking Working IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

Group of the IEEE Geoscience and Remote Sensing Society Image Analysis and Data Fusion Technical Committee and a regular reviewer for international journals and conferences. His research interests cover a wide range of activities related to synthetic aperture radar (SAR) technology, from mission design to SAR signal and image processing, to end-user applications. He is a Member of IEEE. RONNY HÄNSCH ([email protected]) received his diploma in computer science and his Ph.D. degree from TU Berlin, Germany, in 2007 and 2014, respectively. He is a scientist at the Microwave and Radar Institute of the German Aerospace Center, 82234 Weßling, Germany, where he leads the Machine Learning Team in the Signal Processing Group of the SAR Technology Department. He continues to lecture at TU Berlin in the Computer Vision and Remote Sensing Group. He serves as chair of the IEEE Geoscience and Remote Sensing Society Image Analysis and Data Fusion Technical Committee, cochair of the ISPRS Working Group on Image Orientation and Fusion, IEEE Geoscience and Remote Sensing Society (GRSS) membership chair, organizer of the IEEE Conference on Computer Vision and Pattern Recognition Workshops with EarthVision (2017–2023), Photogrammetric Computer Vision (2019 and 2023), the Machine Learning for Remote Sensing Data Analysis Workshop at the International Conference on Learning Representations (2023), and the IEEE International Geoscience and Remote Sensing Symposium Tutorial on Machine Learning in Remote Sensing (2017–2023). He also serves as editor of IEEE Geoscience and Remote Sensing Society eNewsletter and associate editor of Geoscience and Remote Sensing Letters and ISPRS Journal of Photogrammetry and Remote Sensing. He has extensive experience in organizing remote sensing community competitions (e.g., the IEEE GRSS Data Fusion Contest 2018–2023), serves as the GRSS representative within SpaceNet, and was technical lead of the SpaceNet 8 Challenge. His research interest is computer vision and machine learning with a focus on remote sensing (in particular, synthetic aperture radar processing and analysis). He is a Senior Member of IEEE. REFERENCES [1] G. Cheng, J. Han, and X. Lu, “Remote sensing image scene classification: Benchmark and state of the art,” Proc. IEEE, vol. 105, no. 10, pp. 1865–1883, Oct. 2017, doi: 10.1109/JPROC.2017.2675998. [2] D. Hong, J. Hu, J. Yao, J. Chanussot, and X. X. Zhu, “Multimodal remote sensing benchmark datasets for land cover classification with a shared and specific feature learning model,” ISPRS J. Photogrammetry Remote Sens., vol. 178, nos. 9–10, pp. 68–80, Aug. 2021, doi: 10.1016/j.isprsjprs.2021.05.011. [3] K. Li, G. Wan, G. Chen, L. Meng, and J. Han, “Object detection in optical remote sensing images: A survey and a new benchmark,” ISPRS J. Photogrammetry Remote Sens., vol. 159, pp. 296– 307, Jan. 2020, doi: 10.1016/j.isprsjprs.2019.11.023. [4] Y. Long et al., “On creating benchmark dataset for aerial image interpretation: Reviews, guidances, and million-AID,” IEEE J. SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

Sel. Topics Appl. Earth Observ. Remote Sens., vol. 14, pp. 4205– 4230, Apr. 2021, doi: 10.1109/JSTARS.2021.3070368. [5] M. Schmitt, S. A. Ahmadi, and R. Hänsch, “There is no data like more data - Current status of machine learning datasets in remote sensing,” in Proc. Int. Geosci. Remote Sens. Symp., 2021, pp. 1206–1209, doi: 10.1109/IGARSS47720.2021.9555129. [6] M. Schmitt, P. Ghamisi, N. Yokoya, and R. Hänsch, “EOD: The IEEE GRSS Earth observation database,” in Proc. Int. Geosci. Remote Sens. Symp., 2022, pp. 5365–5368, doi: 10.1109/ IGARSS46834.2022.9884725. [7] Y. Yang and S. Newsam, “Bag-of-visual-words and spatial extensions for land-use classification,” in Proc. SIGSPATIAL Int. Conf. Adv. Geographic Inf. Syst., 2010, pp. 270–279, doi: 10.1145/ 1869790.1869829. [8] F. Rottensteiner et al., “The ISPRS benchmark on urban object classification and 3D building reconstruction,” ISPRS Ann. Photogrammetry Remote Sens. Spatial Inf. Sci., vol. I-3, no. 1, pp. 293–298, Sep. 2012, doi: 10.5194/isprsannals-I-3-293-2012. [9] P. Ghamisi and N. Yokoya, “IMG2DSM: Height simulation from single imagery using conditional generative adversarial net,” IEEE Geosci. Remote Sens. Lett., vol. 15, no. 5, pp. 794–798, May 2018, doi: 10.1109/LGRS.2018.2806945. [10] C. Benedek, X. Descombes, and J. Zerubia, “Building development monitoring in multitemporal remotely sensed image pairs with stochastic birth-death dynamics,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 34, no. 1, pp. 33–50, Jan. 2012, doi: 10.1109/TPAMI.2011.94. [11] L. Zhang, L. Zhang, and B. Du, “Deep learning for remote sensing data: A technical tutorial on the state of the art,” IEEE Geosci. Remote Sens. Mag. (replaced Newsletter), vol. 4, no. 2, pp. 22–40, Jun. 2016, doi: 10.1109/MGRS.2016.2540798. [12] X. X. Zhu et al., “Deep learning in remote sensing: A comprehensive review and list of resources,” IEEE Geosci. Remote Sens. Mag. (replaced Newsletter), vol. 5, no. 4, pp. 8–36, Dec. 2017, doi: 10.1109/MGRS.2017.2762307. [13] Q. Yuan et al., “Deep learning in environmental remote sensing: Achievements and challenges,” Remote Sens. Environ., vol. 241, Feb. 2020, Art. no. 111716, doi: 10.1016/j.rse.2020.111716. [14] G. Christie, N. Fendley, J. Wilson, and R. Mukherjee, “Functional map of the world,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 6172–6180, doi: 10.1109/CVPR. 2018.00646. [15] I. Kotaridis and M. Lazaridou, “Remote sensing image segmentation advances: A meta-analysis,” ISPRS J. Photogrammetry Remote Sens., vol. 173, pp. 309–322, Mar. 2021, doi: 10.1016/j. isprsjprs.2021.01.020. [Online]. Available: https://www.science direct.com/science/article/pii/S0924271621000265 [16] M. Schmitt, L. H. Hughes, C. Qiu, and X. X. Zhu, “SEN12MS – A curated dataset of georeferenced multi-spectral sentinel-1/2 imagery for deep learning and data fusion,” ISPRS Ann. Photogrammetry Remote Sens. Spatial Inf. Sci., vol. IV-2/W7, pp. 153–160, Sep. 2019, doi: 10.5194/isprs-annals-IV-2-W7-153-2019. [17] M. Schmitt and Y.-L. Wu, “Remote sensing image classification with the sen12ms dataset,” ISPRS Ann. Photogrammetry, Remote Sens. Spatial Inf. Sci., vol. V-2-2021, pp. 101–106, Apr. 2021, doi: 10.5194/isprs-annals-V-2-2021-101-2021.

95

[18] F. Rottensteiner, G. Sohn, M. Gerke, and J. D. Wegner, “ISPRS test project on urban classification and 3D building reconstruction - 2D semantic labeling - Vaihingen data,” Int. Soc. Photogrammetry Remote Sens., Hannover, Germany, 2013. Accessed: Oct. 14, 2022. [Online]. Available: https://www.isprs.org/educa tion/benchmarks/UrbanSemLab/2d-sem-label-vaihingen.aspx [19] M. Cramer, “The DGPF-Test on digital airborne camera evaluation - Overview and test design,” Photogrammetrie Fernerkundung Geoinf., vol. 2010, no. 2, pp. 73–82, May 2010, doi: 10.1127/1432-8364/2010/0041. [20] Y. Wang and M. Li, “Urban impervious surface detection from remote sensing images: A review of the methods and challenges,” IEEE Geosci. Remote Sens. Mag., vol. 7, no. 3, pp. 64–93, Sep. 2019, doi: 10.1109/MGRS.2019.2927260. [21] D. Wen et al., “Change detection from very-high-spatial-resolution optical remote sensing images: Methods, applications, and future directions,” IEEE Geosci. Remote Sens. Mag., vol. 9, no. 4, pp. 68–101, Dec. 2021, doi: 10.1109/MGRS.2021.3063465. [22] A. Karpatne, Z. Jiang, R. R. Vatsavai, S. Shekhar, and V. Kumar, “Monitoring land-cover changes: A machine-learning perspective,” IEEE Geosci. Remote Sens. Mag., vol. 4, no. 2, pp. 8–21, Jun. 2016, doi: 10.1109/MGRS.2016.2528038. [23] P. Helber, B. Bischke, A. Dengel, and D. Borth, “EuroSAT: A novel dataset and deep learning benchmark for land use and land cover classification,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 12, no. 7, pp. 2217–2226, Jul. 2019, doi: 10.1109/ JSTARS.2019.2918242. [24] P. Helber, B. Bischke, A. Dengel, and D. Borth, “Introducing EuroSAT: A novel dataset and deep learning benchmark for land use and land cover classification,” in Proc. Int. Geosci. Remote Sens. Symp., 2018, pp. 204–207, doi: 10.1109/IGARSS. 2018.8519248. [25] S. A. Yamashkin, A. A. Yamashkin, V. V. Zanozin, M. M. Radovanovic, and A. N. Barmin, “Improving the efficiency of deep learning methods in remote sensing data analysis: Geosystem approach,” IEEE Access, vol. 8, pp. 179,516–179,529, Sep. 2020, doi: 10.1109/ACCESS.2020.3028030. [26] C. Broni-Bediako, Y. Murata, L. H. Mormille, and M. Atsumi, “Searching for CNN architectures for remote sensing scene classification,” IEEE Trans. Geosci. Remote Sens., vol. 60, pp. 1–13, 2022, doi: 10.1109/TGRS.2021.3097938. [27] G. Sumbul et al., “BigEarthNet-MM: A large-scale, multimodal, multilabel benchmark archive for remote sensing image classification and retrieval [Software and Data Sets],” IEEE Geosci. Remote Sens. Mag., vol. 9, no. 3, pp. 174–180, Sep. 2021, doi: 10.1109/MGRS.2021.3089174. [28] U. Chaudhuri, S. Dey, M. Datcu, B. Banerjee, and A. Bhattacharya, “Interband retrieval and classification using the multilabeled sentinel-2 BigEarthNet archive,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 14, pp. 9884–9898, Sep. 2021, doi: 10.1109/JSTARS.2021.3112209. [29] G. Cheng and J. Han, “A survey on object detection in optical remote sensing images,” ISPRS J. Photogrammetry Remote Sens., vol. 117, pp. 11–28, Jul. 2016, doi: 10.1016/j.isprsjprs.2016.03.014. [Online]. Available: https://www.sciencedirect.com/science/ article/pii/S0924271616300144

96

[30] G.-S. Xia et al., “DOTA: A large-scale dataset for object detection in aerial images,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2018. [31] J. Ding, N. Xue, Y. Long, G.-S. Xia, and Q. Lu, “Learning RoI transformer for oriented object detection in aerial images,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Jun. 2019, pp. 2844–2853, doi: 10.1109/CVPR.2019.00296. [32] J. Ding et al., “Object detection in aerial images: A large-scale benchmark and challenges,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 11, pp. 7778–7796, Nov. 2022, doi: 10.1109/ TPAMI.2021.3117983. [33] H. Chen and Z. Shi, “A spatial-temporal attention-based method and a new dataset for remote sensing image change detection,” Remote Sens., vol. 12, no. 10, May 2020, Art. no. 1662, doi: 10.3390/rs12101662. [Online]. Available: https://www.mdpi. com/2072-4292/12/10/1662 [34] R. C. Daudt, B. Le Saux, A. Boulch, and Y. Gousseau, “Urban change detection for multispectral earth observation using convolutional neural networks,” in Proc. Int. Geosci. Remote Sens. Symp., Jul. 2018, pp. 2115–2118, doi: 10.1109/IGARSS.2018.8518015. [35] M. Märtens, D. Izzo, A. Krzic, and D. Cox, “Super-resolution of PROBA-v images using convolutional neural networks,” 2019. [Online]. Available: https://arxiv.org/abs/1907.01821 [36] G. Vivone, M. Dalla Mura, A. Garzelli, and F. Pacifici, “A benchmarking protocol for pansharpening: Dataset, preprocessing, and quality assessment,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 14, pp. 6102–6118, 2021, doi: 10.1109/ JSTARS.2021.3086877. [37] J. Cornebise, I. Orsolic, and F. Kalaitzis, “Open high-resolution satellite imagery: The worldstrat dataset – With application to super-resolution,” in Proc. 36th Conf. Neural Inf. Process. Syst. Datasets Benchmarks Track, 2022. [Online]. Available: https://open review.net/forum?id=DEigo9L8xZA [38] Z. Xiong, F. Zhang, Y. Wang, Y. Shi, and X. X. Zhu, “EarthNets: Empowering A I in Ear th obser vation,” 2022, arXiv:2210.04936. [39] A. Paszke et al., “PyTorch: An imperative style, high-performance deep learning library,” in Proc. Neural Inf. Process. Syst., 2019, vol. 32, pp. 8026–8037. [40] M. Abadi et al., “TensorFlow: A system for large-scale machine learning,” in Proc. USENIX Symp. Oper. Syst. Des. Implementations, 2016, pp. 265–283. [41] A. J. Stewart, C. Robinson, I. A. Corley, A. Ortiz, J. M. L. Ferres, and A. Banerjee, “TorchGeo: Deep learning with geospatial data,” 2021, arXiv:2111.08872. [42] H. Tamiminia, B. Salehi, M. Mahdianpari, L. Quackenbush, S. Adeli, and B. Brisco, “Google earth engine for geo-big data applications: A meta-analysis and systematic review,” ISPRS J. Photogrammetry Remote Sens., vol. 164, pp. 152–170, Jun. 2020, doi: 10.1016/j.isprsjprs.2020.04.001. [43] A. Van Etten, D. Lindenbaum, and T. M. Bacastow, “SpaceNet: A remote sensing dataset and challenge series,” 2018, arXiv:1807.01232. [44] E. Strickland, “Andrew Ng, AI minimalist: The machine-learning pioneer says small is the new big,” IEEE Spectr., vol. 59, no. 4, pp. 22–50, Apr. 2022, doi: 10.1109/MSPEC.2022.9754503. IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

[45] M. D. Wilkinson et al., “The fair guiding principles for scientific data management and stewardship,” Scientific Data, vol. 3, no. 1, Mar. 2016, Art. no. 160018, doi: 10.1038/sdata.2016.18. [46] J. L. Dwyer, D. P. Roy, B. Sauer, C. B. Jenkerson, H. K. Zhang, and L. Lymburner, “Analysis ready data: Enabling analysis of the Landsat archive,” Remote Sens., vol. 10, no. 9, Aug. 2018, Art. no. 1363, doi: 10.3390/rs10091363. [Online]. Available: https:// www.mdpi.com/2072-4292/10/9/1363 [47] G. Giuliani et al., “Building an earth observations data cube: Lessons learned from the Swiss data cube (SDC) on generating analysis ready data (ARD),” Big Earth Data, vol. 1, nos. 1–2, pp. 100–117, Sep. 2017, doi: 10.1080/20964471.2017.1398903. [48] T. Gebru et al., “Datasheets for datasets,” 2018. [Online]. Available: https://arxiv.org/abs/1803.09010 [49] “SEN12MS toolbox.” GitHub. Accessed: Jul. 23, 2023. [Online]. Available: https://github.com/schmitt-muc/SEN12MS [50] “EuroSAT: Land use and land cover classification with sentinel-2.” GitHub Accessed: Jul. 23, 2023. [Online]. Available: https://github.com/phelber/EuroSAT [51] “Accurate and scalable processing of big data in earth observation.” BigEarth. Accessed: Jul. 23, 2023. [Online]. Available: https://bigearth.eu/ [52] “Functional map of the world (fMoW) dataset.” GitHub. Accessed: Jul. 23, 2023. [Online]. Available: https://github.com/ fMoW/dataset [53] “xView3: Dark vessels.” Accessed: Jul. 23, 2023. [Online]. Available: https://iuu.xview.us/ [54] “A large-scale benchmark and challenges for object detection in aerial images.” DOTA. Accessed: Jul. 23, 2023. [Online]. Available: https://captain-whu.github.io/DOTA/ [55] “LEVIR-CD.” Accessed: Jul. 23, 2023. [Online]. Available: https://justchenhao.github.io/LEVIR/ [56] “Onera satellite change detection dataset.” GitHub. Accessed: Jul. 23, 2023. [Online]. Available: https://rcdaudt.github.io/oscd/ [57] “Data,” European Space Agency, Paris, France, Jun. 2019. [Online]. Available: https://kelvins.esa.int/proba-v-super-resolution/ data/ [58] “PAirMax-Airbus.” Accessed: Jul. 23, 2023. [Online]. Available: https://perscido.univ-grenoble-alpes.fr/datasets/DS353 [59] “The WorldStrat dataset.” WorldStrat. Accessed: Jul. 23, 2023. [Online]. Available: https://worldstrat.github.io/ [60] “Copernicus open access hub.” Copernicus. Accessed: Jul. 23, 2023. [Online]. Available: https://scihub.copernicus.eu/ [61] “Open data program.” Maxar. Accessed: Jul. 23, 2023. [Online]. Available: https://www.maxar.com/open-data [62] “Capella space synthetic aperture radar (SAR) open dataset.” AWS. Accessed: Jul. 23, 2023. [Online]. Available: https://registry.opendata.aws/capella_opendata/ [63] “TerraSAR-X science service system.” TerraSAR. Accessed: Jul. 23, 2023. [Online]. Available: https://sss.terrasar-x.dlr.de/ [64] IEEE DataPort. Accessed: Jul. 23, 2023. [Online]. Available: https://ieee-dataport.org/datasets [65] “Open library for earth observations machine learning,” Radiant Earth Foundation, Washington, DC, USA. Accessed: Jul. 23, 2023. [Online]. Available: https://mlhub.earth

SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

[66] “The earth in a cube.” Euro Data Cube. Accessed: Jul. 23, 2023. [Online]. Available: https://eurodatacube.com [67] “Your gateway to NASA earth observation data,” Earth Data, Nat. Aeronaut. Space Admin., Washington, DC, USA. Accessed: Jul. 23, 2023. [Online]. Available: https://www.earthdata. nasa.gov [68] “OpenAerialMap.” Accessed: Jul. 23, 2023. [Online]. Available: https://openaerialmap.org [69] “OpenStreetMap.” Accessed: Jul. 23, 2023. [Online]. Available: https://openstreetmap.org [70] “EarthNets for earth observation.” EarthNets. Accessed: Jul. 23, 2023. [Online]. Available: https://earthnets.nicepage.io [71] “Earth observation database.” Accessed: Jul. 23, 2023. [Online]. Available: https://eod-grss-ieee.com [72] “Torchgeo.” GitHub. Accessed: Jul. 23, 2023. [Online]. Available: https://github.com/microsoft/torchgeo [73] “Azavea/raster-vision.” GitHub. Accessed: Jul. 23, 2023. [Online]. Available: https://github.com/azavea/raster-vision [74] “Raster vision.” Accessed: Jul. 23, 2023. [Online]. Available: https://docs.rastervision.io/ [75] “Keras spatial.” Github. Accessed: Jul. 23, 2023. [Online]. Available: https://github.com/IllinoisStateGeologicalSurvey/keras-spatial [76] “A planetary-scale platform for Earth science data & analysis.” Google Earth Engine. Accessed: Jul. 23, 2023. [Online]. Available: https://earthengine.google.com [77] “NAIP: National agriculture imagery program.” Earth Engine Data Catalog. Accessed: Jul. 23, 2023. [Online]. Available: https://developers.google.com/earth-engine/datasets/catalog/ USDA_NAIP_DOQQ [78] “Start building on AWS today.” AWS. Accessed: Jul. 23, 2023. [Online]. Available: https://aws.amazon.com [79] “Welcome to Digital Earth Africa (DE Africa).” Digital Earth Africa. Accessed: Jul. 23, 2023. [Online]. Available: https://www. digitalearthafrica.org [80] “NOAA’s emergency response imagery,” National Ocean Service, Silver Spring, MD, USA, 2023. Accessed: Jul. 23, 2023. [Online]. Available: https://oceanservice.noaa.gov/hazards/ emergency-response-imagery.html [81] “A planetary computer for a sustainable future.” Planetary Computer. [Online]. Available: https://planetarycomputer. microsoft.com [82] “High resolution electricity access.” Planetary Computer. Accessed: Jul. 23, 2023. [Online]. Available: https://planetary computer.microsoft.com/dataset/hrea [83] “USBuildingFootprints.” GitHub. [Online]. Available: https:// github.com/Microsoft/USBuildingFootprints [84] “Welcome to Colab!” Colab. Accessed: Jul. 23, 2023. [Online]. Available: https://colab.research.google.com [85] Accessed: Jul. 23, 2023. [Online]. Available: https://www.grss -ieee.org/wp-content/uploads/2023/05/EODatasets.pdf [86] “Start with more than a blinking cursor.” Kaggle. [Online]. Available: https://www.kaggle.com [87] “EDC browser.” Accessed: Jul. 23, 2023. [Online]. Available: https://browser.eurodatacube.com/ GRS

97

SOFTWARE AND DATA SETS YI WANG  , NASSIM AIT ALI BRAHAM, ZHITONG XIONG  , CHENYING LIU  , CONRAD M. ALBRECHT, AND XIAO XIANG ZHU 

SSL4EO-S12 A large-scale multimodal, multitemporal dataset for self-supervised learning in Earth observation

S

elf-supervised pretraining bears the potential to generate expressive representations from large-scale Earth observation (EO) data without human annotation. However, most existing pretraining in the field is based on ImageNet or medium-sized, labeled remote sensing (RS) datasets. In this article, we share an unlabeled dataset Self-Supervised Learning for Earth Observation-Sentinel-1/2 (SSL4EO-S12) to assemble a large-scale, global, multimodal, and multiseasonal corpus of satellite imagery. We demonstrate SSL4EO-S12 to succeed in self-supervised pretraining for a set of representative methods: momentum contrast (MoCo), self-distillation with no labels (DINO), masked autoencoders (MAE), and data2vec, and multiple downstream applications, including scene classification, semantic segmentation, and change detection. Our benchmark results prove the effectiveness of SSL4EO-S12 compared to existing datasets. The dataset, related source code, and pretrained models are available at https://github.com/zhu-xlab/ SSL4EO-S12.

INTRODUCTION Self-supervised learning (SSL) has attracted wide attention in the RS community with the ability to learn generic representations from unlabeled data. Numerous studies in the literature have proven the potential of SSL in EO beyond natural images [1]. Despite the focus SSL for EO receives, only limited effort is dedicated to providing large-scale datasets and benchmarks for pretraining. On the one hand, relying on computer vision datasets like ImageNet [2] is not a preferred option due to the domain gap. On the other hand, while RS datasets like SEN12MS [3] or seasonal contrast (SeCo) [4] exist, they are limited by geospatial overlap, sparse geographical distribution, or lack diversity in seasonal or multimodal Digital Object Identifier 10.1109/MGRS.2023.3281651 Date of current version: 19 September 2023

98

information. Therefore, big EO-specific datasets for unsupervised pretraining are necessary to be developed. In this work, we introduce a large-scale, globally distributed, multitemporal and multisensor dataset, SSL4EO-S12. The dataset samples 251,079 locations around the globe, each providing Sentinel-2 level-1C (L1C), Sentinel-2 level-2A (L2A), and Sentinel-1 ground range detected (GRD) images with four snapshots from different seasons (in total: 3 million 2, 640-m # 2, 640-m patches). Additionally, we guarantee optimal geospatial coverage by avoiding the overlap of the randomly sampled locations. This renders SSL4EO-S12 the largest and most generic multispectral/synthetic aperture radar (SAR) dataset in the RS literature [5]. We demonstrate the potential of the SSL4EO-S12 dataset through a series of extensive experiments. Specifically, we evaluate four representative SSL algorithms—namely: MoCo [6], DINO [7], MAE [8], and data2vec [9]—on three different downstream tasks: scene classification, semantic segmentation, and change detection. Our results indicate that pretraining on SSL4EO-S12 improves the downstream performance compared to existing datasets. Moreover, our ablation studies prove the benefits of RSspecific data augmentations, including multisensor, multitemporal, and atmospheric correction. RELATED WORK SELF-SUPERVISED LEARNING Over the past years, SSL has reached important milestones in computer vision, especially through contrastive methods with joint-embedding architectures. These methods get trained to promote similarity between augmented views of the same input, thereby enforcing invariance to data augmentation. Several families of such methods emerge: 1) contrasting negative samples for which the representations are encouraged to be dissimilar [6]; 2) knowledge distillation between an asymmetric IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

teacher–student network [7]; 3) redundancy reduction among the embedding dimensions [26]; and 4) clustering latent features to common prototypes from different views [10]. Meanwhile, recent developments in masked image modeling reveal promising results in generative methods, which reconstruct the masked input at pixel- [8] or feature[9] level. We benchmark four representative methods—MoCo [6], DINO [7], MAE [8], and data2vec [9]—on the proposed dataset. This way, we cover a reasonably diverse set of representative methods from different categories: MoCo contrasts negative samples, DINO represents a distillation method, MAE is based on masked reconstruction, and data2vec combines the masking mechanism with a jointembedding architecture. PRETRAINING DATASETS Pretrained models on ImageNet are widely used for various computer vision tasks. However, this is less appropriate in the context of RS: 1) RS images are not object-centric, 2) there exist various types of sensors in RS, and 3) temporal effects yield variations on the ground surface. Therefore, EO-specific datasets are needed to provide the above indomain knowledge. The literature has proven the benefits of pretraining on existing labeled RS datasets [11], [12], yet there are limitations, such as class bias and temporal and geographical coverage.

Consequently, there is a need for large-scale pretraining datasets in RS. Two datasets closely related to our efforts are SEN12MS [3] and SeCo [4]. However, SEN12MS is limited by temporal coverage, SeCo has only optical data, and both datasets contain strongly overlapping patches that limit the geospatial coverage. With the above in mind, our proposed SSL4EO-S12 dataset provides an improved spatiotemporal coverage by sampling more locations and removing overlapping patches, enclosing multiple seasons, and including Sentinel-1 as well as two Sentinel-2 products (Table 1). SSL4EO-S12 DATASET DATA CURATION AND ASSEMBLY The SSL4EO-S12 dataset (Figure 1) exploits openly available SAR/optical satellite data collected by the European Space Agency’s Sentinel mission. Following a wellorganized baseline provided by SeCo [4], we utilized the Google Earth Engine [14] to download and process the data. We filtered image patches to retrieve from the 10,000 most populated cities (https://simplemaps.com/ data/world-cities) in the world to guarantee reasonable global coverage. To obtain diverse land cover, we sampled 251,079 locations close by the cities following a Gaussian distribution peaking at the city center and standard deviation of 50 km, assuming most of the variability cast to the downtown and suburbs of cities [4]. At each location, we

TABLE 1. SUMMARY OF POPULAR MEDIUM-RESOLUTION PRETRAINING DATASETS IN RS. DATASET

SPATIAL COVER

TEMPORAL COVER

MODALITY

OVERLAP

PATCH SIZE

NO. OF LOCATIONS

NO. OF PATCHES

BigEarthNet [13]

Europe

One timestamp

SAR/Optical

No

120 # 120

590,326

1.2 million

SEN12MS [3]

Global

One timestamp

SAR/optical/land cover

Yes

256 # 256

180,662

541,986

SeCo [4]

Global

Five timestamps

Optical

Yes

264 # 264

~200,000

1 million

SSL4EO-S12

Global

Four timestamps

SAR/optical*2

+ No

264 # 264

251,079

3 million

Earth Observation

Multiple Seasons

Global Coverage

Multiple Modalities

FIGURE 1. Sample images of SSL4EO-S12 dataset assembled. SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

99

downloaded four images drawn from four annual seasons to capture seasonal variation. We searched for Sentinel-2 tiles with a cloud coverage lower than 10%. We also filtered out most overlapping patches with an efficient grid search strategy. In total, we obtained about 1 million S1-GRD/ S2-L1C/S2-L2A image triplets. DATA IDENTIFICATION The collection of SSL4EO-S12 differs from SeCo mainly by introducing overlap filtering and multiple sensors (in italics below). The workflow is shown as follows: 1) Uniformly sample one city from top-10,000 populated cities. 2) Sample one location from a Gaussian distribution with a standard deviation of 50 km around the city center. 3) Check if a 2,640-m # 2,640-m image patch centered around that location has significant overlap with previous patches. If not, continue to step 4, otherwise return to step 1. 4) For a 30-d interval around four reference dates (20 March, 21 June, 22 September, 21 December) in 2021 (additionally look for 2020 as a buffer), check if there exist Sentinel-2 tiles with less than 10% of cloud coverage (for both L1C and L2A) and corresponding Sentinel-1 GRD tiles. 5) If there exist valid Sentinel-1/2 tiles close to all the four dates, process and download them into curated image patches, otherwise return to step 1. OVERLAP FILTERING A simple way to check significant overlap between two patches is to calculate the distance between the two centers. If the distance is smaller than three-quarters of the width of a patch, there is a nonnegligible overlap (>25%). Naively, we needed to execute this computation for every new patch relative to all existing patches. However, this becomes inefficient when the number of patches grows large, 250,000+ for us. Therefore, we employed a grid search strategy to perform efficient overlap filtering. Instead of calculating the distance to all previous patches, we distributed the patch center coordinates into 360 # 180 geographical longitude–latitude, 1c # 1c grids. For each new patch, we converted the center coordinates into integer grid coordinates. Subsequently, we searched

FIGURE 2. Geographical distribution of SSL4EO-S12 dataset.

100

for existing patches within this grid cell and exclusively calculated distances to those local patches. Assuming potential overlap of sampled patches from distinct grid cells is statistically negligible, we significantly reduced computing time compared to a global overlap search. Indeed, for SSL4EO-S12 we recorded an overlap for approximately 3% tiles of densely populated Tokyo, Japan, 1.5% in Chicago, IL, USA, and below 1% for locations such as Beijing, China, Munich, Germany, Kampala, Uganda, and Brasilia, Brazil. DATA CHARACTERISTICS AND VOLUME The presented SSL4EO-S12 dataset contains 251,079 globally distributed Sentinel-1 dual-pol SAR, Sentinel-2 top-of-atmosphere multispectral, and Sentinel-2 surface reflectance multispectral triplets over four seasonal timestamps. As of summer 2022, SSL4EO-S12 constitutes the biggest geospatial–temporal, multimodal dataset in terms of medium-resolution PolSAR and multispectral imagery, serving more than 3 million images. The total data volume equates to an uncompressed size of 251, 079 # 4 # [2 $ 4B + (13 + 12) $ 2B] # 264 2 . 3.7TB. Figure 2 depicts the geospatial distribution of the SSL4EO-S12 dataset, highlighting the dense coverage across the globe. Figure 3 depicts the effect of overlap filtering around the Tokyo area. EXPERIMENTAL SETUP We evaluated the SSL4EO-S12 dataset by self-supervised pretraining and transfer learning on RS downstream tasks. Specific implementation details and additional results are provided in the supplemental material (available at https:// doi.org/10.1109/MGRS.2023.3281651). SELF-SUPERVISED PRETRAINING We performed pretraining using four representative SSL methods: MoCo-v2/v3 [15], [16], DINO [7], MAE [8], and data2vec [9]. We pretrained ResNet [17] backbones with MoCo(-v2) and DINO, and Vision Transformer (ViT) [18] backbones for all four SSL methods listed above. Unless explicitly noted, Sentinel-2 L1C was used for pretraining. To utilize multitemporal information, we used RandomSeasonContrast as a data augmentation strategy. For MoCo and DINO, the input views were randomly picked from two seasons. For MAE and data2vec, one random season was assigned for each patch. Pretraining one ResNet/ViT model for 100 epochs takes 7–25 h on four NVIDIA A100 GPUs, as shown in Table 2. TRANSFER LEARNING The pretrained models were transferred to various downstream tasks. For scene classification, we evaluated EuroSAT [19] (single-label land cover classification), BigEarthNet [13] (multilabel land cover classification), and So2Sat-LCZ42 [20] (local climate zone classification, culture-10 ­version). IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

COMPARISON OF PRETRAINING DATASETS To compare SSL 4EO-S12 with other RS pretraining datasets, we report corresponding linear probing results pretrained with MoCo-v2 (ResNet50 backbone)

For semantic segmentation, we included the 2020 Data Fusion Contest (DFC2020) [21] (land cover segmentation) and Onera Satellite Change Detection (OSCD) [22] (change detection). We performed commonly used linear probing (freezing the pretrained encoder) and fine-tuning for the downstream tasks. The results are reported in percentage scores. BENCHMARK RESULTS CLASSIFICATION COMPARISON OF SELF-SUPERVISED LEARNING METHODS We first benchmarked different SSL methods through linear probing on EuroSAT, BigEarthNet, and So2SatLCZ42. As detailed in Table 3, all methods outperformed random initialization by a substantial margin. As expected, linear probing on BigEarthNet with all labels performs worse than fully supervised training. Promisingly, the gap stays below 5%. On small datasets like BigEarthNet with 10% labels or EuroSAT, linear probing provides results comparable to supervised training within approximately ±1%. The trends are slightly different for So2Sat-LCZ42, where the training and testing sets are built upon different cities with a challenging geographical split. Because of this significant domain shift, adding labeled training data does not necessarily improve the testing performance. In fact, fitting the training data distribution does not guarantee out-of-distribution generalization. Nevertheless, the best pretrained models with linear probing beat the supervised baseline by at least 1% up to about 4%. Furthermore, we benchmarked fine-tuning results in Table 4. All self-supervised methods outperform supervised learning with a margin from 1% to 6%. Top SSL models score 99.1% on EuroSAT (MoCo/DINO) and over 90% on BigEarthNet (MoCo/DINO). Comparing linear probing and fine-tuning results, one interesting phenomenon shows up: in linear probing, contrastive methods (MoCo and DINO) consistently score better than their image-masking (MAE and data2vec) counterparts.

(a)

(b) FIGURE 3. Image patches without (a) and with (b) overlap filtering in Tokyo metropolitan area. We plotted red circles of radius 1.32 km (132 pixels) for better visualization.

TABLE 2. 100 EPOCH TRAINING TIME OF THE STUDIED SSL METHODS. MOCO

DINO

MAE

DATA2VEC

ResNet50

18 h

25 h





ViT-S/16

24 h

25 h

7h

14 h

TABLE 3. LINEAR PROBING RESULTS FOR EUROSAT, BIGEARTHNET (BE), AND SO2SAT-LCZ42 (10% AND 100% LABELS). DOWNSTREAM DATASET MODEL\BACKBONE

EUROSAT RN50

VIT-S/16

BE-10% RN50

VIT-S/16

BE-100% RN50

VIT-S/16

SO2SAT-10% RN50

VIT-S/16

SO2SAT-100% RN50

VIT-S/16

Random initialization

82.0

81.3

63.6

62.3

70.1

70.2

48.8

49.3

49.0

50.2

Supervised

98.0

96.7

83.4

81.3

88.7

87.4

57.5

59.7

57.5

59.3

MoCo

98.0

97.7

82.1

82.3

84.2

83.1

61.3

59.6

61.8

62.2

DINO

97.2

97.7

82.0

81.7

83.9

83.4

55.5

60.9

57.0

62.5

MAE



94.1



77.5



78.2



59.5



60.0

Data2vec



96.9



77.3



79.4



58.2



59.7

We report overall accuracy for EuroSAT and So2Sat-LCZ42, and mean average precision (micro) for BigEarthNet. Two backbone networks get trained: ResNet-50 (RN50) and a small(-S) embedding dimension ViT subdividing input patches into 16 × 16 tiles(/16). Bold values indicate best per column performance.

SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

101

TABLE 4. FINE-TUNING RESULTS FOR EUROSAT, BIGEARTHNET, AND SO2SAT-LCZ42. DOWNSTREAM DATASET

EUROSAT

BE-10%

BE-100%

SO2SAT-10%

SO2SAT-100%

MODEL\BACKBONE

RN50

VIT-S/16

RN50

VIT-S/16

RN50

VIT-S/16

RN50

VIT-S/16

RN50

VIT-S/16

MoCo

99.1

98.6

86.2

86.1

91.8

89.9

60.4

61.2

60.9

61.6

DINO

99.1

99.0

87.1

86.9

90.7

90.5

63.2

61.5

63.6

62.2

MAE



98.7



84.8



88.9



60.8



63.9

Data2vec



99.1



85.6



90.3



63.2



64.8

All beat supervised training; compare with Table 3. Bold values indicate best per column performance.

COMPARISON OF DIFFERENT AMOUNTS OF LABELS Figure 4 visualizes performance results of transfer learning on BigEarthNet with a varying fraction of labeled samples. Compared to the supervised baseline, self-supervised pretraining on SSL4EO-S12 provides significant benefits when the amount of labeled samples is limited. In fact, fine-tuning on 10% of the labels outperforms 50%-labels supervised training; and with ViT-S/16, finetuning on 50% of the labels outperforms 100%-labels supervised training.

Mean Average Precision (%)

90 RN50 85 80 75 70 65

90 Mean Average Precision (%)

in Table 5. Similar to SSL4EO-S12, RandomSeasonContrast is used to pick one timestamp image for each geospatial patch in the SeCo dataset. In the first set of comparison, we used red/green/blue (RGB) bands only. SSL4EO-S12 significantly outperforms ImageNet by about 10%, SeCo by about 6%, and SEN12MS by 1.7% to 3.5%. In a second set of experiments we evaluated all multispectral bands. Results indicate consistent performance gain as in RGB setting comparing SSL4EO-S12 with SEN12MS and SeCo. In addition, pretraining on SSL4EO-S12 outperforms BigEarthNet on itself and EuroSAT (both are European Union only). This proves SSL4EO-S12’s benefits to improve model transferability by learning valuable knowledge from a larger scale and wider geographical coverage.

50 10 Labeled Data (%) (a)

100

50 10 Labeled Data (%) (b)

100

ViT-S/16

85 80 75 70 65

TABLE 5. COMPARISON OF DIFFERENT PRETRAINING DATASETS UNDER LINEAR PROBING EVALUATION.

1

EUROSAT

BE-10%

BE-100%

ImageNet (RGB) [4]

86.4

70.5

71.8

SeCo (RGB) [4]

89.5

74.5

76.3

SEN12MS (RGB)

94.9

76.6

79.6

SSL4EO-S12 (RGB)

96.6

80.1

82.3

SeCo* (all bands)

89.2

73.7

76.6

SEN12MS (all bands)

95.5

79.6

82.1

BigEarthNet (all bands)

94.4

80.6

83.9

SSL4EO-S12 (all bands)

98.0

82.1

84.2

Italics means cited from the literature. Bold values indicate best per column performance. *SeCo is meant to have 200,000 geographical patches as described in the article, but the available data at https://github.com/ServiceNow/seasonal-contrast has only about 160,000 patches, which may affect our reproduced performance.

Fine-Tune

Supervise

BE Label

Percentage

1%

10%

50%

100%

ViT-S/16 RN50

Linear

PRETRAIN DATASET

102

1

Linear Fine-Tune Supervise Linear Fine-Tune Supervise

75.9 80.3 75.7 78.2 78.9 69.3

82.1 86.2 83.4 82.3 86.1 81.3

82.7 87.7 85.2 83 88.2 84.9

84.2 91.8 88.7 83.1 89.9 87.4

FIGURE 4. BigEarthNet (BE) performance depending on the

number of labels available to train downstream task. We report linear probing and fine-tuning results with ResNet50 and ViT-S/16 encoders pretrained using MoCo-v2. Bold values indicate best per column performance. IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

SEGMENTATION LAND COVER SEGMENTATION We used the DFC2020 [21] dataset to evaluate land cover semantic segmentation. We pretrained ResNet50 with MoCo-v2 on SSL4EO-S12 L1C products, and fine-tuned a DeepLabv3+ [23] for segmentation. Table 6 lists results with notable improvements when compared to SeCo pretraining. However, SSL4EO-S12 performs worse than SEN12MS in average accuracy and mean intersection over union. This can be expected, since DFC2020 was built with direct reference to SEN12MS and they have similar data distribution. Nevertheless, the results are still comparable, proving again the transferability of the proposed dataset.

ABLATION OF SEASONAL INFORMATION We evaluated the effectiveness of multitemporal information by replacing seasonal augmentation (compare with “Experimental Setup” section) by random season: the same randomly selected season for the two positive views; and fixed season: the same season for each patch during training. We pretrained on a 50,000 subset of SSL4EO-S12, and evaluated on BigEarthNet-10% and EuroSAT. Table 9 clearly proves the benefits of seasonal augmentation. ATMOSPHERIC CORRECTION AS DATA AUGMENTATION The motivation to include Sentinel-2 L1C and L2A products in SSL4EO-S12 is to match corresponding

TABLE 6. DFC2020 LAND COVER SEGMENTATION RESULTS.

CHANGE DETECTION We evaluated the pretrained models for change detection on the OSCD [22] dataset. We pretrained ResNet50 with MoCo-v2 on SSL4EO-S12 L1C products, froze the backbone, and fine-tuned a U-Net [24] for segmentation. The differences in feature maps between two timestamps were input to the network. As Table 7 indicates, pretraining on SSL4EO-S12 yields superior performance in recall and F1-score when referenced to SeCo and SEN12MS. While SSL4EO-S12 performs worse in precision, this is due to the significant class unbalance that predicting all pixels as unchanged would result in a good precision score. ADDITIONAL STUDIES We complete our benchmark by reporting a set of additional results to document key characteristics of the SSL4EOS12 dataset, namely: multitemporal, multimodal, multiproduct-level, and data scale. For all studies, we pretrained ResNet50 with MoCo-v2 as a common setting. ABLATION STUDIES

PRETRAIN DATASET

OA

AA

MIOU

Random initialization

81.97

56.46

42.11

SeCo

87.31

57.05

49.68

SEN12MS

88.64

67.69

54.83

SSLEO-S12

89.58

64.01

54.68

Bold values indicate best per column performance. AA: average accuracy; mIoU: mean intersection over union; OA: overall accuracy.

TABLE 7. OSCD CHANGE DETECTION RESULTS. PRETRAIN DATASET Random initialization

PRECISION

RECALL

F1

72.31

13.75

23.10

SeCo

74.85

17.47

28.33

SEN12MS

74.67

19.26

30.62

SSL4EO-S12

70.23

23.38

35.08

Bold values indicate best per column performance.

TABLE 8. LINEAR PROBING RESULTS OF MULTIMODAL SSL. BE-1%

BENEFITS OF MULTIMODALITY While the “Benchmark Results” section employs only optical data for fair comparison to existing literature, we highlight the benefits of multimodal pretraining in this section. We integrate SAR data by early fusion, and use RandomSensorDrop [12] as an additional data augmentation strategy. During training, the model gets fed random combinations of SAR/optical patches, thus learning both inner- and intermodality representations. Then, the pretrained model gets transferred to different scenarios, where either both modalities or a single one is available. We compare multimodal pretraining (MM) to unimodal pretraining (S1/2) on BigEarthNet. Table 8 presents results with notable improvement of 1–3% for 100% and 1% label splits. While single-modality pretraining already works well for both Sentinel-2 and Sentinel-1 data, pretraining exploiting both modalities further improves performance. SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

MoCo-S1/2

BE-100%

S1

S2

S1+S2

S1

S2

S1+S2

71.1

75.9



75.9

84.2



MoCo-MM

73.3

76.7

76.8

79.5

85.1

85.2

Supervised

66.7

75.7

76.4

77.2

88.7

88.9

MoCo-S1/2 represents pretraining with one single modality, and MoCo-MM represents pretraining with both modalities. Bold values indicate best per column performance.

TABLE 9. LINEAR PROBING RESULTS OF MULTITEMPORAL ABLATION STUDY. PRETRAIN SEASON

BE-10%

EUROSAT

Fixed

75.1

93.1

Random

76.7

94.0

Augment

77.6

96.2

Bold values indicate best per column performance.

103

­ ownstream tasks. However, these product levels with d or without atmospheric correction can also be considered natural data augmentation for SSL. Accordingly, we conducted an ablation study on a 50,000 SSL4EO-S12 subset utilizing Sentinel-2 L1C, L2A, or both (L1C+L2A). Table 10 summarizes our findings: 1) models pretrained on the same product level as the downstream task have a slight edge (~ 1%) over models trained on the other product level, and 2) pretraining on both modalities generates

TABLE 10. LINEAR PROBING RESULTS OF DIFFERENT PRODUCT LEVELS OF SENTINEL-2. PRETRAIN PRODUCT

BE-10% (L2A)

EUROSAT (L1C)

L1C

74.0

93.1

L2A

75.1

92.0

L1C+L2A

78.0

93.8

Bold values indicate best per column performance.

TABLE 11. LINEAR PROBING RESULTS ON BIGEARTHNET-10% FOR VARIOUS SENTINEL-2 L1C PRETRAINING DATA SIZES. PRETRAIN DATA 100,000

250,000

500,000

750,000

1 MILLION

Accuracy (%)

73

78

81

82

64

(a)

(b) FIGURE 5. t-SNE visualization of EuroSAT image representations. One color represents one class. (a) Random-encoded features; (b) SSL-encoded features. SSL-encoded features are well clustered even without label information.

104

a notable improvement of up to 4% compared to pretraining on a single modality. IMPACT OF PRETRAINING SCALE An aspect relevant to large-scale data mining in EO is scaling of results with training data volume: why don’t we add more images to SSL4EO-S12? One reason concerns computational costs. We believe the current dataset (1 million patches for each Sentinel product) is comparable to the scale of ImageNet, and can serve as a good baseline in RS for further development. Moreover, as observed by [25], saturating downstream performance kicks in beyond 500,000 pretraining images on ImageNet, with 250,000 images yielding acceptable results with as little as 1–2% accuracy loss. We observe such a trend in our dataset, too. As demonstrated by Table 11, we pretrained on various amounts of data to report linear probing results for BigEarthNet-10%. While 50% (500,000) or less pretraining data yields significant performance drops, there’s fewer diminishing gaps from 75% (750,000) on. Note, this saturation effect depends also on the model size. REPRESENTATION VISUALIZATION We qualitatively evaluated the data representations learned from self-supervised pretraining by visualizing the latent distributions with t-distributed stochastic neighbor embedding (t-SNE) (Figure 5). We pretrained a ResNet50 with MoCo-v2 on SSL4EO-S12, and transferred the frozen encoder to EuroSAT to calculate one 128-d representation vector for each image. We then visualized all the vectors with t-SNE, and compared the distribution with a randomly initialized encoder. CONCLUSION In this work, we present SSL4EO-S12, a large-scale multimodal, multitemporal unlabeled dataset for SSL in EO. An extensive benchmark on various SSL methods and RS applications proves the promising benefits of the proposed dataset. SSL4EO-S12 has some limitations: 1) there’s little coverage of polar regions, 2) geographical bias exists due to cloud filtering, 3) it is not strictly free of geospatial overlap, and 4) medium-resolution radar and multispectral images are a limited subset of EO data. Despite these limitations, we believe SSL4EO-S12 renders a valuable basis to advance selfsupervised pretraining and large-scale data mining in RS. ACKNOWLEDGMENT This work is jointly supported by the Helmholtz Association through the Framework of Helmholtz Artificial Intelligence (Grant ZT-I-PF-5-01)–Local Unit “Munich Unit @Aeronautics, Space and Transport (MASTr)” and Helmholtz Excellent Professorship “Data Science in Earth Observation–Big Data Fusion for Urban Research”(Grant W2-W3-100); by the German Federal Ministry of Education and Research in the framework of the International Future Artificial I­ ntelligence IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

Laboratory “AI4EO–Artificial Intelligence for Earth Observation: Reasoning, Uncertainties, Ethics and Beyond” (Grant 01DD20001); and by the German Federal Ministry for Economic Affairs and Climate Action in the framework of the “National Center of excellence ML4Earth” (Grant 50EE2201C). The computing resources were supported by the Helmholtz Association’s Initiative and Networking Fund on the HAICORE@FZJ partition. This article has supplementary downloadable material available at https://doi. org/10.1109/MGRS.2023.3281651. AUTHOR INFORMATION Yi Wang ([email protected]) is with the Chair of Data Science in Earth Observation, Technical University of Munich, 80333 Munich, Germany, and the Remote Sensing Technology Institute, German Aerospace Center, 82234 Weßling, Germany. He is a Graduate Student Member of IEEE. Nassim Ait Ali Braham ([email protected]) is with the Chair of Data Science in Earth Observation, Technical University of Munich, 80333 Munich, Germany, and the Remote Sensing Technology Institute, German Aerospace Center, 82234 Weßling, Germany. Zhitong Xiong ([email protected]) is with the Chair of Data Science in Earth Observation, Technical University of Munich, 80333 Munich, Germany. He is a Member of IEEE. Chenying Liu ([email protected]) is with the Chair of Data Science in Earth Observation, Technical University of Munich, 80333 Munich, Germany, and the Remote Sensing Technology Institute, German Aerospace Center, 82234 Weßling, Germany. She is a Graduate Student Member of IEEE. Conrad M. Albrecht ([email protected]) is with the Remote Sensing Technology Institute, German Aerospace Center, 82234 Weßling, Germany. Xiao Xiang Zhu ([email protected]) is with the Chair of Data Science in Earth Observation, Technical University of Munich, 80333 Munich, Germany. She is a Fellow of IEEE. REFERENCES [1] Y. Wang et al., “Self-supervised learning in remote sensing: A review,” IEEE Geosci. Remote Sens. Mag., vol. 10, no. 4, pp. 213–247, Dec. 2022, doi: 10.1109/MGRS.2022.3198244. [2] J. Deng et al., “ImageNet: A large-scale hierarchical image database,” in Proc. IEEE Conf. Comput. Vision Pattern Recognit., 2009, pp. 248–255, doi: 10.1109/CVPR.2009.5206848. [3] M. Schmitt et al., “SEN12MS – A curated dataset of georeferenced multi-spectral Sentinel-1/2 imagery for deep learning and data fusion,” ISPRS Ann. Photogrammetry, Remote Sens. Spatial Inf. Sci., vol. IV-2/W7, pp. 153–160, Sep. 2019, doi: 10.5194/isprs-annals-IV-2-W7-153-2019. [4] O. Mañas et al., “Seasonal contrast: Unsupervised pre-training from uncurated remote sensing data,” in Proc. IEEE/CVF Int. Conf. Comput. Vision, 2021, pp. 9414–9423. [5] Z. Xiong et al., “EarthNets: Empowering AI in earth observation,” 2022, arXiv:2210.04936. SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

[6] K. He et al., “Momentum contrast for unsupervised visual representation learning,” in Proc. IEEE/CVF Conf. Comput. Vision Pattern Recognit., 2020, pp. 9726–9735, doi: 10.1109/ CVPR42600.2020.00975. [7] M. Caron et al., “Emerging properties in self-supervised vision transformers,” in Proc. IEEE/CVF Int. Conf. Comput. Vision, 2021, pp. 9650–9660. [8] K. He et al., “Masked autoencoders are scalable vision learners,” 2021, arXiv:2111.06377. [9] A. Baevski et al., “Data2vec: A general framework for self-supervised learning in speech, vision and language,” 2022, arXiv:2202.03555. [10] M. Caron et al., “Unsupervised learning of visual features by contrasting cluster assignments,” in Proc. Adv. Neural Inf. Process. Syst., vol. 33, 2020, pp. 9912–9924. [11] M. Neumann et al., “Training general representations for remote sensing using in-domain knowledge,” in Proc. IEEE Int. Geosci. Remote Sens. Symp., 2020, pp. 6730–6733, doi: 10.1109/ IGARSS39084.2020.9324501. [12] Y. Wang, C. M. Albrecht, and X. X. Zhu, “Self-supervised vision transformers for joint SAR-optical representation learning,” 2022, arXiv:2204.05381. [13] G. Sumbul et al., “BigEarthNet: A large-scale benchmark archive for remote sensing image understanding,” in Proc. IEEE Int. Geosci. Remote Sens. Symp., 2019, pp. 5901–5904, doi: 10.1109/ IGARSS.2019.8900532. [14] N. Gorelick et al., “Google Earth Engine: Planetary-scale geospatial analysis for everyone,” Remote Sens. Environ., vol. 202, Dec. 2017, pp. 18–27, doi: 10.1016/j.rse.2017.06.031. [15] X. Chen et al., “Improved baselines with momentum contrastive learning,” 2020, arXiv:2003.04297. [16] X. Chen, S. Xie, and K. He, “An empirical study of training selfsupervised vision transformers,” in Proc. IEEE/CVF Int. Conf. Comput. Vision, 2021, pp. 9640–9649. [17] K. He et al., “Deep residual learning for image recognition,” in Proc. IEEE Conf. Comput. Vision Pattern Recognit., 2016, pp. 770– 778, doi: 10.1109/CVPR.2016.90. [18] A . Dosovitskiy et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” 2020, arXiv:2010.11929. [19] P. Helber et al., “EuroSAT: A novel dataset and deep learning benchmark for land use and land cover classification,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 12, no. 7, pp. 2217–2226, Jul. 2019, doi: 10.1109/JSTARS.2019.2918242. [20] X . X. Zhu et al., “So2Sat LCZ42: A benchmark data set for the classification of global local climate zones,” IEEE Geosci. Remote Sens. Mag. (replaced Newsletter), vol. 8, no. 3, pp. 76–89, Sep. 2020, doi: 10.1109/MGRS.2020.2964708. [21] M. Schmitt et al., “IEEE GRSS data fusion contest,” 2020. [Online]. Available: https://ieee-dataport.org/competitions/2020 -ieee-grss-data-fusion-contest [22] R . C. Daudt et al., “Urban change detection for multispectral earth observation using convolutional neural networks,” in Proc. IEEE Int. Geosci. Remote Sens. Symp., 2018, pp. 2115–2118, doi: 10.1109/IGARSS.2018.8518015. [23] L .-C. Chen et al., “Encoder-decoder with Atrous separable convolution for semantic image segmentation,” in Proc. Eur. Conf. Comput. Vision (ECCV), 2018, pp. 801–818.

105

[24] O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional networks for biomedical image segmentation,” in Proc. Int. Conf. Med. Image Comput. Comput.-Assisted Intervention, 2015, pp. 234–241. [25] E. Cole et al., “When does contrastive visual representation learning work?” in Proc. IEEE/CVF Conf. Comput. Vision Pattern Recognit., 2022, pp. 14,755–14,764.

[26] J. Zbontar, L. Jing, I. Misra, Y. LeCun, and S. Deny, “Barlow twins: Self-supervised learning via redundancy reduction,” in Proc. Int. Conf. Mach. Learn., 2021, pp. 12310–12320. [Online]. Available: https://proceedings.mlr.press/v139/zbontar21a.html

DANIELE REGE CAMBRIN , LUCA COLOMBA, AND PAOLO GARZA

CaBuAr California burned areas dataset for delineation

F

orest wildfires represent one of the catastrophic events that, over the last decades, have caused huge environmental and humanitarian damage. In addition to a significant amount of carbon dioxide emission, they are a source of risk to society in both short-term (e.g., temporary city evacuation due to fire) and long-term (e.g., higher risks of landslides) cases. Consequently, the availability of tools to support local authorities in automatically identifying burned areas plays an important role in the continuous monitoring requirement to alleviate the aftereffects of such catastrophic events. The great availability of satellite acquisitions coupled with computer vision techniques represents an important step in developing such tools. This article introduces a novel open dataset that tackles the burned area delineation problem, a binary segmentation problem applied to satellite imagery. The presented resource consists of pre- and postfire Sentinel-2 L2A acquisitions of California forest fires that took place from 2015 to 2022. Raster annotations were generated from the data released by California’s Department of Forestry and Fire Protection. Moreover, in conjunction with the dataset, we release three different baselines based on spectral index analyses, SegFormer, and U-Net models. INTRODUCTION The Earth observation (EO) field has greatly increased the number of applications in the last decades thanks to the greater data availability, storage capacity, and computational power of modern systems. In fact, leveraging data ac-

Digital Object Identifier 10.1109/MGRS.2023.3292467 Date of current version: 19 September 2023

106

quired by Sentinel [1], Landsat [2], and MODIS [3] missions as an example, it is possible to retrieve information at a continental scale in a short amount of time. This, in conjunction with the development of modern methodologies in the field of machine learning and deep learning, represents an extremely interesting area of research for scientists and authorities from different fields, such as governments and first responders involved in disaster response and disaster recovery missions. Phenomena such as climate change and extreme climate events have a tremendous societal, economic, and environmental impact, also leading to humanitarian and environmental losses (e.g., a higher risk of landslides due to a forest fire). Indeed, leveraging EO and modern deep learning methodologies can provide useful tools in the area of disaster management and disaster recovery. Within the research community, numerous previous works proved the effectiveness of computer vision architectures in the field of disaster response, such as flood delineation [4], change detection [5], [6], and burned area delineation [7], [8], [9]. This article fits in the last mentioned context. Specifically, we release a dataset to tackle the burned area delineation problem, i.e., a binary image segmentation problem that aims to identify areas damaged by a forest wildfire. Tackling such a problem with modern methodologies requires great data availability. However, the time and cost needed to produce high-quality annotations severely limit the ability to investigate ad hoc solutions in the EO field. For these reasons, we propose a new dataset related to forest fires in California, collecting data from the Sentinel-2 mission [10]. The dataset is publicly available to the research community at https://huggingface. co/datasets/DarthReca/california_burned_areas. Compared to IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

the few other datasets about wildfires [11], [12], our dataset covers a larger area and spans more years. Ground truth masks for the task of binary image segmentation were generated starting from the public vector data provided by California’s Department of Forestry and Fire Protection [13] and rasterized. Satellite acquisitions, i.e., the raw input data, were instead collected from the Sentinel-2 L2A mission through Copernicus Open Access Hub. More precisely, we collected and released both prefire and postfire information associated with the same area of interest. The contributions of this article can be summarized as follows: ◗◗ A novel image segmentation dataset was tailored to burned area delineation consisting of Sentinel-2 pre- and postfire acquisitions. We provide more samples than existing datasets to facilitate the training of (large) deep learning models. ◗◗ Three different baselines were evaluated on the proposed dataset: one consisting of the evaluation of several burned area indexes and Otsu’s automatic thresholding method [14], one based on the SegFormer model [15], and one based on the U-Net model [16]. The article is structured as follows. The “Related Works” section introduces the related works, the “Dataset” section introduces the collected dataset and the preprocessing steps performed, and the “Tasks” and “Experiments” sections formally introduce the tasks and the experimental settings and results. Finally, the final section concludes the article. RELATED WORKS Before the development of deep learning-based methodologies, domain experts based their analyses on satellite imagery leveraging spectral index computation and evaluation. Considering the synthetic aperture radar context, thresholding-based techniques have been adopted to distinguish between flooded and unflooded areas [17]. Different analyses have been performed on various tasks concerning several spectral indexes, such as in cloud detection (cloud mask) [18], water presence (water pixels and the normalized difference water index) [19], [20], and vegetation analysis (the normalized difference vegetation index) [21]. Considering the burned area delineation problem, domain experts have developed several indexes: the Normalized Burn Ratio (NBR), NBR2, Burn Area Index (BAI), and BAI for Sentinel-2 (BAIS2) [19]. They are computed using different spectral bands to generate an index highlighting the affected areas of interest. Such techniques are often coupled with thresholding methodologies: either fixed or manually calibrated threshold values are chosen [22], or automatic thresholding algorithms are used [23]. Additional studies evaluate index-based techniques with additional in situ information, namely, the Composite Burned Area Index, which, indeed, provides insightful information but does not represent a scalable solution because in situ data are incredibly costly to collect. Furthermore, studies confirmed that finding a unique threshold that is region and SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

vegetation independent is difficult [24]. These methods assume that burned and unburned areas are linearly separable, which is usually untrue. More recently, researchers started adopting supervised learning techniques to solve several tasks in computer vision and EO. More precisely, convolutional neural network (CNN)-based models proved their effectiveness in image classification and segmentation tasks, achieving state-ofthe-art performances compared to index-based methodologies [25], [26]. Deep models proved their effectiveness in similar tasks covering wildfire detection [27] and spreading [28], too. The main drawback is the need for a significant amount of labeled data, possibly covering heterogeneous regions with different morIT IS WELL KNOWN THAT phological characteristics, to MORE CURATED DATA learn better representations. PROVIDE BETTER MACHINE Over the years, many of the LEARNING MODELS, AND proposed frameworks have limited their analyses to a few THE DATASET PROVIDES samples collected from a limMANY READY-TO-USE ited number of countries or SAMPLES WITHOUT locations [29]. In a few cases, LEVERAGING OTHER larger datasets were adopted SOURCES. to tackle the semantic segmentation problem without disclosing the dataset [27]. In the EO domain, different public datasets are available to the research community tackling different problems, such as flood delineation [17], [30], deforestation [31], wild area monitoring [32], sustainable development goal monitoring [33], and crop classification and segmentation [34], but, to the best of our knowledge, only two public datasets are available for the burned area delineation problem covering some countries in Europe [11] and Indonesia [12]. Our dataset collects more data than these, considering more wildfires and a larger area. It comprises pre- and postfire Sentinel-2 L2A data about California forest fires. Table 1 shows a comparison among the three datasets. The proposed dataset consists of the highest number of considered wildfires (340), globally covering the largest amount of burned areas (28 million pixels covering 11,000 km2) and a higher total covered surface (450,000 km2). Figure  1(a) shows the covered areas. Even though the proposed dataset has the greatest amount of burned surface, it achieves the lowest percentage of burned area compared to the others. However, the CaBuAr dataset provides the highest number of training samples in supervised (supervised learning in binary segmentation and the highest amount of burned areas) and unsupervised cases (self-supervised learning and the highest area covered). It is well known that more curated data provide better machine learning models, and the dataset provides many ready-to-use samples without leveraging other sources. Images are larger in terms of pixels (5,490) and disclosed 107

as raw data in the original and unaltered state, as directly collected from satellite instrumentation. On the other hand, the European dataset provides data collected from a third-party service for which preprocessing operations are performed. The availability of raw data enables researchers to apply the preferred preprocessing steps without any loss of information. Furthermore, the monitored range of dates of the new dataset spans from 2015 to 2022, whereas the other two datasets span a smaller time period.

TABLE 1. THE COMPARISON AMONG DATASETS. FEATURE

CaBuAr (OURS)

[11]

[12]

Region

California

Europe

Indonesia

Mission

Sentinel-2

Sentinel-1/2

Landsat-8

Resolution (m)

20

10 (S2)

30

Image size

5,490 × 5,490

Up to 5,000 × 5,000

512 × 512

Raw data

ü

û

ü

Channels

12

12

8

Forest fires

340

73

227

Start and end dates

January 2015 to December 2022

July 2017 to July 2019

January 2019 to December 2021

Total surface (km2)

~450,000

~19,000

~46,000

Burned surface [MP/km2]

~28/~11,000

~20/~2,000

~8/~7,000

Postfire

ü

ü

ü

Prefire

ü

ü

û

TD

~1 year

# 2 months

/

The highest value in each line is highlighted in bold, except for the resolution case, in which the lowest numerical value is highlighted. MP: the number of burned pixels in millions; TD: the time difference between prefire and postfire acquisitions.

DATASET The newly created dataset comprises L2A products of Sentinel-2, a European Space Agency mission. The area of interest is California, with the geographical distributions of the events shown in Figure 1(b). We collected images of the same area before and after the wildfire. It is essential to note that the L2A product contains red, green, blue (RGB) channels as well as other spectral bands in the infrared region and ultrablue for a total of 12 channels. Depending on the band, they have a resolution of 10 m, 20 m, or 60 m per pixel. PREPROCESSING The California Department of Forestry and Fire Protection publicly provides the ground truth vector information, which we converted into raster images. Each pixel contains a binary value: one in the burned area and zero in the case of undamaged areas. Although the registered wildfires span from 1898 to 2022, we collected data only for wildfires from 2015 to 2022 because there were no Sentinel-2 images before 2015. We gathered the Sentinel-2 images directly from Copernicus Open Access Hub. To minimize the effects of vegetation regrowth and postwildfire modifications, images are collected within one month after the wildfires are fully contained and extinguished. A total number of 340 acquisitions associated with 340 wildfires were downloaded, each being of size 5,490 × 5,490 pixels with a resolution of 20 m per pixel. The few Sentinel-2 bands with different resolutions were either upsampled or downsampled with bicubic interpolation to reach the target resolution. Prefire images have the same size and resolution as the postfire acquisitions. To enforce coherence and ­similar

42

California Satellite Tiles

42

40 Latitude (°)

Latitude (°)

40

38

38

36

36

34

34

32 –124

–122

–120 –118 Longitude (°) (a)

–116

–114

–124

–122

–120 –118 Longitude (°) (b)

–116

–114

Map tiles by Stamen Design, CC BY 3.0 -- Map data (C) OpenStreetMap contributors FIGURE 1. (a) The satellite tiles coverage: the California administrative boundaries (red) versus the satellite tiles of the proposed dataset (blue). (b) The location of the wildfires (red) inside the California boundaries (blue).

108

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

Postfire Prefire FIGURE 2. An example of prefire and postfire RGBs and relative masks.

2,11

5,11

6,13

Mask

Postfire

MANUAL INSPECTION After collecting data, we manually evaluated each postfire image using RGB channels. This was done to 1) discard invalid samples and 2) enrich the dataset with metadata and comments based on our subjective evaluation. We remark that such comments are not helpful for the final prediction task but can be used to better characterize the data. Our evaluation is associated with each satellite acquisition. Each image has a metadata field with a list of numeric codes generated from the manual inspection. Figure 2 reports the code-to-comment association. As can be seen, different climatic conditions can be found in the dataset. Figure 3 reports some examples of postfire images. For each postfire acquisition, Figure 3 reports its RGB version (first line), its binary mask (second line), and the comment(s) assigned to it (on top of the RGB image). For instance, the second acquisition has two comments: comments 2 and 11. We noted that some masks seem to overestimate the burned area. However, our perception refers to the RGB version of the images, i.e., to a subset of the available information. Moreover, our subjective perception can be biased also because the regions at the borders of burned areas are usually less damaged than the central ones. These notes must be extended to other maskrelated comments, but they are rarer. Comments are almost equally dis9 tributed among the folds. All of the 340 acquisitions do not include any comments that can negatively affect results and the dataset’s quality (i.e., comments 4, 8, and 12). Finally, each prefire image was manually inspected to verify its validity, but no new comment types were added. All invalid prefire acquisitions were discarded.

in supervised and unsupervised scenarios. Having at our disposal two sets of images, called PS and PR, containing postfire samples and prefire ones, respectively, the tasks we considered in this article can be formulated as follows: 1) Binary segmentation through machine learning methods based on postfire acquisitions only: This involves a supervised learning algorithm (AS) to perform pixel-level prediction based on samples from PS. AS labels pixels of images as burned or undamaged, creating a binary mask for each new image. 2) Binary segmentation through machine learning models based on prefire and postfire acquisitions: This involves a supervised learning algorithm (AS) to perform pixel-level prediction considering samples from PS and PR. 3) Binary segmentation through spectral indexes: This involves a spectral index (SI) designed for burned area identification. Taking samples from PS, SI outputs a value for

Mask

seasonal and phenological conditions, we downloaded prefire data considering a temporal window of four weeks, centered one year before the date postfire data were collected. For example, given a postfire acquisition collected on 1 April 2018, we downloaded the products available between 18 March 2017 and 15 April 2017, with the center on 1 April 2017. This ensures similar climatic and seasonal conditions, thus limiting environmental changes as much as possible. In some cases, retrieving these products was impossible due to data unavailability; i.e., not all wildfires have a prefire acquisition satisfying such a constraint. Given the 340 wildfires considered in this study, 208 have prefire availability satisfying the mentioned constraint. The dataset was randomly split into five nonoverlapped folds to perform cross validation.

TASKS The proposed dataset can be used as a benchmark for different tasks SEPTEMBER 2023

FIGURE 3. A sample of postfire RGBs and masks with the associated comments.

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

109

each pixel, creating a matrix PSl. Then, a binary mask, burned–unburned, can be made thresholding PSl. 4) Binary segmentation through differential spectral indexes: This involves a spectral index (SI) designed for burned area identification based on the comparison of pre- and postwildfire images. Taking samples from PS and PR, SI outputs a value for each pixel, creating a matrix PSl and PRl. Then, a binary mask, burned–unburned, can be made thresholding the difference PSl - PRl. EXPERIMENTS Our experiments test various classical threshold-based and deep learning-based methods considering three different data settings: 1) Usage of all the available postfire images: setting 1, tackling tasks 1 and 3. 2) Usage of the subset of postfire images for which the corresponding prefire image is available, without using the prefire image to train the models: setting 2, tackling tasks 1 and 3.

TABLE 2. THE ASSOCIATION BETWEEN CODES AND COMMENTS. COMMENT

MEANING

0

The affected area is in the incomplete region.

1

The image is incomplete.

2

There is a small burned area.

3

The mask has a small offset.

4

The mask is totally wrong.

5

There is an extensive burned area.

6

There are clouds over the burned area.

7

There are too many clouds over the image.

8

The wildfire is ongoing.

9

There is snow on the burned area.

10

The mask seems smaller than the burned area.

11

The mask seems bigger than the burned area.

12

The mask is in the missing data area.

13

Part of the mask is outside the area.

3) Usage of postfire and prefire images: setting 3, tackling tasks 2 and 4. Thus, two input images are considered for each area. Spectral indexes in this setting were evaluated by computing the difference between prefire and postfire indexes. The code for the experiments can be found at https://github. com/DarthReca/CaBuAr. EXPERIMENTAL SETTINGS The encoder of SegFormer is initialized with the original weights for Image-Net duplicated four times to handle the 12 available channels for Sentinel-2 L2A acquisitions. U-Net is, instead, randomly initialized. The batch size was set to eight. We used the AdamW optimizer with an initial learning rate of 0.001, decreased by a factor of 10 every 15 epochs, and a weight decay of 0.01 for every considered model. We used the well-known Dice loss [35] as the loss function. All models were trained on one Tesla V100 32-GB GPU. The testing was made using the weights associated with the best validation loss. Due to the size of the original input images (5,490 × 5,490), we split them into patches of size 512 × 512. ­Furthermore, due to class imbalance, we kept only those patches containing at least one pixel associated with the positive class and no clouds over the area of interest (comment 6 in Table 2). A total of 534 patches for setting 1 and 356 for ­settings 2 and 3 were obtained. The statistics and performances reported in the remainder of the article refer to the data obtained after the split-and-filter process mentioned earlier. All training and evaluation procedures were performed with a cross-validation approach. The same criterion was applied for spectral index methodologies to obtain comparable results despite the absence of a trainable model. The reported values are expressed as mean and standard deviation computed over the five folds. In Figure 4, we highlight the percentage of burned pixels per image in each fold. Even if data were split randomly,

1 Burned Pixels per Image (%)

Burned Pixels per Image (%)

1 0.8 0.6 0.4 0.2 0

0.8 0.6 0.4 0.2 0

0

1

2 Fold (a)

3

4

0

1

2 Fold (b)

3

4

FIGURE 4. The burned pixels percentage per image per fold: (a) setting 1 and (b) settings 2 and 3.

110

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

fold 0 is characterized by a larger variability in terms of the number of burned pixels per image. SPECTRAL INDEXES We evaluated several spectral indexes (NBR, NBR2, BAI, and BAIS2) for the burned area delineation task. In Table 3, we summarize the spectral bands exploited by the various indexes, and it is possible to note many bands are common to many of them. They take as the input some bands of a Sentinel product and output a value for each pixel. This value is generally thresholded to create a binary mask providing the burned/unburned information for each pixel. In particular, to assess the performances on the dataset, we computed the Separability Index (SI) [19] (see Table  4), which quantifies how well the index under analysis discerns between burned and unburned regions; i.e., a higher value of SI implies that classes are more separable from each other. We apply Otsu’s thresholding method to quantify the indexes’ segmentation performances. The results, which are shown in Table 4, confirm the poor performances in terms of F1 score and Intersection over Union. Additionally, the availability of prefire images (setting 3) does not significantly improve the evaluation metrics. Figure 5 shows an example of predictions for the cited indexes applying Otsu’s method. In this example case, BAI and NBR achieve the best scores, but many “disturbances” affect the final result in the unburned regions. Figure 5(b), which is a zoom on an actual unburned area, shows that many false positive points are inside the considered unburned area. We refer to this situation as ­“ disturbances.”

setting 3. SegFormer-B3 does not justify the greater complexity considering its results. In this case, prefire images do not provide any improvements, either. This can be justified because of the curse of dimensionality, which affects almost all machine learning models. In fact, the concatenation approach we applied increases the number of features without increasing the number of input samples. An open

TABLE 3. SPECTRAL BANDS USED BY THE INDEXES. INDEX

ULTRABLUE

VISIBLE

VNIR

SWIR

NBR

û

û

B8

B12

NBR2

û

û

û

B11 and B12

BAI

û

B4

B8

û

BAIS2

û

B4

B8A, B7, and B6

B12

VNIR, SWIR, and visible are exploited. B: band; SWIR: shortwave infrared; VNIR: visible and near infrared.

TABLE 4. THE SI AND METRICS COMPUTED FOR EACH ­SETTING AND EACH EVALUATED INDEX. SETTING

INDEX

SI

F1 SCORE

IOU

1

NBR

0.294

0.15 ± 0.231

0.103 ± 0.18

NBR2

0.224

0.226 ± 0.269

0.159 ± 0.209

BAI

0.044

0.04 ± 0.121

0.026 ± 0.086

BAIS2

0.027

0.194 ± 0.292

0.148 ± 0.252

NBR

0.32

0.106 ± 0.196

0.071 ± 0.15

NBR2

0.349

0.243 ± 0.278

0.172 ± 0.218

BAI

0.052

0.037 ± 0.115

0.024 ± 0.079

BAIS2

0.002

0.086 ± 0.174

0.057 ± 0.138

dNBR

0.247

0.114 ± 0.212

0.079 ± 0.168

dNBR2

0.189

0.218 ± 0.281

0.157 ± 0.225

2

3

DEEP LEARNING MODELS dBAI 0.04 0.066 ± 0.161 0.045 ± 0.127 We tested two deep learning architectures for semantic segdBAIS2 0.027 0.047 ± 0.126 0.03 ± 0.099 mentation: a CNN (U-Net [16]) and a vision transformer dNBR: delta NBR; dNBR2: delta NBR2; dBAI: delta BAI; dBAIS2: delta BAIS2; IOU: Intersection over Union. (SegFormer [15]). We decided to take into account two different versions of SegFormer (B0, the smallest version, and B3, a mid-range version) that differ only in size and, so, in the number Image NBR NBR2 BAIS2 BAI Ground Truth of parameters. U-Net, SegFormer-B0, and SegFormer-B3 consist of 31 million, 3.8 million, and 47 million parameters, respectively. To deal with setting 3, the two input images (pre(a) and postfire) are concatenated along Image NBR NBR2 BAIS2 BAI Ground Truth the channel axis, creating patches of size 24 # 152 # 152 (C # H # W) . We report the results for the different settings and models in Table 5. Without any specific pretrain(b) ing, U-Net provides the best performance in every setting. SegFormer- FIGURE 5. An example of segmentation using Otsu’s method for different indexes: B0, which is also lighter than U-Net, the (a) predictions and (b) zoomed predictions. In (b), the zoom was applied to the top left provides comparable performance, corner of the images in (a), boxed in red. This is done to show more clearly an area with false having some difficulties only with positive predictions. SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

111

research direction is the design of more sophisticated models to exploit both images effectively. Figure 6 reports the predictions of these models on the same input sample shown in Figure 5. The most evident difference is that deep models tend to be more precise and less affected by false positives in the unburned areas (i.e., there are fewer “disturbances”). Looking attentively at Figure 5(b), many false positive points can be seen. The greatest problem of thresholding techniques is that they try to find a linear separation between classes, WE PLAN TO EXTEND THE which is frequently unrealDATASET TO NEW REGIONS istic. This is why deep learnAND SATELLITE ing models, which support ACQUISITIONS nonlinearities, perform betCONTINUOUSLY. ter [36], [37]. The substantial variability of the threshold tec hnique, show n by t he higher variance (see Table 4), could be caused by the fact some areas are more separable than others. CONCLUSION This article introduced a new dataset for burned area delineation containing samples with different morphological features of the terrain and different sizes of burned regions. The dataset includes both prefire and postfire data.

TABLE 5. THE METRICS CALCULATED FOR EACH DEEP ­LEARNING MODEL EVALUATED. SETTING METRIC 1 2 3

SEGFORMERB3 SEGFORMERB0 U-NET

F1 score 0.62 ± 0.009

0.686 ± 0.004

0.707 ± 0.004

IoU

0.497 ± 0.008

0.563 ± 0.004

0.583 ± 0.004

F1 score 0.583 ± 0.014

0.654 ± 0.003

0.705 ± 0.002

IoU

0.535 ± 0.003

0.577 ± 0.002

0.447 ± 0.012

F1 score 0.533 ± 0.003

0.499 ± 0.009

0.625 ± 0.002

IoU

0.37 ± 0.007

0.494 ± 0.002

0.401 ± 0.003

Image

SegFormerB3

U-Net

SegFormerB0

Ground Truth

FIGURE 6. Examples of prediction with deep learning models.

112

We provided baselines based on different approaches to assess the quality of basic methods and encourage further research activities. This publicly available dataset can benefit researchers and public authorities for further tasks, such as recovery planning, constant monitoring of affected areas, and the development of deep learning models for burned area delineation. We plan to extend the dataset to new regions and satellite acquisitions continuously. The collection of satellite acquisitions is made publicly available to encourage future use and research activities. AUTHOR INFORMATION Daniele Rege Cambrin ([email protected]) is with Politecnico di Torino, 10129 Turin, Italy. Luca Colomba ([email protected]) is with Politecnico di Torino, 10129 Turin, Italy. Paolo Garza ([email protected]) is with Politecnico di Torino, 10129 Turin, Italy. REFERENCES [1] “Sentinel missions,” European Space Agency, Paris, France, 2023. [Online]. Available: https://sentinel.esa.int/web/sentinel/missions [2] “Landsat mission,” Nat. Aeronaut. Space Admin., Washington, DC, USA, 2023. [Online]. Available: https://landsat.gsfc.nasa.gov/ [3] “Moderate-resolution imaging spectroradiometer (MODIS),” Nat. Aeronaut. Space Admin., Washington, DC, USA , 2023. [Online]. Available: https://www.earthdata.nasa.gov/sensors/modis [4] Z. Dong, G. Wang, S. O. Y. Amankwah, X. Wei, Y. Hu, and A. Feng, “Monitoring the summer flooding in the Poyang Lake area of China in 2020 based on sentinel-1 data and multiple convolutional neural networks,” Int. J. Appl. Earth Observ. Geoinf., vol. 102, Oct. 2021, Art. no. 102400, doi: 10.1016/j.jag.2021.102400. [5] A. Asokan and J. Anitha, “Change detection techniques for remote sensing applications: A survey,” Earth Sci. Informat., vol. 12, no. 2, pp. 143–160, Feb. 2019, doi: 10.1007/s12145-019-00380-5. [6] J. Sublime and E. Kalinicheva, “Automatic post-disaster damage mapping using deep-learning techniques for change detection: Case study of the Tohoku tsunami,” Remote Sens., vol. 11, no. 9, May 2019, Art. no. 1123, doi: 10.3390/rs11091123. [7] R. Lasaponara and B. Tucci, “Identification of burned areas and severity using SAR sentinel-1,” IEEE Geosci. Remote Sens. Lett., vol. 16, no. 6, pp. 917–921, Jun. 2019, doi: 10.1109/LGRS.2018.2888641. [8] D. R. Cambrin, L. Colomba, and P. Garza, “Vision transformers for burned area delineation,” in Proc. Conf. Mach. Learn. Princ. Pract. Knowl. Discovery Databases, 2022. [9] M. A. Tanase et al., “Burned area detection and mapping: Intercomparison of sentinel-1 and sentinel-2 based algorithms over tropical Africa,” Remote Sens., vol. 12, no. 2, Jan. 2020, Art. no. 334, doi: 10.3390/rs12020334. [10] “Sentinel-2 mission guide,” European Space Agency, Paris, France, 2023. Accessed: Apr. 14, 2023. [Online]. Available: https:// sentinel.esa.int/web/sentinel/missions/sentinel-2 [11] L. Colomba et al., “A dataset for burned area delineation and severity estimation from satellite imagery,” in Proc. 31st ACM Int. Conf. Inf. Knowl. Manage., 2022, pp. 3893–3897, doi: 10.1145/ 3511808.3557528. IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

[12] Y. Prabowo et al., “Deep learning dataset for estimating burned areas: Case study, Indonesia,” Data, vol. 7, no. 6, Jun. 2022, Art. no. 78, doi: 10.3390/data7060078. [13] “The department of forestry and fire protection,” California Department of Forestry and Fire Protection, Sacramento, CA, USA, 2023. Accessed: Apr. 14, 2023. [Online]. Available: https://www.fire.ca.gov/ [14] N. Otsu, “A threshold selection method from gray-level histograms,” IEEE Trans. Syst., Man, Cybern., vol. 9, no. 1, pp. 62–66, Jan. 1979, doi: 10.1109/TSMC.1979.4310076. [15] E. Xie, W. Wang, Z. Yu, A. Anandkumar, J. M. Alvarez, and P.  Luo, “SegFormer: Simple and efficient design for semantic segmentation with transformers,” in Proc. Adv. Neural Inf. Process. Syst., Red Hook, NY, USA: Curran Associates, 2021, vol. 34, pp. 12,077–12,090. [16] O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional networks for biomedical image segmentation,” in Medical ­Image Computing and Computer-Assisted Intervention, N. Navab, J. Hornegger, W. Wells, and A. Frangi, Eds. Cham, Switzerland: Springer International Publishing, 2015, pp. 234–241. [17] D. Bonafilia, B. Tellman, T. Anderson, and E. Issenberg, “Sen1Floods11: A georeferenced dataset to train and test deep learning flood algorithms for sentinel-1,” in Proc. IEEE/CVF Conf. Comput. Vision Pattern Recognit. (CVPR) Workshops, Jun. 2020, pp. 835–845, doi: 10.1109/CVPRW50498.2020.00113. [18] R. A. Frey, S. A. Ackerman, R. E. Holz, S. Dutcher, and Z. Griffith, “The continuity MODIS-VIIRS cloud mask,” Remote Sens., vol. 12, no. 20, Oct. 2020, Art. no. 3334, doi: 10.3390/rs12203334. [19] F. Filipponi, “BAIS2: Burned area index for sentinel-2,” ­Proceedings, vol. 2, no. 7, Mar. 2018, Art. no. 364, doi: 10.3390/ ecrs-2-05177. [20] A. Fisher, N. Flood, and T. Danaher, “Comparing Landsat water index methods for automated water classification in eastern Australia,” Remote Sens. Environ., vol. 175, pp. 167–182, Mar. 2016, doi: 10.1016/j.rse.2015.12.055. [21] N. Pettorelli, J. O. Vik, A. Mysterud, J.-M. Gaillard, C. J. Tucker, and N. C. Stenseth, “Using the satellite-derived NDVI to assess ecological responses to environmental change,” Trends Ecology Evol., vol. 20, no. 9, pp. 503–510, Sep. 2005, doi: 10.1016/ j.tree.2005.05.011. [22] L . Saulino et al., “Detecting burn severity across Mediterranean forest types by coupling medium-spatial resolution satellite imagery and field data,” Remote Sens., vol. 12, no. 4, Feb. 2020, Art. no. 741, doi: 10.3390/rs12040741. [23] W. Bin et al., “A method of automatically extracting forest fire burned areas using GF-1 remote sensing images,” in Proc. IEEE Int. Geosci. Remote Sens. Symp. (IGARSS), 2019, pp. 9953–9955, doi: 10.1109/IGARSS.2019.8900399. [24] C. A. Cansler and D. McKenzie, “How robust are burn severity indices when applied in a new region? Evaluation of alternate field-based and remote-sensing methods,” Remote Sens., vol. 4, no. 2, pp. 456–483, Feb. 2012, doi: 10.3390/rs4020456. [25] L . Knopp, M. Wieland, M. Rättich, and S. Martinis, “A deep learning approach for burned area segmentation with sentinel-2 data,” Remote Sens., vol. 12, no. 15, Jul. 2020, Art. no. 2422, doi: 10.3390/rs12152422. SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

[26] A. Farasin, L. Colomba, G. Palomba, G. Nini, and C. Rossi, “Supervised burned areas delineation by means of sentinel-2 imagery and convolutional neural networks,” in Proc. Int. Conf. Inf. Syst. Crisis Response Manage. (ISCRAM), Blacksburg, VA, USA: Virginia Tech, 2020, pp. 24–27. [27] D. Rashkovestsky, F. Mauracher, M. Langer, and M. Schmitt, “Wildfire detection from multisensor satellite imagery using deep semantic segmentation,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 14, pp. 7001–7016, Jun. 2021, doi: 10.1109/ JSTARS.2021.3093625. [28] F. Huot, R. L. Hu, N. Goyal, T. Sankar, M. Ihme, and Y.-F. Chen, “Next day wildfire spread: A machine learning dataset to predict wildfire spreading from remote-sensing data,” IEEE Trans. Geosci. Remote Sens., vol. 60, pp. 1–13, Jul. 2022, doi: 10.1109/TGRS.2022.3192974. [29] Q. Safder, H. Zhang, and Z. Zheng, “Burnt area segmentation ­ eosci. Remote with densely layered capsules,” in Proc. IEEE Int. G Sens. Symp., 2022, pp. 2199–2202, doi: 10.1109/IGARSS46834. 2022.9884323. [30] F. Montello, E. Arnaudo, and C. Rossi, “MMFlood: A multimodal dataset for flood delineation from satellite imagery,” IEEE Access, vol. 10, pp. 96,774–96,787, Sep. 2022, doi: 10.1109/ ACCESS.2022.3205419. [31] D. John and C. Zhang, “An attention-based u-net for detecting deforestation within satellite sensor imagery,” Int. J. Appl. Earth Observ. Geoinf., vol. 107, Mar. 2022, Art. no. 102685, doi: 10.1016/j.jag.2022.102685. [32] B. Ekim, T. T. Stomberg, R. Roscher, and M. Schmitt, “MapInWild: A remote sensing dataset to address the question of what makes nature wild [Software and Data Sets],” IEEE Geosci. Remote Sens. Mag., vol. 11, no. 1, pp. 103–114, Mar. 2023, doi: 10.1109/ MGRS.2022.3226525. [33] C. Yeh et al., “SustainBench: Benchmarks for monitoring the sustainable development goals with machine learning,” in Proc. 35th Conf. Neural Inf. Process. Syst. Datasets Benchmarks Track (Round 2), Dec. 2021. [Online]. Available: https://openreview. net/forum?id=5HR3vCylqD [34] D. Sykas, M. Sdraka, D. Zografakis, and I. Papoutsis, “A sentinel-2 multiyear, multicountry benchmark dataset for crop classification and segmentation with deep learning,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 15, pp. 3323–3339, Apr. 2022, doi: 10.1109/JSTARS.2022.3164771. [35] C. H. Sudre, W. Li, T. Vercauteren, S. Ourselin, and M. Jorge Cardoso, “Generalised dice overlap as a deep learning loss function for highly unbalanced segmentations,” in Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. Cham, Switzerland: Springer International Publishing, 2017, pp. 240–248. [36] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” 2014, arXiv:1409.1556. [37] A. Garcia-Garcia, S. Orts-Escolano, S. Oprea, V. Villena-Martinez, P. Martinez-Gonzalez, and J. Garcia-Rodriguez, “A survey on deep learning techniques for image and video semantic segmentation,” Appl. Soft Comput., vol. 70, pp. 41–65, Sep. 2018. [Online]. Available: https://www.sciencedirect.com/science/article/ pii/S1568494618302813, doi: 10.1016/j.asoc.2018.05.018. GRS

113

CONFERENCE REPORTS ALBERTO MOREIRA  , FRANCESCA BOVOLO , DAVID LONG , AND ANTONIO PLAZA

IGARSS 2023 in Pasadena, California Impressions of the First Days

A

fter four years of online and hybrid conferences, the 43rd IEEE International Geoscience and Remote Sensing Symposium (IGARSS) was held in person again on 16–21 July 2023. The symposium was located at the Pasadena Convention Center, Pasadena, in sunny California, USA, only 11 miles from Los Angeles (Figure 1). At the base of the San Gabriel Mountains in the San Gabriel Valley, filled with historic architecture and national landmarks, Pasadena is known as the “crown of the valley.” To organize such a big conference with thousands of attendees from around the world may have posed a challenge, but it was met most professionally by the local IGARSS 2023 organizing team, Conference Management Services, and the IEEE Geoscience and Remote Sensing Society (GRSS), and it exceeded expectations. The symposium aimed at providing a platform for sharing knowledge and experience on recent developments and advancements in geoscience and remote sensing technologies, particularly in the context of Earth observation, disaster monitoring, and risk assessment. A variety of programs was offered, such as keynote talks, technical sessions, tutorials, exhibitions, a Young Professionals’ (YP’s) mixer, presentation and writing workshops, a career panel, a Technology, Industry, and Education (TIE) forum, a technical tour, an awards banquet, and also a 3-min thesis competition, a student paper contest, and a summer school prior to the symposium. Following are some highlights from the IGARSS 2023 opening and plenary session, held on Monday, 17 July 2023.

Misra (see Figure 2). They gave the welcome address, described the logistics for the week as well as the social program, and highlighted the most important events of the week. The IGARSS 2023 technical program was presented by Dr. Rashmi Shah and Dr. David Kunkee, technical committee cochairs (see Figure 3). First, Dr. Kunkee presented some astonishing statistics from the ­conference. The success of such a big conference is unthinkable without the help of many volunteers: 328 session o ­ rganizers

FIGURE 1. City of Pasadena, CA, USA, the venue of IGARSS 2023. [Courtesy of Visit Pasadena (visitpasadena.com)]

Digital Object Identifier 10.1109/MGRS.2023.3303685 Date of current version: 19 September 2023

114

©IGARSS 2023

WELCOME ADDRESSES AT THE PLENARY SESSION The plenary ceremony of IGARSS 2023 started on Monday with an introduction of the conference by IGARSS 2023 general cochairs Shannon Brown and Sidharth FIGURE 2. Opening remarks of IGARSS 2023 general cochairs

Shannon Brown (right) and Sidharth Misra (left). IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

TABLE 1. PRESENTATIONS AND ATTENDANCE. TOTAL PAPERS ACCEPTED

ORAL PAPERS

POSTER PAPERS

ORAL SESSIONS

POSTER ­SESSIONS

TOTAL ­REGISTERED

STUDENTS

3,688

2,868

1,534

1,334

319

154

2,768

961

SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

©IGARSS 2023 ©IGARSS 2023

and 1,294 reviewers were involved in the preparations. With about 3,700 attendees, there was a record number of submissions, which finally resulted in 2,868 oral or poster presentations assisted by 475 session cochairs and 120 session managers (see Table 1). Several changes were made this year to improve the IGARSS technical program: Shorter abstracts were used to streamline the abstracts submission and, instead of invited sessions “community contributed sessions” were introduced to get more of the community involved. Also, the structure of the oral session was changed to FIGURE 3. Technical committee cochairs Dr. Rashmi Shah (right) and Dr. David Kunkee include more discussion among the com- (left) presenting the technical program. munity members: presentations were After Dr. Coughlin’s presentation, GRSS President Dr. planned to be 12 min long, and within a session there was Mariko S. Burgin gave a warm welcoming address to all atonly at the end of all presentations a 15-min slot in which tendees and reported on the activities of the GRSS (see to ask questions. Figure 5). The GRSS is one of 39 IEEE Societies, and it is a truly Dr. Rashmi recommended the TIE events and YP events. global community. It has nearly 5,000 members in 144 counShe also spoke about the 13 tutorials with more than 300 tries and is organized into 128 Chapters with 12 ambassadors participants that were held in the run-up to the conference all over the world. The GRSS is governed by the GRSS AdCom. and were a great success. After Dr. Rashmi spoke, 2023 IEEE President-Elect Dr. Tom Coughlin’s introduction followed (see Figure 4). He sees IEEE as a resource for technology decisions. As technology of all sorts drives the world’s economy, this is something that is very important to be aware of. IEEE is the largest technical professional organization in the world, its members are involved in all aspects of technology creation and use, its research powers patents, and it creates the world’s technical standards. IEEE also fosters efforts in future directions, technical road maps, and tracking megatrends as well as informing public policy and serving as a resource for techniFIGURE 4. 2023 IEEE President-Elect Dr. Tom Coughlin’s introduction. cal discussions. IEEE has more than 420,000 members in more than 190 countries and sponsors more than 2,000 conferences in 96 countries annually. In addition to its 46 Societies and technical councils, it provides a lot of volunteer opportunities that help Members to build networks and learn new concepts. IEEE is the most-cited publisher in new patents from top-patenting organizations, and IEEE research is increasingly valuable to innovators. As part of his IEEE presidency, Dr. Coughlin would like to increase IEEE’s outreach to younger members and the broader public, increase engagement with industry, and FIGURE 5. Opening remarks of Dr. Mariko Burgin, 2023 president make investments in new products and services. of the GRSS, during the plenary session.

©IGARSS 2023

TOTAL PAPERS ­SUBMITTED

115

©IGARSS 2023

The GRSS fosters engagement of its members for the benefit of society through science, engineering, applications, and education as related to the development of the field of geoscience and remote sensing. The GRSS is also a group of scientists, researchers, and practitioners with common interests and a common framework for building a community. Dr. Burgin also highlighted five important GRSS areas. The GRSS disseminates premium science by sponsoring four refereed publications [IEEE Geoscience and Remote Sensing Magazine (GRSM), IEEE Transactions on Geoscience and Remote Sensing (TGRS), IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (J-STARS), and IEEE Geoscience and Remote Sensing Letters], cosponsoring IEEE Journal on Miniaturization for Air and Space Systems and curating an eNewsletter. Eight GRSS techALL OF THIS SPEAKS IN nical committees organize a FAVOR OF CHOOSING THE wide variety of technical activities. The GRSS provides GRSS AS A “PROFESSIONAL connections and networking HOME”—A PLACE FOR TECHopportunities, such as the NICAL EXCELLENCE, WHERE IGARSS, regional symposia, YOU ARE WELCOME AND smaller conferences, and coWHERE YOU BELONG. sponsored conferences and workshops. The GRSS also organizes professional activities with distinguished lecturers, YP events, professional development microgrants, the Women Mentoring Women program, and much more. Finally, the GRSS provides learning

FIGURE 6. IEEE Fellow Award recipient Prof. James Garrison.

116

and other opportunities, like GRSS schools, webinar series, high school and undergraduate student outreach, student grand challenges, travel grants, and more. All of this speaks in favor of choosing the GRSS as a “professional home”—a place for technical excellence, where you are welcome and where you belong. It is a place that you can make your own because the GRSS can help interact with like-minded researchers, engineers, and developers to make a difference in the world through remote sensing. MAJOR AWARDS CEREMONY Following the opening remarks for IGARSS 2023, Prof. Alberto Moreira, chair of the GRSS Major Awards Committee, opened the 2023 awards ceremony. As in the past, the opening and plenary session of IGARSS 2023 was chosen for the recognition of IEEE GRSS members elevated to IEEE Fellow grade and the four major awards of the GRSS. For each award, 2023 GRSS President Dr. Mariko Burgin and IEEE President Dr. Tom Coughlin presented the recognitions and congratulated the awardees. IEEE FELLOW AWARDS The grade of IEEE Fellow recognizes unusual distinction in the profession, and it is conferred only by invitation of the IEEE Board of Directors upon a person of outstanding and extraordinary qualifications and experience in IEEE-designated fields. The IEEE bylaws limit the number of Members who can be advanced to Fellow grade in any one year to one per mil of the Institute membership, exclusive of students and affiliates. To qualify, the candidate must be a Senior Member and must be nominated by an individual familiar with the candidate’s achievements. Endorsements are required from at least five IEEE Fellows and an IEEE Society best qualified to judge. The GRSS IEEE Fellow Committee completes the first evaluation of the nominees. After this, the IEEE Fellow Committee, comprising approximately 50 IEEE Fellows, carefully evaluates all nominations, considering the Society rankings and presents a list of recommended candidates to the IEEE Board of Directors for the final election. On average, the GRSS performs above the average with respect to the number of elected Fellows every year. This year we have four GRSS members who were promoted to IEEE Fellow. The first Fellow recognition went to Prof. James Garrison with the following citation (see Figure 6): “For contributions to Earth remote sensing using signals of opportunity.” James Garrison is a professor in the School of Aeronautics and Astronautics at Purdue University with a courtesy appointment in the School of Electrical and Computer Engineering and the Ecological Sciences and Engineering Interdisciplinary Graduate program. His research interests include Earth remote sensing using GNSSs and signals of opportunity. He is the principal investigator for SNOOPI, a NASA mission to demonstrate remote sensing with P-band signals of opportunity. Prior to his academic position, Prof. ­Garrison was with NASA. He earned a Ph.D. degree from IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

©IGARSS 2023

©IGARSS 2023

FIGURE 7. IEEE Fellow Award recipient Prof. Jonathan Li (middle)

with Dr. Mariko Burgin (right) and IEEE President-Elect Dr. Tom Coughlin (left).

FIGURE 8. IEEE Fellow Award recipient Prof. Gabriele Moser.

©IGARSS 2023

Engineering and the Engineering Institute of Canada and is the University of Colorado Boulder in 1997 and also holds the recipient of more than 20 prestigious awards. a B.S. degree from the Rensselaer Polytechnic Institute and The third Fellow recognition was received by Prof. Gaan M.S. degree from Stanford University. He is a fellow briele Moser with the following citation (see Figure 8): “For of the Institute of Navigation. From 2018 to 2022 he was contributions to pattern recognition in remote sensing.” editor-in-chief of GRSM. Prof. Gabriele Moser is a full professor of telecommuThe second Fellow recognition went to Prof. Jonathan Li nications at the University of Genoa. His research activwith the citation (see Figure 7) “For contribution to point ity is focused on pattern recognition and image processing cloud analytics in lidar remote sensing.” methodologies for remote sensing and energy applicaProf. Jonathan Li received his Ph.D. degree in geomatics tions. He served as chair of the GRSS Image Analysis and engineering from the University of Cape Town, South Africa, Data Fusion Technical Committee (IADF TC) from 2013 to in 2000. Prof. Jonathan Li is currently a professor of geomat2015 and as IADF TC cochair from ics and systems design engineering 2015 to 2017. He was publication with the University of Waterloo, cochair of IGARSS 2015, technical Canada. His main research interests program cochair of the GRSS Earthinclude artificial intelligence (AI)Vision workshop at the 2015 IEEE based 3D geospatial information Conference on Computer Vision extraction from Earth observation and Pattern Recognition (CV PR), images and lidar point clouds, phoand coorganizer of the second editogrammetry and pointgrammetry tion of EarthVision at CVPR 2017. for high-definition map generation, Since 2019, he has been the head of 3D vision, and GeoAI for digital the M.Sc. program in Engineering twin cities. He has coauthored more for Natural Risk Management at the than 530 publications, more than University of Genoa. Since 2021, he 330 of which were published in has been a member of the national refereed journals. He has also pubevaluation committee for national lished papers in flagship conferencscientific qualification (Abilitazione es in computer vision and AI. His Scientifica Nazionale) as a full propublications have received more fessor in the telecommunications than 15,000 Google citations with field in Italy. an h-index of 65. He has supervised The fourth Fellow recognition more than 120 master’s and Ph.D. was presented to Prof. Ping Yang students as well as postdoctoral felw it h t he c itat ion (see Fig ure  9) lows to completion. He is a fellow FIGURE 9. IEEE Fellow Award recipient Prof. “For s e m i n a l cont r ibut ion s to of both the Canadian Academy of Ping Yang. SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

117

r­ adiative transfer and remote sensing of ice clouds and dust aerosols.” Prof. Ping Yang is a University Distinguished Professor at Texas A&M University (TAMU), where he currently serves as senior ­associate dean for research and graduate studies in the College of Arts and Sciences. He has joint professorship appointments with the Department of Physics & Astronomy and the Department of Oceanography, and he holds the ­David Bullock Harris Chair in Geosciences. Prof. Yang has supervised the completion of 29 doctoral dissertations and 20 master’s degree theses. He has published 359 peerreviewed journal papers and four monographs. His publications have been cited 23,483 times (Google Scholar) with an h-index of 78 (Google Scholar). His research focuses on light scattering, radiative transfer, and satellite-based remote sensing. Prof. Yang is a recipient of the NASA Exceptional Scientific Achievement Medal (2017), the American Geophysical Union Atmospheric Sciences Ascent Award (2013), the David and Lucille Atlas Remote Sensing Prize from the American Meteorological Society (2020), and the Van de Hulst LightScattering Award from Elsevier (2022). Prof. Yang was named the 2022 Distinguished Texas S­ cientist by the Texas Academy of Science. Within TAMU, he received a university-level faculty research award bestowed by the Association of Former Students in 2017 and several college-level awards. The IEEE Fellow recognition part of the awards ceremony was concluded with a group photo with the four IEEE Fellows nominated by the GRSS together with GRSS President Dr. Mariko Burgin and IEEE President-Elect Dr. Tom Coughlin (Figure 10).

IEEE GRSS OUTSTANDING SERVICE AWARD The Outstanding Service Award was established to recognize an individual who has given outstanding service for the benefit and advancement of the GRSS. The award is considered annually but will not be presented unless a suitable candidate is identified. The following factors are suggested for consideration: leadership innovation, activity, service, duration, breadth of participation, and cooperation. GRSS membership is required. The awardee receives a certificate and a plaque. The 2023 GRSS Outstanding Service Award was presented to Dr. Simon Yueh with the following citation (see Figure 11): “In recognition of his outstanding service for the benefit and advancement of the Geoscience and Remote Sensing Society.” Simon Yueh received his Ph.D. degree in electrical engineering in January 1991 from the Massachusetts I­ nstitute

©IGARSS 2023

GRSS MAJOR AWARDS AT THE AWARDS CEREMONY The call for nominations for the GRSS Education Award, the GRSS Outstanding Service Award, the GRSS Indus-

try Leader Award, and the GRSS Fawwaz Ulaby Distinguished Achievement Award was posted in 2022 on the GRSS website and announced in the eNewsletter of the GRSS. The nomination forms are available at http:// www.grss-ieee.org/about/awards/. Any member, with the exception of GRSS AdCom members, can make nominations to recognize deserving individuals. Typically, the lists of nominated candidates comprise three to five names each year. An independent Major Awards Evaluation Committee makes the selection, which is approved by the GRSS president and AdCom. The following major awards were presented: ◗◗ Outstanding Service Award ◗◗ Education Award ◗◗ Industry Leader Award ◗◗ Fawwaz Ulaby Distinguished Achievement Award.

FIGURE 10. IEEE Fellow Award recipients Prof. Gabriele Moser (second from left), Prof. Jonathan Li, Prof. James Garrison, and Prof. Ping Yang (second from right) with GRSS President Dr. Mariko Burgin (right) and IEEE President-Elect Dr. Tom Coughlin (left).

118

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

©IGARSS 2023

©IGARSS 2023

of Technology. In September 1991, he joined the Radar Science and Engineering Section at the Jet Propulsion Laboratory (JPL). He was the supervisor of the Radar System Engineering and Algorithm Development Group during 2002– 2007, the deputy manager of the Climate, Oceans and Solid Earth Section from July 2007 to March 2009, and the section manager from April 2009 to 2013 January. He served as the project scientist of the NASA Aquarius mission from January 2012 to September 2013, the deputy project scientist of the NASA Soil Moisture Active Passive mission from 2013 January to Sep- FIGURE 11. 2023 IEEE GRSS Outstanding Service Award recipient Dr. Simon Yueh. tember 2013, and the Soil Moisture affiliation is required. The awardee receives a certificate Active Passive Project scientist since October 2013. He has and a plaque. been the principal/coinvestigator of numerous NASA and The 2023 GRSS Education Award was presented to U.S. Department of Defense research projects on remote Prof. Shutao Li with the citation (see Figure 12) “In recogsensing of ocean salinity, ocean wind, terrestrial snow, nition of his significant educational contributions to geoand soil moisture. He has authored four book chapters science and remote sensing.” and published more than 300 publications and presentaShutao Li received his B.S., M.S., and Ph.D. degrees tions. He received the 2021 J-STARS Prize Paper Award, from Hunan University, Changsha, China, in 1995, the 2014, 2010, and 2002 GRSS Transactions Prize Paper 1997, and 2001, respectively. Prof. Shutao Li has been a awards, the 2000 Best Paper Award at IGARSS 2000, and full professor with the College of Electrical and Informathe 1995 GRSS Transactions Prize Paper award for a paper tion Engineering, Hunan University, since 2004 and is on polarimetric radiometry. He received the JPL Lew Allen currently the vice rector of Hunan University. Prof. Li’s Award for Excellence in 1998, the Ed Stone Award in 2003, current research interests include remote sensing image the NASA Exceptional Technology Achievement Medal in processing, pattern recognition, AI, and applications 2014, and the NASA Outstanding Public Leadership Medin environmental observation, real in 2017. He was an associate edisource investigation, and precise tor of Radio Science from 2002 to agriculture. He has authored or co2006 and editor-in-chief of TGRS authored more than 300 refereed from 2018 to 2022. journal and international conference papers. He has received more IEEE GRSS EDUCATION AWARD than 28,000 citations in Google The Education Award was estabS c hola r ( h-i nde x : 79) a nd wa s lished to recognize an individual selected as a Clarivate Analytics’ who has made significant educaGlobal Highly Cited Researcher in tional contributions to the field of 2018–2022. For his scientific reGRSS. In selecting the individual, search contributions, he received the factors considered are signifitwo Second-Grade State Scientific cance of the educational contribuand Technological Progress Awards tion in terms of innovation and of China (in 2004 and 2006), a Secthe extent of its overall impact. ond Prize of the National Natural The contribution can be at any Science Award by the State Council level, including K-12, undergraduof China (in 2019), and two First ate, and graduate teaching, profesPrize Hunan Provincial Natural sional development, and public Science Awards (in 2017 and 2022). outreach. It can also be in any form Prof. Li is the founder and head of (e.g., textbooks, curriculum develthe Hunan Provincial Key Laboopment, and educational program FIGURE 12. 2023 IEEE GRSS Education Award rator y of Visual Perception and initiatives). GRSS membership or recipient Prof. Shutao Li. SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

119

­ rtificial Intelligence. He also founded the GRSS ChangA sha Chapter and cofounded the International Joint Research Center for Hyperspectral Imaging and Processing. He is now an associate editor of TGRS and IEEE Transactions on Instrumentation and Measurement and a member of the editorial board of Information Fusion.

©IGARSS 2023

IEEE GRSS FAWWAZ ULABY DISTINGUISHED ACHIEVEMENT AWARD The Fawwaz Ulaby Distinguished Achievement Award was established to recognize an individual who has made significant technical contributions, within the scope of GRSS, usually over a sustained period. In selecting the individual, the factors considered are quality, significance, and impact of the contributions; quantity of the contributions; duration of significant activity; papers published in archival journals; papers presented at conferences and symposia; patents granted; and advancement of the profession. IEEE membership is preferable, but not required. The award is considered annually and presented only if a suitable candidate is identified. The awardee receives a plaque and a certificate. The 2023 IEEE GRSS Fawwaz Ulaby Distinguished Achievement Award was presented to Prof. Howard

©IGARSS 2023

IEEE GRSS INDUSTRY LEADER AWARD The GRSS established the Industry Leader Award to recognize an individual who has made significant contributions over a sustained period of time in an industrial and/ or a commercial remote sensing discipline. The evaluation awards committee may give preference to an individual who 1) is a GRSS member, 2) has made significant contributions to remote sensing system engineering, science, and/or technology, 3) has made significant contributions to dissemination and commercialization of remote sensing products, and 4) has demonstrated leadership to promote remote sensing science and technology. Criteria for selection are significance, quality, and impact of activities and contributions and achievements. The award is considered annually and presented if a distinguished candidate is identified. The 2023 GRSS Industry Leader Award was presented to Robbie Schingler with the following citation (see Figure 13): “For co-founding Planet and for outstanding contributions for the commercialization and dissemination of optical remote sensing data.” Robbie Schingler is a director, cofounder, and chief strategy officer (CSO) at Planet. As CSO, Mr. Schingler leads

the company’s long-term strategic trajectory and oversees Planet’s Space Systems, Corporate Development, and Special Projects functions. He spearheaded Planet’s acquisition of BlackBridge in 2015, Boundless in 2019, and VanderSat in 2021. Prior to Planet, Mr. Schingler spent nine years at NASA, where he helped build the Small Spacecraft Office at NASA Ames and served as chief of staff for the Office of the Chief Technologist at NASA headquarters. He received an MBA from Georgetown University, an M.S. degree in space studies from the International Space University, and a B.S. degree in engineering physics from Santa Clara University.

FIGURE 13. 2023 GRSS Industry Leader Award recipient

Robbie Schingler. 120

FIGURE 14. IEEE GRSS Fawwaz Ulaby Distinguished Achievement Award recipient Prof. Howard Zebker. IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

©IGARSS 2023

FIGURE 15. Group photo at the end of the major awards ceremony (from left to right): IEEE President-Elect Dr. Tom Coughlin, Prof. Gabriele Moser, Prof. James Garrison, Prof. Ping Yang, Prof. Howard Zebker, Dr. Simon Yueh, GRSS President Dr. Mariko Burgin, and Major Awards Chair Prof. Alberto Moreira.

SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

KEYNOTE SPEECHES AT THE PLENARY SESSION After the awards ceremony, the plenary session started with presentations by three distinguished plenary speakers: ◗◗ “Dare Mighty Things Together,” by Dr. Laurie Leshin, director of NASA’s JPL ◗◗ “Talking Pigs—Math Stories From the Visual Effects Industry,” by Joe Mancewicz, senior software engineer in Nvidia’s Omniverse Group ◗◗ Keynote speech by Elia Saikaly, award-winning adventure filmmaker based in Ottawa, Canada. The first keynote speaker, Dr. Laurie Leshin, director of NASA’s JPL, gave insights into the work of JPL, the ­c urrent status of activities in Earth science, and future challenges (see Figure 16). As the conference’s participants also had

©IGARSS 2023

­ ebker with the citation (see Figure 14) “For sustained Z ­outstanding contributions and leadership in the field of radar interferometry.” Howard Zebker was born in Ventura, CA, USA, received his B.S. degree from Caltech in 1976, his M.S. degree from UCLA in 1979, and his Ph.D. degree from Stanford in 1984. Dr. Howard Zebker is currently a professor of geophysics and electrical engineering at Stanford University, where his research group specializes in interferometric radar remote sensing. Originally a microwave engineer, he built support equipment for the Seasat satellite synthetic aperture radar (SAR) and designed airborne radar systems. He later developed imaging radar polarimetry, a technique for measurement of the radar scattering matrix of a surface. He is best known for the development of radar interferometry, leading to spaceborne and airborne sensors capable of measuring topography to meter scale accuracy and surface deformation to millimeter scale. More recently he has participated in the NASA Cassini mission to Saturn and currently is concentrating on the upcoming NASA/Indian Space Research Organisation (ISRO) mission. The major awards ceremony concluded with a group photo with all awardees together with GRSS President Dr. Mariko Burgin, IEEE President-Elect Dr. Tom Coughlin, and GRSS Major Awards Chair Prof. Alberto Moreira (see Figure 15). The deadline for nominations for the 2024 major awards and special awards of the GRSS is 15 December 2023. A detailed description of the awards is available at https://www.grss-ieee.org/resources/awards/.

FIGURE 16. Dr. Laurie Leshin, Director of NASA’s JPL, provided a speech on “Dare Mighty Things Together” in the IGARSS plenary session.

121

©IGARSS 2023

In the field of astrophysics, Dr. Leshin described the Nanthe possibility during the week to join a tour of JPL, this cy Grace Roman Space Telescope, which will be launched in presentation was especially interesting and exciting. 2027. The mission is managed by NASA Goddard, but JPL Dr. Leshin related JPL’s history—how everything started is building the coronagraph instrument, which is one of its as early as back in the 1930s, even before NASA was foundkey instruments. The mission is aiming to image planets ed. JPL has 6,500 employees and is a unique organization. in other solar systems. More than 5,000 exoplanets have Nationwide and also international teamwork and collaboalready been detected, but we cannot see them because of ration are ­important parts of JPL’s life. the brightness of their stars. To be able to see one of those JPL has four big topics in its portfolio: Earth science, planets, one has to block out its star very effectively, and planetary science, and astrophysics, and it operates for this is what the coronagraph instrument is designed to do. NASA the Deep Space Network with satellite antennas on In the field of planetary science, Dr. Leshin talked about three different locations all around the globe. These antenthe mission Europa Clipper, which is to be launched in nas are constantly in contact with JPL’s deep space missions. October 2024. Europa is a moon of Jupiter and is covered In a further presentation, Dr. Leshin presented these topics by a water–ice shell. Beneath that icy shell, a global ocean in more detail. is suspected, which could harbor life. Europa Clipper will The work of JPL in Earth science focuses on the areas of go into orbit around Jupiter and have multiple very close biodiversity, greenhouse gases, water availability, air qualflybys—some of them go down only tens of kilometers ity, sea level, and natural hazards. JPL is interested not only above the surface—with many instruments to study and in the big missions, but also in the applications that can understand the environment on Europa. Another imporimprove people’s lives. Dr. Leshin showed a couple of extant question to be answered is where future missions could amples, like the mission EMIT, which is a multispectral scimore easily access the planet. ence mission originally designed to look at the composition Another mission called Psyche is going to be launched in of mineral dust, but that also has the ability to detect methOctober 2023. The mission goes to the asteroid called Psyche. ane plumes and sees very clearly from space where methThe Mars mission with the rover Perseverance and the ane emission is happening. As methane is a very powerful helicopter Ingenuity was launched in 2021. They are exgreenhouse gas, which has a much shorter residence time ploring and sampling the inside of a crater that once held in the atmosphere than CO2, providing this information a lake. By studying the stored small rocks, they want to offers an excellent possibility to get ahead of global warmanswer the question of whether life could have started in ing in an effective way. the Mars environment at the same time life was starting on The mission Surface Water and Ocean Topography Earth. Finally, Dr. Leshin showed an animation about the (SWOT) was launched in 2022 in collaboration between Mars Sample Return mission, where all of the rock samples the French Space Agency CNES and NASA. The SWOT are planned to be transported to Earth. She concluded her mission uses radar interferometry, achieving much higher speech by referring to Voyager missions 1 and 2 from 1977 resolution ocean topography and higher spatial resolution (the spacecraft are still sending sigand getting closer to the shore than nals every day), and she expressed usual. SWOT is going to represent a her pleasure to be able to do this spectacular revolution in our underkind of work for the benefit of the standing of the oceans. The mission science community. helps us to understand what is hapThe second speaker of t he pening to Earth’s surface water and plenary session was Joe Mancewill measure the height of millions of wicz, senior software engineer in lakes and rivers around the world to Nvidia’s Omniverse Group (see about a few centimeters, which will Figure 17). After showing an astonprovide revolutionary insights into ishing video and explaining techthe distribution of fresh water. nical terms, Mr. Mancewicz talked JPL’s next big science launch is about problems that arise when planned for 2024 in southern India creating an animation of a moving and is called the NASA ISRO SAR figure. The character’s behavior, (NISAR). The dual-frequency SAR and even its mood and the texture satellite with a 12-m reflector is enof its clothing, must look realistic. tering the final testing phase and will To make the scene imagined by be able to look at all kinds of land the authors work on the screen, it surface changes, for example, changtakes a lot of sophisticated work es in biomass, ice, earthquakes, or FIGURE 17. The plenary session speech on “Talkand a bunch of good ideas. It is a volcanoes, and as such will be a real ing Pigs—Math Stories From the Visual Effects long way from using nature photos game changer in understanding land Industry” was given by Joe Mancewicz, senior to several models until you finally surface change. software engineer in Nvidia’s Omniverse Group. 122

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

in the classroom to world-class expeditions. Expeditions and campaigns have enabled students to raise money for a well in Kathmandu, classrooms in Kenya, an orphanage in India, and, most recently, a new school in the village of Banakhu, Nepal, after the 2015 earthquake, which tragically claimed more than 8,000 lives. FindingLife’s mission is to inspire positive change in youth by bringing the world right into their classrooms, and its motto is: Educate. Inspire. Empower. It was in the framework of this project that Elia Saikaly successfully climbed Mount Everest. In his speech he talked about the challenges and choices of such a man-trying enterprise, and he also emphasized that, without the help of the Sherpas, you would not stand a chance. He showed photos of himself and his tent at the base of Mount Everest, where he connected live on Skype to the schools three times a week. He let us experience the excitement of the climb. After this first successful expedition, Elia Saikaly planned the next adventure, which was climbing K2, the world’s second highest peak, in the winter season, which was unprecedented until then. On 16 January 2021, a team of 10 alpinists from Nepal, led by Nirmal Purja and Mingma Gyalje Sherpa, succeeded in making the winter ascent of K2. Other alpinists started too, including John Snorri Sigurjónsson from Iceland, Juan Pablo Mohr Prieto from Chile, and Ali Sadpara from Pakistan, but unfortunately, they went missing during the attempt. Elia Saikaly was part of the international team that was searching for the mountaineers. The search was coordinated together with the Icelandic, Pakistani, and Chilean governments and also involved the Icelandic Space Agency. The agency was mapping the mountain and providing satellite imagery, but it could not find the missing mountaineers. Six months later, in

©IGARSS 2023

arrive at a well-done computer animation. Even the representation of the lighting alone needs many iterations. Mr. Mancewicz demonstrated these challenges on a scene from the movie “Night at the Museum: Battle of the Smithsonian,” where Amy Adams and Ben Stiller are looking at a giant octopus and Ben Stiller throws water onto the animal. A further example was from the movie “Life of Pi”: after a disastrous shipwreck, Pi Patel, son of an Indian zoo director, is floating in a lifeboat in the middle of the ocean—together with a Bengal tiger. Joe Mancewicz talked about the layers of technical animation: the depiction of the boat scene required many carefully coordinated steps, from shooting in a swimming pool to simulating the tiger’s muscles, skin, fur, and movements to placing the scene in an ocean environment. Another issue that was explained was the refraction problem. Mr. Mancewicz demonstrated on a virtual model of an eye the challenge of computing the effects of breaking light, which is also relevant in the field of Earth observation, if the working and movements of satellites are modeled and simulated. At the end of his truly amazing speech, Mr. Mancewicz showed that animations can also support the planning and optimization of satellites. As CubeSat picosatellites have only a very limited area on their walls for solar cells assembling, the available area has to be effectively shared with other parts. With an animation you can choose the most suitable places for solar panels where the sun’s rays will be the strongest; other parts can be placed in other locations. The third keynote speaker of the plenary session, Elia Saikaly (https://eliasaikaly.com/), an award-winning adventure filmmaker based in Ottawa, Canada, started his presentation with a short video with breathtaking images of the top of the world (see Figure 18). Elia Saikaly has participated in more than 25 world-class expeditions, including 10 to Mount Everest—always with a camera in hand. Elia Saikaly illustrated through his own story the power of storytelling, which is also important for researchers when they try to explain their findings. From a state of “had been written off,” Elia Saikaly was able to become a successful documentary filmmaker. In his early 20s he discovered for himself the video camera and fell in love with storytelling. Dr. Sean Egan was one of the key people who had a great influence on him, and Saikaly fulfilled Dr. Egan’s dream of climbing Mount Everest, which he unfortunately could not achieve because he died of heart failure. Dr. Egan’s mission was not about standing on top of the world but about using the platform of Mount Everest to spread the message of hope. He wanted people to get fit, get active, and live healthier, happier, more meaningful lives. After Dr. Egan’s death, Elia Saikaly wanted to ensure that his message could be spread and his legacy would live on. In 2007 he started a project to create a real-time experience for pupils from a remote environment and to connect it back to their curriculum. FindingLife (https://findinglife.ca/) creates immersive educational experiences by connecting students

FIGURE 18. Plenary session speech by Elia Saikaly, award-winning adventure filmmaker based in Ottawa, Canada.

123

©IGARSS 2023

(a)

(b)

FIGURE 19. AdCom participants at the GRSS AdCom meeting prior to IGARSS 2023: (a) AdCom at large. (b) Executive AdCom members.

©IGARSS 2023

the summer season, Elia Saikaly put the original team back together because the misfortune of their friends was a shock for all of them, and this time they were able to find the bodies of the three rope companions from Pakistan, Chile, and Iceland. Elia Saikaly was on Mount Everest again this year. He shared with us reflective thoughts about the reasons why people are dying on Mount Everest. The media oversimplify the reasons and blame traffic as only this season 478 foreign climbers received government permission to climb Everest—this is more tourists than usual. Elia Saikaly interviewed mountaineers and families who had lost some-

FIGURE 20. Core members of the IGARSS 2023 organizing and

supporting team (from left to right): Mehmet Ogut, Nathan Longbotham, Sidharth Misra, Fairouz Stambouli, David Kunkee, Shannon Brown, Joan Francesc Munoz-Martin, Rashmi Shah, Javier Bosch-Lluis, Sharmilla Padmanabhan, Mariko Burgin, Maryam Salim, Nereida Rodriguez-Alvarez, and Tianlin Wang. 124

one in the attempt of climbing Mount Everest. The other side of the problem is not only the foreigner’s ambitions, but also the ambitions of the locals, for whom being in business around Mount Everest is a way out of poverty. Now local people are trying to fight for insurance and retirement plans. Another issue with Mount Everest is the environmental disaster caused by the climbers. They leave their trash behind, which is very often the result of poor planning because no manpower is accounted for to leave no trace behind. A further issue is the overcommercialization of Mount Everest with clients who are interested only in speed records and multiple peaks. As people are hovered by helicopters across the valley, more inexperience and negligence can be observed. People on the inside of the country have to change this, but as storytellers, bystanders and supporters can also help get the word out to finally find positive solutions. At the end of his speech, Elia Saikaly addressed the effects of climate change. The ice and snow layer is also getting thinner on Mount Everest. He compared a photograph taken by Sir George Mallory in 1921 from a glacier in Tibet and another from 2009 taken from the same position. According to calculations, the glacier is 300 feet less dense. Finally, Elia Saikaly reminded us of the responsibility of the storyteller. Everyone can communicate their message and share their work with the world, and the possibilities are unlimited. FUTURE IGARSS CONFERENCES The GRSS AdCom met on 14–15 July 2023, just before IGARSS. In this meeting all of the Society’s operational and technical issues were discussed and main decisions were taken. The 2023 members of the GRSS AdCom are shown in Figure 19. The road map for future IGARSS conferences was confirmed, and a decision was made for IGARSS 2027: ◗◗ IGARSS 2024, Athens, Greece, 7–12 July 2024 ◗◗ IGARSS 2025, Brisbane, Australia, 3–8 August 2025 IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

◗◗ IGARSS 2026, Washington, D.C., USA, 19–24 July 2026 ◗◗ IGARSS 2027, Iceland, 5–9 July 2027.

You are cordially invited to participate in future IGARSS conferences, and we look forward to meeting all of you at IGARSS 2024 in Athens, Greece, on 7–12 July 2024. IGARSS 2023 in Pasadena was a great success and surpassed all expectations. The networking achieved in the IGARSS week was highly appreciated by the participants. Such an outstanding event cannot happen without the hard work of a large team of volunteers. Figure 20 shows some of the key organizing team members. Not shown in the picture are Chris Ruf, Ronny Hänsch, Musafa Ustuner, Eric Loria, Alex Akins, Omkar Pradhan, Mary Morris, Kazem Bakian Dogaheh, Alireza Tabatabaeenejad, and also the following committee members at-large: Fabio Pacifici, Paul Rosen, Saibun Tjuatja, Karen St. Germain, Steve Reising, and Upendra Singh. In total, more than 30 members worked in the core organizing team of IGARSS. ACKNOWLEDGMENT The authors would like to thank Klara Antesberger for her great support in the compilation of this article.

We want to hear from you!

IMAGE LICENSED BY GRAPHIC STOCK

Do you like what you’re reading? Your feedback is important. Let us know— send the editor-in-chief an e-mail!

GRS

TAP. CONNECT. NETWORK. SHARE. Connect to IEEE–no matter where you are–with the IEEE App.

SEPTEMBER 2023

Stay up-to-date with the latest news

Schedule, manage, or join meetups virtually

Get geo and interest-based recommendations

Read and download your IEEE magazines

Create a personalized experience

Locate IEEE members by location, interests, and affiliations

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

125

TECHNICAL COMMITTEES VICTORIA VANTHOF , HEATHER MCNAIRN , STEPHANIE TUMAMPOS , AND MARIKO BURGIN

Reinforcing Our Commitment: Why DEI Matters for the IEEE GRSS

S

hared beliefs and narratives have played a pivotal role in fostering cooperation and progress in human societies. The sense of belonging, common identity, and connection among individuals have driven groundbreaking technological and scientific advancements, among which is the International Space Station (ISS). Developed by five international space agencies, the ISS has hosted a wide range of scientific research and experimentation and astronauts from more than 19 countries. These diverse contributors share a common mission of space exploration, united in their pursuit of bettering life on Earth. Akin to the ISS, the IEEE Geoscience and Remote Sensing Society (GRSS) acknowledges the importance of cultivating diverse and inclusive communities to address complex global challenges that transcend geographical boundaries. Within the fields of geoscience and remote sensing, a diverse group of scientists and practitioners is essential in understanding and evaluating our interactions with the environment. This diversity paves the way for comprehensive solutions and advancements that benefit the entire global community. However, the geoscience community at large faces challenges in terms of diversity and inclusion, with underrepresented groups such as women and people of color being insufficiently represented. This lack of diversity limits the range of perspectives and experiences within the field. When diversity is lacking, certain individuals may struggle to find their place and may not feel fully integrated into the community. The GRSS is actively working to create a more equitable and welcoming environment in the geosciences and among its members. Two recent advancements are the introduction of a committee focused on bringing institutional change within the Society from a diver-

Digital Object Identifier 10.1109/MGRS.2023.3303874 Date of current version: 19 September 2023

126

sity, equity, and inclusion (DEI) lens and a facilitated workshop on unconscious bias to the GRSS administrative governance. DEFINING DIVERSITY, EQUITY, INCLUSION, ACCESSIBILITY, AND BELONGING DEI are terms often combined and used interchangeably, but these concepts are distinct and uniquely important. The concepts of belonging (B) and accessibility (A) are increasingly being associated with DEI efforts. Diversity (D) is about the dimensions, qualities, and characteristics that define us [1]. Our identity is typically intersectional given that many traits define who we are (race, ethnicity, religion, gender identity, sexual orientation, marital status, disability, age, parental status, etc.). A commitment to equity (E) ensures that access, resources, and opportunities are provided for all to succeed [2]. The terms equity and equality are not interchangeable ­(Figure 1). Treating everyone the same (equality) does not recognize an individual’s needs, in particular for those from underrepresented and historically disadvantaged populations. Inclusion (I) provides all with an opportunity to sit at the table and to cocreate. A culture of inclusion embraces, respects, and values diversity [1]. Belonging (B) is the outcome of inclusion and signifies a sense of being accepted and welcomed [3]. Inclusion without belonging makes one feel like they must fit in with the dominant culture [3]. Accessibility (A) is the intentional design and construction of facilities; information and communication technologies; programs; services; and policies so that all can fully and independently benefit from them [4]. Diversity and, to a lesser extent, inclusion, equity, and accessibility, can be quantified and/or more easily expressed. Belonging is difficult to measure, but one senses almost immediately its presence or absence. For example, a meeting accessible to attendees from a diverse community, where all are invited to the table, is not necessarily a meeting where all feel welcomed, valued, and heard. IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

Organizations committed to advancing diversity, equity, inclusion, accessibility, and belonging (DEIAB) stand to reap significant benefits. Those who view DEIAB efforts as onerous underestimate the value of these benefits. A diversity of views, perspectives, talents, and life experiences improves decision making and client services [3], and in fact, research has demonstrated the following [5]: ◗◗ Companies with more racial or gender diversity have more sales revenues, more customers, and greater profits (506 companies evaluated). ◗◗ Companies with more female executives were more profitable (20,000 companies in 91 countries e­ valuated). ◗◗ Management teams with a wider range of educational and work backgrounds produced more innovative ­products. In contrast, a toxic culture is 10× more important than compensation in predicting employee turnover [6]. In a 2021 report by McKinsey & Company on recruitment and retention [7], 40% of employees (surveyed in five countries) indicated that they will likely leave their job in the next three to six months. The top three reasons cited for leaving included: ◗◗ did not feel valued by their organizations (54%) ◗◗ did not feel valued by their managers (52%) ◗◗ did not feel a sense of belonging at work (51%). ADVANCING DEIAB IN THE GRSS For a number of years, the GRSS has delivered important initiatives to its members—and the broader scientific community—to advance DEI goals. These are led by our vibrant Young Professional (YP); Educational; Chapter; and Inspire, Develop, Empower, and Advance (IDEA) committees. IDEA runs a Women Mentoring Women (WMW) program, annually awards professional microgrants to support young minorities and women pursuing advanced degrees, and delivers a Women in Africa webinar series. In 2023, the GRSS president, Dr. Mariko Burgin, formed a DEIAB Ad Hoc Committee. This committee compliments ongoing GRSS activities (Figure 2) but with a mandate to approach DEIAB efforts more holistically and to identify gaps and opportunities to inform all GRSS programs involved in advancing the Society’s DEIAB goals. This Ad Hoc Committee is strategic, working to create institutional changes embedded in Society governance, policies, and procedures. This committee is GRSS’s “agent of change” for DEIAB and is striving to make advancements that will live beyond any particular cohort of leadership and volunteer core. The GRSS is committed to the goals of diversity, equity, and accessibility and strives to create spaces that are inclusive and where all feel that they belong. The newly formed DEIAB Ad Hoc Committee has penned a draft strategic plan, which lays out five short-term goals. ◗◗ Obtain support and participation of the GRSS leadership, portfolio leads, committee chairs, and volunteers in advancing DEIAB across the Society. SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

◗◗ Bring broader awareness of DEIAB by providing train-

ing and resources to the GRSS leadership, portfolio leads, committee chairs, volunteers, and the broader GRSS membership. ◗◗ Gather diversity data on the GRSS membership and across GRSS committees, publications, initiatives, and events to track trends, determine gaps, and identify ­opportunities. ◗◗ Advance diversity and equity across GRSS committees, publications, initiatives, and events and encourage policies and practices that promote a sense of inclusion and belonging. ◗◗ Communicate the initiatives, innovations, and accomplishments of the GRSS in DEIAB excellence. The committee has initiated a review of GRSS diversity data-gathering activities and a review of the Society’s awards processes and procedures. A Diversity and

FIGURE 1. Equality versus equity. (Source: Interaction Institute for Social Change. Artist: Angus Maguire: https://interactioninstitute. org/ and http://madewithangus.com/.)

Young Professional

Educational Activities Chapter Activities

IDEA

DEIAB

Programmatic “Doing”

Strategic “Agent of Change”

FIGURE 2. GRSS committees promote DEIAB across the Society and the broader geoscience and remote sensing community.

127

I­nclusion Chair was appointed for the GRSS 2023 flagship conference, the International Geoscience and Remote Sensing Symposium (IGARSS), for the purpose of viewing the conference through a DEIAB lens. IGARSS 2023 provided gender-neutral bathrooms and implemented both a family room and a lactation room. These conference facilities are important steps forward in providing a facility and an experience that are more inclusive and accessible and where our broader geoscience and remote sensing community feels at home. UNCONSCIOUS BIAS WORKSHOP WITHIN THE GRSS LEADERSHIP As mentioned previously, companies and organizations that invest in DEI training and workshops can reap benefits in terms of productivity and innovation, to name a few. Meanwhile fostering inclusion and belonging broadens the organization’s membership and enables the exchange of ideas and partnerships across the wider scientific community. However, an important topic that needs a deeper dive to advance DEI within groups is understanding that our differences also come with biases. As a step further toward DEI, the GRSS leadership initiated an Unconscious Bias Workshop that involved the Society’s Administrative Committee (AdCom). This workshop was delivered by Felicia Jadczak of She+ Geeks Out (SGO) on 15 July, the last day of the GRSS AdCom meeting in Pasadena, CA, USA. The workshop was attended by 45 AdCom members. During the workshop, the foundations of DEI were introduced, followed by a tool kit that detailed the types of unconscious bias, how to spot these biases, and steps on how to interrupt bias. Some of the AdCom members who participated in the workshop already had an understanding of DEI and its definition, while others were earlier in their journey. Meanwhile, the topic of unconscious bias was fairly new to almost everyone who joined the workshop. The main message presented in this workshop was that most of the biases we have are involuntary and, as such, are referred to as unconscious biases. It is essential to note that unconscious bias is a natural human behavior—whether we like it or not. Unconscious bias can be similarly compared to the wellknown idiom, “Birds of the same feather flock together.” This means that we chose to unintentionally or intentionally move toward those with whom we have more in common. Because of this, we learn stereotypes that automatically drive and affect our understanding and interactions with other people. It is important to acknowledge that we can never eliminate bias. If we can recognize and familiarize ourselves with these unintentional biases, it will lead us to initiate strategies to interrupt such biases. The 2.5-h Unconscious Bias Workshop ignited an engaging discussion with the AdCom members—from the confirmation bias where one tends to cherry-pick information and interpret it to confirm one’s beliefs; to the gender bias where one prefers a gender over another; to the affin128

ity bias where one makes decisions as a result of personal connections. According to feedback collected after the workshop, some AdCom members felt more comfortable speaking about DEI and were eager to have additional interactive sessions to spot and mitigate unconscious bias. In addition, some AdCom members suggested that more time could have been dedicated to the workshop and that future workshops should include interactive sessions where situations are enacted and interventions are explored. The GRSS leadership will consider feedback on this first introductory workshop as it continues to develop awareness around DEI. As SGO pointed out, the real work commences once the workshop ends. It is important that Society leaders execute good intervention strategies and encourage members to do the same. In addition, familiarizing oneself with bias and recognizing that we all have biases encourages us to remain open and impartial and to resist falling into assumptions. Of course, the takeaway message from this workshop is to practice and train ourselves to consciously identify biases. Again, this workshop provided an introduction to unconscious bias for GRSS AdCom members. This foundation can be built on by members carrying this knowledge forward as the GRSS continues to develop strategies to advance the goals of DEIAB. CONCLUSION: THE GRSS DEI CONTINUES THE JOURNEY Just as the ISS fosters a sense of belonging and shared purpose among its diverse crew, the GRSS recognizes the vital role of DEIAB in cultivating an environment where everyone feels welcomed. This sense of belonging will excel in innovation to better understand our interactions with Earth’s ecosystems. The journey toward a more diverse and inclusive geoscience community within the GRSS does not end here; it is an ongoing commitment that requires collective effort. Initiatives such as the DEIAB Ad Hoc Committee and the Unconscious Bias Workshop demonstrate ongoing efforts to create an inclusive and welcoming environment for all members. By actively addressing and discussing unconscious biases in the workshop, the GRSS is emphasizing the need to open the dialog and continue to engage on the topic. Moreover, the DEIAB Ad Hoc Committee will work to create institutional changes embedded in Society governance, policies, and procedures. If, after reading this, you have any input on how the GRSS can serve you better as a member and advance inclusiveness and belonging in the Society, please e-mail [email protected]. REFERENCES [1] “Diversity defined,” Canadian Centre for Diversity and Inclusion, Toronto, ON, Canada, 2023. [Online]. Available: https:// ccdi.ca/our-story/diversity-defined/ [2] “Diversity, equity, and inclusion definitions,” Univ. of ­Washington, Seattle, WA, USA, 2023. [Online]. Available: https:// IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

www.washington.edu/research/or/office-of-research-diversity -equity-and-inclusion/dei-definitions/ [3] “The Advocates’ Journal,” vol. 39, no. 4, pp. 1–40, Spring 2021. [Online]. Available: https://www.yumpu.com/en/document/ read/65310153/the-advocates-journal-spring-2021 [4] “Diversity, equity, inclusion and accessibility: A foundation for meaningful change,” U.S. Department of Labor, Washington, DC, USA, 2022. [Online]. Available: https://blog.dol. gov/2022/02/22/diversity-equity-inclusion-and-accessibility -a-foundation-for-meaningful-change [5] “Diverse teams feel less comfortable — And that’s why they perform better,” Harvard Business Review, Boston, MA, USA, 2016.

[Online]. Available: https://hbr.org/2016/09/diverse-teams-feel -less-comfortable-and-thats-why-they-perform-better [6] “Toxic culture is driving the great resignation,” MIT Sloan Management Review, Cambridge, MA, USA, Jan. 2022. [Online]. Available: https://sloanreview.mit.edu/article/toxic-culture-is-driving -the-great-resignation/ [7] “Great attrition’ or ‘great attraction’? The choice is yours.” McKinsey. Accessed: Sep. 8, 2021. [Online]. Available: https:// www.mckinsey.com/capabilities/people-and-organizational -performance/our-insights/great-attrition-or-great-attraction -the-choice-is-yours GRS

MANIL MASKEY , GABRIELE CAVALLARO , DORA BLANCO HERAS , PAOLO FRACCARO, BLAIR EDWARDS  , IKSHA GURUNG  , BRIAN FREITAG , MUTHUKUMARAN RAMASUBRAMANIAN , JOHANNES JAKUBIK  , LINSONG CHU, RAGHU GANTI, RAHUL RAMACHANDRAN , KOMMY WELDEMARIAM, SUJIT ROY , CARLOS COSTA, ALEX CORVIN , AND ANISH ASTHANA

A Summer School Session on Mastering Geospatial Artificial Intelligence: From Data Production to Artificial Intelligence Foundation Model Development and Downstream Applications

I

n collaboration with IBM Research, the NASA Interagency Implementation and Advanced Concepts Team ­(IMPACT) organized a specialized one-day summer school session focused on exploring the topic of data science at scale (see Figure 1). This session was a part of the “High Performance and Disruptive Computing in Remote Sensing” summer school hosted by the University of Iceland from 29 May to 1 June 2023 in Reykjavik, Iceland [1]. This marked the third edition of the school organized by the High Performance and Disruptive Computing in Remote Sensing (HDCRS) Working Group of the IEEE Geoscience and Remote Sensing Society’s (GRSS’s) Earth Science Informatics (ESI) Technical Committee (TC). The school aimed to provide participants with a comprehensive understanding of contemporary advancements in high-performance computing (HPC), machine learning (ML), and quantum computing methodologies applied to remote sensing (RS). Additionally, it created a networking platform for students and young professionals to interact with leading researchers and professors in RS, thereby promoting collaboration Digital Object Identifier 10.1109/MGRS.2023.3302813 Date of current version: 19 September 2023

SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

in HDCRS interdisciplinary research. The session, organized by IBM Research and NASA IMPACT, was a crucial component of the four-day summer school program [1], according to the HDCRS chairs. This article offers a succinct overview of the session. As the organizers of the summer school program, we are delighted with our successful collaboration with IBM Research and NASA. In the contemporary era, the unprecedented vast volume of data essential for addressing RS challenges necessitates the application of data science expertise. The specialized data science session offered participants a deep understanding of data science at scale, spanning the data lifecycle, underlying systems, and research. This partnership enriched the HDCRS program significantly, leaving participants inspired and informed. —HDCRS chairs. Data science has evolved into a diverse field that extends beyond the use of ML and artificial intelligence (AI) algorithms. In practice, it constitutes a unique scientific discipline, marked by complex processes and nuanced intricacies. Professionals working in data-intensive 129

­ omains like Earth science and RS need to grasp these d layers of data science to excel in their respective fields. To encourage broader participation in scientific research, NASA has launched an open science initiative [2]. This initiative aims to facilitate access to open data, computing platforms, and training opportunities for the next generation of scientific professionals, including those without advanced HPC programming skills. In alignment with the open science visions of both NASA and IBM, the summer school seeks to promote the broader adoption of data science applications. With a focus on geospatial artificial intelligence, it incorporates open science principles. A successful data science team requires a diverse range of expertise that is unlikely to be possessed by a single individual or a group of individuals focused on a specific aspect of data science. Additionally, the expertise and resources needed for such a team are typically not concentrated within a single institution. Therefore, to conduct this session, we fostered collaboration across multiple sectors, including government, industry, and academia. By leveraging the collective knowledge and resources from these different entities, we were able to create a comprehensive and impactful learning experience for the participants.

◗◗ to demystify data science at scale and equip participants

with the necessary skills and knowledge to tackle realworld data challenges. DATA SCIENCE LANDSCAPE Data science encompasses a wide-ranging concept that entails principles and practices spanning data collection, storage, integration, analysis, inference, communication, and ethics [3]. These fundamental principles are crucial for navigating the data-driven landscape of the big data era, requiring a diverse range of skills and expertise. Data engineers play a crucial role in building and maintaining the infrastructure necessary for handling large volumes of data. They design and develop data pipelines, ensuring data quality, reliability, and scalability. Their expertise lies in managing data storage systems and integrating diverse data sources. Data analysts focus on examining datasets, applying statistical techniques, and identifying patterns and trends. ML engineers specialize in developing and implementing ML models and algorithms. They train models, fine-tune hyperparameters, and deploy solutions that can analyze data, make accurate predictions, and facilitate decision-making tasks. Data visualization experts have expertise in presenting data in visually appealing and intuitive ways. They use various visualization techniques and tools to communicate complex information effectively, transforming data into insightful visual representations for better understanding and interpretation. Domain experts bring their in-depth knowledge of specific domains to the data science process. They ask relevant questions, identify meaningful variables, and provide contextual understanding for data analysis. Their expertise enhances the accuracy and relevance of data-driven solutions. By combining the skills and perspectives of these diverse skills, data science teams can address complex problems and unlock valuable

OBJECTIVES OF THE SESSION The objectives of the session were ◗◗ to offer participants an in-depth understanding of the complex data lifecycle and the underlying data systems ◗◗ to explore the research lifecycle and its practical implementation ◗◗ to provide participants with comprehensive knowledge about the rapid advancements in AI, with a particular emphasis on the foundational models

Data System Applications

Sentinel

Landsat

Unsupervised Learning

Foundation Models

Fine-Tuning

Harmonization

Harmonized Landsat and Sentinel

Analysis (NDVI and so on)

Classification

Segmentation (Burn Scars, Flood)

Other Use Cases

FIGURE 1. The practical data science workflow used during the summer school session (green boxes indicate hands-on experience). NDVI: Normalized Difference Vegetation Index.

130

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

insights from data. The composition of our organizing team exemplifies this collaborative approach. DATA SCIENCE WITHIN GEOSPATIAL DOMAIN During the session, a key focus was placed on highlighting the significance of data science within the geospatial domain. Given NASA’s extensive collection of Earth-observation data, it plays a pivotal role in comprehending the Earth as a cohesive system. The sheer volume of data generated by Earth science missions calls for innovative approaches to data analysis as conventional manual methods prove insufficient. To tackle these challenges, the exploration of technologies like commercial clouds and AI has become essential. Notably, the session took advantage of these very technologies as integral components of the teaching curriculum. By incorporating commercial clouds and AI, participants were exposed to cutting-edge tools and techniques for effectively analyzing and extracting insights from the vast and complex Earth-observation datasets. This handson experience equipped them with the necessary skills to navigate the evolving landscape of data science within the geospatial domain. STRUCTURE OF THE SESSION The session was divided into four chapters, each focusing on a specific aspect of data science and its practical applications: CHAPTER 1: DATA PRODUCTION/PROCESSING In this chapter, participants delved into large-scale data harmonization and explored the Harmonized Landsat Sentinel-2 (HLS) [4] data as a live case study. The session covered the use of cloud computing employed in large-scale data production and processing. CHAPTER 2: DATA ANALYSIS Chapter 2 introduced participants to tools such as the NASA Fire Information for Resource Management System, which utilizes HLS datasets and dynamic tiling services [5]. The participants collaboratively developed Jupyter notebooks for analysis and visualization of the HLS dataset, enabling them to engage in interactive data analysis and explore realworld applications with societal implications. CHAPTER 3: THEORY AND APPLICATION OF THE GEOSPATIAL FOUNDATION MODEL Chapter 3 of the session centered around exploring the theory and practical application of the geospatial foundation model (GeoFM), which was developed through a collaboration between IBM Research and NASA IMPACT [6]. The theory segment provided participants with a comprehensive understanding of foundation models [7], highlighting their emergence as a transformative solution for data-driven science. These models have proven instrumental in overcoming challenges such as the requirement for large-scale labeled datasets and the need for generalization across multiple tasks. SEPTEMBER 2023

IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

The participants were given valuable insights into the pretraining process of the HLS-based GeoFM. Additionally, they engaged in hands-on exercises focused on fine-tuning the HLS foundation model for specific use cases, specifically inland flood detection and burn scar delineation. By actively participating in these activities, participants gained practical experience in adapting and customizing the GeoFM to address the real-world geospatial challenges related to flood detection and burn scar delineation. They leveraged the watsonx.ai platform running on NASA’s Science Managed Cloud Environment (SMCE) [8] environment for fine-tuning these models. CHAPTER 4: INTERACTIVE EXPLORATION OF FINE-TUNED MODELS PARTICIPANTS AND ORGANIZERS OF THE HDCRS SUMMER SCHOOL AT THE UNIVERSITY OF ICELAND Chapter 4 of the session provided an interactive experience for participants, allowing them to delve into the fine-tuned GeoFM models. This hands-on section offered participants the opportunity to actively explore the capabilities of the fine-tuned models and apply their outputs to new data. The results obtained were visually displayed in an interactive manner, leveraging map functionality within the Jupyter environment. This final chapter marked a significant milestone in the participants’ data science journey as it allowed them to complete a full lifecycle of the data science process. By performing inferences on the GeoFM models using new data, participants gained valuable experience in deriving meaningful insights and developing real-world applications. INFRASTRUCTURE AND TOOLS The GeoFM was pretrained on IBM Research’s computing infrastructure. It utilized HLS datasets for the Continental United States in 2017, incorporating six different spectral bands. A family of the pretrained GeoFMs used consisted of 100–300 million parameters. Throughout the session, participants utilized the NASA Visualization, Exploration, and Data Analysis (VEDA) analytics platform [9] for HLS data access and analysis. SMCE provided the necessary computing resources. The SMCE, operating in NASA’s Amazon Web Services environment, facilitated easy access and rapid onboarding for external partners and collaborators. VEDA JupyterHub facilitated by the 2i2c [10] was used as a gateway to the computing platform. OUTCOMES Thirty diverse students from four different continents participated in the summer school. This global representation added to the richness and diversity of perspectives in the learning environment. The students hailed from various backgrounds and brought unique experiences to the table, fostering a collaborative and multicultural ­atmosphere. 131

FIGURE 2. The participants and organizers of the HDCRS summer school at the University of Iceland.

The  international nature of the participants further ­emphasized the global relevance and impact of data science in today’s interconnected world. The session adopted a practical approach, centering on real-world application scenarios. The participants actively engaged in hands-on exercises and practical demonstrations. The session aimed to foster collaboration and create a platform for exchanging ideas among participants. The success of the session highlighted the value of collaboration, hands-on experience, and the application of foundational models in geospatial analysis. Furthermore, the hands-on interactions enabled the organizers to document the necessary steps to be taken from data science tools, platforms, and compute infrastructure requirements, enabling large-scale collaborative projects to analyze Earth-observation data (see Figure 2). CONCLUSION The specialized session on data science organized by NASA, IBM Research, and the University of Iceland showcased the power of collaboration, hands-on experience, and the application of foundational models in geospatial analysis. As data science continues to shape the future, it is crucial to empower professionals with the necessary skills and knowledge to harness the potential of data-driven decision making. The session stands as an important milestone in this journey, reflecting a commitment to enhance the capacity for data science and push the boundaries of what is achievable. Due to the success of the session, we have plans to continue organizing the session with a data science theme during future summer schools. ACKNOWLEDGMENT Our heartfelt gratitude goes to the ESI TC for its generous sponsorship of the workshop and the GRSS for its ongoing support. Additionally, we would like to express our special thanks to the University of Iceland for hosting the event and for the invaluable assistance provided by the dedicated volunteers. A special mention goes to Sean Harkins from Development Seed for his invaluable support in facilitat132

ing the TiTiler services and Christopher Phillips from the University of Alabama in Huntsville for putting together training datasets for fine-tuning the GeoFM. The session resources are available at https://github.com/NASA-IMPACT/ summer-school-2023. The GeoFM, fine-tuned models, and data are openly available at https://huggingface.co/ibm -nasa-geospatial. REFERENCES [1] “HDCRS summer school 2023.” GRSS-IEEE. Accessed: Jul. 16, 2023. [Online]. Available: https://www.grss-ieee.org/community/ groups-initiatives/high-performance-and-disruptive -computing-in-remote-sensing-hdcrs/hdcrs-summer -school-2023/ [2] “Open-source science initiative.” NASA Science. Accessed: Jul. 20, 2023. [Online]. Available: https://science.nasa.gov/ open-science-overview [3] National Academies of Sciences, Engineering, and Medicine, Data Science for Undergraduates: Opportunities and Options. Washington, DC, USA: The National Academies Press, 2018. [4] “HLS overview,” United States Geological Survey, Valley Drive Reston, VA, USA. Accessed: Jul. 16, 2023. [Online]. Available: https:// lpdaac.usgs.gov/data/get-started-data/collection-overview/ missions/harmonized-landsat-sentinel-2-hls-overview/ [5] “eoAPI-raster.” Cloudfront.net. Accessed: Jul. 16, 2023. [Online]. Available: https://d1nzvsko7rbono.cloudfront.net/docs [6] “IBM’s new geospatial foundation model.” IBM. Accessed: Jun. 30, 2023. [Online]. Available: https://research.ibm.com/ blog/geospatial-models-nasa-ai [7]  R. Bommasani et al., “On the opportunities and risks of foundation models,” 2021, arXiv:2108.07258. [8] Science Managed Cloud Environment. Accessed: Jul. 17, 2023. [Online]. Available: https://smce.nasa.gov/ [9] M. Maskey et al., “Dashboard for earth observation,” in Advances in Scalable and Intelligent Geospatial Analytics, 1st ed. Boca Raton, FL, USA: CRC Press, 2023. [10] “Interactive computing for your community.” 2i2c. Accessed: Jul. 17, 2023. [Online]. Available: https://2i2c.org/ GRS IEEE GEOSCIENCE AND REMOTE SENSING MAGAZINE

SEPTEMBER 2023

CALL FOR PAPERS IEEE Geoscience and Remote Sensing Magazine

Special issue on “AI meets Remote Sensing Image Understanding” Guest Editors Prof. Gong Cheng, Northwestern Polytechnical University, China ([email protected]) Prof. Jun Li, China University of Geosciences, China ([email protected]) Prof. Xian Sun, Chinese Academy of Sciences, China ([email protected]) Ass. Prof. Zhongling Huang, Northwestern Polytechnical University, China ([email protected]) Prof. Mihai Datcu, University POLITEHNICA of Bucharest (UPB), Romania ([email protected]) Prof. Antonio Plaza, University of Extremadura, Spain ([email protected]) Remote sensing (RS) image understanding aims at extracting valuable information and acquiring knowledge from remotely sensed data, and artificial intelligence (AI) plays a significant role. With the increased data availability and the development of techniques for data interpretation –particularly, deep learning (DL) techniques– the past few years have witnessed a tremendous growth of research efforts focused on the visual interpretation of remote sensing images. Such techniques have made significant breakthroughs in multiple domains, such as scene classification, object detection, feature extraction and recognition, and landuse/land-cover mapping, to name a few. Nevertheless, there are still several challenges in this field, mostly related to the robustness and transferability of interpretation approaches, the efficient perception and understanding of RS mages, the effective fusion and utilization of multi-modal RS data, etc. This special issue that is aimed at investigating new techniques, algorithms and architectures that can be used to overcome the above-mentioned challenges and bring together the state-of-the-art research in this field. This special issue accepts review/tutorial papers on the following topics: • Foundation models for downstream tasks of RS image understanding • Robust AI architectures for detection, segmentation, and recognition in RS images • Integration of geographical knowledge and deep neural networks in RS image interpretation • Development of high-quality and large-scale benchmarks with multi-source data for AI understanding of RS images • Effective processing using AI of multi-modal/sensor RS data • Applications, such as smart agriculture, change detection and understanding of RS time series , environmental monitoring and sustainable development, urban/rural monitoring and assessment, natural disaster warning and management Articles submitted to this special issue of the IEEE Geoscience and Remote Sensing Magazine must contain significant relevance to geoscience and remote sensing and should have noteworthy tutorial/review value. Selection of invited papers will be done on the basis of 4-page White papers, submitted in double-column format. These papers must discuss the foreseen objectives of the paper, the importance of the addressed topic, the impact of the contribution, and the authors’ expertise and past activities on the topic. Contributors selected on the basis of the White papers will be invited to submit full manuscripts. Manuscripts should be submitted online at http://mc.manuscriptcentral.com/grsm using the Manuscript Central interface. Prospective authors should consult the site http://ieeexplore.ieee.org/servlet/opac?punumber=6245518 for guidelines and information on paper submission. Submitted articles should not have been published or be under review elsewhere. All submissions will be peer reviewed according to the IEEE and Geoscience and Remote Sensing Society guidelines. Special Issue tentative schedule: December 31, 2023 January 31, 2024 March 31, 2024 July 31, 2024 October 31, 2024 January 31, 2025 February 28, 2025 June 2025

Digital Object Identifier 10.1109/MGRS.2023.3307828

White paper submission deadline Invitation notification Full paper submission deadline Review notification Revised manuscript due Final acceptance notification Final manuscript due Publication date

Share Your Preprint Research with the World! TechRxiv is a free preprint server for unpublished research in electrical engineering, computer science, and related technology. Powered by IEEE, TechRxiv provides researchers across a broad range of fields the opportunity to share early results of their work ahead of formal peer review and publication.

BENEFITS: • Rapidly disseminate your research findings • Gather feedback from fellow researchers • Find potential collaborators in the scientific community • Establish the precedence of a discovery • Document research results in advance of publication

Upload your unpublished research today!

Follow @TechRxiv_org Learn more techrxiv.org

Powered by IEEE