Science & Discovery
Searching Hubble’s Archive for Hidden Gems
Because of its data collection and archival system, the Hubble Space Telescope has changed how — and who — can do science.
By Sarah Wells
NASA, ESA, The Hubble Heritage Team (STScI/AURA), and A. Riess (JHU/STScI)
For nearly three decades, a treasure trove of data and discoveries about the cosmos has sat safely in a Maryland facility stored inside hundreds of tapes, laser-written optical disks, magnetic disks, and computer “jukeboxes.” Spurred by requests from around the world, workers in the ’90s and early ’00s would rummage through dense aisles of these data-filled vessels day-in-and-day-out to collect, copy, and share the information collected by one of NASA’s most ambitious projects: the Hubble Space Telescope (HST.)
In its first 30 years of life, HST has faced peril as well as delights. The telescope’s breathtaking images have taught us to think deeply about our universe and inspired generations of scientists to learn how to explore it. These data have led to the discovery of dark energy and supermassive black holes at the centers of galaxies, and also characterized alien worlds. But, while direct observations made using the telescope have been incredibly important for the science HST has been able to do, securing time on the telescope has continued to be competitive and challenging.
Spiral galaxy NGC 1309 (opening image) hosted a supernova in 2012. By searching through archival Hubble data from 2005-2006, astronomers found out the type of star that exploded: a remnant white dwarf.
[NASA, ESA, The Hubble Heritage Team (STScI/AURA), and A. Riess (JHU/STScI)]
That is where NASA’s jukeboxes of data become essential. The Hubble Legacy Archive is a first of its kind data archive of not only directly observed HST data but also of high-level science products created using new analysis processes on decades old datasets. While originally stored on only physical disks and tapes in the Maryland facility, this massive database of every HST observation has been moved to the cloud in recent years. Organizers have joined that data with the archives of NASA’s K2, the forthcoming James Webb Space Telescope and other missions in a massive international archive called the Mikulski Archive for Space Telescopes (MAST.)
While archives may conjure images of dusty museum basements with miles of drawers filled with ancient beaver pelts, the HST archive is one of the Hubble project’s most active and creative resources. In 2019 alone, published papers using HST observations were based on at least 50 percent archival data. Discoveries made using the Hubble Legacy Archive have not only shaped our understanding of science in the past decades, but the archive has shaped how we do science as well.
Hubble Space Telescope data initially were etched onto laser-written optical discs, which were placed in “jukeboxes” by archival staff.
[STScI]
From a place to the cloud
Long before the archive was the international superstar it is today, Rick White, the Space Telescope Science Institute Archive Branch chief, says it was simply a backroom in Maryland filled to the brim with data on optical discs.
“The only way people could get data from the telescope is that it would flow from the telescope, get processed, get sorted and archived, and then they’d retrieve it from the archive — all data went through the archive,” says White. That’s because the raw data had to first be processed before it could be read by scientists.
The Hubble Legacy Archive has been collecting data since the telescope first launched in 1990, and for many years it was the first stop before any data ever reached its intended scientists. But the influx of data was too large for typical storage systems, said White.
“The way it physically worked was that the volume of data coming in from Hubble in 1990 was too large to store on [normal] disks… so there was this big, complex set-up that involved writing the data on to optical disks, which were these big, 12-inch-sized platters,” said White. “There were optical disk ‘jukeboxes’ that had slots to hold something like hundreds of disks.”
The Hubble Space Telescope captured this view of Comet C/2012 S1 (ISON), just a week before the scope captured the "UFO-like" observations (below).
[NASA, ESA, J.-Y. Li (PSI), and the Hubble Comet ISON Imaging Science Team]
These disks used to hold hundreds of gigabytes of data, but today the cloud-based database can hold hundreds of terabytes.
Long before the archive had moved to the cloud and could be easily accessed around the world with just a simple click, archive staff would physically fulfil data requests by locating stored optical disksin their respective jukeboxes and copying the data on to tapes, which they then physically mailed off to researchers. Each data request would be answered in a 24-hour period. While staffers often had to work hard to complete those requests on time, White recalls a particular event that sent requests through the roof — not from the scientific community, but from the general public.
In 2013, a UFO conspiracy website discovered an image in the archive that had strange lines and stripes and told its fan-base how to access the HST archive to download the image themselves for proof. “There were several million people who were doing exactly the same thing — they were following the link from this post and they were downloading the image, and it completely saturated our web server,” says White. “Our web server is not setup to handle the interest of millions of people.”
That image was actually a composite of three exposures of Comet C/2012 S1 ISON, and those strange lines and stripes were just a parallax effect. The comet was close enough to the telescope compared to background stars that it moved and its image smeared between those individual exposures. White says they were able to quell the onslaught of requests by posting a small letter on the archive’s main webpage explaining that there was in fact not photos of UFOs hidden in the archive and showing those exposures.
+
+
=
When Hubble observed Comet C/2012 S1 (ISON) during its approach to the Sun, a composite of three separate exposures led some to think this was no comet but instead a UFO. The Hubble archive was overwhelmed by the onslaught of data requests. [STScI/AURA]
Archived data keeps giving
While this influx of public requests created a temporary problem for the archive, Antonella Nota, associate director of the European Space Agency (ESA), says that the wide accessibility of data from the Hubble Legacy Archive has been an extremely important part of its success. It has not only increased the public’s access, but it has also had a powerful effect on how young astrophysicists access this data.
“In typical old-fashioned astronomy, the astronomers go to the telescope, get their own data and put it on magnetic tape, and then they take them home,” says Nota. “But Hubble basically broke that paradigm [because it] offered all the data in the same location and made them available to everybody.”
Before the Hubble Legacy Archive was established in 1990, observational data would be available only to those who conducted the initial observation. The notion that these data should be publicly shared after initial analysis by the observing scientists has created an important shift in how astronomical data is viewed and used, says Nota.
Joshua Peek, a Principal Investigator at MAST, adds that this has been incredibly important for students of his. They are able to work with archival data and start getting their names into astronomy journals.
“One way we like to think about the archive is a way for people to get involved with Hubble data without having to go through the permission process of having their proposal approved,” says Peek. “That ends up as a way for them to join the scientific community and the literature, and then that’s a steppingstone into making proposals [for original observations] that are going to pass the committee.”
And even more than simply continuing the work that scientists set out to do in their initial proposals, the “patchwork” of observations collected and stored in the Hubble Legacy Archive allow scientists to do incredibly creative and innovative work. For example, Peek refers to one of his favorite unlikely discoveries: A team of researchers used archival data from Hubble’s fine-guidance sensors to overturn assumptions about the number of rocky objects at the edge of the Kuiper Belt, the band of objects — including Pluto — just beyond Neptune. They found an icy comet-like object just 3200 feet (975 meters) across as it passed in front of a star. The tiny size suggests Kuiper Belt objects are being ground down by collisions. “I really like when people use the entirety of the archive in some totally strange way,” Peek adds.
The breadth and history of HST’s observations also make the archive an incredibly useful resource. As new researchers return to old data with new computer algorithms, they can process data in ways that weren’t possible a few decades ago and discover new objects previously hidden in the data.
Archived Hubble data is used not just for discoveries but also to create striking images. This compilation image of a protostar spewing jets of material uses data from observations in 2001, 2009, and 2014.
[NASA, ESA, the Hubble Heritage (STScI/AURA)/Hubble-Europe (ESA) Collaboration, D. Padgett (GSFC), T. Megeath (Univ. of Toledo), and B. Reipurth (Univ. of Hawaii)]
In 1994, the Hubble Space Telescope captured these views of the core of enormous galaxy M87. The disk of gas and a jet shooting out essentially confirmed M87 holds a supermassive black hole at its center.
[Holland Ford (STScI/JHU); Richard Harms, Linda Dressel, and Ajay K. Kochhar (Applied Research Corp.); Zlatan Tsvetanov, Arthur Davidsen, and Gerard Kriss (JHU); Ralph Bohlin and George Hartig (STScI)); Bruce Margon (Univ. of Washington, Seattle), and NASA]
The archive’s legacy
As science missions and projects begin to shift their focus toward deep surveys of the sky, the Hubble Legacy Archive has an important role to play. “The [Rubin Observatory’s] Legacy Survey of Space and Time is going to be doing this enormous survey of the whole sky, many, many, many times over, which is going to give us this incredible time-domain perspective,” says Peek. “People are studying things that change in time, [and] what makes the archive so powerful is that it allows you to go backward in the other direction.”
When reflecting on the history of the Hubble Legacy Archive, as well as its future, it’s important to remember that there could be no archive without the HST itself and the high-quality, science-ready data it provides. There is also a different mindset, says Peek, when it comes to studying the archival data. While directly observing photons from HST may seem immediate, there is a long wait before you can ever analyze that data. With the archive, getting a hold of data is a lot more immediate, for everyone.
It is this shift in the way scientists can do science through these archives that will be the archive’s great legacy. “[A] philosophical change has been provided by the establishment of the archive. These last 30 years have been revolutionary for the way people do science,” says Nota. “It goes from the privilege of one to sharing with the entire world, because the archive is available to everybody — you just need to have an internet connection. This is to me the democratization of science." ✰
(Originally published June 2020)
In 2009, after analyzing Hubble images from 1998, an astronomer found two planets orbiting star HR 8799.
[NASA, ESA, and R. Soummer (STScI)]
SARAH WELLS is a science and technology journalist based in Boston who’s interested in how innovation and research intersect with our daily lives. Her work has been published in Undark, Smithsonian.com, Inverse, Gizmodo, and Space.com, among others.
Mercury is an advertisement-free publication. If you are interested in supporting Mercury, please email us.