Tag Archives: Ryan Kennedy

Cool Things About the Webb Telescope

Everyone is going gaga over the images from the James Web Space Telescope (JWST).

But for nerds like me, the telescope itself is the interesting part.  It’s a long way away (3000 times as far as the Hubble!), and it’s a pretty slick bit of kit, that’s for sure.  We’ve already noted it’s origami inspired sun shade

The Lagrange points are equilibrium locations where competing gravitational tugs on an object net out to zero. JWST is one of two other craft currently occupying L2. (Image Credit: IEEE SPECTRUM) (From [2])

Like my garage, the JWST is solar powered [1].  In fact, it is roughly the same size and capacity as my garage roof. : – )  In space the sunlight is a lot more predictable.  No clouds or trees, and the orbit is simpler than on the surface of a precessing planet.  On the other hand, L2 is literally twice as far from the sun as Earth, so the light is a quarter as intense. 

Sunlight is great, indeed vital, but the actual telescope itself needs to be shielded from the sun, both to see in the dark and to be cold [3].  The Webb is an infrared telescope—we’re trying to image really faint objects—which means that it has to be cold to operate.  It needs to be 40 degrees K, which is really, really cold. 

This is accomplished with a sun shade, five layers of reflective foil that keeps the telescope in complete darkness.  The telescope is exposed to space, which is about 2.5 degrees K, so any heat from the machinery radiates away.  This is a simple, elegant passive cooling system. Honestly, It seems unlikely that anything more complicated would not work, at least for long.

Image Credit: STScI/NASA (From [1]_

And finally, closer to my own career, the telescope downloads data.  A lot of data.  Up to 57 GB per day, which is more than 25 times the capacity of the Hubble [2].

As an old data mover, I know that getting data from this kind of remote (1.5 M km!) instrument is a waltz of time and storage space. 

The radio downlink isn’t fast by surface standards (a few MB per second), and has to share the DSN on the ground.  So gigabytes of data has to be marshalled on board to be compressed and downloaded in scheduled time slots.  Observations are complicated “programs” which choreograph the capture of photons and the movement of data into storage for download.

The JWST has 68GB of storage, which isn’t much by surface standards, but is huge by L2 standards.  This storage can hold about 24 hours of data, which should encompass several opportunities to download it.  This design is intended so that if communications fail on a single download, there should be at least one second chance before the observation is lost forever.

With the first pictures this month, we see that this whole thing is actually working.  As an engineer, that’s amazing to me. : – )

Running at low power near absolute zero.  Executing complicated workflows by (extremely) remote control.  Downloading gigabytes of data.  This shouldn’t work—but it does!

Nice work, all.


  1. Ryan Kennedy, Solar panels power the James Webb telescope, in PV Magazine, July 13, 2022. https://pv-magazine-usa.com/2022/07/13/solar-panels-power-the-james-webb-telescope/
  2. Michael Koziol, The Webb Space Telescope’s Profound Data Challenges, in IEEE Spectrum – aerospace, July 8, 2022. https://spectrum.ieee.org/james-webb-telescope-communications
  3. Ned Potter, Inside the Universe Machine: The Webb Space Telescope’s Chilly Sun Shield, in IEEE Spectrum – Aerospace, July 7, 2022. https://spectrum.ieee.org/james-webb-telescope-sunshield

Science Article on “Big Data Hubris”

No Big Data story is more famous than Google’s claim to be able to track flu outbreaks in real time, much faster than conventional public health surveillance.

In Science, Lazar and colleagues present an analysis and critique of this claim and the actual performance of the Google Flu Trends.

Their finding is that the Google Flu Trends (GFT) consistently over estimates the incidence of flu. In other words, the real time trigger is “too sensitive”, beating the conventional signals in part by “crying wolf”.

These errors are quite important, because this kind of real time prediction is supposed to enable resources to be swiftly deployed, to react to epidemics much quicker than the slower conventional methods allow. But if the real time prediction is a false positive, these resources will be misallocated, and the deployment effort wasted or misdirected.

To the extent that these errors can be assessed (see below), they appear to be due to the use of poor correlates. The GFT is based on analysis of search terms thought to be related to (i.e., correlated with) the outbreak of flu, such as queries about symptoms and medications. This query behavior is only partly driven by actual symptoms, it may, for instance, be triggered by “winter”. (No points will be awarded for detecting winter via Google searches.) It is also possible that social phenomena, such as media hype, can increase interest and fears regardless of symptoms.

Obviously, not everything accurately predicts the actual outbreak of flu. Worse, the dataset contains 50 million terms, while the data to predict is a few thousand points—overfitting is almost guaranteed.

The errors in GFT’s predictions were quite substantial. In fact, the “bad old” conventional reporting, though not real time, was more accurate than GFT for projecting the actual occurrence of flu. This should not be a surprise, since these projections were carefully designed.

Naturally, combining GFT or similar data with other surveillance will be even better than either alone. GFT would also be more useful if commonly used statistical methods were used to model and reduce errors.

But, there are problems in any attempt to use the GFT itself as real data

The GFT is irreproducible, because it has never been adequately reported. The data is unavailable for study, and the algorithms are closed and ever changing. The GFT could not be published in any reputable scientific journal, and it is difficult to see how you could validate it.

This critique extends to most many large data studies: however spectacular the headlines, it is difficult to make useful technology out of opaque, semi-magical, processes. I have remarked on the psychology of Big Data, offering a secular form of prophecy. Lazar et al call this tendency “Big Data Hubris”: big data is always better.

This also demonstrates the reason why we need publicly sponsored science: however wonderful this technology might be, it is owned by Google (or Facebook or Twitter, or whoever, and they release only what they wish to let out.  We have no way to replicate, compare, or even understand exactly what they did.  Whatever this is, it isn’t science (and it isn’t transparent).  Is it evil?  Possibly.

I see the GFT is an example of Google’s overall attitude. A love for quick and dirty methods based on unquestioned assumptions that more data is better than anything else, even careful theory and modeling. “Open source” that enables people access to selected data, but complete opaqueness about the actual data and algorithms. “Trust us” is not really good enough for data science.

I would also comment that in GFT and elsewhere, Google has touted the importance of empirical data and analytics, to understand health and safety (and selling advertising). But in the case of Google Glass, the company has not presented any data at all to demonstrate that the device is safe to use, or even useful.  We have been given lots of hype and anecdotes, which their data scientists must cringe at.

As I have pointed out sever times, there is really strong reason to worry that Glass is really bad for people, possibly producing eye damage, and almost certainly causing distraction. Yet Google has presented no data, and has never publicly said that it has ever collected or intends to collect such data.

Combining these cases, we see a cavalier and selfish attitude toward science and public safety. A major, monopolistic, for profit company has a right to act these ways.  But if it does, then it better not BS about “not being evil”, because they are, at best, selfish and amoral (just as a capitalist company should be).

Reference:

David Lazer, Ryan Kennedy, Gary King, and Alessandro Vespignani. 2014. The Parable of Google Flu: Traps in Big Data Analysis. Science 343, no. 14 March: 1203-1205. Copy at http://j.mp/1ii4ETo